N Nexus Docs
Architecture

ClickHouse Analytics

ClickHouse schema, ETL pipeline, and query patterns for NexusCommerce analytics.

Overview

NexusCommerce uses ClickHouse as a dedicated analytics store alongside the primary Supabase (PostgreSQL) database. ClickHouse's columnar storage and vectorized query execution deliver sub-second analytics queries on datasets that would take seconds or minutes in PostgreSQL.

The schema is in services/clickhouse/schema.sql. The ETL pipeline runs continuously in the NestJS API, syncing new records from Supabase to ClickHouse.

Key Concepts

Columnar Storage — ClickHouse stores data column by column rather than row by row. Analytics queries typically access a few columns across many rows, which is extremely efficient in columnar layout.

MergeTree Engine — The primary ClickHouse table engine. ReplacingMergeTree is used for tables that receive updates (e.g., order status changes). AggregatingMergeTree is used for pre-computed aggregations.

Materialized View — A pre-computed query result that ClickHouse updates incrementally as new data arrives. NexusCommerce's dashboard charts load from materialized views, not from raw tables.

ETL — Extract, Transform, Load. The NestJS API's ETL service reads new/updated records from Supabase change events (via Supabase Realtime or polling) and inserts them into ClickHouse.

Getting Started

Initialize ClickHouse schema:

docker compose run --rm clickhouse-init

This executes services/clickhouse/schema.sql against the ClickHouse instance.

Verify the schema:

clickhouse-client --query "SHOW TABLES FROM nexuscommerce"

Features

Core Tables

orders

CREATE TABLE nexuscommerce.orders (
  tenant_id UUID,
  id UUID,
  external_order_id String,
  marketplace LowCardinality(String),
  status LowCardinality(String),
  fulfillment_method LowCardinality(String),
  total_amount Decimal(10, 2),
  currency LowCardinality(String),
  created_at DateTime,
  updated_at DateTime,
  sign Int8
) ENGINE = CollapsingMergeTree(sign)
PARTITION BY toYYYYMM(created_at)
ORDER BY (tenant_id, created_at, id);

The sign column supports updates: insert a row with sign = -1 (cancel) and sign = 1 (upsert) to handle status changes without deleting old data.

pricing_events

CREATE TABLE nexuscommerce.pricing_events (
  tenant_id UUID,
  sku String,
  marketplace LowCardinality(String),
  old_price Decimal(10, 2),
  new_price Decimal(10, 2),
  trigger LowCardinality(String),  -- 'rule', 'ai_recommendation', 'manual'
  rule_id Nullable(String),
  created_at DateTime
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
ORDER BY (tenant_id, sku, created_at);

inventory_events

CREATE TABLE nexuscommerce.inventory_events (
  tenant_id UUID,
  sku String,
  warehouse_id String,
  before_quantity Int32,
  after_quantity Int32,
  reason LowCardinality(String),
  created_at DateTime
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
ORDER BY (tenant_id, sku, created_at);

Materialized Views

daily_revenue

Pre-computes daily revenue per tenant per marketplace:

CREATE MATERIALIZED VIEW nexuscommerce.daily_revenue
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(day)
ORDER BY (tenant_id, marketplace, day)
AS SELECT
  tenant_id,
  marketplace,
  toStartOfDay(created_at) AS day,
  sumState(total_amount) AS revenue,
  countState() AS order_count
FROM nexuscommerce.orders
WHERE sign = 1
GROUP BY tenant_id, marketplace, day;

Query daily revenue:

SELECT
  day,
  marketplace,
  sumMerge(revenue) AS revenue,
  countMerge(order_count) AS orders
FROM nexuscommerce.daily_revenue
WHERE tenant_id = :tenantId
  AND day >= :startDate
GROUP BY day, marketplace
ORDER BY day;

ETL Pipeline

The NestJS ETL service (apps/api/src/analytics/etl.service.ts) runs on a 5-minute cron:

  1. Query Supabase for records with updated_at > last_etl_run
  2. Batch records (1,000 per batch)
  3. Insert into ClickHouse via HTTP interface
  4. Update last_etl_run checkpoint in Redis

For orders (which can change status), the ETL inserts two rows per update: a -1 sign row (cancelling the old state) and a +1 sign row (inserting the new state). ClickHouse's CollapsingMergeTree engine merges these during background compaction.

Example Analytics Queries

Revenue by marketplace, last 30 days:

SELECT
  marketplace,
  sumMerge(revenue) AS total_revenue,
  countMerge(order_count) AS total_orders
FROM nexuscommerce.daily_revenue
WHERE tenant_id = :tenantId
  AND day >= now() - INTERVAL 30 DAY
GROUP BY marketplace
ORDER BY total_revenue DESC;

Top 10 SKUs by revenue, last 7 days:

SELECT
  JSONExtractString(line_items, '$[*].sku') AS sku,
  sum(total_amount) AS revenue
FROM nexuscommerce.orders
WHERE tenant_id = :tenantId
  AND created_at >= now() - INTERVAL 7 DAY
  AND sign = 1
GROUP BY sku
ORDER BY revenue DESC
LIMIT 10;

Return rate by marketplace:

SELECT
  o.marketplace,
  countIf(r.id IS NOT NULL) / count() AS return_rate
FROM nexuscommerce.orders o
LEFT JOIN nexuscommerce.returns r ON r.order_id = o.id
WHERE o.tenant_id = :tenantId
  AND o.created_at >= now() - INTERVAL 30 DAY
  AND o.sign = 1
GROUP BY o.marketplace;

Configuration

VariableDescriptionDefault
CLICKHOUSE_HOSTClickHouse server hostlocalhost
CLICKHOUSE_PORTHTTP interface port8123
CLICKHOUSE_DATABASEDatabase namenexuscommerce
CLICKHOUSE_USERUsernamedefault
CLICKHOUSE_PASSWORDPasswordRequired in production
ETL_INTERVAL_MINUTESHow often the ETL pipeline runs5
ETL_BATCH_SIZERecords per ClickHouse insert batch1000
CLICKHOUSE_RETENTION_MONTHSData retention policy24