feat: Add v2 architecture with multi-state support and orchestrator services
Major additions: - Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare - Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator - Discovery system: dutchie discovery service, geo validation, city seeding scripts - Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages - Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram) - Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata Frontend pages added: - Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores - StateHeatmap, CrossStateCompare, SyncInfoPanel Components added: - StateSelector, OrchestratorTraceModal, WorkflowStepper 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
345
docs/multi-state.md
Normal file
345
docs/multi-state.md
Normal file
@@ -0,0 +1,345 @@
|
||||
# Multi-State Support
|
||||
|
||||
## Overview
|
||||
|
||||
Phase 4 implements full multi-state support for CannaiQ, transforming it from an Arizona-only platform to a national cannabis intelligence system. This document covers schema updates, API structure, frontend usage, and operational guidelines.
|
||||
|
||||
## Schema Updates
|
||||
|
||||
### Core Tables Modified
|
||||
|
||||
#### 1. `dispensaries` table
|
||||
Already has `state` column:
|
||||
```sql
|
||||
state CHAR(2) DEFAULT 'AZ' -- State code (AZ, CA, CO, etc.)
|
||||
state_id INTEGER REFERENCES states(id) -- FK to canonical states table
|
||||
```
|
||||
|
||||
#### 2. `raw_payloads` table (Migration 047)
|
||||
Added state column for query optimization:
|
||||
```sql
|
||||
state CHAR(2) -- Denormalized from dispensary for fast filtering
|
||||
```
|
||||
|
||||
#### 3. `states` table
|
||||
Canonical reference for all US states:
|
||||
```sql
|
||||
CREATE TABLE states (
|
||||
id SERIAL PRIMARY KEY,
|
||||
code VARCHAR(2) NOT NULL UNIQUE, -- 'AZ', 'CA', etc.
|
||||
name VARCHAR(100) NOT NULL, -- 'Arizona', 'California', etc.
|
||||
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
```
|
||||
|
||||
### New Indexes (Migration 047)
|
||||
|
||||
```sql
|
||||
-- State-based payload filtering
|
||||
CREATE INDEX idx_raw_payloads_state ON raw_payloads(state);
|
||||
CREATE INDEX idx_raw_payloads_state_unprocessed ON raw_payloads(state, processed) WHERE processed = FALSE;
|
||||
|
||||
-- Dispensary state queries
|
||||
CREATE INDEX idx_dispensaries_state_menu_type ON dispensaries(state, menu_type);
|
||||
CREATE INDEX idx_dispensaries_state_crawl_status ON dispensaries(state, crawl_status);
|
||||
CREATE INDEX idx_dispensaries_state_active ON dispensaries(state) WHERE crawl_status != 'disabled';
|
||||
```
|
||||
|
||||
### Materialized Views
|
||||
|
||||
#### `mv_state_metrics`
|
||||
Pre-aggregated state-level metrics for fast dashboard queries:
|
||||
```sql
|
||||
CREATE MATERIALIZED VIEW mv_state_metrics AS
|
||||
SELECT
|
||||
d.state,
|
||||
s.name AS state_name,
|
||||
COUNT(DISTINCT d.id) AS store_count,
|
||||
COUNT(DISTINCT sp.id) AS total_products,
|
||||
COUNT(DISTINCT sp.brand_id) AS unique_brands,
|
||||
AVG(sp.price_rec) AS avg_price_rec,
|
||||
-- ... more metrics
|
||||
FROM dispensaries d
|
||||
LEFT JOIN states s ON d.state = s.code
|
||||
LEFT JOIN store_products sp ON d.id = sp.dispensary_id
|
||||
GROUP BY d.state, s.name;
|
||||
```
|
||||
|
||||
Refresh with:
|
||||
```sql
|
||||
SELECT refresh_state_metrics();
|
||||
```
|
||||
|
||||
### Views
|
||||
|
||||
#### `v_brand_state_presence`
|
||||
Brand presence and metrics per state:
|
||||
```sql
|
||||
SELECT brand_id, brand_name, state, store_count, product_count, avg_price
|
||||
FROM v_brand_state_presence
|
||||
WHERE state = 'AZ';
|
||||
```
|
||||
|
||||
#### `v_category_state_distribution`
|
||||
Category distribution by state:
|
||||
```sql
|
||||
SELECT state, category, product_count, store_count, avg_price
|
||||
FROM v_category_state_distribution
|
||||
WHERE state = 'CA';
|
||||
```
|
||||
|
||||
### Functions
|
||||
|
||||
#### `fn_national_price_comparison(category, brand_id)`
|
||||
Compare prices across all states:
|
||||
```sql
|
||||
SELECT * FROM fn_national_price_comparison('Flower', NULL);
|
||||
```
|
||||
|
||||
#### `fn_brand_state_penetration(brand_id)`
|
||||
Get brand penetration across states:
|
||||
```sql
|
||||
SELECT * FROM fn_brand_state_penetration(123);
|
||||
```
|
||||
|
||||
## API Structure
|
||||
|
||||
### State List Endpoints
|
||||
|
||||
```
|
||||
GET /api/states # All configured states
|
||||
GET /api/states?active=true # Only states with dispensary data
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"data": {
|
||||
"states": [
|
||||
{ "code": "AZ", "name": "Arizona" },
|
||||
{ "code": "CA", "name": "California" }
|
||||
],
|
||||
"count": 2
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### State-Specific Endpoints
|
||||
|
||||
```
|
||||
GET /api/state/:state/summary # Full state summary with metrics
|
||||
GET /api/state/:state/brands # Brands in state (paginated)
|
||||
GET /api/state/:state/categories # Categories in state (paginated)
|
||||
GET /api/state/:state/stores # Stores in state (paginated)
|
||||
GET /api/state/:state/analytics/prices # Price distribution
|
||||
```
|
||||
|
||||
Query parameters:
|
||||
- `limit` - Results per page (default: 50)
|
||||
- `offset` - Pagination offset
|
||||
- `sortBy` - Sort field (e.g., `productCount`, `avgPrice`)
|
||||
- `sortDir` - Sort direction (`asc` or `desc`)
|
||||
- `includeInactive` - Include disabled stores
|
||||
|
||||
### National Analytics Endpoints
|
||||
|
||||
```
|
||||
GET /api/analytics/national/summary # National aggregate metrics
|
||||
GET /api/analytics/national/prices # Price comparison across states
|
||||
GET /api/analytics/national/heatmap # State heatmap data
|
||||
GET /api/analytics/national/metrics # All state metrics
|
||||
```
|
||||
|
||||
Heatmap metrics:
|
||||
- `stores` - Store count per state
|
||||
- `products` - Product count per state
|
||||
- `brands` - Brand count per state
|
||||
- `avgPrice` - Average price per state
|
||||
- `penetration` - Brand penetration (requires `brandId`)
|
||||
|
||||
### Cross-State Comparison Endpoints
|
||||
|
||||
```
|
||||
GET /api/analytics/compare/brand/:brandId # Compare brand across states
|
||||
GET /api/analytics/compare/category/:category # Compare category across states
|
||||
GET /api/analytics/brand/:brandId/penetration # Brand penetration by state
|
||||
GET /api/analytics/brand/:brandId/trend # Historical penetration trend
|
||||
```
|
||||
|
||||
Query parameters:
|
||||
- `states` - Comma-separated state codes to include (optional)
|
||||
- `days` - Days of history for trends (default: 30)
|
||||
|
||||
### Admin Endpoints
|
||||
|
||||
```
|
||||
POST /api/admin/states/refresh-metrics # Refresh materialized views
|
||||
```
|
||||
|
||||
## Frontend Usage
|
||||
|
||||
### State Selector
|
||||
|
||||
The global state selector is in the sidebar and persists selection via localStorage:
|
||||
|
||||
```tsx
|
||||
import { useStateStore } from '../store/stateStore';
|
||||
|
||||
function MyComponent() {
|
||||
const { selectedState, setSelectedState, isNationalView } = useStateStore();
|
||||
|
||||
// null = All States / National view
|
||||
// 'AZ' = Arizona only
|
||||
|
||||
if (isNationalView()) {
|
||||
// Show national data
|
||||
} else {
|
||||
// Filter by selectedState
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### State Badge Component
|
||||
|
||||
Show current state selection:
|
||||
```tsx
|
||||
import { StateBadge } from '../components/StateSelector';
|
||||
|
||||
<StateBadge /> // Shows "National" or state name
|
||||
```
|
||||
|
||||
### API Calls with State Filter
|
||||
|
||||
```tsx
|
||||
import { api } from '../lib/api';
|
||||
import { useStateStore } from '../store/stateStore';
|
||||
|
||||
function useStateData() {
|
||||
const { selectedState } = useStateStore();
|
||||
|
||||
useEffect(() => {
|
||||
if (selectedState) {
|
||||
// State-specific data
|
||||
api.get(`/state/${selectedState}/summary`);
|
||||
} else {
|
||||
// National data
|
||||
api.get('/analytics/national/summary');
|
||||
}
|
||||
}, [selectedState]);
|
||||
}
|
||||
```
|
||||
|
||||
### Navigation Routes
|
||||
|
||||
| Route | Component | Description |
|
||||
|-------|-----------|-------------|
|
||||
| `/national` | NationalDashboard | National overview with all states |
|
||||
| `/national/heatmap` | StateHeatmap | Interactive state heatmap |
|
||||
| `/national/compare` | CrossStateCompare | Brand/category cross-state comparison |
|
||||
|
||||
## Ingestion Rules
|
||||
|
||||
### State Assignment
|
||||
|
||||
Every raw payload MUST include state:
|
||||
1. State is looked up from `dispensaries.state` during payload storage
|
||||
2. Stored on `raw_payloads.state` for query optimization
|
||||
3. Inherited by all normalized products/snapshots via `dispensary_id`
|
||||
|
||||
### Hydration Pipeline
|
||||
|
||||
The hydration worker supports state filtering:
|
||||
```typescript
|
||||
// Process only AZ payloads
|
||||
await getUnprocessedPayloads(pool, { state: 'AZ' });
|
||||
|
||||
// Process multiple states
|
||||
await getUnprocessedPayloads(pool, { states: ['AZ', 'CA', 'NV'] });
|
||||
```
|
||||
|
||||
### Data Isolation
|
||||
|
||||
Critical rules:
|
||||
- **No cross-state contamination** - Product IDs are unique per (dispensary_id, provider_product_id)
|
||||
- **No SKU merging** - Same SKU in AZ and CA are separate products
|
||||
- **No store merging** - Same store name in different states are separate records
|
||||
- Every dispensary maps to exactly ONE state
|
||||
|
||||
## Constraints & Best Practices
|
||||
|
||||
### Query Performance
|
||||
|
||||
1. Use `mv_state_metrics` for dashboard queries (refreshed hourly)
|
||||
2. Use indexed views for brand/category queries
|
||||
3. Filter by state early in queries to leverage indexes
|
||||
4. For cross-state queries, use the dedicated comparison functions
|
||||
|
||||
### Cache Strategy
|
||||
|
||||
API endpoints should be cached with Redis:
|
||||
```typescript
|
||||
// Cache key pattern
|
||||
`state:${state}:summary` // State summary - 5 min TTL
|
||||
`national:summary` // National summary - 5 min TTL
|
||||
`heatmap:${metric}` // Heatmap data - 5 min TTL
|
||||
```
|
||||
|
||||
### Adding New States
|
||||
|
||||
1. Add state to `states` table (if not already present)
|
||||
2. Import dispensary data with correct `state` code
|
||||
3. Run menu detection for new dispensaries
|
||||
4. Crawl dispensaries with resolved platform IDs
|
||||
5. Refresh materialized views: `SELECT refresh_state_metrics()`
|
||||
|
||||
## Migration Guide
|
||||
|
||||
### From Arizona-Only to Multi-State
|
||||
|
||||
1. Apply migration 047:
|
||||
```bash
|
||||
DATABASE_URL="..." npm run migrate
|
||||
```
|
||||
|
||||
2. Existing AZ data requires no changes (already has `state='AZ'`)
|
||||
|
||||
3. New states are added via:
|
||||
- Manual dispensary import
|
||||
- Menu detection crawl
|
||||
- Platform ID resolution
|
||||
|
||||
4. Frontend automatically shows state selector after update
|
||||
|
||||
### Rollback
|
||||
|
||||
Migration 047 is additive - no destructive changes:
|
||||
- New columns have defaults
|
||||
- Views can be dropped without data loss
|
||||
- Indexes can be dropped for performance tuning
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Key Metrics to Watch
|
||||
|
||||
1. **Store count by state** - `SELECT state, COUNT(*) FROM dispensaries GROUP BY state`
|
||||
2. **Product coverage** - `SELECT state, COUNT(DISTINCT sp.id) FROM store_products...`
|
||||
3. **Crawl health by state** - Check `crawl_runs` by dispensary state
|
||||
4. **Materialized view freshness** - `SELECT refreshed_at FROM mv_state_metrics`
|
||||
|
||||
### Alerts
|
||||
|
||||
Set up alerts for:
|
||||
- Materialized view not refreshed in 2+ hours
|
||||
- State with 0 products after having products
|
||||
- Cross-state data appearing (should never happen)
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Planned for future phases:
|
||||
1. **Redis caching** for all state endpoints
|
||||
2. **Real-time refresh** of materialized views
|
||||
3. **Geographic heatmaps** with actual US map visualization
|
||||
4. **State-specific pricing rules** (tax rates, etc.)
|
||||
5. **Multi-state brand portfolio tracking**
|
||||
Reference in New Issue
Block a user