Major additions: - Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare - Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator - Discovery system: dutchie discovery service, geo validation, city seeding scripts - Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages - Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram) - Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata Frontend pages added: - Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores - StateHeatmap, CrossStateCompare, SyncInfoPanel Components added: - StateSelector, OrchestratorTraceModal, WorkflowStepper 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
346 lines
9.3 KiB
Markdown
346 lines
9.3 KiB
Markdown
# Multi-State Support
|
|
|
|
## Overview
|
|
|
|
Phase 4 implements full multi-state support for CannaiQ, transforming it from an Arizona-only platform to a national cannabis intelligence system. This document covers schema updates, API structure, frontend usage, and operational guidelines.
|
|
|
|
## Schema Updates
|
|
|
|
### Core Tables Modified
|
|
|
|
#### 1. `dispensaries` table
|
|
Already has `state` column:
|
|
```sql
|
|
state CHAR(2) DEFAULT 'AZ' -- State code (AZ, CA, CO, etc.)
|
|
state_id INTEGER REFERENCES states(id) -- FK to canonical states table
|
|
```
|
|
|
|
#### 2. `raw_payloads` table (Migration 047)
|
|
Added state column for query optimization:
|
|
```sql
|
|
state CHAR(2) -- Denormalized from dispensary for fast filtering
|
|
```
|
|
|
|
#### 3. `states` table
|
|
Canonical reference for all US states:
|
|
```sql
|
|
CREATE TABLE states (
|
|
id SERIAL PRIMARY KEY,
|
|
code VARCHAR(2) NOT NULL UNIQUE, -- 'AZ', 'CA', etc.
|
|
name VARCHAR(100) NOT NULL, -- 'Arizona', 'California', etc.
|
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
### New Indexes (Migration 047)
|
|
|
|
```sql
|
|
-- State-based payload filtering
|
|
CREATE INDEX idx_raw_payloads_state ON raw_payloads(state);
|
|
CREATE INDEX idx_raw_payloads_state_unprocessed ON raw_payloads(state, processed) WHERE processed = FALSE;
|
|
|
|
-- Dispensary state queries
|
|
CREATE INDEX idx_dispensaries_state_menu_type ON dispensaries(state, menu_type);
|
|
CREATE INDEX idx_dispensaries_state_crawl_status ON dispensaries(state, crawl_status);
|
|
CREATE INDEX idx_dispensaries_state_active ON dispensaries(state) WHERE crawl_status != 'disabled';
|
|
```
|
|
|
|
### Materialized Views
|
|
|
|
#### `mv_state_metrics`
|
|
Pre-aggregated state-level metrics for fast dashboard queries:
|
|
```sql
|
|
CREATE MATERIALIZED VIEW mv_state_metrics AS
|
|
SELECT
|
|
d.state,
|
|
s.name AS state_name,
|
|
COUNT(DISTINCT d.id) AS store_count,
|
|
COUNT(DISTINCT sp.id) AS total_products,
|
|
COUNT(DISTINCT sp.brand_id) AS unique_brands,
|
|
AVG(sp.price_rec) AS avg_price_rec,
|
|
-- ... more metrics
|
|
FROM dispensaries d
|
|
LEFT JOIN states s ON d.state = s.code
|
|
LEFT JOIN store_products sp ON d.id = sp.dispensary_id
|
|
GROUP BY d.state, s.name;
|
|
```
|
|
|
|
Refresh with:
|
|
```sql
|
|
SELECT refresh_state_metrics();
|
|
```
|
|
|
|
### Views
|
|
|
|
#### `v_brand_state_presence`
|
|
Brand presence and metrics per state:
|
|
```sql
|
|
SELECT brand_id, brand_name, state, store_count, product_count, avg_price
|
|
FROM v_brand_state_presence
|
|
WHERE state = 'AZ';
|
|
```
|
|
|
|
#### `v_category_state_distribution`
|
|
Category distribution by state:
|
|
```sql
|
|
SELECT state, category, product_count, store_count, avg_price
|
|
FROM v_category_state_distribution
|
|
WHERE state = 'CA';
|
|
```
|
|
|
|
### Functions
|
|
|
|
#### `fn_national_price_comparison(category, brand_id)`
|
|
Compare prices across all states:
|
|
```sql
|
|
SELECT * FROM fn_national_price_comparison('Flower', NULL);
|
|
```
|
|
|
|
#### `fn_brand_state_penetration(brand_id)`
|
|
Get brand penetration across states:
|
|
```sql
|
|
SELECT * FROM fn_brand_state_penetration(123);
|
|
```
|
|
|
|
## API Structure
|
|
|
|
### State List Endpoints
|
|
|
|
```
|
|
GET /api/states # All configured states
|
|
GET /api/states?active=true # Only states with dispensary data
|
|
```
|
|
|
|
Response:
|
|
```json
|
|
{
|
|
"success": true,
|
|
"data": {
|
|
"states": [
|
|
{ "code": "AZ", "name": "Arizona" },
|
|
{ "code": "CA", "name": "California" }
|
|
],
|
|
"count": 2
|
|
}
|
|
}
|
|
```
|
|
|
|
### State-Specific Endpoints
|
|
|
|
```
|
|
GET /api/state/:state/summary # Full state summary with metrics
|
|
GET /api/state/:state/brands # Brands in state (paginated)
|
|
GET /api/state/:state/categories # Categories in state (paginated)
|
|
GET /api/state/:state/stores # Stores in state (paginated)
|
|
GET /api/state/:state/analytics/prices # Price distribution
|
|
```
|
|
|
|
Query parameters:
|
|
- `limit` - Results per page (default: 50)
|
|
- `offset` - Pagination offset
|
|
- `sortBy` - Sort field (e.g., `productCount`, `avgPrice`)
|
|
- `sortDir` - Sort direction (`asc` or `desc`)
|
|
- `includeInactive` - Include disabled stores
|
|
|
|
### National Analytics Endpoints
|
|
|
|
```
|
|
GET /api/analytics/national/summary # National aggregate metrics
|
|
GET /api/analytics/national/prices # Price comparison across states
|
|
GET /api/analytics/national/heatmap # State heatmap data
|
|
GET /api/analytics/national/metrics # All state metrics
|
|
```
|
|
|
|
Heatmap metrics:
|
|
- `stores` - Store count per state
|
|
- `products` - Product count per state
|
|
- `brands` - Brand count per state
|
|
- `avgPrice` - Average price per state
|
|
- `penetration` - Brand penetration (requires `brandId`)
|
|
|
|
### Cross-State Comparison Endpoints
|
|
|
|
```
|
|
GET /api/analytics/compare/brand/:brandId # Compare brand across states
|
|
GET /api/analytics/compare/category/:category # Compare category across states
|
|
GET /api/analytics/brand/:brandId/penetration # Brand penetration by state
|
|
GET /api/analytics/brand/:brandId/trend # Historical penetration trend
|
|
```
|
|
|
|
Query parameters:
|
|
- `states` - Comma-separated state codes to include (optional)
|
|
- `days` - Days of history for trends (default: 30)
|
|
|
|
### Admin Endpoints
|
|
|
|
```
|
|
POST /api/admin/states/refresh-metrics # Refresh materialized views
|
|
```
|
|
|
|
## Frontend Usage
|
|
|
|
### State Selector
|
|
|
|
The global state selector is in the sidebar and persists selection via localStorage:
|
|
|
|
```tsx
|
|
import { useStateStore } from '../store/stateStore';
|
|
|
|
function MyComponent() {
|
|
const { selectedState, setSelectedState, isNationalView } = useStateStore();
|
|
|
|
// null = All States / National view
|
|
// 'AZ' = Arizona only
|
|
|
|
if (isNationalView()) {
|
|
// Show national data
|
|
} else {
|
|
// Filter by selectedState
|
|
}
|
|
}
|
|
```
|
|
|
|
### State Badge Component
|
|
|
|
Show current state selection:
|
|
```tsx
|
|
import { StateBadge } from '../components/StateSelector';
|
|
|
|
<StateBadge /> // Shows "National" or state name
|
|
```
|
|
|
|
### API Calls with State Filter
|
|
|
|
```tsx
|
|
import { api } from '../lib/api';
|
|
import { useStateStore } from '../store/stateStore';
|
|
|
|
function useStateData() {
|
|
const { selectedState } = useStateStore();
|
|
|
|
useEffect(() => {
|
|
if (selectedState) {
|
|
// State-specific data
|
|
api.get(`/state/${selectedState}/summary`);
|
|
} else {
|
|
// National data
|
|
api.get('/analytics/national/summary');
|
|
}
|
|
}, [selectedState]);
|
|
}
|
|
```
|
|
|
|
### Navigation Routes
|
|
|
|
| Route | Component | Description |
|
|
|-------|-----------|-------------|
|
|
| `/national` | NationalDashboard | National overview with all states |
|
|
| `/national/heatmap` | StateHeatmap | Interactive state heatmap |
|
|
| `/national/compare` | CrossStateCompare | Brand/category cross-state comparison |
|
|
|
|
## Ingestion Rules
|
|
|
|
### State Assignment
|
|
|
|
Every raw payload MUST include state:
|
|
1. State is looked up from `dispensaries.state` during payload storage
|
|
2. Stored on `raw_payloads.state` for query optimization
|
|
3. Inherited by all normalized products/snapshots via `dispensary_id`
|
|
|
|
### Hydration Pipeline
|
|
|
|
The hydration worker supports state filtering:
|
|
```typescript
|
|
// Process only AZ payloads
|
|
await getUnprocessedPayloads(pool, { state: 'AZ' });
|
|
|
|
// Process multiple states
|
|
await getUnprocessedPayloads(pool, { states: ['AZ', 'CA', 'NV'] });
|
|
```
|
|
|
|
### Data Isolation
|
|
|
|
Critical rules:
|
|
- **No cross-state contamination** - Product IDs are unique per (dispensary_id, provider_product_id)
|
|
- **No SKU merging** - Same SKU in AZ and CA are separate products
|
|
- **No store merging** - Same store name in different states are separate records
|
|
- Every dispensary maps to exactly ONE state
|
|
|
|
## Constraints & Best Practices
|
|
|
|
### Query Performance
|
|
|
|
1. Use `mv_state_metrics` for dashboard queries (refreshed hourly)
|
|
2. Use indexed views for brand/category queries
|
|
3. Filter by state early in queries to leverage indexes
|
|
4. For cross-state queries, use the dedicated comparison functions
|
|
|
|
### Cache Strategy
|
|
|
|
API endpoints should be cached with Redis:
|
|
```typescript
|
|
// Cache key pattern
|
|
`state:${state}:summary` // State summary - 5 min TTL
|
|
`national:summary` // National summary - 5 min TTL
|
|
`heatmap:${metric}` // Heatmap data - 5 min TTL
|
|
```
|
|
|
|
### Adding New States
|
|
|
|
1. Add state to `states` table (if not already present)
|
|
2. Import dispensary data with correct `state` code
|
|
3. Run menu detection for new dispensaries
|
|
4. Crawl dispensaries with resolved platform IDs
|
|
5. Refresh materialized views: `SELECT refresh_state_metrics()`
|
|
|
|
## Migration Guide
|
|
|
|
### From Arizona-Only to Multi-State
|
|
|
|
1. Apply migration 047:
|
|
```bash
|
|
DATABASE_URL="..." npm run migrate
|
|
```
|
|
|
|
2. Existing AZ data requires no changes (already has `state='AZ'`)
|
|
|
|
3. New states are added via:
|
|
- Manual dispensary import
|
|
- Menu detection crawl
|
|
- Platform ID resolution
|
|
|
|
4. Frontend automatically shows state selector after update
|
|
|
|
### Rollback
|
|
|
|
Migration 047 is additive - no destructive changes:
|
|
- New columns have defaults
|
|
- Views can be dropped without data loss
|
|
- Indexes can be dropped for performance tuning
|
|
|
|
## Monitoring
|
|
|
|
### Key Metrics to Watch
|
|
|
|
1. **Store count by state** - `SELECT state, COUNT(*) FROM dispensaries GROUP BY state`
|
|
2. **Product coverage** - `SELECT state, COUNT(DISTINCT sp.id) FROM store_products...`
|
|
3. **Crawl health by state** - Check `crawl_runs` by dispensary state
|
|
4. **Materialized view freshness** - `SELECT refreshed_at FROM mv_state_metrics`
|
|
|
|
### Alerts
|
|
|
|
Set up alerts for:
|
|
- Materialized view not refreshed in 2+ hours
|
|
- State with 0 products after having products
|
|
- Cross-state data appearing (should never happen)
|
|
|
|
## Future Enhancements
|
|
|
|
Planned for future phases:
|
|
1. **Redis caching** for all state endpoints
|
|
2. **Real-time refresh** of materialized views
|
|
3. **Geographic heatmaps** with actual US map visualization
|
|
4. **State-specific pricing rules** (tax rates, etc.)
|
|
5. **Multi-state brand portfolio tracking**
|