Files
cannaiq/backend/docs/ANALYTICS_RUNBOOK.md
Kelly b4a2fb7d03 feat: Add v2 architecture with multi-state support and orchestrator services
Major additions:
- Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare
- Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator
- Discovery system: dutchie discovery service, geo validation, city seeding scripts
- Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages
- Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram)
- Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata

Frontend pages added:
- Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores
- StateHeatmap, CrossStateCompare, SyncInfoPanel

Components added:
- StateSelector, OrchestratorTraceModal, WorkflowStepper

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 11:30:57 -07:00

713 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CannaiQ Analytics Runbook
Phase 3: Analytics Engine - Complete Implementation Guide
## Overview
The CannaiQ Analytics Engine provides real-time insights into cannabis market data across price trends, brand penetration, category performance, store changes, and competitive positioning.
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ API Layer │
│ /api/az/analytics/* │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Analytics Services │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │PriceTrend │ │Penetration │ │CategoryAnalytics │ │
│ │Service │ │Service │ │Service │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │StoreChange │ │BrandOpportunity│ │AnalyticsCache │ │
│ │Service │ │Service │ │(15-min TTL) │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Canonical Tables │
│ store_products │ store_product_snapshots │ brands │ categories │
│ dispensaries │ brand_snapshots │ category_snapshots │
└─────────────────────────────────────────────────────────────────┘
```
## Services
### 1. PriceTrendService
Provides time-series price analytics.
**Key Methods:**
| Method | Description |
|--------|-------------|
| `getProductPriceTrend(productId, storeId?, days)` | Price history for a product |
| `getBrandPriceTrend(brandName, filters)` | Average prices for a brand |
| `getCategoryPriceTrend(category, filters)` | Category-level price trends |
| `getPriceSummary(filters)` | 7d/30d/90d price averages |
| `detectPriceCompression(category, state?)` | Price war detection |
| `getGlobalPriceStats()` | Market-wide pricing overview |
**Filters:**
```typescript
interface PriceFilters {
storeId?: number;
brandName?: string;
category?: string;
state?: string;
days?: number; // default: 30
}
```
**Price Compression Detection:**
- Calculates standard deviation of prices within category
- Returns compression score 0-100 (higher = more compressed)
- Identifies brands converging toward mean price
---
### 2. PenetrationService
Tracks brand market presence across stores and states.
**Key Methods:**
| Method | Description |
|--------|-------------|
| `getBrandPenetration(brandName, filters)` | Store count, SKU count, coverage |
| `getTopBrandsByPenetration(limit, filters)` | Leaderboard of dominant brands |
| `getPenetrationTrend(brandName, days)` | Historical penetration growth |
| `getShelfShareByCategory(brandName)` | % of shelf per category |
| `getBrandPresenceByState(brandName)` | Multi-state presence map |
| `getStoresCarryingBrand(brandName)` | List of stores carrying brand |
| `getPenetrationHeatmap(brandName?)` | Geographic distribution |
**Penetration Calculation:**
```
Penetration % = (Stores with Brand / Total Stores in Market) × 100
```
---
### 3. CategoryAnalyticsService
Analyzes category performance and trends.
**Key Methods:**
| Method | Description |
|--------|-------------|
| `getCategorySummary(category?, filters)` | SKU count, avg price, stores |
| `getCategoryGrowth(days, filters)` | 7d/30d/90d growth rates |
| `getCategoryGrowthTrend(category, days)` | Time-series category growth |
| `getCategoryHeatmap(metric, periods)` | Visual heatmap data |
| `getTopMovers(limit, days)` | Fastest growing/declining categories |
| `getSubcategoryBreakdown(category)` | Drill-down into subcategories |
**Time Windows:**
- 7 days: Short-term volatility
- 30 days: Monthly trends
- 90 days: Seasonal patterns
---
### 4. StoreChangeService
Tracks product adds/drops, brand changes, and price movements per store.
**Key Methods:**
| Method | Description |
|--------|-------------|
| `getStoreChangeSummary(storeId)` | Overview of recent changes |
| `getStoreChangeEvents(storeId, filters)` | Event log (add, drop, price, OOS) |
| `getNewBrands(storeId, days)` | Brands added to store |
| `getLostBrands(storeId, days)` | Brands dropped from store |
| `getProductChanges(storeId, type, days)` | Filtered product changes |
| `getCategoryLeaderboard(category, limit)` | Top stores for category |
| `getMostActiveStores(days, limit)` | Stores with most changes |
| `compareStores(store1, store2)` | Side-by-side store comparison |
**Event Types:**
- `added` - New product appeared
- `discontinued` - Product removed
- `price_drop` - Price decreased
- `price_increase` - Price increased
- `restocked` - OOS → In Stock
- `out_of_stock` - In Stock → OOS
---
### 5. BrandOpportunityService
Competitive intelligence and opportunity identification.
**Key Methods:**
| Method | Description |
|--------|-------------|
| `getBrandOpportunity(brandName)` | Full opportunity analysis |
| `getMarketPositionSummary(brandName)` | Market position vs competitors |
| `getAlerts(filters)` | Analytics-generated alerts |
| `markAlertsRead(alertIds)` | Mark alerts as read |
**Opportunity Analysis Includes:**
- White space stores (potential targets)
- Competitive threats (brands gaining share)
- Pricing opportunities (underpriced vs market)
- Missing SKU recommendations
---
### 6. AnalyticsCache
In-memory caching with database fallback.
**Configuration:**
```typescript
const cache = new AnalyticsCache(pool, {
defaultTtlMinutes: 15,
});
```
**Usage Pattern:**
```typescript
const data = await cache.getOrCompute(cacheKey, async () => {
// Expensive query here
return result;
});
```
**Cache Management:**
- `GET /api/az/analytics/cache/stats` - View cache stats
- `POST /api/az/analytics/cache/clear?pattern=price*` - Clear by pattern
- Auto-cleanup of expired entries every 5 minutes
---
## API Endpoints Reference
### Price Endpoints
```bash
# Product price trend (last 30 days)
GET /api/az/analytics/price/product/12345?days=30
# Brand price trend with filters
GET /api/az/analytics/price/brand/Cookies?storeId=101&category=Flower&days=90
# Category median price
GET /api/az/analytics/price/category/Vaporizers?state=AZ
# Price summary (7d/30d/90d)
GET /api/az/analytics/price/summary?brand=Stiiizy&state=AZ
# Detect price wars
GET /api/az/analytics/price/compression/Flower?state=AZ
# Global stats
GET /api/az/analytics/price/global
```
### Penetration Endpoints
```bash
# Brand penetration
GET /api/az/analytics/penetration/brand/Cookies
# Top brands leaderboard
GET /api/az/analytics/penetration/top?limit=20&state=AZ&category=Flower
# Penetration trend
GET /api/az/analytics/penetration/trend/Cookies?days=90
# Shelf share by category
GET /api/az/analytics/penetration/shelf-share/Cookies
# Multi-state presence
GET /api/az/analytics/penetration/by-state/Cookies
# Stores carrying brand
GET /api/az/analytics/penetration/stores/Cookies
# Heatmap data
GET /api/az/analytics/penetration/heatmap?brand=Cookies
```
### Category Endpoints
```bash
# Category summary
GET /api/az/analytics/category/summary?category=Flower&state=AZ
# Category growth (7d/30d/90d)
GET /api/az/analytics/category/growth?days=30&state=AZ
# Category trend
GET /api/az/analytics/category/trend/Concentrates?days=90
# Heatmap
GET /api/az/analytics/category/heatmap?metric=growth&periods=12
# Top movers (growing/declining)
GET /api/az/analytics/category/top-movers?limit=5&days=30
# Subcategory breakdown
GET /api/az/analytics/category/Edibles/subcategories
```
### Store Endpoints
```bash
# Store change summary
GET /api/az/analytics/store/101/summary
# Event log
GET /api/az/analytics/store/101/events?type=price_drop&days=7&limit=50
# New brands
GET /api/az/analytics/store/101/brands/new?days=30
# Lost brands
GET /api/az/analytics/store/101/brands/lost?days=30
# Product changes by type
GET /api/az/analytics/store/101/products/changes?type=added&days=7
# Category leaderboard
GET /api/az/analytics/store/leaderboard/Flower?limit=20
# Most active stores
GET /api/az/analytics/store/most-active?days=7&limit=10
# Compare two stores
GET /api/az/analytics/store/compare?store1=101&store2=102
```
### Brand Opportunity Endpoints
```bash
# Full opportunity analysis
GET /api/az/analytics/brand/Cookies/opportunity
# Market position summary
GET /api/az/analytics/brand/Cookies/position
# Get alerts
GET /api/az/analytics/alerts?brand=Cookies&type=competitive&unreadOnly=true
# Mark alerts read
POST /api/az/analytics/alerts/mark-read
Body: { "alertIds": [1, 2, 3] }
```
### Maintenance Endpoints
```bash
# Capture daily snapshots (run by scheduler)
POST /api/az/analytics/snapshots/capture
# Cache statistics
GET /api/az/analytics/cache/stats
# Clear cache (admin)
POST /api/az/analytics/cache/clear?pattern=price*
```
---
## Incremental Computation
Analytics are designed for real-time queries without full recomputation:
### Snapshot Strategy
1. **Raw Data**: `store_products` (current state)
2. **Historical**: `store_product_snapshots` (time-series)
3. **Aggregated**: `brand_snapshots`, `category_snapshots` (daily rollups)
### Window Calculations
```sql
-- 7-day window
WHERE crawled_at >= NOW() - INTERVAL '7 days'
-- 30-day window
WHERE crawled_at >= NOW() - INTERVAL '30 days'
-- 90-day window
WHERE crawled_at >= NOW() - INTERVAL '90 days'
```
### Materialized Views (Optional)
For heavy queries, create materialized views:
```sql
CREATE MATERIALIZED VIEW mv_brand_daily_metrics AS
SELECT
DATE(sps.captured_at) as date,
sp.brand_id,
COUNT(DISTINCT sp.dispensary_id) as store_count,
COUNT(*) as sku_count,
AVG(sp.price_rec) as avg_price
FROM store_product_snapshots sps
JOIN store_products sp ON sps.store_product_id = sp.id
WHERE sps.captured_at >= NOW() - INTERVAL '90 days'
GROUP BY DATE(sps.captured_at), sp.brand_id;
-- Refresh daily
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_brand_daily_metrics;
```
---
## Scheduled Jobs
### Daily Snapshot Capture
Trigger via cron or scheduler:
```bash
curl -X POST http://localhost:3010/api/az/analytics/snapshots/capture
```
This calls:
- `capture_brand_snapshots()` - Captures brand metrics
- `capture_category_snapshots()` - Captures category metrics
### Cache Cleanup
Automatic cleanup every 5 minutes via in-memory timer.
For manual cleanup:
```bash
curl -X POST http://localhost:3010/api/az/analytics/cache/clear
```
---
## Extending Analytics (Future Phases)
### Phase 6: Intelligence Engine
- Automated alert generation
- Recommendation engine
- Price prediction
### Phase 7: Orders Integration
- Sales velocity analytics
- Reorder predictions
- Inventory turnover
### Phase 8: Advanced ML
- Demand forecasting
- Price elasticity modeling
- Customer segmentation
---
## Troubleshooting
### Common Issues
**1. Slow queries**
- Check cache stats: `GET /api/az/analytics/cache/stats`
- Increase cache TTL if data doesn't need real-time freshness
- Add indexes on frequently filtered columns
**2. Empty results**
- Verify data exists in source tables
- Check filter parameters (case-sensitive brand names)
- Verify state codes are valid
**3. Stale data**
- Run snapshot capture: `POST /api/az/analytics/snapshots/capture`
- Clear cache: `POST /api/az/analytics/cache/clear`
### Debugging
Enable query logging:
```typescript
// In service constructor
this.debug = process.env.ANALYTICS_DEBUG === 'true';
```
---
## Data Contracts
### Price Trend Response
```typescript
interface PriceTrend {
productId?: number;
storeId?: number;
brandName?: string;
category?: string;
dataPoints: Array<{
date: string;
minPrice: number | null;
maxPrice: number | null;
avgPrice: number | null;
wholesalePrice: number | null;
sampleSize: number;
}>;
summary: {
currentAvg: number | null;
previousAvg: number | null;
changePercent: number | null;
trend: 'up' | 'down' | 'stable';
volatilityScore: number | null;
};
}
```
### Brand Penetration Response
```typescript
interface BrandPenetration {
brandName: string;
totalStores: number;
storesWithBrand: number;
penetrationPercent: number;
skuCount: number;
avgPrice: number | null;
priceRange: { min: number; max: number } | null;
topCategories: Array<{ category: string; count: number }>;
stateBreakdown?: Array<{ state: string; storeCount: number }>;
}
```
### Category Growth Response
```typescript
interface CategoryGrowth {
category: string;
currentCount: number;
previousCount: number;
growthPercent: number;
growthTrend: 'up' | 'down' | 'stable';
avgPrice: number | null;
priceChange: number | null;
topBrands: Array<{ brandName: string; count: number }>;
}
```
---
## Files Reference
| File | Purpose |
|------|---------|
| `src/dutchie-az/services/analytics/price-trends.ts` | Price analytics |
| `src/dutchie-az/services/analytics/penetration.ts` | Brand penetration |
| `src/dutchie-az/services/analytics/category-analytics.ts` | Category metrics |
| `src/dutchie-az/services/analytics/store-changes.ts` | Store event tracking |
| `src/dutchie-az/services/analytics/brand-opportunity.ts` | Competitive intel |
| `src/dutchie-az/services/analytics/cache.ts` | Caching layer |
| `src/dutchie-az/services/analytics/index.ts` | Module exports |
| `src/dutchie-az/routes/analytics.ts` | API routes (680 LOC) |
| `src/multi-state/state-query-service.ts` | Cross-state analytics |
---
---
## Analytics V2: Rec/Med State Segmentation
Phase 3 Enhancement: Enhanced analytics with recreational vs medical-only state analysis.
### V2 API Endpoints
All V2 endpoints are prefixed with `/api/analytics/v2`
#### V2 Price Analytics
```bash
# Price trends for a specific product
GET /api/analytics/v2/price/product/12345?window=30d
# Price by category and state (with rec/med segmentation)
GET /api/analytics/v2/price/category/Flower?state=AZ
# Price by brand and state
GET /api/analytics/v2/price/brand/Cookies?state=AZ
# Most volatile products
GET /api/analytics/v2/price/volatile?window=30d&limit=50&state=AZ
# Rec vs Med price comparison by category
GET /api/analytics/v2/price/rec-vs-med?category=Flower
```
#### V2 Brand Penetration
```bash
# Brand penetration metrics with state breakdown
GET /api/analytics/v2/brand/Cookies/penetration?window=30d
# Brand market position within categories
GET /api/analytics/v2/brand/Cookies/market-position?category=Flower&state=AZ
# Brand presence in rec vs med-only states
GET /api/analytics/v2/brand/Cookies/rec-vs-med
# Top brands by penetration
GET /api/analytics/v2/brand/top?limit=25&state=AZ
# Brands expanding or contracting
GET /api/analytics/v2/brand/expansion-contraction?window=30d&limit=25
```
#### V2 Category Analytics
```bash
# Category growth metrics
GET /api/analytics/v2/category/Flower/growth?window=30d
# Category growth trend over time
GET /api/analytics/v2/category/Flower/trend?window=30d
# Top brands in category
GET /api/analytics/v2/category/Flower/top-brands?limit=25&state=AZ
# All categories with metrics
GET /api/analytics/v2/category/all?state=AZ&limit=50
# Rec vs Med category comparison
GET /api/analytics/v2/category/rec-vs-med?category=Flower
# Fastest growing categories
GET /api/analytics/v2/category/fastest-growing?window=30d&limit=25
```
#### V2 Store Analytics
```bash
# Store change summary
GET /api/analytics/v2/store/101/summary?window=30d
# Product change events
GET /api/analytics/v2/store/101/events?window=7d&limit=100
# Store inventory composition
GET /api/analytics/v2/store/101/inventory
# Store price positioning vs market
GET /api/analytics/v2/store/101/price-position
# Most active stores by changes
GET /api/analytics/v2/store/most-active?window=7d&limit=25&state=AZ
```
#### V2 State Analytics
```bash
# State market summary
GET /api/analytics/v2/state/AZ/summary
# All states with coverage metrics
GET /api/analytics/v2/state/all
# Legal state breakdown (rec, med-only, no program)
GET /api/analytics/v2/state/legal-breakdown
# Rec vs Med pricing by category
GET /api/analytics/v2/state/rec-vs-med-pricing?category=Flower
# States with coverage gaps
GET /api/analytics/v2/state/coverage-gaps
# Cross-state pricing comparison
GET /api/analytics/v2/state/price-comparison
```
### V2 Services Architecture
```
src/services/analytics/
├── index.ts # Exports all V2 services
├── types.ts # Shared type definitions
├── PriceAnalyticsService.ts # Price trends and volatility
├── BrandPenetrationService.ts # Brand market presence
├── CategoryAnalyticsService.ts # Category growth analysis
├── StoreAnalyticsService.ts # Store change tracking
└── StateAnalyticsService.ts # State-level analytics
src/routes/analytics-v2.ts # V2 API route handlers
```
### Key V2 Features
1. **Rec/Med State Segmentation**: All analytics can be filtered and compared by legal status
2. **State Coverage Gaps**: Identify legal states with missing or stale data
3. **Cross-State Pricing**: Compare prices across recreational and medical-only markets
4. **Brand Footprint Analysis**: Track brand presence in rec vs med states
5. **Category Comparison**: Compare category performance by legal status
### V2 Migration Path
1. Run migration 052 for state cannabis flags:
```bash
psql "$DATABASE_URL" -f migrations/052_add_state_cannabis_flags.sql
```
2. Run migration 053 for analytics indexes:
```bash
psql "$DATABASE_URL" -f migrations/053_analytics_indexes.sql
```
3. Restart backend to pick up new routes
### V2 Response Examples
**Rec vs Med Price Comparison:**
```json
{
"category": "Flower",
"recreational": {
"state_count": 15,
"product_count": 12500,
"avg_price": 35.50,
"median_price": 32.00
},
"medical_only": {
"state_count": 8,
"product_count": 5200,
"avg_price": 42.00,
"median_price": 40.00
},
"price_diff_percent": -15.48
}
```
**Legal State Breakdown:**
```json
{
"recreational_states": {
"count": 24,
"dispensary_count": 850,
"product_count": 125000,
"states": [
{ "code": "CA", "name": "California", "dispensary_count": 250 },
{ "code": "CO", "name": "Colorado", "dispensary_count": 150 }
]
},
"medical_only_states": {
"count": 18,
"dispensary_count": 320,
"product_count": 45000,
"states": [
{ "code": "FL", "name": "Florida", "dispensary_count": 120 }
]
},
"no_program_states": {
"count": 9,
"states": [
{ "code": "ID", "name": "Idaho" }
]
}
}
```
---
*Phase 3 Analytics Engine - Fully Implemented*
*V2 Rec/Med State Analytics - Added December 2024*