- Move deprecated directories to src/_deprecated/: - hydration/ (old pipeline approach) - scraper-v2/ (old Puppeteer scraper) - canonical-hydration/ (merged into tasks) - Unused services: availability, crawler-logger, geolocation, etc - Unused utils: age-gate-playwright, HomepageValidator, stealthBrowser - Archive outdated docs to docs/_archive/: - ANALYTICS_RUNBOOK.md - ANALYTICS_V2_EXAMPLES.md - BRAND_INTELLIGENCE_API.md - CRAWL_PIPELINE.md - TASK_WORKFLOW_2024-12-10.md - WORKER_TASK_ARCHITECTURE.md - ORGANIC_SCRAPING_GUIDE.md - Add docs/CODEBASE_MAP.md as single source of truth - Add warning files to deprecated/archived directories - Slim down CLAUDE.md to essential rules only 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
20 KiB
CannaiQ Analytics Runbook
Phase 3: Analytics Engine - Complete Implementation Guide
Overview
The CannaiQ Analytics Engine provides real-time insights into cannabis market data across price trends, brand penetration, category performance, store changes, and competitive positioning.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ API Layer │
│ /api/az/analytics/* │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Analytics Services │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │PriceTrend │ │Penetration │ │CategoryAnalytics │ │
│ │Service │ │Service │ │Service │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │StoreChange │ │BrandOpportunity│ │AnalyticsCache │ │
│ │Service │ │Service │ │(15-min TTL) │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Canonical Tables │
│ store_products │ store_product_snapshots │ brands │ categories │
│ dispensaries │ brand_snapshots │ category_snapshots │
└─────────────────────────────────────────────────────────────────┘
Services
1. PriceTrendService
Provides time-series price analytics.
Key Methods:
| Method | Description |
|---|---|
getProductPriceTrend(productId, storeId?, days) |
Price history for a product |
getBrandPriceTrend(brandName, filters) |
Average prices for a brand |
getCategoryPriceTrend(category, filters) |
Category-level price trends |
getPriceSummary(filters) |
7d/30d/90d price averages |
detectPriceCompression(category, state?) |
Price war detection |
getGlobalPriceStats() |
Market-wide pricing overview |
Filters:
interface PriceFilters {
storeId?: number;
brandName?: string;
category?: string;
state?: string;
days?: number; // default: 30
}
Price Compression Detection:
- Calculates standard deviation of prices within category
- Returns compression score 0-100 (higher = more compressed)
- Identifies brands converging toward mean price
2. PenetrationService
Tracks brand market presence across stores and states.
Key Methods:
| Method | Description |
|---|---|
getBrandPenetration(brandName, filters) |
Store count, SKU count, coverage |
getTopBrandsByPenetration(limit, filters) |
Leaderboard of dominant brands |
getPenetrationTrend(brandName, days) |
Historical penetration growth |
getShelfShareByCategory(brandName) |
% of shelf per category |
getBrandPresenceByState(brandName) |
Multi-state presence map |
getStoresCarryingBrand(brandName) |
List of stores carrying brand |
getPenetrationHeatmap(brandName?) |
Geographic distribution |
Penetration Calculation:
Penetration % = (Stores with Brand / Total Stores in Market) × 100
3. CategoryAnalyticsService
Analyzes category performance and trends.
Key Methods:
| Method | Description |
|---|---|
getCategorySummary(category?, filters) |
SKU count, avg price, stores |
getCategoryGrowth(days, filters) |
7d/30d/90d growth rates |
getCategoryGrowthTrend(category, days) |
Time-series category growth |
getCategoryHeatmap(metric, periods) |
Visual heatmap data |
getTopMovers(limit, days) |
Fastest growing/declining categories |
getSubcategoryBreakdown(category) |
Drill-down into subcategories |
Time Windows:
- 7 days: Short-term volatility
- 30 days: Monthly trends
- 90 days: Seasonal patterns
4. StoreChangeService
Tracks product adds/drops, brand changes, and price movements per store.
Key Methods:
| Method | Description |
|---|---|
getStoreChangeSummary(storeId) |
Overview of recent changes |
getStoreChangeEvents(storeId, filters) |
Event log (add, drop, price, OOS) |
getNewBrands(storeId, days) |
Brands added to store |
getLostBrands(storeId, days) |
Brands dropped from store |
getProductChanges(storeId, type, days) |
Filtered product changes |
getCategoryLeaderboard(category, limit) |
Top stores for category |
getMostActiveStores(days, limit) |
Stores with most changes |
compareStores(store1, store2) |
Side-by-side store comparison |
Event Types:
added- New product appeareddiscontinued- Product removedprice_drop- Price decreasedprice_increase- Price increasedrestocked- OOS → In Stockout_of_stock- In Stock → OOS
5. BrandOpportunityService
Competitive intelligence and opportunity identification.
Key Methods:
| Method | Description |
|---|---|
getBrandOpportunity(brandName) |
Full opportunity analysis |
getMarketPositionSummary(brandName) |
Market position vs competitors |
getAlerts(filters) |
Analytics-generated alerts |
markAlertsRead(alertIds) |
Mark alerts as read |
Opportunity Analysis Includes:
- White space stores (potential targets)
- Competitive threats (brands gaining share)
- Pricing opportunities (underpriced vs market)
- Missing SKU recommendations
6. AnalyticsCache
In-memory caching with database fallback.
Configuration:
const cache = new AnalyticsCache(pool, {
defaultTtlMinutes: 15,
});
Usage Pattern:
const data = await cache.getOrCompute(cacheKey, async () => {
// Expensive query here
return result;
});
Cache Management:
GET /api/az/analytics/cache/stats- View cache statsPOST /api/az/analytics/cache/clear?pattern=price*- Clear by pattern- Auto-cleanup of expired entries every 5 minutes
API Endpoints Reference
Price Endpoints
# Product price trend (last 30 days)
GET /api/az/analytics/price/product/12345?days=30
# Brand price trend with filters
GET /api/az/analytics/price/brand/Cookies?storeId=101&category=Flower&days=90
# Category median price
GET /api/az/analytics/price/category/Vaporizers?state=AZ
# Price summary (7d/30d/90d)
GET /api/az/analytics/price/summary?brand=Stiiizy&state=AZ
# Detect price wars
GET /api/az/analytics/price/compression/Flower?state=AZ
# Global stats
GET /api/az/analytics/price/global
Penetration Endpoints
# Brand penetration
GET /api/az/analytics/penetration/brand/Cookies
# Top brands leaderboard
GET /api/az/analytics/penetration/top?limit=20&state=AZ&category=Flower
# Penetration trend
GET /api/az/analytics/penetration/trend/Cookies?days=90
# Shelf share by category
GET /api/az/analytics/penetration/shelf-share/Cookies
# Multi-state presence
GET /api/az/analytics/penetration/by-state/Cookies
# Stores carrying brand
GET /api/az/analytics/penetration/stores/Cookies
# Heatmap data
GET /api/az/analytics/penetration/heatmap?brand=Cookies
Category Endpoints
# Category summary
GET /api/az/analytics/category/summary?category=Flower&state=AZ
# Category growth (7d/30d/90d)
GET /api/az/analytics/category/growth?days=30&state=AZ
# Category trend
GET /api/az/analytics/category/trend/Concentrates?days=90
# Heatmap
GET /api/az/analytics/category/heatmap?metric=growth&periods=12
# Top movers (growing/declining)
GET /api/az/analytics/category/top-movers?limit=5&days=30
# Subcategory breakdown
GET /api/az/analytics/category/Edibles/subcategories
Store Endpoints
# Store change summary
GET /api/az/analytics/store/101/summary
# Event log
GET /api/az/analytics/store/101/events?type=price_drop&days=7&limit=50
# New brands
GET /api/az/analytics/store/101/brands/new?days=30
# Lost brands
GET /api/az/analytics/store/101/brands/lost?days=30
# Product changes by type
GET /api/az/analytics/store/101/products/changes?type=added&days=7
# Category leaderboard
GET /api/az/analytics/store/leaderboard/Flower?limit=20
# Most active stores
GET /api/az/analytics/store/most-active?days=7&limit=10
# Compare two stores
GET /api/az/analytics/store/compare?store1=101&store2=102
Brand Opportunity Endpoints
# Full opportunity analysis
GET /api/az/analytics/brand/Cookies/opportunity
# Market position summary
GET /api/az/analytics/brand/Cookies/position
# Get alerts
GET /api/az/analytics/alerts?brand=Cookies&type=competitive&unreadOnly=true
# Mark alerts read
POST /api/az/analytics/alerts/mark-read
Body: { "alertIds": [1, 2, 3] }
Maintenance Endpoints
# Capture daily snapshots (run by scheduler)
POST /api/az/analytics/snapshots/capture
# Cache statistics
GET /api/az/analytics/cache/stats
# Clear cache (admin)
POST /api/az/analytics/cache/clear?pattern=price*
Incremental Computation
Analytics are designed for real-time queries without full recomputation:
Snapshot Strategy
- Raw Data:
store_products(current state) - Historical:
store_product_snapshots(time-series) - Aggregated:
brand_snapshots,category_snapshots(daily rollups)
Window Calculations
-- 7-day window
WHERE crawled_at >= NOW() - INTERVAL '7 days'
-- 30-day window
WHERE crawled_at >= NOW() - INTERVAL '30 days'
-- 90-day window
WHERE crawled_at >= NOW() - INTERVAL '90 days'
Materialized Views (Optional)
For heavy queries, create materialized views:
CREATE MATERIALIZED VIEW mv_brand_daily_metrics AS
SELECT
DATE(sps.captured_at) as date,
sp.brand_id,
COUNT(DISTINCT sp.dispensary_id) as store_count,
COUNT(*) as sku_count,
AVG(sp.price_rec) as avg_price
FROM store_product_snapshots sps
JOIN store_products sp ON sps.store_product_id = sp.id
WHERE sps.captured_at >= NOW() - INTERVAL '90 days'
GROUP BY DATE(sps.captured_at), sp.brand_id;
-- Refresh daily
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_brand_daily_metrics;
Scheduled Jobs
Daily Snapshot Capture
Trigger via cron or scheduler:
curl -X POST http://localhost:3010/api/az/analytics/snapshots/capture
This calls:
capture_brand_snapshots()- Captures brand metricscapture_category_snapshots()- Captures category metrics
Cache Cleanup
Automatic cleanup every 5 minutes via in-memory timer.
For manual cleanup:
curl -X POST http://localhost:3010/api/az/analytics/cache/clear
Extending Analytics (Future Phases)
Phase 6: Intelligence Engine
- Automated alert generation
- Recommendation engine
- Price prediction
Phase 7: Orders Integration
- Sales velocity analytics
- Reorder predictions
- Inventory turnover
Phase 8: Advanced ML
- Demand forecasting
- Price elasticity modeling
- Customer segmentation
Troubleshooting
Common Issues
1. Slow queries
- Check cache stats:
GET /api/az/analytics/cache/stats - Increase cache TTL if data doesn't need real-time freshness
- Add indexes on frequently filtered columns
2. Empty results
- Verify data exists in source tables
- Check filter parameters (case-sensitive brand names)
- Verify state codes are valid
3. Stale data
- Run snapshot capture:
POST /api/az/analytics/snapshots/capture - Clear cache:
POST /api/az/analytics/cache/clear
Debugging
Enable query logging:
// In service constructor
this.debug = process.env.ANALYTICS_DEBUG === 'true';
Data Contracts
Price Trend Response
interface PriceTrend {
productId?: number;
storeId?: number;
brandName?: string;
category?: string;
dataPoints: Array<{
date: string;
minPrice: number | null;
maxPrice: number | null;
avgPrice: number | null;
wholesalePrice: number | null;
sampleSize: number;
}>;
summary: {
currentAvg: number | null;
previousAvg: number | null;
changePercent: number | null;
trend: 'up' | 'down' | 'stable';
volatilityScore: number | null;
};
}
Brand Penetration Response
interface BrandPenetration {
brandName: string;
totalStores: number;
storesWithBrand: number;
penetrationPercent: number;
skuCount: number;
avgPrice: number | null;
priceRange: { min: number; max: number } | null;
topCategories: Array<{ category: string; count: number }>;
stateBreakdown?: Array<{ state: string; storeCount: number }>;
}
Category Growth Response
interface CategoryGrowth {
category: string;
currentCount: number;
previousCount: number;
growthPercent: number;
growthTrend: 'up' | 'down' | 'stable';
avgPrice: number | null;
priceChange: number | null;
topBrands: Array<{ brandName: string; count: number }>;
}
Files Reference
| File | Purpose |
|---|---|
src/dutchie-az/services/analytics/price-trends.ts |
Price analytics |
src/dutchie-az/services/analytics/penetration.ts |
Brand penetration |
src/dutchie-az/services/analytics/category-analytics.ts |
Category metrics |
src/dutchie-az/services/analytics/store-changes.ts |
Store event tracking |
src/dutchie-az/services/analytics/brand-opportunity.ts |
Competitive intel |
src/dutchie-az/services/analytics/cache.ts |
Caching layer |
src/dutchie-az/services/analytics/index.ts |
Module exports |
src/dutchie-az/routes/analytics.ts |
API routes (680 LOC) |
src/multi-state/state-query-service.ts |
Cross-state analytics |
Analytics V2: Rec/Med State Segmentation
Phase 3 Enhancement: Enhanced analytics with recreational vs medical-only state analysis.
V2 API Endpoints
All V2 endpoints are prefixed with /api/analytics/v2
V2 Price Analytics
# Price trends for a specific product
GET /api/analytics/v2/price/product/12345?window=30d
# Price by category and state (with rec/med segmentation)
GET /api/analytics/v2/price/category/Flower?state=AZ
# Price by brand and state
GET /api/analytics/v2/price/brand/Cookies?state=AZ
# Most volatile products
GET /api/analytics/v2/price/volatile?window=30d&limit=50&state=AZ
# Rec vs Med price comparison by category
GET /api/analytics/v2/price/rec-vs-med?category=Flower
V2 Brand Penetration
# Brand penetration metrics with state breakdown
GET /api/analytics/v2/brand/Cookies/penetration?window=30d
# Brand market position within categories
GET /api/analytics/v2/brand/Cookies/market-position?category=Flower&state=AZ
# Brand presence in rec vs med-only states
GET /api/analytics/v2/brand/Cookies/rec-vs-med
# Top brands by penetration
GET /api/analytics/v2/brand/top?limit=25&state=AZ
# Brands expanding or contracting
GET /api/analytics/v2/brand/expansion-contraction?window=30d&limit=25
V2 Category Analytics
# Category growth metrics
GET /api/analytics/v2/category/Flower/growth?window=30d
# Category growth trend over time
GET /api/analytics/v2/category/Flower/trend?window=30d
# Top brands in category
GET /api/analytics/v2/category/Flower/top-brands?limit=25&state=AZ
# All categories with metrics
GET /api/analytics/v2/category/all?state=AZ&limit=50
# Rec vs Med category comparison
GET /api/analytics/v2/category/rec-vs-med?category=Flower
# Fastest growing categories
GET /api/analytics/v2/category/fastest-growing?window=30d&limit=25
V2 Store Analytics
# Store change summary
GET /api/analytics/v2/store/101/summary?window=30d
# Product change events
GET /api/analytics/v2/store/101/events?window=7d&limit=100
# Store inventory composition
GET /api/analytics/v2/store/101/inventory
# Store price positioning vs market
GET /api/analytics/v2/store/101/price-position
# Most active stores by changes
GET /api/analytics/v2/store/most-active?window=7d&limit=25&state=AZ
V2 State Analytics
# State market summary
GET /api/analytics/v2/state/AZ/summary
# All states with coverage metrics
GET /api/analytics/v2/state/all
# Legal state breakdown (rec, med-only, no program)
GET /api/analytics/v2/state/legal-breakdown
# Rec vs Med pricing by category
GET /api/analytics/v2/state/rec-vs-med-pricing?category=Flower
# States with coverage gaps
GET /api/analytics/v2/state/coverage-gaps
# Cross-state pricing comparison
GET /api/analytics/v2/state/price-comparison
V2 Services Architecture
src/services/analytics/
├── index.ts # Exports all V2 services
├── types.ts # Shared type definitions
├── PriceAnalyticsService.ts # Price trends and volatility
├── BrandPenetrationService.ts # Brand market presence
├── CategoryAnalyticsService.ts # Category growth analysis
├── StoreAnalyticsService.ts # Store change tracking
└── StateAnalyticsService.ts # State-level analytics
src/routes/analytics-v2.ts # V2 API route handlers
Key V2 Features
- Rec/Med State Segmentation: All analytics can be filtered and compared by legal status
- State Coverage Gaps: Identify legal states with missing or stale data
- Cross-State Pricing: Compare prices across recreational and medical-only markets
- Brand Footprint Analysis: Track brand presence in rec vs med states
- Category Comparison: Compare category performance by legal status
V2 Migration Path
-
Run migration 052 for state cannabis flags:
psql "$DATABASE_URL" -f migrations/052_add_state_cannabis_flags.sql -
Run migration 053 for analytics indexes:
psql "$DATABASE_URL" -f migrations/053_analytics_indexes.sql -
Restart backend to pick up new routes
V2 Response Examples
Rec vs Med Price Comparison:
{
"category": "Flower",
"recreational": {
"state_count": 15,
"product_count": 12500,
"avg_price": 35.50,
"median_price": 32.00
},
"medical_only": {
"state_count": 8,
"product_count": 5200,
"avg_price": 42.00,
"median_price": 40.00
},
"price_diff_percent": -15.48
}
Legal State Breakdown:
{
"recreational_states": {
"count": 24,
"dispensary_count": 850,
"product_count": 125000,
"states": [
{ "code": "CA", "name": "California", "dispensary_count": 250 },
{ "code": "CO", "name": "Colorado", "dispensary_count": 150 }
]
},
"medical_only_states": {
"count": 18,
"dispensary_count": 320,
"product_count": 45000,
"states": [
{ "code": "FL", "name": "Florida", "dispensary_count": 120 }
]
},
"no_program_states": {
"count": 9,
"states": [
{ "code": "ID", "name": "Idaho" }
]
}
}
Phase 3 Analytics Engine - Fully Implemented V2 Rec/Med State Analytics - Added December 2024