Compare commits

..

91 Commits

Author SHA1 Message Date
Kelly
baf3b2a76a fix: Use registry.spdy.io for all CI images + fix TypeScript errors
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Update .woodpecker.yml to use registry.spdy.io for node, alpine, kaniko, kubectl images
- Fix Buffer type errors in script files (Node 22 TypeScript compatibility)
- Add explicit typing for Map iterations in diff scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 23:15:07 -07:00
Kelly
b88b9ab8bd fix: use local registry for all base images
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
All Dockerfiles now pull from registry.spdy.io instead of Docker Hub
to avoid rate limits. Added python:3.11-slim to daily sync job.

Images synced daily at 3 AM:
- node:20-slim, node:22-slim, node:22, node:22-alpine, node:20-alpine
- nginx:alpine
- python:3.11-slim
- alpine:latest
- busybox:latest
- bitnami/kubectl:latest
2025-12-17 23:03:36 -07:00
Kelly
09057b5756 feat: Add Hoodie comparison reports with scheduled jobs
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Migration for hoodie_comparison_reports table
- Comparison service: pulls all Hoodie data, compares to CannaIQ
- Stores delta results (hoodie_only, cannaiq_only, matched)
- Raw Hoodie data stays remote (proxy only)
- New endpoints:
  - POST /api/hoodie/reports/run/:state - run comparison
  - GET /api/hoodie/reports - latest reports
  - GET /api/hoodie/reports/:id - full report details
  - GET /api/hoodie/reports/history/:type/:state - report history

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 19:38:56 -07:00
Kelly
8e2c2ad2c3 feat: Add Hoodie Analytics proxy client and API routes
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Query Hoodie's Algolia indexes directly (no local sync)
- Indexes: dispensaries, products, brands, master_products, locations
- Endpoints:
  - GET /api/hoodie/stats - total counts
  - GET /api/hoodie/stats/:state - state stats
  - GET /api/hoodie/dispensaries - search/filter dispensaries
  - GET /api/hoodie/products - search/filter products
  - GET /api/hoodie/brands - search/filter brands
  - GET /api/hoodie/compare/dispensaries/:state - delta comparison
  - GET /api/hoodie/compare/brands/:state - delta comparison

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 19:33:27 -07:00
Kelly
4a0ed6c80a fix(proxy): Add city fallback chain for Evomi geo-targeting
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
When a dispensary's city isn't available in Evomi (412 error), now tries:
1. Exact dispensary city (e.g., "el.mirage")
2. State's major city (e.g., "phoenix" for AZ)
3. State-only targeting as last resort

This ensures workers can always get a proxy while preferring
the closest available city to the dispensary.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 19:09:30 -07:00
Kelly
caab7e2e5d fix(proxy): Disable city-level targeting - Evomi lacks small city proxies
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Evomi returns 412 "failed to select from local pool" when requesting
proxies for small cities like "El Mirage". State-level targeting
(e.g., arizona) works fine and provides sufficient geo coverage.

This was causing all workers to fail with 412 errors on every request.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 19:07:03 -07:00
Kelly
5f5f5edf73 fix(k8s): Prevent worker pods from exceeding 8 during rollouts
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Added explicit rolling update strategy:
- maxSurge: 0 - Never create extra pods above replica count
- maxUnavailable: 1 - Roll out one pod at a time

This ensures the 8-pod limit is NEVER exceeded, even during deployments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 19:04:56 -07:00
Kelly
28f5ca8666 fix(docker): Add ca-certificates for HTTPS proxy support
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Workers were failing with "error setting certificate file" when making
HTTPS requests through the Evomi proxy. The node:22-slim base image
was missing the ca-certificates package needed for SSL verification.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 18:43:02 -07:00
Kelly
ee9d3bdef5 feat(brands): Add normalize_brand() for fuzzy brand matching
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Adds brand name normalization for Cannabrands integration:
- normalize_brand() SQL function removes spaces/punctuation, lowercases
- Functional indexes on store_products and store_product_snapshots
- Updated 10 brand API queries to use normalized matching

Now "Aloha TymeMachine", "ALOHA TYME MACHINE", "Aloha Tyme Machine"
all match the same brand when querying the API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 18:13:00 -07:00
Kelly
7485123390 feat(inventory): Add inventory API routes and complete materialized views
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add /api/inventory/* routes for high-frequency, snapshots, and events
- Created mv_brand_market_share and mv_store_performance views in production
- Register inventory routes in backend index.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 17:31:43 -07:00
Kelly
1155236c74 fix: Build product URLs using dispensary slug and product name
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Use dispensary.slug + slugified product name for URLs
- Support Dutchie, Jane/iHeartJane, and Treez URL formats
- Add regex fallback for dispensaries without slug field
- Dutchie: /embedded-menu/{slug}/product/{product-slug}
- Jane: /stores/{slug}/products/{product-id}
- Treez: /onlinemenu/{slug}?product={product-slug}

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 16:57:57 -07:00
Kelly
0a26944374 fix(workers): Claim tasks before initializing proxy/preflight
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Two fixes:
1. sessionPoolMainLoop now checks for tasks BEFORE initializing stealth.
   Previously, workers would get proxy IPs and run preflights even when
   no tasks were available, wasting proxy bandwidth.

2. Fix Evomi city parameter formatting - spaces must be replaced with dots
   and lowercased (e.g., "El Mirage" -> "el.mirage"). This was causing
   412 errors from Evomi when claiming tasks for cities with spaces.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 16:14:00 -07:00
Kelly
239feaf069 fix: Build product-specific URLs instead of dispensary homepage
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Parse Dutchie URLs and construct product links:
  dutchie.com/dispensary/{slug}/product/{product-id}
- Parse Jane/iHeartJane URLs similarly
- Applied to both /products and /specials endpoints

This fixes products linking to dispensary homepage instead of
the specific product page.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 16:02:33 -07:00
Kelly
a2324214e9 feat: Enhanced cart button tracking and automatic outbound click tracking
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Plugin:
- Enhanced [cannaiq_cart_button] shortcode with full tracking data
  - Includes product_id, name, price, category, on_special, store_id
  - Added size, full-width, and custom class options
- Added [cannaiq_product_wrapper] shortcode for custom buttons
- Auto-track ALL outbound clicks to menu providers:
  - dutchie.com, iheartjane.com, jane.com, treez.io, weedmaps.com, leafly.com

API:
- Added menu_url to /products response
- Added menu_url to /specials response

Docs:
- Updated Elementor guide with button tracking options
- Documented shortcode attributes and examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 15:25:15 -07:00
Kelly
141f95e88e feat: Add menu_url to products API and document button usage
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
API:
- Add menu_url to /products response (joins dispensaries table)
- Add menu_url to /specials response
- Products now include link to dispensary's online ordering system

Docs:
- Document how to use any Elementor button with Menu URL dynamic tag
- Add examples for brand specials and combined filters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 15:20:29 -07:00
Kelly
d7f2edf630 docs: Add Loop Grid filtering examples to Elementor guide
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Document Products and Specials data sources
- List all available filters (category, brand, strain, etc.)
- Add examples: Flower Specials, Top THC, Brand Showcase
- Add examples: Brand Specials, combined filters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 15:16:46 -07:00
Kelly
214cbaee7d feat: Add Elementor Loop Grid support with Products and Specials sources
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add loop-builder.php for Loop Grid integration
- Register "Products" and "Specials" as custom query sources
- Show "{Dispensary Name} Products/Specials" in query dropdown
- Auto-save dispensary name when API key is saved
- Specials source auto-filters to on_special=true
- Support all CannaiQ filters (category, brand, strain, etc.)
- Set $cannaiq_current_product for dynamic tags in loop items

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 15:12:34 -07:00
Kelly
15711d21fc fix(k8s): Use cannaiq-config ConfigMap for workers
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Changed configMapRef from scraper-config to cannaiq-config to match
existing cluster ConfigMap name. This fixes CreateContainerConfigError
on new worker pods.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 14:54:40 -07:00
Kelly
4ab2679d55 feat(wordpress): API key routes to store automatically (v2.2.0)
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Remove Store ID control from Product Loop widget
- Widget calls /api/v1/products with X-API-Key header
- Backend determines store from API key (no store_id needed)
- Add store_id to /menu endpoint response
- Update K8s docs: deploy-and-forget model

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-17 13:25:19 -07:00
Kelly
3d2718ffbe fix(ci): apply scraper-worker.yaml on deploy
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Ensures replica count and resource changes in scraper-worker.yaml
are deployed via CI instead of requiring manual kubectl.
2025-12-17 12:45:27 -07:00
Kelly
3fa7987651 docs: Add Elementor guide for WordPress plugin
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-17 12:45:03 -07:00
Kelly
b6369d0191 fix: WordPress plugin ZIP structure for WP installer
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-17 12:37:01 -07:00
Kelly
69a013538c feat(workers): Session pool with preflight qualification and gold badge
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Complete worker session flow:
1. Claim ONE task first (determines geo)
2. Get IP matching task's city/state
3. Sticky fingerprint: 1 IP = 1 fingerprint (reuse or generate new)
4. Preflight/qualify: verify antidetect working
5. Gold badge awarded on preflight pass
6. Claim 2-4 more tasks for same geo (total 3-5)
7. Execute tasks
8. Retire session: badge cleared, IP on 8hr cooldown

Changes:
- Enable USE_SESSION_POOL=true in K8s deployment
- Add preflight step after IP acquisition
- Enforce 1 IP = 1 fingerprint rule
- Block unqualified workers from claiming tasks
- Close qualification bypass in task-service.ts
- Add worker badge column (gold = qualified)
- Random 3-5 tasks per session for natural traffic

Migrations:
- 129: claim_tasks_batch_for_geo function
- 130: worker_registry badge column and functions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 11:57:19 -07:00
Kelly
11088b87ff fix: Add Next.js App Router support for Treez config extraction
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Sites like BEST Dispensary use the App Router streaming format
(self.__next_f.push) instead of __NEXT_DATA__. This adds a fallback
parser to extract Treez credentials from the streaming format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 11:08:42 -07:00
Kelly
d1d58da0b2 feat: Add K8s worker restart and pool control APIs
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
K8s routes (/api/k8s):
- POST /workers/restart - Restart all worker pods

Pool routes (/api/pool):
- GET /status - Pool status, task counts, pending by role/state
- POST /open - Open pool (allow task claiming)
- POST /close - Close pool (stop task claiming)
- POST /clear - Clear pending tasks (by role/state)
- GET /tasks - List tasks in pool
- DELETE /tasks/:id - Remove specific task
- POST /release-stale - Release stuck tasks

Migration 128: pool_config table with pool_open check in claim_task

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-17 10:53:58 -07:00
Kelly
93c8bc3598 feat(plugin): Add cannaiq_track shortcode for click tracking
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Usage:
  [cannaiq_track event="banner_click"]<div>Content</div>[/cannaiq_track]
  [cannaiq_track event="promo" category="holiday" tag="span"]Click me[/cannaiq_track]

Attributes:
  - event (required): Event name
  - category: Event category
  - label: Custom label
  - value: Numeric value
  - tag: HTML tag (div, span, a, button, section, article)
  - class: Additional CSS classes

Bumps plugin to v2.0.3

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-17 10:24:45 -07:00
Kelly
c5ed50f1b3 chore: Bump plugin version to 2.0.2 in PHP header
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-17 10:20:21 -07:00
Kelly
ec2e942810 feat(plugin): Add generic data-cannaiq-track attribute for custom tracking
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Usage:
  <a href='...' data-cannaiq-track='shop_now'>Shop Now</a>
  <button data-cannaiq-track='filter_click' data-cannaiq-track-category='flower'>Filter</button>

Optional attributes:
  - data-cannaiq-track-category
  - data-cannaiq-track-label
  - data-cannaiq-track-value

Bumps plugin to v2.0.2

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-17 10:17:53 -07:00
Kelly
e8e7261409 fix: TypeScript null check in StockStatusBadge
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Use != null to handle both null and undefined for optional daysUntilOOS prop

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-17 02:33:07 -07:00
Kelly
1d254238f3 chore: Bump plugin version to 2.0.1
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 02:04:06 -07:00
Kelly
0b4ed48d2f feat: Add premade card templates and click analytics
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
WordPress Plugin v2.0.0:
- Add Promo Banner widget (dark banner with deal text)
- Add Horizontal Product Row widget (wide list format)
- Add Category Card widget (image-based categories)
- Add Compact Card widget (dense grid layout)
- Add CannaiQAnalytics click tracking (tracks add_to_cart,
  product_view, promo_click, category_click events)
- Register cannaiq-templates Elementor category
- Fix branding: CannaiQAnalytics (not CannaIQAnalytics)

Backend:
- Add POST /api/analytics/click endpoint for WordPress plugin
- Accepts API token auth, records to product_click_events table
- Stores metadata: product_name, price, category, url, referrer

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 02:03:28 -07:00
Kelly
87da7625cd feat: WordPress plugin v2.0.0 - modular component library
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add 14 new shortcodes for building custom product cards
- Add visual builder guide with Hot Lava example in admin page
- Add comprehensive component documentation
- Update branding to "CannaiQ" throughout
- Include layout shortcodes: specials, brands, categories
- Include component shortcodes: discount_badge, strain_badge,
  thc, cbd, effects, price, cart_button, stock, terpenes
- Add build steps and instructions for assembling cards

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 01:58:20 -07:00
Kelly
9f3bc8a843 fix: Worker task concurrency limit and inventory tracking
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Fix claim_task to enforce max 5 tasks per worker (was unlimited)
- Add session_task_count check before ANY claiming path
- Add triggers to auto-decrement count on task complete/release
- Update MAX_CONCURRENT_TASKS default from 3 to 5
- Update frontend fallback to show 5 task slots

- Add Wasabi S3 storage for payload archival
- Add inventory snapshots service (delta-only tracking)
- Add sales analytics views and routes
- Add high-frequency manager UI components
- Reset hardcoded AZ 5-minute intervals (use UI to configure)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 01:34:38 -07:00
Kelly
c33ed1cae9 feat: CannaiQ Menus WordPress Plugin v2.0.0 - Modular Component Library
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
New modular component widgets:
- Discount Ribbon (ribbon/pill/text styles)
- Strain Badge (Sativa/Indica/Hybrid colored pills)
- THC/CBD Meter (progress bars or badges)
- Effects Display (styled chips with icons)
- Price Block (original + sale price)
- Cart Button (styled CTA linking to menu)
- Stock Indicator (in/out of stock badges)
- Product Image + Badges (image with overlays)

New card template:
- Premium Product Card (ready-to-use template)

Extended dynamic tags (30+ total):
- Discount %, Strain Badge, THC/CBD Badge
- Effects Chips, Terpenes, Price Display
- Menu URL, Stock Status, and more

New files:
- assets/css/components.css
- includes/effects-icons.php (SVG icons)
- 10 new widget files
- dynamic-tags-extended.php

Branding updated to "CannaiQ" throughout.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 00:21:40 -07:00
Kelly
38e7980cf4 fix: Add missing imports and type annotations
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add authMiddleware import to tasks.ts
- Fix pool import in SalesAnalyticsService.ts
- Add type annotations to map callbacks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 00:03:52 -07:00
Kelly
887ce33b11 fix: Add health probes to scraper deployment
- Add liveness probe (restarts pod if unresponsive)
- Add readiness probe (removes from service if not ready)
- Add resource limits (512Mi-2Gi memory, 250m-1000m CPU)
- Update CI to apply full manifest on deploy
- Increase frontend rollout timeout to 300s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 23:06:22 -07:00
Kelly
de239df314 fix(k8s): update registry-sync to use registry.spdy.io
Use registry.spdy.io instead of internal IP for base image syncing.
Add library/busybox:latest to the sync list.
2025-12-16 22:17:44 -07:00
Kelly
6fcc64933a fix: Increase cannaiq-frontend rollout timeout to 300s
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Prevents false CI failures when rollout takes longer than 120s.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 22:06:00 -07:00
Kelly
3488905ccc fix: Delete completed tasks from pool instead of marking complete
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Completed tasks are now deleted from worker_tasks table.
Only failed tasks remain in the pool for retry/review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 21:19:20 -07:00
Kelly
3ee09fbe84 feat: Treez SSR support, task improvements, worker geo display
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add SSR config extraction for Treez sites (BEST Dispensary)
- Increase MAX_RETRIES from 3 to 5 for task failures
- Update task list ordering: active > pending > failed > completed
- Show detected proxy location in worker dashboard (from fingerprint)
- Hardcode 'dutchie' menu_type in promotion.ts (remove deriveMenuType)
- Update provider display to show actual provider names

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 19:22:04 -07:00
Kelly
7d65e0ae59 fix: Use cannaiq namespace for deployments
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Revert namespace from dispensary-scraper to cannaiq
- Keep registry.spdy.io for image URLs (k8s nodes need HTTPS)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 16:37:50 -07:00
Kelly
25f9118662 fix: Use registry.spdy.io for k8s deployments
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Update kubectl set image commands to use HTTPS registry URL
- Fix namespace from cannaiq to dispensary-scraper
- Add guidance on when to use which registry URL

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 12:37:11 -07:00
Kelly
5c0de752af fix: Check inventory_snapshots for product_discovery output verification
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
raw_crawl_payloads only saved during baseline window (12:01-3:00 AM),
but inventory_snapshots are always saved. This caused product_discovery
tasks to fail verification outside the baseline window.
2025-12-16 10:20:48 -07:00
Kelly
a90b10a1f7 feat(k8s): Add daily registry sync cronjob for base images
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2025-12-16 09:49:36 -07:00
Kelly
75822ab67d docs: Add Docker registry cache instructions
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2025-12-16 09:34:55 -07:00
Kelly
df4d599478 chore: test CI after fixes
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2025-12-16 09:22:53 -07:00
Kelly
4544718cad chore: trigger CI after DNS fix
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-16 09:21:13 -07:00
Kelly
47da61ed71 chore: trigger CI rebuild
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-16 09:19:58 -07:00
Kelly
e450d2e99e fix(ci): use local registry mirror instead of mirror.gcr.io
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Switch Kaniko registry-mirror from mirror.gcr.io to 10.100.9.70:5000
to pull base images from local registry instead of GCR.
2025-12-16 09:09:15 -07:00
Kelly
205a8b3159 chore: retry CI for visibility-events fix
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-16 08:56:59 -07:00
Kelly
8bd29d11bb fix: Use correct column names in visibility-events query
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Changed name -> name_raw and brand -> brand_name_raw to match
store_products table schema.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 02:21:01 -07:00
Kelly
4e7b3d2336 fix: Update DATABASE_URL to point to primary PostgreSQL server
Changed from 10.100.6.50 (secondary/replica in read-only mode) to
10.100.7.50 (primary) to fix read-only transaction errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 01:13:49 -07:00
Kelly
849123693a fix(ci): Use unquoted heredoc for kubeconfig token injection
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Changed heredoc from 'KUBEEOF' (quoted) to KUBEEOF (unquoted)
- This allows shell variable expansion of $K8S_TOKEN directly
- Removed sed replacement step that was failing due to YAML escaping issues
2025-12-15 21:55:52 -07:00
Kelly
a1227f77b9 chore: retry CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 21:53:00 -07:00
Kelly
415e89a012 chore: retry CI with k8s_token secret
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 21:26:06 -07:00
Kelly
45844c6281 ci: Embed kubeconfig, use k8s_token secret for token only 2025-12-15 21:19:26 -07:00
Kelly
24c9586d81 ci: Skip base64 - use raw kubeconfig in secret
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 21:09:54 -07:00
Kelly
f8d61446d5 chore: retry CI with correct kubeconfig
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 20:57:19 -07:00
Kelly
0f859d1c75 chore: retry CI after kubeconfig fix
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 20:43:40 -07:00
Kelly
52dc669782 ci: Remove clone/volume config (requires admin trust)
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Woodpecker doesn't allow custom clone or volumes without elevated trust.
Kaniko layer caching (--cache-repo) still works (registry-based).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 20:16:18 -07:00
Kelly
2e47996354 ci: Add shallow git clone (depth: 1)
Only fetch latest commit instead of full history.
Reduces checkout time and bandwidth.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 20:05:33 -07:00
Kelly
f25d4eaf27 ci: Add npm and Docker layer caching
- PR steps: shared npm-cache volume for faster npm ci
- Docker builds: --cache-repo to local registry for layer caching
- Kaniko will reuse npm install layer when package.json unchanged

First build populates cache, subsequent builds much faster.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 20:04:15 -07:00
Kelly
61a6be888c ci: Consolidate back to 4 docker steps
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Remove separate build steps (didn't save time)
- Use original multi-stage Dockerfiles
- Delete unused Dockerfile.ci files
- 4 parallel docker builds + deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 20:01:11 -07:00
Kelly
09c2b3a0e1 ci: Use node:22 instead of node:22-alpine for builds
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Alpine uses musl libc which breaks Rollup's native bindings.
Debian-based node:22 uses glibc and works correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 19:56:48 -07:00
Kelly
cec34198c7 ci: Add slim Dockerfile.ci files for faster CI builds
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add Dockerfile.ci for backend, cannaiq, findadispo, findagram
- Frontend Dockerfiles just copy pre-built assets to nginx
- Backend Dockerfile copies pre-built dist/node_modules
- Reduces Docker build time by doing npm ci/build in CI step

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 19:43:08 -07:00
Kelly
3c10e07e45 feat(ci): Push built images to local registry for faster K8s pulls
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Build images push to 10.100.9.70:5000/cannaiq/*
- Deploy pulls from local registry (no external network)
- Removed git.spdy.io registry auth (not needed for local)
- Added --insecure-registry for HTTP local registry
2025-12-15 19:16:16 -07:00
Kelly
3582c2e9e2 fix(k8s): Use external Postgres/Redis/MinIO services
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Update secrets.yaml with correct MinIO credentials
- Add Redis connection details
- Remove postgres.yaml (use external 10.100.6.50)
- Remove redis.yaml (use external 10.100.9.50)
2025-12-15 19:03:05 -07:00
Kelly
c6874977ee docs: Add spdy.io infrastructure credentials
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:59:18 -07:00
Kelly
68430f5c22 fix(ci): Use mirror.gcr.io as registry mirror for Kaniko
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:52:02 -07:00
Kelly
ccefd325aa fix(ci): Use hardcoded Woodpecker workspace path for Kaniko
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:49:56 -07:00
Kelly
e119c5af53 chore: trigger CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:44:24 -07:00
Kelly
e61224aaed fix(ci): Use CI_WORKSPACE for Kaniko context paths
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:42:33 -07:00
Kelly
7cf1b7643f feat(ci): Use local registry 10.100.9.70:5000 for base images
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:28:20 -07:00
Kelly
74f813d68f feat(ci): Switch to Kaniko for Docker builds (no daemon, better DNS)
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:20:53 -07:00
Kelly
f38f1024de fix(docker): Use mirror.gcr.io in all Dockerfiles to avoid rate limits
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:19:18 -07:00
Kelly
358099c58a chore: trigger CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:12:41 -07:00
Kelly
7fdcfc4fc4 fix(ci): Use mirror.gcr.io to avoid Docker Hub rate limits
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:11:16 -07:00
Kelly
541b461283 fix(ci): Use public node:20 image for typecheck steps
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 18:08:46 -07:00
Kelly
8f25cf10ab chore: retry CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 17:06:42 -07:00
Kelly
79e434212f chore: retry CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 16:45:05 -07:00
Kelly
600172eff6 chore: retry CI
Some checks are pending
ci/woodpecker/push/woodpecker Pipeline is running
2025-12-15 15:51:40 -07:00
Kelly
4c12763fa1 chore: retry CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 13:18:53 -07:00
Kelly
2cb9a093f4 chore: retry CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 12:29:45 -07:00
Kelly
15ab40a820 chore: trigger CI build
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 12:14:14 -07:00
Kelly
2708fbe319 feat(brands): Add calculated tags with configurable thresholds
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Tags assigned per store:
- must_win: High-revenue store with room to grow SKUs
- at_risk: High OOS% (losing shelf presence)
- top_performer: High sales + good inventory management
- growth: Above-average velocity
- low_inventory: Low days on hand

Configurable via query params:
- ?must_win_max_skus=5
- ?at_risk_oos_pct=30
- ?top_performer_max_oos=15
- ?low_inventory_days=7

Response includes tag_thresholds showing applied values.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 12:06:44 -07:00
Kelly
231d49e3e8 feat(brands): Add margin estimation to stores/performance endpoint
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add ?margin_pct query param (default 50% industry standard)
- Returns margin_pct and margin_est per store
- Includes margin_pct_assumed in response metadata

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 12:02:36 -07:00
Kelly
17defa046c feat(api): Add /api/brands/:brand/stores/performance endpoint
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Add comprehensive per-store performance endpoint for Cannabrands integration.
Returns all metrics in one call for easy merging with internal order data.

Response includes per store:
- active_skus, oos_skus, total_skus, oos_pct
- avg_daily_units (velocity from inventory deltas)
- avg_days_on_hand (stock / daily velocity)
- total_sales_est (units × price × days)
- lost_opportunity (OOS days × velocity × price)
- categories breakdown (JSON object)
- avg_price, total_stock

Query params: ?days=28&state=AZ&limit=100&offset=0

Matches Hoodie Analytics columns for Order Management view.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 11:57:38 -07:00
Kelly
d76a5fb3c5 feat(api): Add brand analytics API endpoints
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Add comprehensive brand-level analytics endpoints at /api/brands:

Brand Discovery:
- GET /api/brands - List all brands with summary metrics
- GET /api/brands/search - Search brands by name
- GET /api/brands/top - Top brands by distribution

Brand Overview:
- GET /api/brands/:brand - Full brand intelligence dashboard
- GET /api/brands/:brand/analytics - Alias for overview

Sales & Velocity:
- GET /api/brands/:brand/sales - Sales data (4wk, daily avg)
- GET /api/brands/:brand/velocity - Units/day by SKU
- GET /api/brands/:brand/trends - Weekly sales trends

Inventory & Stock:
- GET /api/brands/:brand/inventory - Current stock levels
- GET /api/brands/:brand/oos - Out-of-stock products
- GET /api/brands/:brand/low-stock - Products below threshold

Pricing:
- GET /api/brands/:brand/pricing - Current prices
- GET /api/brands/:brand/price-history - Price changes over time

Distribution:
- GET /api/brands/:brand/distribution - Store count, market coverage
- GET /api/brands/:brand/stores - Stores carrying brand
- GET /api/brands/:brand/gaps - Whitespace opportunities

Events & Alerts:
- GET /api/brands/:brand/events - Visibility events
- POST /api/brands/:brand/events/:id/ack - Acknowledge alert

Products:
- GET /api/brands/:brand/products - All SKUs with metrics
- GET /api/brands/:brand/products/:sku - Single product deep dive

All endpoints support ?state=XX, ?days=N, and ?category=X filters.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 11:06:23 -07:00
Kelly
f19fc59583 chore: retry CI
Some checks are pending
ci/woodpecker/push/woodpecker Pipeline is running
2025-12-15 09:59:11 -07:00
Kelly
4c183c87a9 chore: retry CI after registry fix
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 09:35:05 -07:00
Kelly
ffa05f89c4 chore: trigger CI on develop
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2025-12-15 09:06:43 -07:00
136 changed files with 29174 additions and 666 deletions

View File

@@ -3,7 +3,7 @@ steps:
# PR VALIDATION: Parallel type checks (PRs only)
# ===========================================
typecheck-backend:
image: git.spdy.io/creationshop/node:20
image: registry.spdy.io/library/node:22
commands:
- cd backend
- npm ci --prefer-offline
@@ -13,7 +13,7 @@ steps:
event: pull_request
typecheck-cannaiq:
image: git.spdy.io/creationshop/node:20
image: registry.spdy.io/library/node:22
commands:
- cd cannaiq
- npm ci --prefer-offline
@@ -23,7 +23,7 @@ steps:
event: pull_request
typecheck-findadispo:
image: git.spdy.io/creationshop/node:20
image: registry.spdy.io/library/node:22
commands:
- cd findadispo/frontend
- npm ci --prefer-offline
@@ -33,7 +33,7 @@ steps:
event: pull_request
typecheck-findagram:
image: git.spdy.io/creationshop/node:20
image: registry.spdy.io/library/node:22
commands:
- cd findagram/frontend
- npm ci --prefer-offline
@@ -46,7 +46,7 @@ steps:
# AUTO-MERGE: Merge PR after all checks pass
# ===========================================
auto-merge:
image: alpine:latest
image: registry.spdy.io/library/alpine:latest
environment:
GITEA_TOKEN:
from_secret: gitea_token
@@ -68,114 +68,117 @@ steps:
event: pull_request
# ===========================================
# MASTER DEPLOY: Parallel Docker builds
# NOTE: cache_from/cache_to removed due to plugin bug splitting on commas
# DOCKER: Multi-stage builds with layer caching
# ===========================================
docker-backend:
image: plugins/docker
settings:
registry: git.spdy.io
repo: git.spdy.io/creationshop/cannaiq
tags:
- latest
- sha-${CI_COMMIT_SHA:0:8}
dockerfile: backend/Dockerfile
context: backend
username:
from_secret: registry_username
password:
from_secret: registry_password
build_args:
- APP_BUILD_VERSION=sha-${CI_COMMIT_SHA:0:8}
- APP_GIT_SHA=${CI_COMMIT_SHA}
- APP_BUILD_TIME=${CI_PIPELINE_CREATED}
- CONTAINER_IMAGE_TAG=sha-${CI_COMMIT_SHA:0:8}
image: registry.spdy.io/library/kaniko:debug
commands:
- /kaniko/executor
--context=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/backend
--dockerfile=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/backend/Dockerfile
--destination=registry.spdy.io/cannaiq/backend:latest
--destination=registry.spdy.io/cannaiq/backend:sha-${CI_COMMIT_SHA:0:8}
--build-arg=APP_BUILD_VERSION=sha-${CI_COMMIT_SHA:0:8}
--build-arg=APP_GIT_SHA=${CI_COMMIT_SHA}
--build-arg=APP_BUILD_TIME=${CI_PIPELINE_CREATED}
--cache=true
--cache-repo=registry.spdy.io/cannaiq/cache-backend
--cache-ttl=168h
depends_on: []
when:
branch: [master, develop]
event: push
docker-cannaiq:
image: plugins/docker
settings:
registry: git.spdy.io
repo: git.spdy.io/creationshop/cannaiq-frontend
tags:
- latest
- sha-${CI_COMMIT_SHA:0:8}
dockerfile: cannaiq/Dockerfile
context: cannaiq
username:
from_secret: registry_username
password:
from_secret: registry_password
image: registry.spdy.io/library/kaniko:debug
commands:
- /kaniko/executor
--context=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/cannaiq
--dockerfile=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/cannaiq/Dockerfile
--destination=registry.spdy.io/cannaiq/frontend:latest
--destination=registry.spdy.io/cannaiq/frontend:sha-${CI_COMMIT_SHA:0:8}
--cache=true
--cache-repo=registry.spdy.io/cannaiq/cache-cannaiq
--cache-ttl=168h
depends_on: []
when:
branch: [master, develop]
event: push
docker-findadispo:
image: plugins/docker
settings:
registry: git.spdy.io
repo: git.spdy.io/creationshop/findadispo-frontend
tags:
- latest
- sha-${CI_COMMIT_SHA:0:8}
dockerfile: findadispo/frontend/Dockerfile
context: findadispo/frontend
username:
from_secret: registry_username
password:
from_secret: registry_password
image: registry.spdy.io/library/kaniko:debug
commands:
- /kaniko/executor
--context=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/findadispo/frontend
--dockerfile=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/findadispo/frontend/Dockerfile
--destination=registry.spdy.io/cannaiq/findadispo:latest
--destination=registry.spdy.io/cannaiq/findadispo:sha-${CI_COMMIT_SHA:0:8}
--cache=true
--cache-repo=registry.spdy.io/cannaiq/cache-findadispo
--cache-ttl=168h
depends_on: []
when:
branch: [master, develop]
event: push
docker-findagram:
image: plugins/docker
settings:
registry: git.spdy.io
repo: git.spdy.io/creationshop/findagram-frontend
tags:
- latest
- sha-${CI_COMMIT_SHA:0:8}
dockerfile: findagram/frontend/Dockerfile
context: findagram/frontend
username:
from_secret: registry_username
password:
from_secret: registry_password
image: registry.spdy.io/library/kaniko:debug
commands:
- /kaniko/executor
--context=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/findagram/frontend
--dockerfile=/woodpecker/src/git.spdy.io/Creationshop/cannaiq/findagram/frontend/Dockerfile
--destination=registry.spdy.io/cannaiq/findagram:latest
--destination=registry.spdy.io/cannaiq/findagram:sha-${CI_COMMIT_SHA:0:8}
--cache=true
--cache-repo=registry.spdy.io/cannaiq/cache-findagram
--cache-ttl=168h
depends_on: []
when:
branch: [master, develop]
event: push
# ===========================================
# STAGE 3: Deploy and Run Migrations
# DEPLOY: Pull from local registry
# ===========================================
deploy:
image: bitnami/kubectl:latest
image: registry.spdy.io/library/kubectl:latest
environment:
KUBECONFIG_CONTENT:
from_secret: kubeconfig_data
K8S_TOKEN:
from_secret: k8s_token
commands:
- mkdir -p ~/.kube
- echo "$KUBECONFIG_CONTENT" | tr -d '[:space:]' | base64 -d > ~/.kube/config
- |
cat > ~/.kube/config << KUBEEOF
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJkakNDQVIyZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQWpNU0V3SHdZRFZRUUREQmhyTTNNdGMyVnkKZG1WeUxXTmhRREUzTmpVM05UUTNPRE13SGhjTk1qVXhNakUwTWpNeU5qSXpXaGNOTXpVeE1qRXlNak15TmpJegpXakFqTVNFd0h3WURWUVFEREJock0zTXRjMlZ5ZG1WeUxXTmhRREUzTmpVM05UUTNPRE13V1RBVEJnY3Foa2pPClBRSUJCZ2dxaGtqT1BRTUJCd05DQUFRWDRNdFJRTW5lWVJVV0s2cjZ3VEV2WjAxNnV4T3NUR3JJZ013TXVnNGwKajQ1bHZ6ZkM1WE1NY1pESnUxZ0t1dVJhVGxlb0xVOVJnSERIUUI4TUwzNTJvMEl3UURBT0JnTlZIUThCQWY4RQpCQU1DQXFRd0R3WURWUjBUQVFIL0JBVXdBd0VCL3pBZEJnTlZIUTRFRmdRVXIzNDZpNE42TFhzaEZsREhvSlU0CjJ1RjZseGN3Q2dZSUtvWkl6ajBFQXdJRFJ3QXdSQUlnVUtqdWRFQWJyS1JDVHROVXZTc1Rmb3FEaHFSeDM5MkYKTFFSVWlKK0hCVElDSUJqOFIxbG1zSnFSRkRHMEpwMGN4OG5ZZnFCaElRQzh6WWdRdTdBZmR4L3IKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
server: https://10.100.6.10:6443
name: spdy-k3s
contexts:
- context:
cluster: spdy-k3s
namespace: cannaiq
user: cannaiq-admin
name: cannaiq
current-context: cannaiq
users:
- name: cannaiq-admin
user:
token: $K8S_TOKEN
KUBEEOF
- chmod 600 ~/.kube/config
# Deploy backend first
- kubectl set image deployment/scraper scraper=git.spdy.io/creationshop/cannaiq:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
# Apply manifests to ensure probes and resource limits are set
- kubectl apply -f /woodpecker/src/git.spdy.io/Creationshop/cannaiq/k8s/scraper.yaml
- kubectl apply -f /woodpecker/src/git.spdy.io/Creationshop/cannaiq/k8s/scraper-worker.yaml
- kubectl set image deployment/scraper scraper=registry.spdy.io/cannaiq/backend:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl rollout status deployment/scraper -n cannaiq --timeout=300s
# Note: Migrations run automatically at startup via auto-migrate
# Deploy remaining services
# Resilience: ensure workers are scaled up if at 0
- REPLICAS=$(kubectl get deployment scraper-worker -n cannaiq -o jsonpath='{.spec.replicas}'); if [ "$REPLICAS" = "0" ]; then echo "Scaling workers from 0 to 5"; kubectl scale deployment/scraper-worker --replicas=5 -n cannaiq; fi
- kubectl set image deployment/scraper-worker worker=git.spdy.io/creationshop/cannaiq:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl set image deployment/cannaiq-frontend cannaiq-frontend=git.spdy.io/creationshop/cannaiq-frontend:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl set image deployment/findadispo-frontend findadispo-frontend=git.spdy.io/creationshop/findadispo-frontend:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl set image deployment/findagram-frontend findagram-frontend=git.spdy.io/creationshop/findagram-frontend:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl rollout status deployment/cannaiq-frontend -n cannaiq --timeout=120s
- kubectl set image deployment/scraper-worker worker=registry.spdy.io/cannaiq/backend:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl set image deployment/cannaiq-frontend cannaiq-frontend=registry.spdy.io/cannaiq/frontend:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl set image deployment/findadispo-frontend findadispo-frontend=registry.spdy.io/cannaiq/findadispo:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl set image deployment/findagram-frontend findagram-frontend=registry.spdy.io/cannaiq/findagram:sha-${CI_COMMIT_SHA:0:8} -n cannaiq
- kubectl rollout status deployment/cannaiq-frontend -n cannaiq --timeout=300s
depends_on:
- docker-backend
- docker-cannaiq

194
CLAUDE.md
View File

@@ -42,53 +42,49 @@ Never import `src/db/migrate.ts` at runtime. Use `src/db/pool.ts` for DB access.
Batch everything, push once, wait for user feedback.
### 7. K8S POD LIMITS — CRITICAL
**EXACTLY 8 PODS** for `scraper-worker` deployment. NEVER CHANGE THIS.
### 7. K8S — DEPLOY AND FORGET
**DO NOT run kubectl commands.** The system is self-managing.
**Replica Count is LOCKED:**
- Always 8 replicas — no more, no less
- NEVER scale down (even temporarily)
- NEVER scale up beyond 8
- If pods are not 8, restore to 8 immediately
**Operational Model:**
```
┌─────────────────────────────────────────────────────────┐
│ DEPLOY ONCE → WORKERS RUN FOREVER → CREATE TASKS ONLY │
└─────────────────────────────────────────────────────────┘
**Pods vs Workers:**
- **Pod** = Kubernetes container instance (ALWAYS 8)
- **Worker** = Concurrent task runner INSIDE a pod (controlled by `MAX_CONCURRENT_TASKS` env var)
- Formula: `8 pods × MAX_CONCURRENT_TASKS = 24 total concurrent workers`
**Browser Task Memory Limits:**
- Each Puppeteer/Chrome browser uses ~400 MB RAM
- Pod memory limit is 2 GB
- **MAX_CONCURRENT_TASKS=3** is the safe maximum for browser tasks
- More than 3 concurrent browsers per pod = OOM crash
| Browsers | RAM Used | Status |
|----------|----------|--------|
| 3 | ~1.3 GB | Safe (recommended) |
| 4 | ~1.7 GB | Risky |
| 5+ | >2 GB | OOM crash |
**To increase throughput:** Add more pods (up to 8), NOT more concurrent tasks per pod.
```bash
# CORRECT - scale pods (up to 8)
kubectl scale deployment/scraper-worker -n dispensary-scraper --replicas=8
# WRONG - will cause OOM crashes
kubectl set env deployment/scraper-worker -n dispensary-scraper MAX_CONCURRENT_TASKS=10
1. CI deploys code changes (automatic on push)
2. K8s maintains 8 pods (self-healing)
3. Workers poll DB for tasks (autonomous)
4. Create tasks via API or DB → workers pick them up
5. Never touch K8s directly
```
**If K8s API returns ServiceUnavailable:** STOP IMMEDIATELY. Do not retry. The cluster is overloaded.
**Fixed Configuration (NEVER CHANGE):**
- **8 replicas** — locked in `k8s/scraper-worker.yaml`
- **MAX_CONCURRENT_TASKS=3** — 3 browsers per pod (memory safe)
- **Total capacity:** 8 pods × 3 = 24 concurrent tasks
### 7. K8S REQUIRES EXPLICIT PERMISSION
**NEVER run kubectl commands without explicit user permission.**
**DO NOT:**
- Run `kubectl` commands (scale, rollout, logs, get pods, etc.)
- Manually restart pods
- Change replica count
- Check deployment status
Before running ANY `kubectl` command (scale, rollout, set env, delete, apply, etc.):
1. Tell the user what you want to do
2. Wait for explicit approval
3. Only then execute the command
**To interact with the system:**
- Create tasks in DB → workers pick them up automatically
- Check task status via DB queries or API
- View worker status via dashboard (cannaiq.co)
This applies to ALL kubectl operations - even read-only ones like `kubectl get pods`.
**Why no kubectl?**
- K8s auto-restarts crashed pods
- Workers self-heal (reconnect to DB, retry failed tasks)
- No manual intervention needed in steady state
- Only CI touches K8s (on code deployments)
**Scaling Decision:**
- Monitor pool drain rate via dashboard/DB queries
- If pool drains too slowly, manually increase replicas in `k8s/scraper-worker.yaml`
- Commit + push → CI deploys new replica count
- No runtime kubectl scaling — all changes via code
---
@@ -294,7 +290,7 @@ Workers use Evomi's residential proxy API for geo-targeted proxies on-demand.
**K8s Secret**: Credentials stored in `scraper-secrets`:
```bash
kubectl get secret scraper-secrets -n dispensary-scraper -o jsonpath='{.data.EVOMI_PASS}' | base64 -d
kubectl get secret scraper-secrets -n cannaiq -o jsonpath='{.data.EVOMI_PASS}' | base64 -d
```
**Proxy URL Format**: `http://{user}_{session}_{geo}:{pass}@{host}:{port}`
@@ -373,6 +369,122 @@ curl -X POST http://localhost:3010/api/tasks/crawl-state/AZ \
---
## Wasabi S3 Storage (Payload Archive)
Raw crawl payloads are archived to Wasabi S3 for long-term storage and potential reprocessing.
### Configuration
| Variable | Description | Default |
|----------|-------------|---------|
| `WASABI_ACCESS_KEY` | Wasabi access key ID | - |
| `WASABI_SECRET_KEY` | Wasabi secret access key | - |
| `WASABI_BUCKET` | Bucket name | `cannaiq` |
| `WASABI_REGION` | Wasabi region | `us-west-2` |
| `WASABI_ENDPOINT` | S3 endpoint URL | `https://s3.us-west-2.wasabisys.com` |
### Storage Path Format
```
payloads/{state}/{YYYY-MM-DD}/{dispensary_id}/{platform}_{timestamp}.json.gz
```
Example: `payloads/AZ/2025-12-16/123/dutchie_2025-12-16T10-30-00-000Z.json.gz`
### Features
- **Gzip compression**: ~70% size reduction on JSON payloads
- **Automatic archival**: Every crawl is archived (not just daily baselines)
- **Metadata**: taskId, productCount, platform stored with each object
- **Graceful fallback**: If Wasabi not configured, archival is skipped (no task failure)
### Files
| File | Purpose |
|------|---------|
| `src/services/wasabi-storage.ts` | S3 client and storage functions |
| `src/tasks/handlers/product-discovery-dutchie.ts` | Archives Dutchie payloads |
| `src/tasks/handlers/product-discovery-jane.ts` | Archives Jane payloads |
| `src/tasks/handlers/product-discovery-treez.ts` | Archives Treez payloads |
### K8s Secret Setup
```bash
kubectl patch secret scraper-secrets -n cannaiq -p '{"stringData":{
"WASABI_ACCESS_KEY": "<access-key>",
"WASABI_SECRET_KEY": "<secret-key>"
}}'
```
### Usage in Code
```typescript
import { storePayload, getPayload, listPayloads } from '../services/wasabi-storage';
// Store a payload
const result = await storePayload(dispensaryId, 'AZ', 'dutchie', rawPayload);
console.log(result.path); // payloads/AZ/2025-12-16/123/dutchie_...
console.log(result.compressedBytes); // Size after gzip
// Retrieve a payload
const payload = await getPayload(result.path);
// List payloads for a store on a date
const paths = await listPayloads(123, 'AZ', '2025-12-16');
```
### Estimated Storage
- ~100KB per crawl (compressed)
- ~200 stores × 12 crawls/day = 240MB/day
- ~7.2GB/month
- 5TB capacity = ~5+ years of storage
---
## Real-Time Inventory Tracking
High-frequency crawling for sales velocity and inventory analytics.
### Crawl Intervals
| State | Interval | Jitter | Effective Range |
|-------|----------|--------|-----------------|
| AZ | 5 min | ±3 min | 2-8 min |
| Others | 60 min | ±3 min | 57-63 min |
### Delta-Only Snapshots
Only store inventory changes, not full state. Reduces storage by ~95%.
**Change Types**:
- `sale`: quantity decreased (qty_delta < 0)
- `restock`: quantity increased (qty_delta > 0)
- `price_change`: price changed, quantity same
- `oos`: went out of stock (qty → 0)
- `back_in_stock`: returned to stock (0 → qty)
- `new_product`: first time seeing product
### Revenue Calculation
```
revenue = ABS(qty_delta) × effective_price
effective_price = sale_price if on_special else regular_price
```
### Key Views
| View | Purpose |
|------|---------|
| `v_hourly_sales` | Sales aggregated by hour |
| `v_daily_store_sales` | Daily revenue by store |
| `v_daily_brand_sales` | Daily brand performance |
| `v_product_velocity` | Hot/steady/slow/stale rankings |
| `v_stock_out_prediction` | Days until OOS based on velocity |
| `v_brand_variants` | SKU counts per brand |
### Files
| File | Purpose |
|------|---------|
| `src/services/inventory-snapshots.ts` | Delta calculation and storage |
| `src/services/task-scheduler.ts` | High-frequency scheduling with jitter |
| `migrations/125_delta_only_snapshots.sql` | Delta columns and views |
| `migrations/126_az_high_frequency.sql` | AZ 5-min intervals |
---
## Documentation
| Doc | Purpose |

View File

@@ -1,6 +1,6 @@
# Build stage
# Image: git.spdy.io/creationshop/dispensary-scraper
FROM node:20-slim AS builder
FROM registry.spdy.io/library/node:22-slim AS builder
# Install build tools for native modules (bcrypt, sharp)
RUN apt-get update && apt-get install -y \
@@ -27,7 +27,7 @@ RUN npm run build
RUN npm prune --production
# Production stage
FROM node:20-slim
FROM registry.spdy.io/library/node:22-slim
# Build arguments for version info
ARG APP_BUILD_VERSION=dev
@@ -41,9 +41,10 @@ ENV APP_GIT_SHA=${APP_GIT_SHA}
ENV APP_BUILD_TIME=${APP_BUILD_TIME}
ENV CONTAINER_IMAGE_TAG=${CONTAINER_IMAGE_TAG}
# Install Chromium dependencies and curl for HTTP requests
# Install Chromium dependencies, curl, and CA certificates for HTTPS
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
chromium \
fonts-liberation \
libnss3 \

View File

@@ -0,0 +1,937 @@
# CannaIQ Market Intelligence System
## Overview
Real-time cannabis market intelligence platform that competes with Headset.io and Hoodie Analytics by capturing minute-level inventory changes across dispensaries.
**Key Insight**: Every payload diff tells a story - new products, sales, restocks, price changes, and removals are all valuable market signals.
## Data We Capture
### From Each Payload Diff
| Change Type | Business Intelligence |
|-------------|----------------------|
| **sale** | Revenue, velocity, customer demand |
| **restock** | Supply chain timing, inventory turns |
| **new** | Product launches, brand expansion |
| **removed** | Discontinued products, seasonal items |
| **price_change** | Pricing strategy, promotions |
| **cannabinoid_change** | Quality/potency changes |
| **effect_change** | Consumer feedback updates |
### Product Data Fields (from POSMetaData)
```
POSMetaData.children[i]:
├── quantity → Stock level (KEY for sales calculation)
├── quantityAvailable → Available for online orders
├── option → Variant size ("1/8oz", "1g", "500mg")
├── canonicalID → Cross-store product matching
├── canonicalSKU → Brand's internal SKU
├── vendorId → Distributor/supplier
├── strainId → Strain genetics ID
├── price / recPrice → Regular retail price
├── effectivePotencyMg → Actual potency per unit
└── kioskQuantityAvailable → In-store kiosk stock
```
### Pricing Data
```
Product level:
├── Prices[] → Base prices per variant
├── recPrices[] → Recreational prices
├── recSpecialPrices[] → Sale prices
├── specialData:
│ ├── saleSpecials[]:
│ │ ├── percentOff / dollarOff
│ │ ├── brandIds[] → Which brands on sale
│ │ └── categoryIds[] → Which categories
│ └── bogoSpecials[]:
│ ├── buy X, get Y
│ └── discountPercent
└── special: boolean → Is product on sale?
```
### Product Metadata
```
├── brand.name → Brand
├── type → Category (Flower, Concentrate, Edible)
├── subcategory → Subcategory (gummies, live-resin, pods)
├── strainType → Indica / Sativa / Hybrid
├── THCContent.range → THC percentage
├── CBDContent.range → CBD percentage
├── cannabinoidsV2[] → Full cannabinoid profile
├── effects → User-reported effects
├── weights[] → Available sizes
└── image → Product image URL
```
---
## Analytics We Can Provide
### 1. Sales Analytics
| Metric | Description | Formula |
|--------|-------------|---------|
| **Daily Revenue** | Total sales per store | `SUM(revenue) WHERE change_type='sale'` |
| **Units Sold** | Total units moved | `SUM(ABS(quantity_delta))` |
| **Transactions** | Number of sale events | `COUNT(*) WHERE change_type='sale'` |
| **Average Basket** | Revenue per transaction | `revenue / transactions` |
| **Sales by Hour** | Peak selling hours | `GROUP BY date_trunc('hour')` |
| **Weekend vs Weekday** | Day-of-week patterns | `GROUP BY EXTRACT(dow)` |
### 2. Velocity Metrics (Like Hoodie)
| Metric | Description | Formula |
|--------|-------------|---------|
| **Sales Velocity** | Units/day per SKU | `units_sold / days_in_stock` |
| **Days of Supply** | How long stock lasts | `current_qty / daily_sell_rate` |
| **Sell-Through Rate** | % inventory sold | `units_sold / (units_sold + remaining_qty)` |
| **Turn Rate** | Inventory turns/year | `annual_sales / avg_inventory` |
| **Stock-Out Frequency** | Times hitting zero | `COUNT(*) WHERE quantity_after=0` |
| **Restock Cadence** | Days between restocks | `AVG(days_between_restocks)` |
### 3. Brand Performance (Like Headset)
| Metric | Description | Formula |
|--------|-------------|---------|
| **Brand Revenue** | Total by brand | `SUM(revenue) GROUP BY brand_name` |
| **Brand Market Share** | % of store revenue | `brand_revenue / total_revenue * 100` |
| **Category Share** | % within category | `brand_category_rev / category_rev * 100` |
| **Brand Velocity** | Units/day for brand | `brand_units / days` |
| **Brand Ranking** | Position vs competitors | `RANK() OVER (ORDER BY revenue)` |
| **Brand Momentum** | Week-over-week growth | `(this_week - last_week) / last_week` |
### 4. Distribution Metrics (Like Hoodie)
| Metric | Description | Formula |
|--------|-------------|---------|
| **Store Count** | Stores carrying brand | `COUNT(DISTINCT dispensary_id)` |
| **Weighted Distribution** | Distribution by store size | `SUM(store_revenue WHERE has_brand)` |
| **Category Penetration** | % of category stores | `stores_with_brand / total_category_stores` |
| **State Coverage** | States where brand is sold | `COUNT(DISTINCT state)` |
| **New Store Additions** | Stores added this period | `COUNT(new first_appearance)` |
| **Store Loss** | Stores dropped brand | `COUNT(removed AND no_reappear)` |
### 5. Pricing Intelligence
| Metric | Description | Formula |
|--------|-------------|---------|
| **Average Price** | Mean price by product | `AVG(price)` |
| **Price Range** | Min to max | `MIN(price) - MAX(price)` |
| **Price vs Market** | Above/below average | `price - market_avg` |
| **Promotional Depth** | % discount when on sale | `(price - special_price) / price` |
| **Time on Promotion** | % days on special | `special_days / total_days` |
| **Promotional Lift** | Sales increase from special | `special_velocity / regular_velocity` |
### 6. Competitive Analysis
| Metric | Description | Formula |
|--------|-------------|---------|
| **Share of Shelf** | % of category SKUs | `brand_skus / total_category_skus` |
| **Price Position** | Premium/Value/Budget | `NTILE(3) OVER (ORDER BY price)` |
| **Head-to-Head** | vs specific competitor | `brand_a_share - brand_b_share` |
| **Category Leadership** | Top brand per category | `RANK() OVER (PARTITION BY category)` |
| **Substitution Pattern** | What sells when OOS | Correlation analysis |
### 7. Product Lifecycle
| Metric | Description | Formula |
|--------|-------------|---------|
| **New Product Velocity** | First 30-day performance | `velocity WHERE age < 30` |
| **Launch Success Rate** | % hitting sales targets | `successful_launches / total_launches` |
| **Time to Peak** | Days to max velocity | `MIN(date WHERE velocity = max)` |
| **Product Lifespan** | Days from new to removed | `removed_date - first_seen_date` |
| **Seasonal Products** | Appear/disappear patterns | Seasonal decomposition |
### 8. Category Trends
| Metric | Description | Formula |
|--------|-------------|---------|
| **Category Growth** | WoW/MoM revenue change | `(this_period - last_period) / last_period` |
| **Category Mix** | % of total revenue | `category_rev / total_rev` |
| **Emerging Categories** | Fastest growing | `ORDER BY growth_rate DESC` |
| **Category Cannibalization** | New category stealing share | Correlation analysis |
| **Format Trends** | Vape vs Flower vs Edible | `GROUP BY type` |
### 9. Potency & Effects
| Metric | Description | Formula |
|--------|-------------|---------|
| **THC Preference** | Avg THC of sold products | `AVG(thc_content) weighted by units` |
| **Potency Trends** | THC changes over time | `AVG(thc) GROUP BY month` |
| **Effect Correlation** | Effects that drive sales | `CORR(effect_score, velocity)` |
| **Strain Performance** | Indica vs Sativa vs Hybrid | `GROUP BY strain_type` |
| **Premium THC Premium** | Price per % THC | `price / thc_content` |
### 10. Stock & Supply Chain
| Metric | Description | Formula |
|--------|-------------|---------|
| **Out-of-Stock Rate** | % time product is OOS | `oos_hours / total_hours` |
| **Lost Sales Estimate** | Revenue lost to OOS | `avg_hourly_rev * oos_hours` |
| **Restock Lead Time** | Days from OOS to restock | `restock_date - stockout_date` |
| **Over-Stock Risk** | Slow movers with high qty | `qty > 30 AND velocity < 1` |
| **Vendor Performance** | Restock reliability | `on_time_restocks / total_restocks` |
---
## Database Views for Analytics
```sql
-- Sales velocity by SKU
CREATE MATERIALIZED VIEW mv_sku_velocity AS
SELECT
dispensary_id,
product_id,
option,
brand_name,
category,
SUM(ABS(quantity_delta)) as units_30d,
SUM(revenue) as revenue_30d,
COUNT(*) as transactions_30d,
SUM(ABS(quantity_delta)) / 30.0 as daily_velocity
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, product_id, option, brand_name, category;
-- Brand market share by store
CREATE MATERIALIZED VIEW mv_brand_share AS
SELECT
dispensary_id,
brand_name,
SUM(revenue) as brand_revenue,
SUM(revenue) / SUM(SUM(revenue)) OVER (PARTITION BY dispensary_id) * 100 as market_share,
RANK() OVER (PARTITION BY dispensary_id ORDER BY SUM(revenue) DESC) as rank
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, brand_name;
-- Category performance
CREATE MATERIALIZED VIEW mv_category_performance AS
SELECT
dispensary_id,
category,
SUM(revenue) as category_revenue,
SUM(ABS(quantity_delta)) as units_sold,
COUNT(DISTINCT product_id) as unique_products,
COUNT(DISTINCT brand_name) as unique_brands
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, category;
-- Stock-out events
CREATE MATERIALIZED VIEW mv_stockouts AS
SELECT
dispensary_id,
product_id,
product_name,
brand_name,
option,
COUNT(*) as stockout_count,
SUM(EXTRACT(EPOCH FROM (COALESCE(restock_time, NOW()) - stockout_time)) / 3600) as total_oos_hours
FROM (
SELECT
*,
LEAD(detected_at) OVER (PARTITION BY dispensary_id, product_id, option ORDER BY detected_at) as restock_time
FROM inventory_changes
WHERE quantity_after = 0
) sub
WHERE detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, product_id, product_name, brand_name, option;
-- Price history
CREATE VIEW v_price_history AS
SELECT
dispensary_id,
product_id,
product_name,
brand_name,
option,
detected_at,
price,
special_price,
is_special,
LAG(price) OVER w as prev_price,
LAG(special_price) OVER w as prev_special_price,
price - LAG(price) OVER w as price_change
FROM inventory_changes
WHERE change_type = 'price_change'
WINDOW w AS (PARTITION BY dispensary_id, product_id, option ORDER BY detected_at);
```
---
## API Endpoints
### Sales
- `GET /api/inventory/sales/hourly/:dispensaryId`
- `GET /api/inventory/sales/daily/:dispensaryId`
- `GET /api/inventory/sales/by-category/:dispensaryId`
### Brands
- `GET /api/inventory/brands/:dispensaryId` - Brand performance
- `GET /api/inventory/brands/market-share/:dispensaryId`
- `GET /api/inventory/brands/velocity/:brandName`
- `GET /api/inventory/brands/distribution/:brandName`
### Products
- `GET /api/inventory/products/velocity/:dispensaryId`
- `GET /api/inventory/products/new/:dispensaryId`
- `GET /api/inventory/products/removed/:dispensaryId`
### Stock
- `GET /api/inventory/stockouts/:dispensaryId`
- `GET /api/inventory/restocks/:dispensaryId`
- `GET /api/inventory/stock-levels/:dispensaryId`
### Pricing
- `GET /api/inventory/price-changes/:dispensaryId`
- `GET /api/inventory/promotions/:dispensaryId`
- `GET /api/inventory/price-comparison/:productId`
### Market
- `GET /api/inventory/market/category-trends`
- `GET /api/inventory/market/brand-rankings`
- `GET /api/inventory/market/state-comparison`
---
## Revenue Formula
For every sale:
```
revenue = ABS(quantity_delta) × effective_price
effective_price = is_special ? special_price : price
```
Example:
- Product: Clout King Flower Jar | Hood Snacks (1/8oz)
- Regular price: $45.00
- Special price: $33.75
- is_special: true
- quantity_delta: -2
**Revenue = 2 × $33.75 = $67.50**
---
## Data Quality
### Deduplication
- Hash each payload to prevent reprocessing
- Track `payload_processing_log` for audit trail
### Consistency
- Use daily snapshots as baseline for new product detection
- Compare consecutive payloads for change detection
### Coverage
- Track all 987+ products per store
- All variants/options (1/8oz, 1/4oz, 1/2oz, etc.)
- Full cannabinoid profiles (THC, CBD, THCA, CBG, etc.)
- User-reported effects (calm, happy, relaxed, etc.)
---
## Implementation Status
| Component | Status | File |
|-----------|--------|------|
| Schema | ✅ Complete | `migrations/132_inventory_changes.sql` |
| Diff Calculator | ✅ Complete | `src/services/inventory-tracker.ts` |
| Daily Snapshots | ✅ Complete | `src/services/daily-snapshot.ts` |
| Analytics API | ✅ Complete | `src/routes/inventory-analytics.ts` |
| Task Handler | 🔲 Pending | `src/tasks/handlers/realtime-inventory.ts` |
| Materialized Views | 🔲 Pending | SQL migration |
---
## Competitive Analysis
### Headset Product Suite
| Product | What It Does | Can We Build? |
|---------|--------------|---------------|
| **Retailer** | POS integration, sales tracking, inventory optimization, employee performance | YES (except employee data) |
| **Insights** | Market data across 17 US + 4 CA markets, consumer behavior, product intelligence | YES - we have the data |
| **Bridge Nexus** | AI-powered ordering, auto-generated POs, ERP sync for suppliers | FUTURE - data foundation exists |
| **Vault** | Snowflake data warehouse, SQL access, Tableau integration | YES - expose via API |
### Hoodie Analytics Features
| Feature | Can We Build? |
|---------|---------------|
| Sales velocity | YES - quantity_delta over time |
| Weighted distribution | YES - brand presence × store revenue |
| Market share | YES - brand_revenue / total_revenue |
| Pricing trends | YES - price_change events |
| Out-of-stock detection | YES - quantity_after = 0 |
| Competitive positioning | YES - brand vs brand comparisons |
| Consumer geofencing | NO - requires mobile app data |
| Retail demographic profiling | PARTIAL - can integrate external data |
### Feature Comparison Matrix
| Feature | Headset | Hoodie | CannaIQ |
|---------|---------|--------|---------|
| **Data Source** | Direct POS | Menu scraping | Menu scraping |
| **Update Frequency** | Hourly | Daily | **Per-minute** |
| **Markets Covered** | 17 US + 4 CA | US focus | Growing |
| **Cannabinoid Tracking** | Basic THC/CBD | None | **Full profile (THC, CBD, THCA, CBG, etc.)** |
| **Effect Tracking** | None | None | **User-reported effects** |
| **Pricing History** | Limited | Yes | **Complete with specials** |
| **BOGO/Deal Tracking** | Unknown | Unknown | **Yes - specialData** |
| **Real-time Alerts** | Yes | No | **Yes** |
| **Data Warehouse** | Snowflake (Vault) | No | **API + future Snowflake** |
| **AI Ordering** | Yes (Bridge) | No | **Future** |
| **Cost** | $$$$$ | $$$$ | **Lower** |
### Our Unique Advantages
1. **Per-minute granularity** - No one else has this level of detail
2. **Full cannabinoid profiles** - THCA, CBG, etc. not just THC/CBD
3. **User-reported effects** - Calm, happy, relaxed, etc.
4. **Special/deal tracking** - BOGO deals, percentage off, category-wide sales
5. **No POS integration required** - Works with any Dutchie-powered store
### What We're Missing (Future Development)
| Gap | How to Address |
|-----|----------------|
| Individual customer data | Would need POS integration |
| Basket analysis | Can infer from correlated changes |
| Employee performance | Not available via menu data |
| Consumer geofencing | Would need mobile app |
| Store demographics | Integrate Census/external data |
| AI-powered ordering | Build on our data foundation |
---
## Cannabrands B2B Platform Integration
Cannabrands uses CannaIQ data for their B2B platform. Here's what we provide:
### For Brand Owners (Sellers)
| Metric | Query | Value |
|--------|-------|-------|
| **Where am I sold?** | Stores with brand sales | Distribution footprint |
| **How am I performing?** | Revenue, units, velocity by store | Account prioritization |
| **Am I in stock?** | Current qty at each store | Reorder alerts |
| **What's my market share?** | Brand % of category/store | Competitive position |
| **How are my competitors doing?** | Side-by-side brand comparison | Strategic planning |
| **Are my specials working?** | Special vs regular price sales | Promotional ROI |
| **Which SKUs are hot?** | Velocity by product/option | Production planning |
| **Where am I losing distribution?** | Stores that dropped brand | Account recovery |
| **New store opportunities** | Stores without brand in category | Sales targets |
### For Retailers (Buyers)
| Metric | Query | Value |
|--------|-------|-------|
| **What's selling?** | Top products by velocity | Buying decisions |
| **What should I reorder?** | Low stock + high velocity | Inventory management |
| **What's trending?** | Category growth rates | Assortment planning |
| **Am I priced right?** | Price vs market average | Pricing strategy |
| **What am I missing?** | Hot products not carried | Assortment gaps |
| **Brand performance** | All brands ranked | Vendor negotiations |
### API Endpoints for Cannabrands
```
# Brand analytics
GET /api/brands/:brandName/distribution
GET /api/brands/:brandName/velocity
GET /api/brands/:brandName/market-share
GET /api/brands/:brandName/stock-status
GET /api/brands/:brandName/competitors
# Store analytics
GET /api/stores/:storeId/brand-performance
GET /api/stores/:storeId/category-mix
GET /api/stores/:storeId/reorder-suggestions
# Market analytics
GET /api/market/category-trends
GET /api/market/brand-rankings
GET /api/market/price-index
```
### Normalized Brand Matching
Cannabrands stores a normalized brand key to match across spelling variations:
```sql
-- "Aloha Tyme Machine", "ALOHA TYMEMACHINE", "Aloha TymeMachine"
-- All normalize to: "alohatymemachine"
SELECT * FROM inventory_changes
WHERE normalize_brand(brand_name) = normalize_brand('Aloha Tyme Machine');
```
### Real-Time Alerts for Brands
| Alert | Trigger | Action |
|-------|---------|--------|
| **Stock-out** | quantity_after = 0 | Notify sales rep |
| **Low stock** | days_of_supply < 7 | Suggest reorder |
| **Lost distribution** | No sales in 30 days | Account review |
| **Price change** | Competitor price drop | Pricing alert |
| **New competitor** | New brand in category | Competitive intel |
---
## Data Pipeline Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ DATA COLLECTION │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Browser Worker (Puppeteer + Stealth + Evomi Proxy) │
│ ├── Visit store menu page (natural browsing) │
│ ├── Execute GraphQL query from browser context │
│ ├── Capture full product payload │
│ └── Send to diff calculator │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ DIFF CALCULATION │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ inventory-tracker.ts │
│ ├── Compare current payload to previous │
│ ├── Detect: new, removed, sale, restock, price_change │
│ ├── Detect: cannabinoid_change, effect_change │
│ ├── Calculate revenue: qty_delta × effective_price │
│ └── Insert changes to inventory_changes table │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ DATA STORAGE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ PostgreSQL │
│ ├── inventory_changes (diffs - queryable) │
│ ├── daily_snapshots (benchmarks - reconstruction) │
│ ├── payload_processing_log (deduplication) │
│ └── Materialized views (pre-computed analytics) │
│ │
│ MinIO/S3 (optional cold storage) │
│ └── Raw payloads for re-analysis (30-90 day retention) │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ ANALYTICS LAYER │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ REST API │
│ ├── /api/inventory/sales/* (revenue, units, transactions) │
│ ├── /api/inventory/brands/* (market share, velocity) │
│ ├── /api/inventory/products/* (new, removed, velocity) │
│ ├── /api/inventory/stock/* (stockouts, restocks) │
│ └── /api/inventory/market/* (trends, rankings) │
│ │
│ Materialized Views (refreshed hourly) │
│ ├── mv_sku_velocity │
│ ├── mv_brand_share │
│ ├── mv_category_performance │
│ ├── mv_brand_distribution │
│ └── mv_promotional_analysis │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ CONSUMERS │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Cannabrands B2B Platform │
│ ├── Brand dashboards │
│ ├── Store analytics │
│ └── Market intelligence │
│ │
│ CannaIQ Admin │
│ ├── System monitoring │
│ └── Data quality checks │
│ │
│ Future: Snowflake/Tableau export │
│ │
└─────────────────────────────────────────────────────────────────────┘
```
---
## Novel Analytics (Our Unique IP)
These are analytics **only we can build** due to our unique data:
### 1. Potency-Price Optimization
No one else tracks full cannabinoid profiles. We can answer:
```sql
-- What's the optimal THC% for each price tier?
SELECT
NTILE(5) OVER (ORDER BY price) as price_tier,
AVG(thc_content) as avg_thc,
SUM(ABS(quantity_delta)) as units_sold,
AVG(daily_velocity) as avg_velocity
FROM inventory_changes
WHERE change_type = 'sale' AND thc_content IS NOT NULL
GROUP BY 1;
-- Value per mg THC - are customers paying for potency?
SELECT
brand_name,
AVG(price / NULLIF(thc_content, 0)) as price_per_thc_pct,
SUM(revenue) as total_revenue
FROM inventory_changes
WHERE change_type = 'sale' AND thc_content > 0
GROUP BY brand_name
ORDER BY total_revenue DESC;
```
**Business value**: Brands can optimize potency targeting. Retailers can price by potency.
### 2. Effect-Driven Product Development
We track user-reported effects (calm, happy, relaxed, euphoric, energetic, etc.)
```sql
-- Which effect profiles drive highest sales?
SELECT
effects->>'calm' as calm_score,
effects->>'happy' as happy_score,
SUM(revenue) as total_revenue,
AVG(daily_velocity) as avg_velocity
FROM inventory_changes
WHERE change_type = 'sale' AND effects IS NOT NULL
GROUP BY 1, 2
ORDER BY total_revenue DESC;
-- Effect preferences by time of day
SELECT
EXTRACT(hour FROM detected_at) as hour,
AVG((effects->>'calm')::int) as avg_calm_sold,
AVG((effects->>'energetic')::int) as avg_energetic_sold
FROM inventory_changes
WHERE change_type = 'sale' AND effects IS NOT NULL
GROUP BY 1;
```
**Business value**: Brands develop products for specific effects. Morning = energetic, evening = calm.
### 3. Minute-Level Flash Sale Analysis
No one else has per-minute data. We can detect:
```sql
-- Flash sale effectiveness (sales spike detection)
SELECT
date_trunc('minute', detected_at) as minute,
COUNT(*) as transactions,
SUM(revenue) as minute_revenue,
LAG(SUM(revenue)) OVER (ORDER BY date_trunc('minute', detected_at)) as prev_minute,
SUM(revenue) / NULLIF(LAG(SUM(revenue)) OVER (ORDER BY date_trunc('minute', detected_at)), 0) as spike_ratio
FROM inventory_changes
WHERE change_type = 'sale'
GROUP BY 1
HAVING SUM(revenue) / NULLIF(LAG(SUM(revenue)) OVER (ORDER BY date_trunc('minute', detected_at)), 0) > 2;
```
**Business value**: Detect when promotions go live. Measure immediate impact.
### 4. Substitution Pattern Analysis
When product A sells out, what do customers buy instead?
```sql
-- Find substitution patterns after stock-outs
WITH stockouts AS (
SELECT dispensary_id, product_id, brand_name, category, detected_at as stockout_time
FROM inventory_changes
WHERE quantity_after = 0
),
subsequent_sales AS (
SELECT ic.*, so.brand_name as stockout_brand, so.product_id as stockout_product
FROM inventory_changes ic
JOIN stockouts so ON ic.dispensary_id = so.dispensary_id
AND ic.category = so.category
AND ic.brand_name != so.brand_name
AND ic.detected_at BETWEEN so.stockout_time AND so.stockout_time + INTERVAL '4 hours'
WHERE ic.change_type = 'sale'
)
SELECT
stockout_brand,
brand_name as substitute_brand,
COUNT(*) as substitutions,
SUM(revenue) as captured_revenue
FROM subsequent_sales
GROUP BY stockout_brand, brand_name
ORDER BY substitutions DESC;
```
**Business value**: Identify who steals your sales when you're OOS. Understand brand loyalty.
### 5. Cannabinoid Trend Forecasting
Track shifts in cannabinoid preferences over time:
```sql
-- THC preference trend over time
SELECT
date_trunc('week', detected_at) as week,
AVG(thc_content) FILTER (WHERE change_type = 'sale') as avg_thc_sold,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY thc_content) as p90_thc,
SUM(ABS(quantity_delta)) FILTER (WHERE cbg_content > 0) as cbg_units,
SUM(ABS(quantity_delta)) FILTER (WHERE thca_content > 25) as high_thca_units
FROM inventory_changes
WHERE thc_content IS NOT NULL
GROUP BY 1
ORDER BY 1;
```
**Business value**: Predict demand for high-THC products. Spot emerging cannabinoids (CBG boom).
### 6. Price Elasticity by Product Attribute
How does price sensitivity vary by THC%, category, brand tier?
```sql
-- Price elasticity by THC tier
SELECT
CASE
WHEN thc_content < 15 THEN 'low_thc'
WHEN thc_content < 25 THEN 'mid_thc'
ELSE 'high_thc'
END as thc_tier,
CORR(price, ABS(quantity_delta)) as price_volume_correlation,
AVG(CASE WHEN is_special THEN ABS(quantity_delta) END) /
NULLIF(AVG(CASE WHEN NOT is_special THEN ABS(quantity_delta) END), 0) as special_lift
FROM inventory_changes
WHERE change_type = 'sale' AND thc_content IS NOT NULL
GROUP BY 1;
```
**Business value**: Set optimal prices by product segment. Know where discounts work.
### 7. Strain Performance Leaderboard
Track which strains sell best across all stores:
```sql
-- Top performing strains (via product name parsing)
SELECT
REGEXP_REPLACE(product_name, '.*(Indica|Sativa|Hybrid).*', '\1') as strain_type,
-- Extract strain name from common patterns
SPLIT_PART(product_name, '|', 2) as strain_name,
COUNT(DISTINCT dispensary_id) as stores_carrying,
SUM(revenue) as total_revenue,
SUM(ABS(quantity_delta)) as units_sold
FROM inventory_changes
WHERE change_type = 'sale'
GROUP BY 1, 2
HAVING COUNT(DISTINCT dispensary_id) > 3
ORDER BY total_revenue DESC
LIMIT 50;
```
**Business value**: Growers know which genetics to cultivate. Buyers know what to stock.
### 8. Promotional Strategy Optimization
BOGO vs % off vs $ off - what works best?
```sql
-- Analyze special types from specialData (if captured)
SELECT
category,
is_special,
AVG(ABS(quantity_delta)) as avg_units_per_sale,
AVG((price - COALESCE(special_price, price)) / NULLIF(price, 0) * 100) as avg_discount_pct,
SUM(revenue) as total_revenue
FROM inventory_changes
WHERE change_type = 'sale'
GROUP BY category, is_special;
```
**Business value**: Optimize promotional calendar. Know which discount types move product.
### 9. Restock Prediction Model
Predict when stores will run out based on velocity:
```sql
-- Days until stockout prediction
SELECT
dispensary_id,
product_id,
product_name,
brand_name,
option,
current_qty,
daily_velocity,
CASE
WHEN daily_velocity > 0 THEN current_qty / daily_velocity
ELSE 999
END as days_until_stockout
FROM (
SELECT DISTINCT ON (dispensary_id, product_id, option)
dispensary_id,
product_id,
product_name,
brand_name,
option,
quantity_after as current_qty
FROM inventory_changes
ORDER BY dispensary_id, product_id, option, detected_at DESC
) current_stock
JOIN mv_sku_velocity USING (dispensary_id, product_id, option)
WHERE current_qty > 0 AND daily_velocity > 0
ORDER BY days_until_stockout ASC;
```
**Business value**: Proactive reorder alerts. Prevent lost sales.
### 10. Category Cannibalization Detection
Does a new category steal from existing ones?
```sql
-- Week-over-week category share shift
WITH weekly_category AS (
SELECT
date_trunc('week', detected_at) as week,
category,
SUM(revenue) as category_revenue
FROM inventory_changes
WHERE change_type = 'sale'
GROUP BY 1, 2
)
SELECT
category,
week,
category_revenue,
category_revenue / SUM(category_revenue) OVER (PARTITION BY week) * 100 as share_pct,
category_revenue / NULLIF(LAG(category_revenue) OVER (PARTITION BY category ORDER BY week), 0) - 1 as wow_growth
FROM weekly_category
ORDER BY week DESC, category_revenue DESC;
```
**Business value**: Spot when vapes steal from flower. Understand format shifts.
### 11. Cross-Store Price Arbitrage
Same product, different prices at different stores:
```sql
-- Price variance for same product across stores
SELECT
product_name,
brand_name,
option,
COUNT(DISTINCT dispensary_id) as store_count,
MIN(price) as min_price,
MAX(price) as max_price,
MAX(price) - MIN(price) as price_spread,
STDDEV(price) as price_stddev
FROM inventory_changes
WHERE change_type = 'sale'
GROUP BY product_name, brand_name, option
HAVING COUNT(DISTINCT dispensary_id) > 3 AND MAX(price) - MIN(price) > 5
ORDER BY price_spread DESC;
```
**Business value**: Brands ensure MSRP compliance. Retailers benchmark pricing.
### 12. Time-of-Day × Category Heatmap
When do different categories sell?
```sql
-- Sales heatmap by hour and category
SELECT
EXTRACT(hour FROM detected_at) as hour,
category,
SUM(revenue) as revenue,
SUM(ABS(quantity_delta)) as units
FROM inventory_changes
WHERE change_type = 'sale' AND category IS NOT NULL
GROUP BY 1, 2
ORDER BY hour, revenue DESC;
```
**Business value**: Schedule staff by category demand. Time promotions optimally.
### 13. New Product Launch Velocity Benchmark
How fast should a new product sell to be "successful"?
```sql
-- Benchmark: velocity in first 7 days vs established products
SELECT
category,
CASE
WHEN first_seen > NOW() - INTERVAL '7 days' THEN 'new_launch'
ELSE 'established'
END as product_age,
AVG(daily_velocity) as avg_velocity,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY daily_velocity) as median_velocity,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY daily_velocity) as p75_velocity
FROM mv_new_product_performance
GROUP BY category, product_age;
```
**Business value**: Set launch KPIs. Identify winners/losers early.
### 14. Weather Impact Analysis (External Integration)
Correlate sales with weather data:
```sql
-- (Requires weather data integration)
-- Do edibles sell more on rainy days?
SELECT
w.condition, -- sunny, rainy, cloudy
ic.category,
AVG(ic.revenue) as avg_daily_revenue
FROM inventory_changes ic
JOIN weather_data w ON DATE(ic.detected_at) = w.date AND ic.state = w.state
WHERE ic.change_type = 'sale'
GROUP BY w.condition, ic.category;
```
**Business value**: Predict demand spikes. Optimize inventory for weather.
---
## Summary: Our Unique Value Proposition
| Capability | Headset | Hoodie | CannaIQ |
|------------|---------|--------|---------|
| Cannabinoid optimization | ❌ | ❌ | ✅ |
| Effect-based analytics | ❌ | ❌ | ✅ |
| Per-minute flash sale detection | ❌ | ❌ | ✅ |
| Substitution patterns | Limited | ❌ | ✅ |
| Cannabinoid trend forecasting | ❌ | ❌ | ✅ |
| Strain genetics performance | Basic | ❌ | ✅ |
| Price elasticity by potency | ❌ | ❌ | ✅ |
| Real-time stockout prediction | ❌ | ❌ | ✅ |
**Bottom line**: We have data granularity and cannabinoid/effect data that no competitor has. This enables entirely new categories of analytics.

View File

@@ -0,0 +1,362 @@
# Payload-Based Sales Intelligence System
## Overview
CannaIQ stores raw product payloads from dispensary platforms (Dutchie, Jane, etc.) in MinIO/Wasabi S3 storage. By comparing payloads over time, we can calculate:
- **Units sold** (quantity decreases)
- **Revenue** (qty × price, using special pricing when applicable)
- **New arrivals** (products appearing)
- **Sold out items** (products disappearing)
- **Price changes** (regular and special)
- **Brand performance** (aggregated by brand)
## Data Structure
### Payload Storage
Payloads are stored in MinIO with path pattern:
```
payloads/{platform}/{year}/{month}/{day}/store_{id}_{timestamp}.json[.gz]
```
Example:
```
payloads/dutchie/2025/12/17/store_112_t008jb_1765939307492.json
```
### Key Fields for Sales Calculation
| Field | Location | Description |
|-------|----------|-------------|
| `quantity` | `POSMetaData.children[i].quantity` | Current stock for variant |
| `quantityAvailable` | `POSMetaData.children[i].quantityAvailable` | Available stock |
| `kioskQuantityAvailable` | `POSMetaData.children[i].kioskQuantityAvailable` | Kiosk-specific stock |
| `price` | `POSMetaData.children[i].price` or `Prices[i]` | Regular price |
| `recPrice` | `POSMetaData.children[i].recPrice` | Recreational price |
| `medPrice` | `POSMetaData.children[i].medPrice` | Medical price |
| `special` | Product root | Boolean - is product on sale |
| `recSpecialPrices` | Product root | Array of special prices per variant |
| `medicalSpecialPrices` | Product root | Array of medical special prices |
| `Options` | Product root | Array of weight/size strings (e.g., "1/8oz", "1g") |
### POSMetaData.children[i] - Full Structure
Each variant (size/weight option) has its own child object:
```json
{
"option": "1/2oz",
"quantity": 3,
"quantityAvailable": 0,
"kioskQuantityAvailable": 3,
"price": 60,
"recPrice": 60,
"medPrice": null,
"canonicalID": "3529641",
"canonicalSKU": "HWF14GMTW",
"canonicalBrandId": "113707",
"canonicalVendorId": "81035",
"canonicalStrainId": "613696",
"canonicalCategory": "Flower | Half-Ounce",
"canonicalCategoryId": "1840844",
"canonicalEffectivePotencyMg": 3948,
"canonicalImgUrl": "https://...",
"canonicalLabResultUrl": "https://...",
"activeBatchTags": [{"tagId": "3944"}]
}
```
### specialData - Deals & Promotions
#### Sale Specials
```json
{
"saleSpecials": [{
"discount": 40,
"percentDiscount": true,
"dollarDiscount": false,
"specialName": "40% Off STIIIZY",
"specialType": "sale",
"eligibleProductOptions": ["2.5g"],
"saleDiscounts": [{
"brandIds": ["27733"],
"categoryIds": ["1842571"],
"productIds": ["2530095", ...],
"discountAmount": 40,
"discountType": "percentDiscount",
"weights": ["2.5g", "1g"]
}]
}]
}
```
#### BOGO Specials
```json
{
"bogoSpecials": [{
"bogoConditions": [],
"bogoRewards": [{
"brandIds": ["113707"],
"discountAmount": 25,
"discountType": "targetPrice",
"totalQuantity": {
"quantity": 2,
"quantityOperator": "greaterThanEqualTo"
},
"weights": ["14.0g"]
}],
"endStamp": "1767164400000"
}]
}
```
### Product Structure Example
```json
{
"id": "67d206d0df131282d5ce95a0",
"Name": "High West Farms Flower Mylar | Timewreck",
"brand": { "name": "High West Farms" },
"Options": ["1/2oz"],
"Prices": [60],
"special": true,
"recSpecialPrices": [45],
"POSMetaData": {
"children": [
{
"option": "1/2oz",
"quantity": 3,
"quantityAvailable": 0,
"price": 60,
"recPrice": 60
}
]
}
}
```
## Sales Calculation Formula
```
For each SKU variant:
qty_sold = before.POSMetaData.children[i].quantity - after.POSMetaData.children[i].quantity
If qty_sold > 0:
price = after.recSpecialPrices[i] if after.special else after.Prices[i]
revenue = qty_sold × price
```
**Key principle**: Use TODAY's price (from the newer payload) because that's what the customer paid.
## Scripts
### 1. Track Quantity Over Time
`src/scripts/track-qty-over-time.ts`
Shows SKU quantity at each snapshot:
```
DR Flower Mylar | AK 1995 (1/8oz)
Price: $7.98 (special) | Sold: 23 units | Revenue: $183.54
Qty: Dec 12 11:57 PM: 25 → Dec 13 04:49 AM: 25 → Dec 14 03:34 AM: 22 → Dec 16 07:41 PM: 2
```
### 2. Calculate Sales Between Payloads
`src/scripts/calculate-sales.ts`
Compares two payloads and outputs:
- Total units sold
- Total revenue
- Top products by revenue
- Revenue by brand
- Special vs regular pricing breakdown
### 3. Diff With Specials
`src/scripts/diff-with-specials.ts`
Shows inventory changes with special pricing context:
- New arrivals
- Sold out / removed
- Price changes
- New specials added
- Specials ended
- Special price adjustments
## MinIO Access
### Connection
```typescript
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002', // SSH tunnel to production
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
```
### SSH Tunnel (for local development)
```bash
sshpass -p 'PASSWORD' ssh -o StrictHostKeyChecking=no -f -N \
-L 9002:10.100.9.80:9000 kelly@ci.spdy.io
```
### List Payloads
```typescript
const response = await client.send(new ListObjectsV2Command({
Bucket: 'cannaiq',
Prefix: 'payloads/',
}));
```
## Sample Output
### Deeply Rooted Phoenix (Dec 13-17, 2025)
```
Total Units Sold: 1,134
Total Revenue: $32,534.47
Avg Price/Unit: $28.69
TOP BRANDS BY REVENUE:
1. Deeply Rooted: $2,564.79 (7.9%)
2. Clout King: $2,400 (7.4%)
3. Mfused: $1,990 (6.1%)
4. Cure Injoy: $1,610 (4.9%)
5. Select: $1,411.50 (4.3%)
SPECIAL vs REGULAR:
- Special (sale): 501 units, $11,354.47 (34.9%)
- Regular: 633 units, $21,180.00 (65.1%)
```
## Payload Capture Requirements
For accurate sales calculation, payloads MUST include:
1. **`includeEnterpriseSpecials: true`** in GraphQL query variables
- Provides `recSpecialPrices` and `specialData`
2. **`POSMetaData` with children array**
- Contains per-variant quantity data
3. **Multiple snapshots over time**
- More frequent = more granular sales data
- Recommended: every 4-6 hours minimum
## Additional Data Available
### Cannabinoids (cannabinoidsV2)
844/987 products have detailed cannabinoid data:
```json
{
"cannabinoidsV2": [
{"value": 0.1, "unit": "PERCENTAGE", "cannabinoid": {"name": "CBG"}},
{"value": 0.91, "unit": "PERCENTAGE", "cannabinoid": {"name": "CBGA"}},
{"value": 0.45, "unit": "PERCENTAGE", "cannabinoid": {"name": "THC-D9"}},
{"value": 31.64, "unit": "PERCENTAGE", "cannabinoid": {"name": "THCA"}}
]
}
```
### THC Content Distribution
| Range | Products | Percentage |
|-------|----------|------------|
| <10% | 132 | 13.9% |
| 10-20% | 34 | 3.6% |
| 20-30% | 219 | 23.1% |
| 30-40% | 70 | 7.4% |
| >40% | 493 | 52.0% |
### Effects Data
403/987 products have user-reported effects:
```json
{
"effects": {
"Calm": 9,
"Happy": 8,
"Relaxed": 6,
"Energetic": 5,
"Pain-Relief": 5
}
}
```
### Strain Types
| Type | Count |
|------|-------|
| N/A | 280 |
| Hybrid | 259 |
| Indica-Hybrid | 198 |
| Sativa-Hybrid | 83 |
| Indica | 68 |
| Sativa | 44 |
### Categories & Subcategories
| Category | Count |
|----------|-------|
| Flower | 321 |
| Concentrate | 241 |
| Edible | 156 |
| Vaporizers | 143 |
| Pre-Rolls | 91 |
| Subcategory | Count |
|-------------|-------|
| gummies | 106 |
| live-resin | 88 |
| pods | 84 |
| rosin | 70 |
| singles | 64 |
| cartridges | 55 |
### Canonical/Enterprise IDs
Cross-references for matching products across stores:
- `enterpriseProductId` - Chain-level product ID
- `canonicalID` - Standard product ID
- `canonicalSKU` - SKU number
- `canonicalVendorId` - Vendor/distributor ID
- `canonicalBrandId` - Standard brand ID
- `canonicalStrainId` - Strain database ID
### Batch Tags
725/987 products have batch/lot tracking tags for compliance.
## Analytics Possibilities
### Sales Intelligence
- Revenue by brand, category, strain type
- Sales velocity (units/hour by time of day)
- Sell-through rate by product
- Inventory turnover analysis
### Pricing Intelligence
- Special/promo effectiveness (% of sales at special price)
- Price elasticity by category
- Competitor price comparison
- BOGO deal performance
### Product Intelligence
- THC content vs sales correlation
- Effect profile popularity
- Category mix optimization
- Strain popularity trends
### Inventory Intelligence
- Stock-out prediction
- Reorder point optimization
- Dead stock identification
- Seasonal demand patterns
## Future Enhancements
1. **Automated hourly collection** - Capture payloads on schedule
2. **Real-time sales dashboard** - Visualize sales velocity
3. **Brand performance alerts** - Notify when brands are selling fast
4. **Inventory prediction** - Estimate restock timing
5. **Price elasticity analysis** - Measure sales response to price changes
6. **Cross-store comparison** - Compare same brand across stores
7. **THC/effect correlation** - Analyze what sells by potency/effects
8. **Deal effectiveness** - Track BOGO/special performance

View File

@@ -0,0 +1,483 @@
# Real-Time Inventory Tracking System
## Overview
Track dispensary inventory changes at minute-level granularity to calculate:
- Real-time sales (units sold × price)
- Revenue by brand/category/store
- Stock velocity and sell-through rates
- Price change history
- Restock timing
## Architecture
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Browser Worker │────▶│ Diff Calculator │────▶│ PostgreSQL │
│ (Puppeteer) │ │ │ │ │
│ - Stealth │ │ Compare prev │ │ inventory_changes│
│ - Evomi Proxy │ │ vs current │ │ daily_snapshots │
│ - Fingerprint │ │ payload │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ MinIO/S3 │ │ Analytics API │
│ (Cold Storage) │ │ - Sales/hour │
│ - Raw payloads │ │ - Brand perf │
│ - 30-90 days │ │ - Alerts │
└─────────────────┘ └─────────────────┘
```
## Database Schema
### Table: `inventory_changes`
Tracks every quantity/price change at the SKU level.
```sql
CREATE TABLE inventory_changes (
id BIGSERIAL PRIMARY KEY,
-- Store reference
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
-- Product identification
product_id VARCHAR(50) NOT NULL, -- Dutchie product._id
canonical_id VARCHAR(50), -- POSMetaData.children[].canonicalID
canonical_sku VARCHAR(100), -- POSMetaData.children[].canonicalSKU
product_name VARCHAR(500),
brand_name VARCHAR(200),
option VARCHAR(50), -- Weight/size: "1/8oz", "1g", etc.
-- Change type
change_type VARCHAR(20) NOT NULL, -- 'sale', 'restock', 'price_change', 'new', 'removed'
-- Quantity tracking
quantity_before INTEGER,
quantity_after INTEGER,
quantity_delta INTEGER, -- Negative = sale, Positive = restock
-- Price tracking (use today's price for revenue)
price DECIMAL(10,2), -- Regular price at time of change
special_price DECIMAL(10,2), -- Sale price if on special
is_special BOOLEAN DEFAULT FALSE,
-- Calculated revenue (for sales only)
revenue DECIMAL(10,2), -- quantity_delta * effective_price
-- Metadata
category VARCHAR(100), -- Flower, Concentrate, Edible, etc.
subcategory VARCHAR(100), -- gummies, live-resin, pods, etc.
strain_type VARCHAR(50), -- Indica, Sativa, Hybrid
thc_content DECIMAL(5,2), -- THC percentage
-- Timestamps
detected_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
payload_timestamp TIMESTAMPTZ, -- When payload was captured
-- Indexes for common queries
CONSTRAINT inventory_changes_qty_check CHECK (
(change_type IN ('sale', 'restock') AND quantity_delta IS NOT NULL) OR
(change_type NOT IN ('sale', 'restock'))
)
);
-- Indexes for performance
CREATE INDEX idx_inventory_changes_dispensary_time
ON inventory_changes(dispensary_id, detected_at DESC);
CREATE INDEX idx_inventory_changes_brand_time
ON inventory_changes(brand_name, detected_at DESC);
CREATE INDEX idx_inventory_changes_type_time
ON inventory_changes(change_type, detected_at DESC);
CREATE INDEX idx_inventory_changes_product
ON inventory_changes(dispensary_id, product_id, option);
-- Partitioning by month for large scale
-- Consider: CREATE TABLE inventory_changes_2025_12 PARTITION OF inventory_changes ...
```
### Table: `daily_snapshots`
Full product state once per day for reconstruction.
```sql
CREATE TABLE daily_snapshots (
id BIGSERIAL PRIMARY KEY,
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
snapshot_date DATE NOT NULL,
-- Store full product data as JSONB
products JSONB NOT NULL,
-- Summary stats
product_count INTEGER,
total_skus INTEGER,
total_inventory_units INTEGER,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(dispensary_id, snapshot_date)
);
CREATE INDEX idx_daily_snapshots_lookup
ON daily_snapshots(dispensary_id, snapshot_date DESC);
```
### Table: `payload_checksums`
Track what we've already processed to avoid duplicates.
```sql
CREATE TABLE payload_checksums (
id BIGSERIAL PRIMARY KEY,
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
payload_hash VARCHAR(64) NOT NULL, -- SHA256 of payload
captured_at TIMESTAMPTZ NOT NULL,
processed_at TIMESTAMPTZ DEFAULT NOW(),
changes_detected INTEGER DEFAULT 0,
UNIQUE(dispensary_id, payload_hash)
);
```
## Data Flow
### 1. Payload Capture (Every Minute)
```typescript
interface PayloadCapture {
dispensaryId: number;
capturedAt: Date;
products: DutchieProduct[];
payloadHash: string;
}
```
### 2. Diff Calculation
```typescript
interface InventoryChange {
dispensaryId: number;
productId: string;
canonicalId?: string;
canonicalSku?: string;
productName: string;
brandName: string;
option: string;
changeType: 'sale' | 'restock' | 'price_change' | 'new' | 'removed';
quantityBefore?: number;
quantityAfter?: number;
quantityDelta?: number;
price: number;
specialPrice?: number;
isSpecial: boolean;
revenue?: number; // For sales: abs(quantityDelta) * effectivePrice
category?: string;
subcategory?: string;
strainType?: string;
thcContent?: number;
detectedAt: Date;
payloadTimestamp: Date;
}
```
### 3. Change Detection Logic
```typescript
function detectChanges(
prevPayload: DutchieProduct[],
currPayload: DutchieProduct[],
dispensaryId: number,
timestamp: Date
): InventoryChange[] {
const changes: InventoryChange[] = [];
const prevMap = buildVariantMap(prevPayload);
const currMap = buildVariantMap(currPayload);
// Check for sales/restocks (quantity changes)
for (const [key, curr] of currMap) {
const prev = prevMap.get(key);
if (!prev) {
// NEW product
changes.push({
changeType: 'new',
quantityAfter: curr.quantity,
...curr
});
continue;
}
const qtyDelta = curr.quantity - prev.quantity;
if (qtyDelta < 0) {
// SALE
const effectivePrice = curr.isSpecial ? curr.specialPrice : curr.price;
changes.push({
changeType: 'sale',
quantityBefore: prev.quantity,
quantityAfter: curr.quantity,
quantityDelta: qtyDelta,
revenue: Math.abs(qtyDelta) * effectivePrice,
...curr
});
} else if (qtyDelta > 0) {
// RESTOCK
changes.push({
changeType: 'restock',
quantityBefore: prev.quantity,
quantityAfter: curr.quantity,
quantityDelta: qtyDelta,
...curr
});
}
// Check for price changes (separate from qty)
if (prev.price !== curr.price || prev.specialPrice !== curr.specialPrice) {
changes.push({
changeType: 'price_change',
priceBefore: prev.price,
priceAfter: curr.price,
...curr
});
}
}
// Check for REMOVED products
for (const [key, prev] of prevMap) {
if (!currMap.has(key)) {
changes.push({
changeType: 'removed',
quantityBefore: prev.quantity,
quantityAfter: 0,
...prev
});
}
}
return changes;
}
```
## Storage Estimates
### At 1 Payload/Minute
| Metric | 1 Store | 10 Stores | 100 Stores |
|--------|---------|-----------|------------|
| **Full Payloads** | | | |
| Daily | 828 MB | 8.1 GB | 81 GB |
| Monthly | 24 GB | 242 GB | 2.4 TB |
| **Diffs Only** | | | |
| Daily | 1.4 MB | 14 MB | 137 MB |
| Monthly | 41 MB | 412 MB | 4 GB |
| **DB Rows/Day** | 7,200 | 72,000 | 720,000 |
| **DB Rows/Year** | 2.6M | 26M | 263M |
### Recommended Approach
1. **Diffs → PostgreSQL** (queryable, ~4 GB/month for 100 stores)
2. **Raw Payloads → MinIO** (cold storage, 30-90 day retention)
3. **Daily Snapshots → PostgreSQL** (point-in-time reconstruction)
## Browser Fingerprinting Strategy
### Why Not Curl
- Curl has distinctive TLS fingerprint
- No JavaScript execution
- Missing browser-specific headers
- Easily detected and blocked
### Browser-Based Approach
```typescript
interface BrowserSession {
// Puppeteer with stealth plugin
browser: Browser;
page: Page;
// Evomi residential proxy (rotates per session)
proxy: {
host: string;
port: number;
username: string;
password: string;
geo: string; // State targeting
};
// Randomized fingerprint
fingerprint: {
userAgent: string;
viewport: { width: number; height: number };
platform: 'Windows' | 'MacOS' | 'Linux';
deviceType: 'desktop' | 'mobile' | 'tablet';
};
// Session management
createdAt: Date;
requestCount: number;
maxRequests: number; // Rotate after N requests
maxAge: number; // Rotate after N minutes
}
```
### Session Rotation Strategy
```
┌─────────────────────────────────────────────────────────────┐
│ Session Lifecycle │
├─────────────────────────────────────────────────────────────┤
│ │
│ CREATE ──▶ USE (10-15 requests) ──▶ ROTATE ──▶ CREATE │
│ │ │ │ │
│ │ ▼ │ │
│ │ • Same fingerprint │ │
│ │ • Same proxy IP │ │
│ │ • Natural timing │ │
│ │ │ │
│ ▼ ▼ │
│ • New proxy IP • New fingerprint │
│ • New fingerprint • New proxy IP │
│ • Fresh cookies • Fresh session │
│ │
└─────────────────────────────────────────────────────────────┘
```
### Request Pattern (Natural Behavior)
```typescript
async function naturalRequest(session: BrowserSession, storeUrl: string) {
// 1. Random delay (human-like)
await sleep(randomBetween(2000, 5000));
// 2. Visit store page first (establish session)
await session.page.goto(storeUrl);
await sleep(randomBetween(1000, 3000));
// 3. Scroll like a human
await humanScroll(session.page);
// 4. Make GraphQL request from browser context
const payload = await session.page.evaluate(async () => {
const response = await fetch('/graphql', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query: FILTERED_PRODUCTS_QUERY }),
});
return response.json();
});
return payload;
}
```
## Analytics Queries
### Sales by Hour
```sql
SELECT
date_trunc('hour', detected_at) as hour,
COUNT(*) as transactions,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue
FROM inventory_changes
WHERE change_type = 'sale'
AND dispensary_id = 112
AND detected_at > NOW() - INTERVAL '24 hours'
GROUP BY 1
ORDER BY 1;
```
### Brand Performance
```sql
SELECT
brand_name,
COUNT(*) as sales_count,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
AVG(CASE WHEN is_special THEN special_price ELSE price END) as avg_price
FROM inventory_changes
WHERE change_type = 'sale'
AND dispensary_id = 112
AND detected_at > NOW() - INTERVAL '7 days'
GROUP BY brand_name
ORDER BY total_revenue DESC
LIMIT 20;
```
### Stock-Out Detection
```sql
SELECT
product_name,
brand_name,
option,
quantity_after,
detected_at
FROM inventory_changes
WHERE change_type = 'sale'
AND quantity_after = 0
AND dispensary_id = 112
AND detected_at > NOW() - INTERVAL '24 hours'
ORDER BY detected_at DESC;
```
### Price Change History
```sql
SELECT
product_name,
brand_name,
detected_at,
price as new_price,
LAG(price) OVER (PARTITION BY product_id, option ORDER BY detected_at) as old_price
FROM inventory_changes
WHERE change_type = 'price_change'
AND dispensary_id = 112
ORDER BY detected_at DESC;
```
## Implementation Files
| File | Purpose |
|------|---------|
| `migrations/XXX_inventory_changes.sql` | Database schema |
| `src/services/inventory-tracker.ts` | Diff calculation service |
| `src/services/browser-session-pool.ts` | Managed browser sessions |
| `src/tasks/handlers/realtime-inventory.ts` | Task handler |
| `src/routes/inventory-analytics.ts` | Analytics API |
## Key Data Fields from Payload
### For Sales Calculation (from `POSMetaData.children[i]`)
- `quantity` - current stock
- `quantityAvailable` - available for sale
- `price` / `recPrice` - regular price
- `option` - weight/size variant
### For Revenue (from product root)
- `special` - boolean, is on sale
- `recSpecialPrices[i]` - sale price per variant
- `Prices[i]` - regular price per variant
### For Analytics
- `brand.name` - brand
- `type` - category (Flower, Concentrate, etc.)
- `subcategory` - subcategory (gummies, live-resin, etc.)
- `strainType` - Indica/Sativa/Hybrid
- `THCContent.range[0]` - THC percentage
- `canonicalID` - for cross-store matching
- `canonicalSKU` - SKU for inventory systems

View File

@@ -0,0 +1,383 @@
-- Migration 121: Sales Analytics Materialized Views
-- Pre-computed views for sales velocity, brand market share, and store performance
-- ============================================================
-- VIEW 1: Daily Sales Estimates (per product/store)
-- Calculates delta between consecutive snapshots
-- ============================================================
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_daily_sales_estimates AS
WITH qty_deltas AS (
SELECT
dispensary_id,
product_id,
brand_name,
category,
DATE(captured_at) AS sale_date,
price_rec,
quantity_available,
LAG(quantity_available) OVER (
PARTITION BY dispensary_id, product_id
ORDER BY captured_at
) AS prev_quantity
FROM inventory_snapshots
WHERE quantity_available IS NOT NULL
AND captured_at >= NOW() - INTERVAL '30 days'
)
SELECT
dispensary_id,
product_id,
brand_name,
category,
sale_date,
AVG(price_rec) AS avg_price,
SUM(GREATEST(0, COALESCE(prev_quantity, 0) - quantity_available)) AS units_sold,
SUM(GREATEST(0, quantity_available - COALESCE(prev_quantity, 0))) AS units_restocked,
SUM(GREATEST(0, COALESCE(prev_quantity, 0) - quantity_available) * COALESCE(price_rec, 0)) AS revenue_estimate,
COUNT(*) AS snapshot_count
FROM qty_deltas
WHERE prev_quantity IS NOT NULL
GROUP BY dispensary_id, product_id, brand_name, category, sale_date;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_daily_sales_pk
ON mv_daily_sales_estimates(dispensary_id, product_id, sale_date);
CREATE INDEX IF NOT EXISTS idx_mv_daily_sales_brand
ON mv_daily_sales_estimates(brand_name, sale_date);
CREATE INDEX IF NOT EXISTS idx_mv_daily_sales_category
ON mv_daily_sales_estimates(category, sale_date);
CREATE INDEX IF NOT EXISTS idx_mv_daily_sales_date
ON mv_daily_sales_estimates(sale_date DESC);
-- ============================================================
-- VIEW 2: Brand Market Share by State
-- Weighted distribution across stores
-- ============================================================
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_brand_market_share AS
WITH brand_presence AS (
SELECT
sp.brand AS brand_name,
d.state AS state_code,
COUNT(DISTINCT sp.dispensary_id) AS stores_carrying,
COUNT(*) AS sku_count,
SUM(CASE WHEN sp.is_in_stock THEN 1 ELSE 0 END) AS in_stock_skus,
AVG(sp.price_rec) AS avg_price
FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand IS NOT NULL
AND d.state IS NOT NULL
GROUP BY sp.brand, d.state
),
state_totals AS (
SELECT
d.state AS state_code,
COUNT(DISTINCT d.id) FILTER (WHERE d.crawl_enabled) AS total_stores
FROM dispensaries d
WHERE d.state IS NOT NULL
GROUP BY d.state
)
SELECT
bp.brand_name,
bp.state_code,
bp.stores_carrying,
st.total_stores,
ROUND(bp.stores_carrying::NUMERIC * 100 / NULLIF(st.total_stores, 0), 2) AS penetration_pct,
bp.sku_count,
bp.in_stock_skus,
bp.avg_price,
NOW() AS calculated_at
FROM brand_presence bp
JOIN state_totals st ON st.state_code = bp.state_code;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_brand_market_pk
ON mv_brand_market_share(brand_name, state_code);
CREATE INDEX IF NOT EXISTS idx_mv_brand_market_state
ON mv_brand_market_share(state_code);
CREATE INDEX IF NOT EXISTS idx_mv_brand_market_penetration
ON mv_brand_market_share(penetration_pct DESC);
-- ============================================================
-- VIEW 3: SKU Velocity (30-day rolling)
-- Average daily units sold per SKU
-- ============================================================
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_sku_velocity AS
SELECT
dse.product_id,
dse.brand_name,
dse.category,
dse.dispensary_id,
d.name AS dispensary_name,
d.state AS state_code,
SUM(dse.units_sold) AS total_units_30d,
SUM(dse.revenue_estimate) AS total_revenue_30d,
COUNT(DISTINCT dse.sale_date) AS days_with_sales,
ROUND(SUM(dse.units_sold)::NUMERIC / NULLIF(COUNT(DISTINCT dse.sale_date), 0), 2) AS avg_daily_units,
AVG(dse.avg_price) AS avg_price,
CASE
WHEN SUM(dse.units_sold)::NUMERIC / NULLIF(COUNT(DISTINCT dse.sale_date), 0) >= 5 THEN 'hot'
WHEN SUM(dse.units_sold)::NUMERIC / NULLIF(COUNT(DISTINCT dse.sale_date), 0) >= 1 THEN 'steady'
WHEN SUM(dse.units_sold)::NUMERIC / NULLIF(COUNT(DISTINCT dse.sale_date), 0) >= 0.1 THEN 'slow'
ELSE 'stale'
END AS velocity_tier,
NOW() AS calculated_at
FROM mv_daily_sales_estimates dse
JOIN dispensaries d ON d.id = dse.dispensary_id
WHERE dse.sale_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY dse.product_id, dse.brand_name, dse.category, dse.dispensary_id, d.name, d.state;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_sku_velocity_pk
ON mv_sku_velocity(dispensary_id, product_id);
CREATE INDEX IF NOT EXISTS idx_mv_sku_velocity_brand
ON mv_sku_velocity(brand_name);
CREATE INDEX IF NOT EXISTS idx_mv_sku_velocity_tier
ON mv_sku_velocity(velocity_tier);
CREATE INDEX IF NOT EXISTS idx_mv_sku_velocity_state
ON mv_sku_velocity(state_code);
CREATE INDEX IF NOT EXISTS idx_mv_sku_velocity_units
ON mv_sku_velocity(total_units_30d DESC);
-- ============================================================
-- VIEW 4: Store Performance Rankings
-- Revenue estimates and brand diversity per store
-- ============================================================
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_store_performance AS
SELECT
d.id AS dispensary_id,
d.name AS dispensary_name,
d.city,
d.state AS state_code,
-- Revenue metrics from sales estimates
COALESCE(sales.total_revenue_30d, 0) AS total_revenue_30d,
COALESCE(sales.total_units_30d, 0) AS total_units_30d,
-- Inventory metrics
COUNT(DISTINCT sp.id) AS total_skus,
COUNT(DISTINCT sp.id) FILTER (WHERE sp.is_in_stock) AS in_stock_skus,
-- Brand diversity
COUNT(DISTINCT sp.brand) AS unique_brands,
COUNT(DISTINCT sp.category) AS unique_categories,
-- Pricing
AVG(sp.price_rec) AS avg_price,
-- Activity
MAX(sp.updated_at) AS last_updated,
NOW() AS calculated_at
FROM dispensaries d
LEFT JOIN store_products sp ON sp.dispensary_id = d.id
LEFT JOIN (
SELECT
dispensary_id,
SUM(revenue_estimate) AS total_revenue_30d,
SUM(units_sold) AS total_units_30d
FROM mv_daily_sales_estimates
WHERE sale_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY dispensary_id
) sales ON sales.dispensary_id = d.id
WHERE d.crawl_enabled = TRUE
GROUP BY d.id, d.name, d.city, d.state, sales.total_revenue_30d, sales.total_units_30d;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_store_perf_pk
ON mv_store_performance(dispensary_id);
CREATE INDEX IF NOT EXISTS idx_mv_store_perf_state
ON mv_store_performance(state_code);
CREATE INDEX IF NOT EXISTS idx_mv_store_perf_revenue
ON mv_store_performance(total_revenue_30d DESC);
-- ============================================================
-- VIEW 5: Weekly Category Trends
-- Category performance over time
-- ============================================================
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_category_weekly_trends AS
SELECT
dse.category,
d.state AS state_code,
DATE_TRUNC('week', dse.sale_date)::DATE AS week_start,
COUNT(DISTINCT dse.product_id) AS sku_count,
COUNT(DISTINCT dse.dispensary_id) AS store_count,
SUM(dse.units_sold) AS total_units,
SUM(dse.revenue_estimate) AS total_revenue,
AVG(dse.avg_price) AS avg_price,
NOW() AS calculated_at
FROM mv_daily_sales_estimates dse
JOIN dispensaries d ON d.id = dse.dispensary_id
WHERE dse.category IS NOT NULL
AND dse.sale_date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY dse.category, d.state, DATE_TRUNC('week', dse.sale_date);
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_cat_weekly_pk
ON mv_category_weekly_trends(category, state_code, week_start);
CREATE INDEX IF NOT EXISTS idx_mv_cat_weekly_state
ON mv_category_weekly_trends(state_code, week_start);
CREATE INDEX IF NOT EXISTS idx_mv_cat_weekly_date
ON mv_category_weekly_trends(week_start DESC);
-- ============================================================
-- VIEW 6: Product Intelligence (Hoodie-style per-product metrics)
-- Includes stock diff, days since OOS, days until stockout
-- ============================================================
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_product_intelligence AS
WITH
-- Calculate stock diff over 120 days
stock_diff AS (
SELECT
dispensary_id,
product_id,
-- Get oldest and newest quantity in last 120 days
FIRST_VALUE(quantity_available) OVER (
PARTITION BY dispensary_id, product_id
ORDER BY captured_at ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS qty_120d_ago,
LAST_VALUE(quantity_available) OVER (
PARTITION BY dispensary_id, product_id
ORDER BY captured_at ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS qty_current
FROM inventory_snapshots
WHERE captured_at >= NOW() - INTERVAL '120 days'
),
stock_diff_calc AS (
SELECT DISTINCT
dispensary_id,
product_id,
qty_current - COALESCE(qty_120d_ago, qty_current) AS stock_diff_120
FROM stock_diff
),
-- Get days since last OOS event
last_oos AS (
SELECT
dispensary_id,
product_id,
MAX(detected_at) AS last_oos_date
FROM product_visibility_events
WHERE event_type = 'oos'
GROUP BY dispensary_id, product_id
),
-- Calculate avg daily units sold (from velocity view)
velocity AS (
SELECT
dispensary_id,
product_id,
avg_daily_units
FROM mv_sku_velocity
)
SELECT
sp.dispensary_id,
d.name AS dispensary_name,
d.state AS state_code,
d.city,
sp.provider_product_id AS sku,
sp.name_raw AS product_name,
sp.brand_name_raw AS brand,
sp.category_raw AS category,
sp.is_in_stock,
sp.stock_status,
sp.stock_quantity,
sp.price_rec AS price,
sp.first_seen_at AS first_seen,
sp.last_seen_at AS last_seen,
-- Calculated fields
COALESCE(sd.stock_diff_120, 0) AS stock_diff_120,
CASE
WHEN lo.last_oos_date IS NOT NULL
THEN EXTRACT(DAY FROM NOW() - lo.last_oos_date)::INT
ELSE NULL
END AS days_since_oos,
-- Days until stockout = current stock / daily burn rate
CASE
WHEN v.avg_daily_units > 0 AND sp.stock_quantity > 0
THEN ROUND(sp.stock_quantity::NUMERIC / v.avg_daily_units)::INT
ELSE NULL
END AS days_until_stock_out,
v.avg_daily_units,
NOW() AS calculated_at
FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id
LEFT JOIN stock_diff_calc sd ON sd.dispensary_id = sp.dispensary_id
AND sd.product_id = sp.provider_product_id
LEFT JOIN last_oos lo ON lo.dispensary_id = sp.dispensary_id
AND lo.product_id = sp.provider_product_id
LEFT JOIN velocity v ON v.dispensary_id = sp.dispensary_id
AND v.product_id = sp.provider_product_id
WHERE d.crawl_enabled = TRUE;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_prod_intel_pk
ON mv_product_intelligence(dispensary_id, sku);
CREATE INDEX IF NOT EXISTS idx_mv_prod_intel_brand
ON mv_product_intelligence(brand);
CREATE INDEX IF NOT EXISTS idx_mv_prod_intel_state
ON mv_product_intelligence(state_code);
CREATE INDEX IF NOT EXISTS idx_mv_prod_intel_stock_out
ON mv_product_intelligence(days_until_stock_out ASC NULLS LAST);
CREATE INDEX IF NOT EXISTS idx_mv_prod_intel_oos
ON mv_product_intelligence(days_since_oos DESC NULLS LAST);
-- ============================================================
-- REFRESH FUNCTION
-- ============================================================
CREATE OR REPLACE FUNCTION refresh_sales_analytics_views()
RETURNS TABLE(view_name TEXT, rows_affected BIGINT) AS $$
DECLARE
row_count BIGINT;
BEGIN
-- Must refresh in dependency order:
-- 1. daily_sales (base view)
-- 2. sku_velocity (depends on daily_sales)
-- 3. product_intelligence (depends on sku_velocity)
-- 4. others (independent)
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_sales_estimates;
SELECT COUNT(*) INTO row_count FROM mv_daily_sales_estimates;
view_name := 'mv_daily_sales_estimates';
rows_affected := row_count;
RETURN NEXT;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_brand_market_share;
SELECT COUNT(*) INTO row_count FROM mv_brand_market_share;
view_name := 'mv_brand_market_share';
rows_affected := row_count;
RETURN NEXT;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_sku_velocity;
SELECT COUNT(*) INTO row_count FROM mv_sku_velocity;
view_name := 'mv_sku_velocity';
rows_affected := row_count;
RETURN NEXT;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_store_performance;
SELECT COUNT(*) INTO row_count FROM mv_store_performance;
view_name := 'mv_store_performance';
rows_affected := row_count;
RETURN NEXT;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_category_weekly_trends;
SELECT COUNT(*) INTO row_count FROM mv_category_weekly_trends;
view_name := 'mv_category_weekly_trends';
rows_affected := row_count;
RETURN NEXT;
-- Product intelligence depends on sku_velocity, so refresh last
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_product_intelligence;
SELECT COUNT(*) INTO row_count FROM mv_product_intelligence;
view_name := 'mv_product_intelligence';
rows_affected := row_count;
RETURN NEXT;
END;
$$ LANGUAGE plpgsql;
COMMENT ON FUNCTION refresh_sales_analytics_views IS
'Refresh all sales analytics materialized views. Call hourly via scheduler.';
-- ============================================================
-- INITIAL REFRESH (populate views)
-- ============================================================
-- Note: Initial refresh must be non-concurrent (no unique index yet populated)
-- Run these manually after migration:
-- REFRESH MATERIALIZED VIEW mv_daily_sales_estimates;
-- REFRESH MATERIALIZED VIEW mv_brand_market_share;
-- REFRESH MATERIALIZED VIEW mv_sku_velocity;
-- REFRESH MATERIALIZED VIEW mv_store_performance;
-- REFRESH MATERIALIZED VIEW mv_category_weekly_trends;

View File

@@ -0,0 +1,359 @@
-- Migration 122: Market Intelligence Schema
-- Separate schema for external market data ingestion
-- Supports product, brand, and dispensary data from third-party sources
-- Create dedicated schema
CREATE SCHEMA IF NOT EXISTS market_intel;
-- ============================================================
-- BRANDS: Brand/Company Intelligence
-- ============================================================
CREATE TABLE IF NOT EXISTS market_intel.brands (
id SERIAL PRIMARY KEY,
-- Identity
brand_name VARCHAR(255) NOT NULL,
parent_brand VARCHAR(255),
parent_company VARCHAR(255),
slug VARCHAR(255),
external_id VARCHAR(255) UNIQUE, -- objectID from source
-- Details
brand_description TEXT,
brand_logo_url TEXT,
brand_url TEXT,
linkedin_url TEXT,
-- Presence
states JSONB DEFAULT '[]', -- Array of state names
active_variants INTEGER DEFAULT 0,
all_variants INTEGER DEFAULT 0,
-- Metadata
source VARCHAR(50) DEFAULT 'external',
fetched_at TIMESTAMPTZ DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS idx_brands_name ON market_intel.brands(brand_name);
CREATE INDEX IF NOT EXISTS idx_brands_parent ON market_intel.brands(parent_brand);
CREATE INDEX IF NOT EXISTS idx_brands_external ON market_intel.brands(external_id);
CREATE INDEX IF NOT EXISTS idx_brands_states ON market_intel.brands USING GIN(states);
-- ============================================================
-- DISPENSARIES: Dispensary/Store Intelligence
-- ============================================================
CREATE TABLE IF NOT EXISTS market_intel.dispensaries (
id SERIAL PRIMARY KEY,
-- Identity
dispensary_name VARCHAR(255) NOT NULL,
dispensary_company_name VARCHAR(255),
dispensary_company_id VARCHAR(255),
slug VARCHAR(255),
external_id VARCHAR(255) UNIQUE, -- objectID from source
-- Location
street_address VARCHAR(255),
city VARCHAR(100),
state VARCHAR(100),
postal_code VARCHAR(20),
county_name VARCHAR(100),
country_code VARCHAR(10) DEFAULT 'USA',
full_address TEXT,
latitude DECIMAL(10, 7),
longitude DECIMAL(10, 7),
timezone VARCHAR(50),
urbanicity VARCHAR(50), -- Urban, Suburban, Rural
-- Contact
phone VARCHAR(50),
email VARCHAR(255),
website TEXT,
linkedin_url TEXT,
-- License
license_number VARCHAR(100),
license_type VARCHAR(100),
-- Store Type
is_medical BOOLEAN DEFAULT FALSE,
is_recreational BOOLEAN DEFAULT FALSE,
delivery_enabled BOOLEAN DEFAULT FALSE,
curbside_pickup BOOLEAN DEFAULT FALSE,
instore_pickup BOOLEAN DEFAULT FALSE,
location_type VARCHAR(50), -- RETAIL, DELIVERY, etc.
-- Sales Estimates
estimated_daily_sales DECIMAL(12, 2),
estimated_sales DECIMAL(12, 2),
avg_daily_sales DECIMAL(12, 2),
state_sales_bucket INTEGER,
-- Customer Demographics
affluency JSONB DEFAULT '[]', -- Array of affluency segments
age_skew JSONB DEFAULT '[]', -- Array of age brackets
customer_segments JSONB DEFAULT '[]', -- Array of segment names
-- Inventory Stats
menus_count INTEGER DEFAULT 0,
menus_count_med INTEGER DEFAULT 0,
menus_count_rec INTEGER DEFAULT 0,
parent_brands JSONB DEFAULT '[]',
brand_company_names JSONB DEFAULT '[]',
-- Business Info
banner VARCHAR(255), -- Chain/banner name
business_type VARCHAR(50), -- MSO, Independent, etc.
pos_system VARCHAR(100),
atm_presence BOOLEAN DEFAULT FALSE,
tax_included BOOLEAN DEFAULT FALSE,
-- Ratings
rating DECIMAL(3, 2),
reviews_count INTEGER DEFAULT 0,
-- Status
is_closed BOOLEAN DEFAULT FALSE,
open_date TIMESTAMPTZ,
last_updated_at TIMESTAMPTZ,
-- Media
logo_url TEXT,
cover_url TEXT,
-- Metadata
source VARCHAR(50) DEFAULT 'external',
fetched_at TIMESTAMPTZ DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS idx_dispensaries_name ON market_intel.dispensaries(dispensary_name);
CREATE INDEX IF NOT EXISTS idx_dispensaries_state ON market_intel.dispensaries(state);
CREATE INDEX IF NOT EXISTS idx_dispensaries_city ON market_intel.dispensaries(city);
CREATE INDEX IF NOT EXISTS idx_dispensaries_external ON market_intel.dispensaries(external_id);
CREATE INDEX IF NOT EXISTS idx_dispensaries_banner ON market_intel.dispensaries(banner);
CREATE INDEX IF NOT EXISTS idx_dispensaries_business_type ON market_intel.dispensaries(business_type);
CREATE INDEX IF NOT EXISTS idx_dispensaries_geo ON market_intel.dispensaries(latitude, longitude);
CREATE INDEX IF NOT EXISTS idx_dispensaries_segments ON market_intel.dispensaries USING GIN(customer_segments);
-- ============================================================
-- PRODUCTS: Product/SKU Intelligence
-- ============================================================
CREATE TABLE IF NOT EXISTS market_intel.products (
id SERIAL PRIMARY KEY,
-- Identity
name VARCHAR(500) NOT NULL,
brand VARCHAR(255),
brand_id VARCHAR(255),
brand_company_name VARCHAR(255),
parent_brand VARCHAR(255),
external_id VARCHAR(255) UNIQUE, -- objectID from source
cm_id VARCHAR(100), -- Canonical menu ID
-- Category Hierarchy
category_0 VARCHAR(100), -- Top level: Flower, Edibles, Vapes
category_1 VARCHAR(255), -- Mid level: Flower > Pre-Rolls
category_2 VARCHAR(500), -- Detailed: Flower > Pre-Rolls > Singles
-- Cannabis Classification
cannabis_type VARCHAR(50), -- SATIVA, INDICA, HYBRID
strain VARCHAR(255),
flavor VARCHAR(255),
pack_size VARCHAR(100),
description TEXT,
-- Cannabinoids
thc_mg DECIMAL(10, 2),
cbd_mg DECIMAL(10, 2),
percent_thc DECIMAL(5, 2),
percent_cbd DECIMAL(5, 2),
-- Dispensary Context (denormalized for query performance)
master_dispensary_name VARCHAR(255),
master_dispensary_id VARCHAR(255),
dispensary_count INTEGER DEFAULT 0, -- How many stores carry this
d_state VARCHAR(100),
d_city VARCHAR(100),
d_banner VARCHAR(255),
d_business_type VARCHAR(50),
d_medical BOOLEAN,
d_recreational BOOLEAN,
-- Customer Demographics (from dispensary)
d_customer_segments JSONB DEFAULT '[]',
d_age_skew JSONB DEFAULT '[]',
d_affluency JSONB DEFAULT '[]',
d_urbanicity VARCHAR(50),
-- Stock Status
in_stock BOOLEAN DEFAULT TRUE,
last_seen_at DATE,
last_seen_at_ts BIGINT,
-- Media
img_url TEXT,
product_url TEXT,
menu_slug VARCHAR(500),
-- Geo
latitude DECIMAL(10, 7),
longitude DECIMAL(10, 7),
-- Metadata
source VARCHAR(50) DEFAULT 'external',
fetched_at TIMESTAMPTZ DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS idx_products_name ON market_intel.products(name);
CREATE INDEX IF NOT EXISTS idx_products_brand ON market_intel.products(brand);
CREATE INDEX IF NOT EXISTS idx_products_external ON market_intel.products(external_id);
CREATE INDEX IF NOT EXISTS idx_products_category ON market_intel.products(category_0, category_1);
CREATE INDEX IF NOT EXISTS idx_products_cannabis_type ON market_intel.products(cannabis_type);
CREATE INDEX IF NOT EXISTS idx_products_strain ON market_intel.products(strain);
CREATE INDEX IF NOT EXISTS idx_products_state ON market_intel.products(d_state);
CREATE INDEX IF NOT EXISTS idx_products_in_stock ON market_intel.products(in_stock);
CREATE INDEX IF NOT EXISTS idx_products_dispensary_count ON market_intel.products(dispensary_count DESC);
CREATE INDEX IF NOT EXISTS idx_products_segments ON market_intel.products USING GIN(d_customer_segments);
-- ============================================================
-- PRODUCT_VARIANTS: Variant-Level Data (Pricing, Stock)
-- ============================================================
CREATE TABLE IF NOT EXISTS market_intel.product_variants (
id SERIAL PRIMARY KEY,
product_id INTEGER REFERENCES market_intel.products(id) ON DELETE CASCADE,
-- Identity
variant_id VARCHAR(255) NOT NULL,
pos_sku VARCHAR(255),
pos_product_id VARCHAR(255),
pos_system VARCHAR(100),
-- Pricing
actual_price DECIMAL(10, 2),
original_price DECIMAL(10, 2),
discounted_price DECIMAL(10, 2),
-- Presentation
product_presentation VARCHAR(255), -- "100.00 mg", "3.5g", etc.
quantity DECIMAL(10, 2),
unit VARCHAR(50), -- mg, g, oz, each
-- Availability
is_medical BOOLEAN DEFAULT FALSE,
is_recreational BOOLEAN DEFAULT FALSE,
is_active BOOLEAN DEFAULT TRUE,
-- Stock Intelligence
stock_status VARCHAR(50), -- In Stock, Low Stock, Out of Stock
stock_diff_120 DECIMAL(10, 2), -- 120-day stock change
days_since_oos INTEGER,
days_until_stock_out INTEGER,
-- Timestamps
first_seen_at_ts BIGINT,
first_seen_at TIMESTAMPTZ,
last_seen_at DATE,
-- Metadata
fetched_at TIMESTAMPTZ DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(product_id, variant_id)
);
CREATE INDEX IF NOT EXISTS idx_variants_product ON market_intel.product_variants(product_id);
CREATE INDEX IF NOT EXISTS idx_variants_sku ON market_intel.product_variants(pos_sku);
CREATE INDEX IF NOT EXISTS idx_variants_stock_status ON market_intel.product_variants(stock_status);
CREATE INDEX IF NOT EXISTS idx_variants_price ON market_intel.product_variants(actual_price);
CREATE INDEX IF NOT EXISTS idx_variants_days_out ON market_intel.product_variants(days_until_stock_out);
-- ============================================================
-- FETCH_LOG: Track data fetches
-- ============================================================
CREATE TABLE IF NOT EXISTS market_intel.fetch_log (
id SERIAL PRIMARY KEY,
fetch_type VARCHAR(50) NOT NULL, -- brands, dispensaries, products
state_code VARCHAR(10),
query_params JSONB,
records_fetched INTEGER DEFAULT 0,
records_inserted INTEGER DEFAULT 0,
records_updated INTEGER DEFAULT 0,
duration_ms INTEGER,
error_message TEXT,
started_at TIMESTAMPTZ DEFAULT NOW(),
completed_at TIMESTAMPTZ
);
CREATE INDEX IF NOT EXISTS idx_fetch_log_type ON market_intel.fetch_log(fetch_type);
CREATE INDEX IF NOT EXISTS idx_fetch_log_state ON market_intel.fetch_log(state_code);
CREATE INDEX IF NOT EXISTS idx_fetch_log_started ON market_intel.fetch_log(started_at DESC);
-- ============================================================
-- HELPER VIEWS
-- ============================================================
-- Brand market presence summary
CREATE OR REPLACE VIEW market_intel.v_brand_presence AS
SELECT
b.brand_name,
b.parent_company,
b.active_variants,
b.all_variants,
jsonb_array_length(b.states) as state_count,
b.states,
b.fetched_at
FROM market_intel.brands b
ORDER BY b.active_variants DESC;
-- Dispensary sales rankings by state
CREATE OR REPLACE VIEW market_intel.v_dispensary_rankings AS
SELECT
d.dispensary_name,
d.city,
d.state,
d.banner,
d.business_type,
d.estimated_daily_sales,
d.menus_count,
d.is_medical,
d.is_recreational,
d.customer_segments,
RANK() OVER (PARTITION BY d.state ORDER BY d.estimated_daily_sales DESC NULLS LAST) as state_rank
FROM market_intel.dispensaries d
WHERE d.is_closed = FALSE;
-- Product distribution by brand and state
CREATE OR REPLACE VIEW market_intel.v_product_distribution AS
SELECT
p.brand,
p.d_state as state,
p.category_0 as category,
COUNT(*) as product_count,
COUNT(*) FILTER (WHERE p.in_stock) as in_stock_count,
AVG(p.dispensary_count) as avg_store_count,
COUNT(DISTINCT p.master_dispensary_id) as unique_stores
FROM market_intel.products p
GROUP BY p.brand, p.d_state, p.category_0;
-- ============================================================
-- COMMENTS
-- ============================================================
COMMENT ON SCHEMA market_intel IS 'Market intelligence data from external sources';
COMMENT ON TABLE market_intel.brands IS 'Brand/company data with multi-state presence';
COMMENT ON TABLE market_intel.dispensaries IS 'Dispensary data with sales estimates and demographics';
COMMENT ON TABLE market_intel.products IS 'Product/SKU data with cannabinoid and category info';
COMMENT ON TABLE market_intel.product_variants IS 'Variant-level pricing and stock data';
COMMENT ON TABLE market_intel.fetch_log IS 'Log of data fetches for monitoring';

View File

@@ -0,0 +1,159 @@
-- Migration 123: Extract unmapped fields from provider_data
-- These fields exist in our crawl payloads but weren't being stored in columns
-- ============================================================
-- ADD NEW COLUMNS TO store_products
-- ============================================================
-- Cannabis classification (SATIVA, INDICA, HYBRID, CBD)
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS cannabis_type VARCHAR(50);
-- Canonical IDs from POS systems
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS canonical_strain_id VARCHAR(100);
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS canonical_vendor_id VARCHAR(100);
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS canonical_brand_id VARCHAR(100);
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS canonical_category_id VARCHAR(100);
-- Lab results
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS lab_result_url TEXT;
-- Flavors (extracted from JSONB to text array for easier querying)
ALTER TABLE store_products ADD COLUMN IF NOT EXISTS flavors_list TEXT[];
-- ============================================================
-- BACKFILL FROM provider_data
-- ============================================================
-- Backfill cannabis_type from classification
UPDATE store_products
SET cannabis_type = CASE
WHEN provider_data->>'classification' IN ('HYBRID', 'H') THEN 'HYBRID'
WHEN provider_data->>'classification' IN ('INDICA', 'I') THEN 'INDICA'
WHEN provider_data->>'classification' IN ('SATIVA', 'S') THEN 'SATIVA'
WHEN provider_data->>'classification' = 'I/S' THEN 'INDICA_DOMINANT'
WHEN provider_data->>'classification' = 'S/I' THEN 'SATIVA_DOMINANT'
WHEN provider_data->>'classification' = 'CBD' THEN 'CBD'
ELSE provider_data->>'classification'
END
WHERE provider_data->>'classification' IS NOT NULL
AND cannabis_type IS NULL;
-- Also backfill from strain_type if cannabis_type still null
UPDATE store_products
SET cannabis_type = CASE
WHEN strain_type ILIKE '%indica%hybrid%' OR strain_type ILIKE '%hybrid%indica%' THEN 'INDICA_DOMINANT'
WHEN strain_type ILIKE '%sativa%hybrid%' OR strain_type ILIKE '%hybrid%sativa%' THEN 'SATIVA_DOMINANT'
WHEN strain_type ILIKE '%indica%' THEN 'INDICA'
WHEN strain_type ILIKE '%sativa%' THEN 'SATIVA'
WHEN strain_type ILIKE '%hybrid%' THEN 'HYBRID'
WHEN strain_type ILIKE '%cbd%' THEN 'CBD'
ELSE NULL
END
WHERE strain_type IS NOT NULL
AND cannabis_type IS NULL;
-- Backfill canonical IDs from POSMetaData
UPDATE store_products
SET
canonical_strain_id = provider_data->'POSMetaData'->>'canonicalStrainId',
canonical_vendor_id = provider_data->'POSMetaData'->>'canonicalVendorId',
canonical_brand_id = provider_data->'POSMetaData'->>'canonicalBrandId',
canonical_category_id = provider_data->'POSMetaData'->>'canonicalCategoryId'
WHERE provider_data->'POSMetaData' IS NOT NULL
AND canonical_strain_id IS NULL;
-- Backfill lab result URLs
UPDATE store_products
SET lab_result_url = provider_data->'POSMetaData'->>'canonicalLabResultUrl'
WHERE provider_data->'POSMetaData'->>'canonicalLabResultUrl' IS NOT NULL
AND lab_result_url IS NULL;
-- ============================================================
-- INDEXES
-- ============================================================
CREATE INDEX IF NOT EXISTS idx_store_products_cannabis_type ON store_products(cannabis_type);
CREATE INDEX IF NOT EXISTS idx_store_products_vendor_id ON store_products(canonical_vendor_id);
CREATE INDEX IF NOT EXISTS idx_store_products_strain_id ON store_products(canonical_strain_id);
-- ============================================================
-- ADD MSO FLAG TO DISPENSARIES
-- ============================================================
-- Multi-State Operator flag (calculated from chain presence in multiple states)
ALTER TABLE dispensaries ADD COLUMN IF NOT EXISTS is_mso BOOLEAN DEFAULT FALSE;
-- Update MSO flag based on chain presence in multiple states
WITH mso_chains AS (
SELECT chain_id
FROM dispensaries
WHERE chain_id IS NOT NULL
GROUP BY chain_id
HAVING COUNT(DISTINCT state) > 1
)
UPDATE dispensaries d
SET is_mso = TRUE
WHERE d.chain_id IN (SELECT chain_id FROM mso_chains);
-- Index for MSO queries
CREATE INDEX IF NOT EXISTS idx_dispensaries_is_mso ON dispensaries(is_mso) WHERE is_mso = TRUE;
-- ============================================================
-- PRODUCT DISTRIBUTION VIEW
-- ============================================================
-- View: How many stores carry each product (by brand + canonical name)
CREATE OR REPLACE VIEW v_product_distribution AS
SELECT
sp.brand_name_raw as brand,
sp.c_name as product_canonical_name,
COUNT(DISTINCT sp.dispensary_id) as store_count,
COUNT(DISTINCT d.state) as state_count,
ARRAY_AGG(DISTINCT d.state) as states,
AVG(sp.price_rec) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price
FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.c_name IS NOT NULL
AND sp.brand_name_raw IS NOT NULL
AND sp.is_in_stock = TRUE
GROUP BY sp.brand_name_raw, sp.c_name
HAVING COUNT(DISTINCT sp.dispensary_id) > 1
ORDER BY store_count DESC;
-- ============================================================
-- MSO SUMMARY VIEW
-- ============================================================
CREATE OR REPLACE VIEW v_mso_summary AS
SELECT
c.name as chain_name,
COUNT(DISTINCT d.id) as store_count,
COUNT(DISTINCT d.state) as state_count,
ARRAY_AGG(DISTINCT d.state ORDER BY d.state) as states,
SUM(d.product_count) as total_products,
TRUE as is_mso
FROM dispensaries d
JOIN chains c ON c.id = d.chain_id
WHERE d.chain_id IN (
SELECT chain_id
FROM dispensaries
WHERE chain_id IS NOT NULL
GROUP BY chain_id
HAVING COUNT(DISTINCT state) > 1
)
GROUP BY c.id, c.name
ORDER BY state_count DESC, store_count DESC;
-- ============================================================
-- COMMENTS
-- ============================================================
COMMENT ON COLUMN store_products.cannabis_type IS 'Normalized cannabis classification: SATIVA, INDICA, HYBRID, INDICA_DOMINANT, SATIVA_DOMINANT, CBD';
COMMENT ON COLUMN store_products.canonical_strain_id IS 'POS system strain identifier for cross-store matching';
COMMENT ON COLUMN store_products.canonical_vendor_id IS 'POS system vendor/supplier identifier';
COMMENT ON COLUMN store_products.lab_result_url IS 'Link to Certificate of Analysis / lab test results';
COMMENT ON COLUMN dispensaries.is_mso IS 'Multi-State Operator: chain operates in 2+ states';
COMMENT ON VIEW v_product_distribution IS 'Shows how many stores carry each product for distribution analysis';
COMMENT ON VIEW v_mso_summary IS 'Summary of multi-state operator chains';

View File

@@ -0,0 +1,73 @@
-- Migration 124: Convert inventory_snapshots to TimescaleDB hypertable
-- Requires: CREATE EXTENSION timescaledb; (run after installing TimescaleDB)
-- ============================================================
-- STEP 1: Enable TimescaleDB extension
-- ============================================================
CREATE EXTENSION IF NOT EXISTS timescaledb;
-- ============================================================
-- STEP 2: Convert to hypertable
-- ============================================================
-- Note: Table must have a time column and no foreign key constraints
-- First, drop any foreign keys if they exist
ALTER TABLE inventory_snapshots DROP CONSTRAINT IF EXISTS inventory_snapshots_dispensary_id_fkey;
-- Convert to hypertable, partitioned by captured_at (1 day chunks)
SELECT create_hypertable(
'inventory_snapshots',
'captured_at',
chunk_time_interval => INTERVAL '1 day',
if_not_exists => TRUE,
migrate_data => TRUE
);
-- ============================================================
-- STEP 3: Enable compression
-- ============================================================
-- Compress by dispensary_id and product_id (common query patterns)
ALTER TABLE inventory_snapshots SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'dispensary_id, product_id',
timescaledb.compress_orderby = 'captured_at DESC'
);
-- ============================================================
-- STEP 4: Compression policy (compress chunks older than 1 day)
-- ============================================================
SELECT add_compression_policy('inventory_snapshots', INTERVAL '1 day');
-- ============================================================
-- STEP 5: Retention policy (optional - drop chunks older than 90 days)
-- ============================================================
-- Uncomment if you want automatic cleanup:
-- SELECT add_retention_policy('inventory_snapshots', INTERVAL '90 days');
-- ============================================================
-- STEP 6: Optimize indexes for time-series queries
-- ============================================================
-- TimescaleDB automatically creates time-based indexes
-- Add composite index for common queries
CREATE INDEX IF NOT EXISTS idx_snapshots_disp_prod_time
ON inventory_snapshots (dispensary_id, product_id, captured_at DESC);
-- ============================================================
-- VERIFICATION QUERIES (run after migration)
-- ============================================================
-- Check hypertable status:
-- SELECT * FROM timescaledb_information.hypertables WHERE hypertable_name = 'inventory_snapshots';
-- Check compression status:
-- SELECT * FROM timescaledb_information.compression_settings WHERE hypertable_name = 'inventory_snapshots';
-- Check chunk sizes:
-- SELECT chunk_name, pg_size_pretty(before_compression_total_bytes) as before,
-- pg_size_pretty(after_compression_total_bytes) as after,
-- round(100 - (after_compression_total_bytes::numeric / before_compression_total_bytes * 100), 1) as compression_pct
-- FROM chunk_compression_stats('inventory_snapshots');
-- ============================================================
-- COMMENTS
-- ============================================================
COMMENT ON TABLE inventory_snapshots IS 'TimescaleDB hypertable for inventory time-series data. Compressed after 1 day.';

View File

@@ -0,0 +1,402 @@
-- Migration 125: Delta-only inventory snapshots
-- Only store a row when something meaningful changes
-- Revenue calculated as: effective_price × qty_sold
-- ============================================================
-- ADD DELTA TRACKING COLUMNS
-- ============================================================
-- Previous values (to show what changed)
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS prev_quantity INTEGER;
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS prev_price_rec DECIMAL(10,2);
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS prev_price_med DECIMAL(10,2);
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS prev_status VARCHAR(50);
-- Calculated deltas
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS qty_delta INTEGER; -- negative = sold, positive = restocked
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS price_delta DECIMAL(10,2);
-- Change type flags
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS change_type VARCHAR(50); -- 'sale', 'restock', 'price_change', 'oos', 'back_in_stock'
-- ============================================================
-- INDEX FOR CHANGE TYPE QUERIES
-- ============================================================
CREATE INDEX IF NOT EXISTS idx_snapshots_change_type ON inventory_snapshots(change_type);
CREATE INDEX IF NOT EXISTS idx_snapshots_qty_delta ON inventory_snapshots(qty_delta) WHERE qty_delta != 0;
-- ============================================================
-- VIEW: Latest product state (for delta comparison)
-- ============================================================
CREATE OR REPLACE VIEW v_product_latest_state AS
SELECT DISTINCT ON (dispensary_id, product_id)
dispensary_id,
product_id,
quantity_available,
price_rec,
price_med,
status,
captured_at
FROM inventory_snapshots
ORDER BY dispensary_id, product_id, captured_at DESC;
-- ============================================================
-- FUNCTION: Check if product state changed
-- ============================================================
CREATE OR REPLACE FUNCTION should_capture_snapshot(
p_dispensary_id INTEGER,
p_product_id TEXT,
p_quantity INTEGER,
p_price_rec DECIMAL,
p_price_med DECIMAL,
p_status VARCHAR
) RETURNS TABLE (
should_capture BOOLEAN,
prev_quantity INTEGER,
prev_price_rec DECIMAL,
prev_price_med DECIMAL,
prev_status VARCHAR,
qty_delta INTEGER,
price_delta DECIMAL,
change_type VARCHAR
) AS $$
DECLARE
v_prev RECORD;
BEGIN
-- Get previous state
SELECT
ls.quantity_available,
ls.price_rec,
ls.price_med,
ls.status
INTO v_prev
FROM v_product_latest_state ls
WHERE ls.dispensary_id = p_dispensary_id
AND ls.product_id = p_product_id;
-- First time seeing this product
IF NOT FOUND THEN
RETURN QUERY SELECT
TRUE,
NULL::INTEGER,
NULL::DECIMAL,
NULL::DECIMAL,
NULL::VARCHAR,
NULL::INTEGER,
NULL::DECIMAL,
'new_product'::VARCHAR;
RETURN;
END IF;
-- Check for changes
IF v_prev.quantity_available IS DISTINCT FROM p_quantity
OR v_prev.price_rec IS DISTINCT FROM p_price_rec
OR v_prev.price_med IS DISTINCT FROM p_price_med
OR v_prev.status IS DISTINCT FROM p_status THEN
RETURN QUERY SELECT
TRUE,
v_prev.quantity_available,
v_prev.price_rec,
v_prev.price_med,
v_prev.status,
COALESCE(p_quantity, 0) - COALESCE(v_prev.quantity_available, 0),
COALESCE(p_price_rec, 0) - COALESCE(v_prev.price_rec, 0),
CASE
WHEN COALESCE(p_quantity, 0) < COALESCE(v_prev.quantity_available, 0) THEN 'sale'
WHEN COALESCE(p_quantity, 0) > COALESCE(v_prev.quantity_available, 0) THEN 'restock'
WHEN p_quantity = 0 AND v_prev.quantity_available > 0 THEN 'oos'
WHEN p_quantity > 0 AND v_prev.quantity_available = 0 THEN 'back_in_stock'
WHEN p_price_rec IS DISTINCT FROM v_prev.price_rec THEN 'price_change'
ELSE 'status_change'
END;
RETURN;
END IF;
-- No change
RETURN QUERY SELECT
FALSE,
NULL::INTEGER,
NULL::DECIMAL,
NULL::DECIMAL,
NULL::VARCHAR,
NULL::INTEGER,
NULL::DECIMAL,
NULL::VARCHAR;
END;
$$ LANGUAGE plpgsql;
-- ============================================================
-- REVENUE CALCULATION COLUMNS
-- ============================================================
-- Effective prices (sale price if on special, otherwise regular)
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS effective_price_rec DECIMAL(10,2);
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS effective_price_med DECIMAL(10,2);
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS is_on_special BOOLEAN DEFAULT FALSE;
-- Revenue by market type
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS revenue_rec DECIMAL(10,2);
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS revenue_med DECIMAL(10,2);
-- Time between snapshots (for velocity calc)
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS time_since_last_snapshot INTERVAL;
ALTER TABLE inventory_snapshots ADD COLUMN IF NOT EXISTS hours_since_last DECIMAL(10,2);
-- ============================================================
-- VIEW: Hourly Sales Velocity
-- ============================================================
CREATE OR REPLACE VIEW v_hourly_sales AS
SELECT
dispensary_id,
DATE(captured_at) as sale_date,
EXTRACT(HOUR FROM captured_at) as sale_hour,
COUNT(*) FILTER (WHERE qty_delta < 0) as transactions,
SUM(ABS(qty_delta)) FILTER (WHERE qty_delta < 0) as units_sold,
SUM(revenue_estimate) FILTER (WHERE qty_delta < 0) as revenue,
COUNT(DISTINCT product_id) FILTER (WHERE qty_delta < 0) as unique_products_sold
FROM inventory_snapshots
WHERE change_type = 'sale'
GROUP BY dispensary_id, DATE(captured_at), EXTRACT(HOUR FROM captured_at);
-- ============================================================
-- VIEW: Daily Sales by Store
-- ============================================================
CREATE OR REPLACE VIEW v_daily_store_sales AS
SELECT
s.dispensary_id,
d.name as store_name,
d.state,
DATE(s.captured_at) as sale_date,
SUM(ABS(s.qty_delta)) as units_sold,
SUM(s.revenue_estimate) as revenue,
COUNT(*) as sale_events,
COUNT(DISTINCT s.product_id) as unique_products
FROM inventory_snapshots s
JOIN dispensaries d ON d.id = s.dispensary_id
WHERE s.change_type = 'sale'
GROUP BY s.dispensary_id, d.name, d.state, DATE(s.captured_at);
-- ============================================================
-- VIEW: Daily Sales by Brand
-- ============================================================
CREATE OR REPLACE VIEW v_daily_brand_sales AS
SELECT
s.brand_name,
d.state,
DATE(s.captured_at) as sale_date,
SUM(ABS(s.qty_delta)) as units_sold,
SUM(s.revenue_estimate) as revenue,
COUNT(DISTINCT s.dispensary_id) as stores_with_sales,
COUNT(DISTINCT s.product_id) as unique_skus_sold
FROM inventory_snapshots s
JOIN dispensaries d ON d.id = s.dispensary_id
WHERE s.change_type = 'sale'
AND s.brand_name IS NOT NULL
GROUP BY s.brand_name, d.state, DATE(s.captured_at);
-- ============================================================
-- VIEW: Product Velocity Rankings
-- ============================================================
CREATE OR REPLACE VIEW v_product_velocity AS
SELECT
s.product_id,
s.brand_name,
s.category,
s.dispensary_id,
d.name as store_name,
d.state,
SUM(ABS(s.qty_delta)) as units_sold_30d,
SUM(s.revenue_estimate) as revenue_30d,
COUNT(*) as sale_events,
ROUND(SUM(ABS(s.qty_delta))::NUMERIC / NULLIF(COUNT(DISTINCT DATE(s.captured_at)), 0), 2) as avg_daily_units,
ROUND(SUM(s.revenue_estimate) / NULLIF(COUNT(DISTINCT DATE(s.captured_at)), 0), 2) as avg_daily_revenue,
CASE
WHEN SUM(ABS(s.qty_delta)) / NULLIF(COUNT(DISTINCT DATE(s.captured_at)), 0) >= 10 THEN 'hot'
WHEN SUM(ABS(s.qty_delta)) / NULLIF(COUNT(DISTINCT DATE(s.captured_at)), 0) >= 3 THEN 'steady'
WHEN SUM(ABS(s.qty_delta)) / NULLIF(COUNT(DISTINCT DATE(s.captured_at)), 0) >= 1 THEN 'slow'
ELSE 'stale'
END as velocity_tier
FROM inventory_snapshots s
JOIN dispensaries d ON d.id = s.dispensary_id
WHERE s.change_type = 'sale'
AND s.captured_at >= NOW() - INTERVAL '30 days'
GROUP BY s.product_id, s.brand_name, s.category, s.dispensary_id, d.name, d.state;
-- ============================================================
-- VIEW: Busiest Hours by Store
-- ============================================================
CREATE OR REPLACE VIEW v_busiest_hours AS
SELECT
dispensary_id,
sale_hour,
AVG(units_sold) as avg_units_per_hour,
AVG(revenue) as avg_revenue_per_hour,
SUM(units_sold) as total_units,
SUM(revenue) as total_revenue,
COUNT(*) as days_with_data,
RANK() OVER (PARTITION BY dispensary_id ORDER BY AVG(revenue) DESC) as hour_rank
FROM v_hourly_sales
GROUP BY dispensary_id, sale_hour;
-- ============================================================
-- VIEW: Promotion Effectiveness (compare sale vs non-sale prices)
-- ============================================================
CREATE OR REPLACE VIEW v_promotion_effectiveness AS
SELECT
s.dispensary_id,
d.name as store_name,
s.product_id,
s.brand_name,
DATE(s.captured_at) as sale_date,
SUM(ABS(s.qty_delta)) FILTER (WHERE s.price_rec < s.prev_price_rec) as units_on_discount,
SUM(ABS(s.qty_delta)) FILTER (WHERE s.price_rec >= COALESCE(s.prev_price_rec, s.price_rec)) as units_full_price,
SUM(s.revenue_estimate) FILTER (WHERE s.price_rec < s.prev_price_rec) as revenue_discounted,
SUM(s.revenue_estimate) FILTER (WHERE s.price_rec >= COALESCE(s.prev_price_rec, s.price_rec)) as revenue_full_price
FROM inventory_snapshots s
JOIN dispensaries d ON d.id = s.dispensary_id
WHERE s.change_type = 'sale'
GROUP BY s.dispensary_id, d.name, s.product_id, s.brand_name, DATE(s.captured_at);
-- ============================================================
-- COMMENTS
-- ============================================================
COMMENT ON COLUMN inventory_snapshots.qty_delta IS 'Quantity change: negative=sold, positive=restocked';
COMMENT ON COLUMN inventory_snapshots.revenue_estimate IS 'Estimated revenue: ABS(qty_delta) * price_rec when qty_delta < 0';
COMMENT ON COLUMN inventory_snapshots.change_type IS 'Type of change: sale, restock, price_change, oos, back_in_stock, new_product';
COMMENT ON FUNCTION should_capture_snapshot IS 'Returns whether a snapshot should be captured and delta values';
COMMENT ON VIEW v_hourly_sales IS 'Sales aggregated by hour - find busiest times';
COMMENT ON VIEW v_daily_store_sales IS 'Daily revenue by store';
COMMENT ON VIEW v_daily_brand_sales IS 'Daily brand performance by state';
COMMENT ON VIEW v_product_velocity IS 'Product sales velocity rankings (hot/steady/slow/stale)';
COMMENT ON VIEW v_busiest_hours IS 'Rank hours by sales volume per store';
-- ============================================================
-- VIEW: Days Until Stock Out (Predictive)
-- ============================================================
CREATE OR REPLACE VIEW v_stock_out_prediction AS
WITH velocity AS (
SELECT
dispensary_id,
product_id,
brand_name,
-- Average units sold per day (last 7 days)
ROUND(SUM(ABS(qty_delta))::NUMERIC / NULLIF(COUNT(DISTINCT DATE(captured_at)), 0), 2) as daily_velocity,
-- Hours between sales
AVG(hours_since_last) FILTER (WHERE qty_delta < 0) as avg_hours_between_sales
FROM inventory_snapshots
WHERE change_type = 'sale'
AND captured_at >= NOW() - INTERVAL '7 days'
GROUP BY dispensary_id, product_id, brand_name
),
current_stock AS (
SELECT DISTINCT ON (dispensary_id, product_id)
dispensary_id,
product_id,
quantity_available as current_qty,
captured_at as last_seen
FROM inventory_snapshots
ORDER BY dispensary_id, product_id, captured_at DESC
)
SELECT
cs.dispensary_id,
d.name as store_name,
cs.product_id,
v.brand_name,
cs.current_qty,
v.daily_velocity,
CASE
WHEN v.daily_velocity > 0 THEN ROUND(cs.current_qty / v.daily_velocity, 1)
ELSE NULL
END as days_until_stock_out,
CASE
WHEN v.daily_velocity > 0 AND cs.current_qty / v.daily_velocity <= 3 THEN 'critical'
WHEN v.daily_velocity > 0 AND cs.current_qty / v.daily_velocity <= 7 THEN 'low'
WHEN v.daily_velocity > 0 AND cs.current_qty / v.daily_velocity <= 14 THEN 'moderate'
ELSE 'healthy'
END as stock_health,
cs.last_seen
FROM current_stock cs
JOIN velocity v ON v.dispensary_id = cs.dispensary_id AND v.product_id = cs.product_id
JOIN dispensaries d ON d.id = cs.dispensary_id
WHERE cs.current_qty > 0
AND v.daily_velocity > 0;
-- ============================================================
-- VIEW: Days Since OOS (for products currently out of stock)
-- ============================================================
CREATE OR REPLACE VIEW v_days_since_oos AS
SELECT
s.dispensary_id,
d.name as store_name,
s.product_id,
s.brand_name,
s.captured_at as went_oos_at,
EXTRACT(EPOCH FROM (NOW() - s.captured_at)) / 86400 as days_since_oos,
s.prev_quantity as last_known_qty
FROM inventory_snapshots s
JOIN dispensaries d ON d.id = s.dispensary_id
WHERE s.change_type = 'oos'
AND NOT EXISTS (
-- No back_in_stock event after this OOS
SELECT 1 FROM inventory_snapshots s2
WHERE s2.dispensary_id = s.dispensary_id
AND s2.product_id = s.product_id
AND s2.change_type = 'back_in_stock'
AND s2.captured_at > s.captured_at
);
-- ============================================================
-- VIEW: Brand Variant Counts (track brand growth)
-- ============================================================
CREATE OR REPLACE VIEW v_brand_variants AS
SELECT
sp.brand_name_raw as brand_name,
d.state,
COUNT(DISTINCT sp.id) as total_variants,
COUNT(DISTINCT sp.id) FILTER (WHERE sp.is_in_stock = TRUE) as active_variants,
COUNT(DISTINCT sp.id) FILTER (WHERE sp.is_in_stock = FALSE) as inactive_variants,
COUNT(DISTINCT sp.dispensary_id) as stores_carrying,
COUNT(DISTINCT sp.category_raw) as categories,
MIN(sp.first_seen_at) as brand_first_seen,
MAX(sp.last_seen_at) as brand_last_seen
FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name_raw IS NOT NULL
GROUP BY sp.brand_name_raw, d.state;
-- ============================================================
-- VIEW: Brand Growth (compare variant counts over time)
-- ============================================================
CREATE OR REPLACE VIEW v_brand_growth AS
WITH weekly_counts AS (
SELECT
brand_name_raw as brand_name,
DATE_TRUNC('week', last_seen_at) as week,
COUNT(DISTINCT id) as variant_count
FROM store_products
WHERE brand_name_raw IS NOT NULL
AND last_seen_at >= NOW() - INTERVAL '90 days'
GROUP BY brand_name_raw, DATE_TRUNC('week', last_seen_at)
)
SELECT
w1.brand_name,
w1.week as current_week,
w1.variant_count as current_variants,
w2.variant_count as prev_week_variants,
w1.variant_count - COALESCE(w2.variant_count, 0) as variant_change,
CASE
WHEN w2.variant_count IS NULL THEN 'new'
WHEN w1.variant_count > w2.variant_count THEN 'growing'
WHEN w1.variant_count < w2.variant_count THEN 'declining'
ELSE 'stable'
END as growth_status
FROM weekly_counts w1
LEFT JOIN weekly_counts w2
ON w2.brand_name = w1.brand_name
AND w2.week = w1.week - INTERVAL '1 week'
ORDER BY w1.brand_name, w1.week DESC;
COMMENT ON VIEW v_stock_out_prediction IS 'Predict days until stock out based on velocity';
COMMENT ON VIEW v_days_since_oos IS 'Products currently OOS and how long they have been out';
COMMENT ON VIEW v_brand_variants IS 'Active vs inactive SKU counts per brand per state';
COMMENT ON VIEW v_brand_growth IS 'Week-over-week brand variant growth tracking';

View File

@@ -0,0 +1,53 @@
-- Migration 126: Set AZ stores to 5-minute high-frequency crawls
-- Other states default to 60-minute (1 hour) intervals
-- ============================================================
-- SET AZ STORES TO 5-MINUTE INTERVALS (with 3-min jitter)
-- ============================================================
-- Base interval: 5 minutes
-- Jitter: +/- 3 minutes (so 2-8 minute effective range)
UPDATE dispensaries
SET
crawl_interval_minutes = 5,
next_crawl_at = NOW() + (RANDOM() * INTERVAL '5 minutes') -- Stagger initial crawls
WHERE state = 'AZ'
AND crawl_enabled = TRUE;
-- ============================================================
-- SET OTHER STATES TO 60-MINUTE INTERVALS (with 3-min jitter)
-- ============================================================
UPDATE dispensaries
SET
crawl_interval_minutes = 60,
next_crawl_at = NOW() + (RANDOM() * INTERVAL '60 minutes') -- Stagger initial crawls
WHERE state != 'AZ'
AND crawl_enabled = TRUE
AND crawl_interval_minutes IS NULL;
-- ============================================================
-- VERIFY RESULTS
-- ============================================================
-- SELECT state, crawl_interval_minutes, COUNT(*)
-- FROM dispensaries
-- WHERE crawl_enabled = TRUE
-- GROUP BY state, crawl_interval_minutes
-- ORDER BY state;
-- ============================================================
-- CREATE VIEW FOR MONITORING CRAWL LOAD
-- ============================================================
CREATE OR REPLACE VIEW v_crawl_load AS
SELECT
state,
crawl_interval_minutes,
COUNT(*) as store_count,
-- Crawls per hour = stores * (60 / interval)
ROUND(COUNT(*) * (60.0 / COALESCE(crawl_interval_minutes, 60))) as crawls_per_hour,
-- Assuming 30 sec per crawl, workers needed = crawls_per_hour / 120
ROUND(COUNT(*) * (60.0 / COALESCE(crawl_interval_minutes, 60)) / 120, 1) as workers_needed
FROM dispensaries
WHERE crawl_enabled = TRUE
GROUP BY state, crawl_interval_minutes
ORDER BY crawls_per_hour DESC;
COMMENT ON VIEW v_crawl_load IS 'Monitor crawl load by state and interval';

View File

@@ -0,0 +1,164 @@
-- Migration 127: Fix worker task concurrency limit
-- Problem: claim_task function checks session_task_count but never increments it
-- Solution: Increment on claim, decrement on complete/fail/release
-- =============================================================================
-- STEP 1: Set max tasks to 5 for all workers
-- =============================================================================
UPDATE worker_registry SET session_max_tasks = 5;
-- Set default to 5 for new workers
ALTER TABLE worker_registry ALTER COLUMN session_max_tasks SET DEFAULT 5;
-- =============================================================================
-- STEP 2: Reset all session_task_count to match actual active tasks
-- =============================================================================
UPDATE worker_registry wr SET session_task_count = (
SELECT COUNT(*) FROM worker_tasks wt
WHERE wt.worker_id = wr.worker_id
AND wt.status IN ('claimed', 'running')
);
-- =============================================================================
-- STEP 3: Update claim_task function to increment session_task_count
-- =============================================================================
CREATE OR REPLACE FUNCTION claim_task(
p_role VARCHAR(50),
p_worker_id VARCHAR(100),
p_curl_passed BOOLEAN DEFAULT TRUE,
p_http_passed BOOLEAN DEFAULT FALSE
) RETURNS worker_tasks AS $$
DECLARE
claimed_task worker_tasks;
worker_state VARCHAR(2);
session_valid BOOLEAN;
session_tasks INT;
max_tasks INT;
BEGIN
-- Get worker's current geo session info
SELECT
current_state,
session_task_count,
session_max_tasks,
(geo_session_started_at IS NOT NULL AND geo_session_started_at > NOW() - INTERVAL '60 minutes')
INTO worker_state, session_tasks, max_tasks, session_valid
FROM worker_registry
WHERE worker_id = p_worker_id;
-- Check if worker has reached max concurrent tasks (default 5)
IF session_tasks >= COALESCE(max_tasks, 5) THEN
RETURN NULL;
END IF;
-- If no valid geo session, or session expired, worker can't claim tasks
-- Worker must re-qualify first
IF worker_state IS NULL OR NOT session_valid THEN
RETURN NULL;
END IF;
-- Claim task matching worker's state
UPDATE worker_tasks
SET
status = 'claimed',
worker_id = p_worker_id,
claimed_at = NOW(),
updated_at = NOW()
WHERE id = (
SELECT wt.id FROM worker_tasks wt
JOIN dispensaries d ON wt.dispensary_id = d.id
WHERE wt.role = p_role
AND wt.status = 'pending'
AND (wt.scheduled_for IS NULL OR wt.scheduled_for <= NOW())
-- GEO FILTER: Task's dispensary must match worker's state
AND d.state = worker_state
-- Method compatibility: worker must have passed the required preflight
AND (
wt.method IS NULL -- No preference, any worker can claim
OR (wt.method = 'curl' AND p_curl_passed = TRUE)
OR (wt.method = 'http' AND p_http_passed = TRUE)
)
-- Exclude stores that already have an active task
AND (wt.dispensary_id IS NULL OR wt.dispensary_id NOT IN (
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
AND dispensary_id IS NOT NULL
AND dispensary_id != wt.dispensary_id
))
ORDER BY wt.priority DESC, wt.created_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING * INTO claimed_task;
-- INCREMENT session_task_count if we claimed a task
IF claimed_task.id IS NOT NULL THEN
UPDATE worker_registry
SET session_task_count = session_task_count + 1
WHERE worker_id = p_worker_id;
END IF;
RETURN claimed_task;
END;
$$ LANGUAGE plpgsql;
-- =============================================================================
-- STEP 4: Create trigger to decrement on task completion/failure/release
-- =============================================================================
CREATE OR REPLACE FUNCTION decrement_worker_task_count()
RETURNS TRIGGER AS $$
BEGIN
-- Only decrement when task was assigned to a worker and is now complete/released
IF OLD.worker_id IS NOT NULL AND OLD.status IN ('claimed', 'running') THEN
-- Task completed/failed/released - decrement count
IF NEW.status IN ('pending', 'completed', 'failed') OR NEW.worker_id IS NULL THEN
UPDATE worker_registry
SET session_task_count = GREATEST(0, session_task_count - 1)
WHERE worker_id = OLD.worker_id;
END IF;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Drop existing trigger if any
DROP TRIGGER IF EXISTS trg_decrement_worker_task_count ON worker_tasks;
-- Create trigger on UPDATE (status change or worker_id cleared)
CREATE TRIGGER trg_decrement_worker_task_count
AFTER UPDATE ON worker_tasks
FOR EACH ROW
EXECUTE FUNCTION decrement_worker_task_count();
-- Also handle DELETE (completed tasks are deleted from pool)
CREATE OR REPLACE FUNCTION decrement_worker_task_count_delete()
RETURNS TRIGGER AS $$
BEGIN
IF OLD.worker_id IS NOT NULL AND OLD.status IN ('claimed', 'running') THEN
UPDATE worker_registry
SET session_task_count = GREATEST(0, session_task_count - 1)
WHERE worker_id = OLD.worker_id;
END IF;
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS trg_decrement_worker_task_count_delete ON worker_tasks;
CREATE TRIGGER trg_decrement_worker_task_count_delete
AFTER DELETE ON worker_tasks
FOR EACH ROW
EXECUTE FUNCTION decrement_worker_task_count_delete();
-- =============================================================================
-- STEP 5: Verify current state
-- =============================================================================
SELECT
wr.worker_id,
wr.friendly_name,
wr.session_task_count,
wr.session_max_tasks,
(SELECT COUNT(*) FROM worker_tasks wt WHERE wt.worker_id = wr.worker_id AND wt.status IN ('claimed', 'running')) as actual_count
FROM worker_registry wr
WHERE wr.status = 'active'
ORDER BY wr.friendly_name;

View File

@@ -0,0 +1,109 @@
-- Migration 128: Pool configuration table
-- Controls whether workers can claim tasks from the pool
CREATE TABLE IF NOT EXISTS pool_config (
id SERIAL PRIMARY KEY,
pool_open BOOLEAN NOT NULL DEFAULT true,
closed_reason TEXT,
closed_at TIMESTAMPTZ,
closed_by VARCHAR(100),
opened_at TIMESTAMPTZ DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Insert default config (pool open)
INSERT INTO pool_config (pool_open, opened_at)
VALUES (true, NOW())
ON CONFLICT DO NOTHING;
-- Update claim_task function to check pool status
CREATE OR REPLACE FUNCTION claim_task(
p_role VARCHAR(50),
p_worker_id VARCHAR(100),
p_curl_passed BOOLEAN DEFAULT TRUE,
p_http_passed BOOLEAN DEFAULT FALSE
) RETURNS worker_tasks AS $$
DECLARE
claimed_task worker_tasks;
worker_state VARCHAR(2);
session_valid BOOLEAN;
session_tasks INT;
max_tasks INT;
is_pool_open BOOLEAN;
BEGIN
-- Check if pool is open
SELECT pool_open INTO is_pool_open FROM pool_config LIMIT 1;
IF NOT COALESCE(is_pool_open, true) THEN
RETURN NULL; -- Pool is closed, no claiming allowed
END IF;
-- Get worker's current geo session info
SELECT
current_state,
session_task_count,
session_max_tasks,
(geo_session_started_at IS NOT NULL AND geo_session_started_at > NOW() - INTERVAL '60 minutes')
INTO worker_state, session_tasks, max_tasks, session_valid
FROM worker_registry
WHERE worker_id = p_worker_id;
-- Check if worker has reached max concurrent tasks (default 5)
IF session_tasks >= COALESCE(max_tasks, 5) THEN
RETURN NULL;
END IF;
-- If no valid geo session, or session expired, worker can't claim tasks
-- Worker must re-qualify first
IF worker_state IS NULL OR NOT session_valid THEN
RETURN NULL;
END IF;
-- Claim task matching worker's state
UPDATE worker_tasks
SET
status = 'claimed',
worker_id = p_worker_id,
claimed_at = NOW(),
updated_at = NOW()
WHERE id = (
SELECT wt.id FROM worker_tasks wt
JOIN dispensaries d ON wt.dispensary_id = d.id
WHERE wt.role = p_role
AND wt.status = 'pending'
AND (wt.scheduled_for IS NULL OR wt.scheduled_for <= NOW())
-- GEO FILTER: Task's dispensary must match worker's state
AND d.state = worker_state
-- Method compatibility: worker must have passed the required preflight
AND (
wt.method IS NULL -- No preference, any worker can claim
OR (wt.method = 'curl' AND p_curl_passed = TRUE)
OR (wt.method = 'http' AND p_http_passed = TRUE)
)
-- Exclude stores that already have an active task
AND (wt.dispensary_id IS NULL OR wt.dispensary_id NOT IN (
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
AND dispensary_id IS NOT NULL
AND dispensary_id != wt.dispensary_id
))
ORDER BY wt.priority DESC, wt.created_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING * INTO claimed_task;
-- INCREMENT session_task_count if we claimed a task
IF claimed_task.id IS NOT NULL THEN
UPDATE worker_registry
SET session_task_count = session_task_count + 1
WHERE worker_id = p_worker_id;
END IF;
RETURN claimed_task;
END;
$$ LANGUAGE plpgsql;
-- Verify
SELECT 'pool_config table created' as status;
SELECT * FROM pool_config;

View File

@@ -0,0 +1,60 @@
-- Migration 129: Claim tasks for specific geo
-- Used after worker gets IP to claim more tasks for same geo
-- Function: Claim up to N tasks for a SPECIFIC geo (state/city)
-- Different from claim_tasks_batch which picks the geo with most tasks
CREATE OR REPLACE FUNCTION claim_tasks_batch_for_geo(
p_worker_id VARCHAR(255),
p_max_tasks INTEGER DEFAULT 4,
p_state_code VARCHAR(2),
p_city VARCHAR(100) DEFAULT NULL,
p_role VARCHAR(50) DEFAULT NULL
) RETURNS TABLE (
task_id INTEGER,
role VARCHAR(50),
dispensary_id INTEGER,
dispensary_name VARCHAR(255),
city VARCHAR(100),
state_code VARCHAR(2),
platform VARCHAR(50),
method VARCHAR(20)
) AS $$
BEGIN
-- Claim up to p_max_tasks for the specified geo
RETURN QUERY
WITH claimed AS (
UPDATE worker_tasks t SET
status = 'claimed',
worker_id = p_worker_id,
claimed_at = NOW()
FROM (
SELECT t2.id
FROM worker_tasks t2
JOIN dispensaries d ON t2.dispensary_id = d.id
WHERE t2.status = 'pending'
AND d.state = p_state_code
AND (p_city IS NULL OR d.city = p_city)
AND (p_role IS NULL OR t2.role = p_role)
ORDER BY t2.priority DESC, t2.created_at ASC
FOR UPDATE SKIP LOCKED
LIMIT p_max_tasks
) sub
WHERE t.id = sub.id
RETURNING t.id, t.role, t.dispensary_id, t.method
)
SELECT
c.id as task_id,
c.role,
c.dispensary_id,
d.name as dispensary_name,
d.city,
d.state as state_code,
d.platform,
c.method
FROM claimed c
JOIN dispensaries d ON c.dispensary_id = d.id;
END;
$$ LANGUAGE plpgsql;
-- Verify
SELECT 'claim_tasks_batch_for_geo function created' as status;

View File

@@ -0,0 +1,49 @@
-- Hoodie Comparison Reports
-- Stores delta results from comparing Hoodie data against CannaIQ
-- Raw Hoodie data stays remote (proxy only) - we only store comparison results
CREATE TABLE IF NOT EXISTS hoodie_comparison_reports (
id SERIAL PRIMARY KEY,
report_type VARCHAR(50) NOT NULL, -- 'dispensaries', 'brands', 'products'
state VARCHAR(50) NOT NULL,
-- Counts
hoodie_total INT NOT NULL DEFAULT 0,
cannaiq_total INT NOT NULL DEFAULT 0,
in_both INT NOT NULL DEFAULT 0,
hoodie_only INT NOT NULL DEFAULT 0,
cannaiq_only INT NOT NULL DEFAULT 0,
-- Delta details (JSONB for flexibility)
hoodie_only_items JSONB DEFAULT '[]', -- Items in Hoodie but not CannaIQ
cannaiq_only_items JSONB DEFAULT '[]', -- Items in CannaIQ but not Hoodie
matched_items JSONB DEFAULT '[]', -- Items in both (with any differences)
-- Metadata
created_at TIMESTAMPTZ DEFAULT NOW(),
duration_ms INT, -- How long the comparison took
error TEXT -- Any errors during comparison
);
-- Index for querying latest reports
CREATE INDEX idx_hoodie_reports_type_state ON hoodie_comparison_reports(report_type, state, created_at DESC);
CREATE INDEX idx_hoodie_reports_created ON hoodie_comparison_reports(created_at DESC);
-- View for latest report per type/state
CREATE OR REPLACE VIEW v_hoodie_latest_reports AS
SELECT DISTINCT ON (report_type, state)
id,
report_type,
state,
hoodie_total,
cannaiq_total,
in_both,
hoodie_only,
cannaiq_only,
created_at,
duration_ms
FROM hoodie_comparison_reports
WHERE error IS NULL
ORDER BY report_type, state, created_at DESC;
COMMENT ON TABLE hoodie_comparison_reports IS 'Stores comparison results between Hoodie and CannaIQ data. Raw Hoodie data stays remote.';

View File

@@ -0,0 +1,53 @@
-- Migration 130: Worker qualification badge
-- Session-scoped badge showing worker qualification status
-- Add badge column to worker_registry
ALTER TABLE worker_registry
ADD COLUMN IF NOT EXISTS badge VARCHAR(20) DEFAULT NULL;
-- Add qualified_at timestamp
ALTER TABLE worker_registry
ADD COLUMN IF NOT EXISTS qualified_at TIMESTAMPTZ DEFAULT NULL;
-- Add current_session_id to link worker to their active session
ALTER TABLE worker_registry
ADD COLUMN IF NOT EXISTS current_session_id INTEGER DEFAULT NULL;
-- Badge values:
-- 'gold' = preflight passed, actively qualified with valid session
-- NULL = not qualified (no active session or session expired)
-- Function: Set worker badge to gold when qualified
CREATE OR REPLACE FUNCTION set_worker_qualified(
p_worker_id VARCHAR(255),
p_session_id INTEGER
) RETURNS BOOLEAN AS $$
BEGIN
UPDATE worker_registry
SET badge = 'gold',
qualified_at = NOW(),
current_session_id = p_session_id
WHERE worker_id = p_worker_id;
RETURN FOUND;
END;
$$ LANGUAGE plpgsql;
-- Function: Clear worker badge when session ends
CREATE OR REPLACE FUNCTION clear_worker_badge(p_worker_id VARCHAR(255))
RETURNS BOOLEAN AS $$
BEGIN
UPDATE worker_registry
SET badge = NULL,
qualified_at = NULL,
current_session_id = NULL
WHERE worker_id = p_worker_id;
RETURN FOUND;
END;
$$ LANGUAGE plpgsql;
-- Index for finding qualified workers
CREATE INDEX IF NOT EXISTS idx_worker_registry_badge
ON worker_registry(badge) WHERE badge IS NOT NULL;
-- Verify
SELECT 'worker_registry badge column added' as status;

View File

@@ -0,0 +1,21 @@
-- Migration: 131_normalize_brand
-- Purpose: Add normalize_brand() function for fuzzy brand matching across dispensaries
-- Used by Cannabrands integration to match brand names regardless of spelling variations
-- Function to normalize brand names for matching
-- "Aloha TymeMachine" → "alohatymemachine"
-- "ALOHA TYME MACHINE" → "alohatymemachine"
-- "Aloha Tyme Machine" → "alohatymemachine"
CREATE OR REPLACE FUNCTION normalize_brand(name TEXT)
RETURNS TEXT AS $$
SELECT LOWER(REGEXP_REPLACE(COALESCE(name, ''), '[^a-zA-Z0-9]', '', 'g'))
$$ LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
-- Create functional index for efficient lookups
-- This allows queries like: WHERE normalize_brand(brand_name_raw) = 'alohatymemachine'
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_store_products_brand_normalized
ON store_products (normalize_brand(brand_name_raw));
-- Also index on snapshots table for historical queries
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_store_product_snapshots_brand_normalized
ON store_product_snapshots (normalize_brand(brand_name_raw));

View File

@@ -0,0 +1,373 @@
-- Real-Time Inventory Tracking System
-- Tracks every quantity/price change at the SKU level
-- DIFFS ONLY - no full payload storage (too much data)
-- Trade-off: Cannot re-analyze historical cannabinoids/effects, but saves ~2TB/month at scale
-- ============================================================================
-- INVENTORY CHANGES TABLE
-- ============================================================================
-- One row per change event (sale, restock, price change, new, removed)
-- At 1 payload/minute with ~5 changes each = ~7,200 rows/day/store
CREATE TABLE IF NOT EXISTS inventory_changes (
id BIGSERIAL PRIMARY KEY,
-- Store reference
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
-- Product identification
product_id VARCHAR(50) NOT NULL, -- Dutchie product._id
canonical_id VARCHAR(50), -- POSMetaData.children[].canonicalID
canonical_sku VARCHAR(100), -- POSMetaData.children[].canonicalSKU
product_name VARCHAR(500),
brand_name VARCHAR(200),
option VARCHAR(50), -- Weight/size: "1/8oz", "1g", etc.
-- Change type
change_type VARCHAR(20) NOT NULL, -- 'sale', 'restock', 'price_change', 'new', 'removed'
-- Quantity tracking
quantity_before INTEGER,
quantity_after INTEGER,
quantity_delta INTEGER, -- Negative = sale, Positive = restock
-- Price tracking (use today's price for revenue calculation)
price DECIMAL(10,2), -- Regular price at time of change
special_price DECIMAL(10,2), -- Sale price if product is on special
is_special BOOLEAN DEFAULT FALSE,
-- Calculated revenue (for sales only)
-- Formula: ABS(quantity_delta) * COALESCE(special_price, price)
revenue DECIMAL(10,2),
-- Product metadata for analytics
category VARCHAR(100), -- Flower, Concentrate, Edible, etc.
subcategory VARCHAR(100), -- gummies, live-resin, pods, etc.
strain_type VARCHAR(50), -- Indica, Sativa, Hybrid
-- Potency data (captured on 'new' and when changed)
thc_content DECIMAL(5,2), -- THC percentage
cbd_content DECIMAL(5,2), -- CBD percentage
thca_content DECIMAL(5,2), -- THCA percentage
cbg_content DECIMAL(5,2), -- CBG percentage
-- Full cannabinoid profile (JSONB for flexibility)
-- Stored on 'new' products and when cannabinoids change
cannabinoids JSONB, -- [{name: "THCA", value: 31.64, unit: "PERCENTAGE"}, ...]
-- Effects data (user-reported, stored on 'new' and changes)
effects JSONB, -- {calm: 9, happy: 8, relaxed: 6, ...}
-- Timestamps
detected_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
payload_timestamp TIMESTAMPTZ -- When the source payload was captured
);
-- Performance indexes
CREATE INDEX idx_inventory_changes_dispensary_time
ON inventory_changes(dispensary_id, detected_at DESC);
CREATE INDEX idx_inventory_changes_brand_time
ON inventory_changes(brand_name, detected_at DESC)
WHERE brand_name IS NOT NULL;
CREATE INDEX idx_inventory_changes_type_time
ON inventory_changes(change_type, detected_at DESC);
CREATE INDEX idx_inventory_changes_product_lookup
ON inventory_changes(dispensary_id, product_id, option);
CREATE INDEX idx_inventory_changes_sales
ON inventory_changes(dispensary_id, detected_at DESC)
WHERE change_type = 'sale';
CREATE INDEX idx_inventory_changes_category
ON inventory_changes(category, detected_at DESC)
WHERE category IS NOT NULL;
-- ============================================================================
-- PAYLOAD PROCESSING TRACKER
-- ============================================================================
-- Track what we've processed to avoid duplicate processing
-- We don't store payloads, just track what we've seen
CREATE TABLE IF NOT EXISTS payload_processing_log (
id BIGSERIAL PRIMARY KEY,
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
-- Payload identification (hash only, no storage)
payload_hash VARCHAR(64) NOT NULL, -- SHA256 of payload content
payload_timestamp TIMESTAMPTZ NOT NULL, -- When payload was captured
-- Processing results
processed_at TIMESTAMPTZ DEFAULT NOW(),
product_count INTEGER DEFAULT 0, -- Products in this payload
changes_detected INTEGER DEFAULT 0,
sales_detected INTEGER DEFAULT 0,
revenue_detected DECIMAL(10,2) DEFAULT 0,
-- Previous payload for diff reference
previous_payload_hash VARCHAR(64),
UNIQUE(dispensary_id, payload_hash)
);
CREATE INDEX idx_payload_processing_dispensary
ON payload_processing_log(dispensary_id, payload_timestamp DESC);
-- ============================================================================
-- HELPER VIEWS
-- ============================================================================
-- Hourly sales summary
CREATE OR REPLACE VIEW v_hourly_sales AS
SELECT
dispensary_id,
date_trunc('hour', detected_at) as hour,
COUNT(*) as transactions,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
COUNT(DISTINCT brand_name) as brands_sold,
COUNT(DISTINCT product_id) as products_sold
FROM inventory_changes
WHERE change_type = 'sale'
GROUP BY dispensary_id, date_trunc('hour', detected_at);
-- Brand performance (last 7 days)
CREATE OR REPLACE VIEW v_brand_performance AS
SELECT
dispensary_id,
brand_name,
COUNT(*) as sales_count,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
AVG(COALESCE(special_price, price)) as avg_price,
SUM(CASE WHEN is_special THEN revenue ELSE 0 END) as special_revenue,
SUM(CASE WHEN NOT is_special THEN revenue ELSE 0 END) as regular_revenue
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '7 days'
GROUP BY dispensary_id, brand_name;
-- Stock-outs (products that hit zero)
CREATE OR REPLACE VIEW v_stock_outs AS
SELECT
dispensary_id,
product_id,
product_name,
brand_name,
option,
quantity_before,
detected_at as stock_out_at
FROM inventory_changes
WHERE change_type = 'sale'
AND quantity_after = 0
AND detected_at > NOW() - INTERVAL '24 hours';
-- ============================================================================
-- COMMENTS
-- ============================================================================
COMMENT ON TABLE inventory_changes IS 'Real-time inventory change tracking. One row per change event.';
COMMENT ON COLUMN inventory_changes.change_type IS 'sale, restock, price_change, new, removed, cannabinoid_change, effect_change';
COMMENT ON COLUMN inventory_changes.quantity_delta IS 'Negative = sale (qty decreased), Positive = restock (qty increased)';
COMMENT ON COLUMN inventory_changes.revenue IS 'For sales: ABS(quantity_delta) * COALESCE(special_price, price)';
COMMENT ON COLUMN inventory_changes.cannabinoids IS 'Full cannabinoid profile as JSONB. Stored on new products and when values change.';
COMMENT ON COLUMN inventory_changes.effects IS 'User-reported effects as JSONB. Stored on new products and when values change.';
COMMENT ON TABLE payload_processing_log IS 'Tracks processed payloads to avoid duplicates. No payload storage.';
-- ============================================================================
-- DAILY SNAPSHOTS TABLE
-- ============================================================================
-- Stores the first payload of each day as the "benchmark" for new product detection
-- New products are those that appear in current payload but NOT in daily snapshot
CREATE TABLE IF NOT EXISTS daily_snapshots (
id BIGSERIAL PRIMARY KEY,
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
snapshot_date DATE NOT NULL,
-- Full product state as JSONB
products JSONB NOT NULL,
-- Summary stats
product_count INTEGER,
total_skus INTEGER,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(dispensary_id, snapshot_date)
);
CREATE INDEX IF NOT EXISTS idx_daily_snapshots_lookup
ON daily_snapshots(dispensary_id, snapshot_date DESC);
COMMENT ON TABLE daily_snapshots IS 'Daily benchmark payload for each store. First payload of day becomes the benchmark for new product detection.';
-- ============================================================================
-- MATERIALIZED VIEWS FOR MARKET INTELLIGENCE
-- ============================================================================
-- SKU-level sales velocity (refresh hourly)
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_sku_velocity AS
SELECT
dispensary_id,
product_id,
product_name,
brand_name,
option,
category,
SUM(ABS(quantity_delta)) as units_30d,
SUM(revenue) as revenue_30d,
COUNT(*) as transactions_30d,
SUM(ABS(quantity_delta)) / 30.0 as daily_velocity,
AVG(COALESCE(special_price, price)) as avg_price
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, product_id, product_name, brand_name, option, category;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_sku_velocity_pk
ON mv_sku_velocity(dispensary_id, product_id, option);
-- Brand market share by store (refresh hourly)
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_brand_share AS
SELECT
dispensary_id,
brand_name,
SUM(revenue) as brand_revenue,
SUM(ABS(quantity_delta)) as units_sold,
COUNT(DISTINCT product_id) as unique_products,
COUNT(*) as transactions
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
AND brand_name IS NOT NULL
GROUP BY dispensary_id, brand_name;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_brand_share_pk
ON mv_brand_share(dispensary_id, brand_name);
-- Category performance by store (refresh hourly)
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_category_performance AS
SELECT
dispensary_id,
category,
SUM(revenue) as category_revenue,
SUM(ABS(quantity_delta)) as units_sold,
COUNT(DISTINCT product_id) as unique_products,
COUNT(DISTINCT brand_name) as unique_brands,
COUNT(*) as transactions,
AVG(COALESCE(special_price, price)) as avg_price
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
AND category IS NOT NULL
GROUP BY dispensary_id, category;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_category_performance_pk
ON mv_category_performance(dispensary_id, category);
-- Hourly sales patterns (refresh hourly)
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_hourly_patterns AS
SELECT
dispensary_id,
EXTRACT(dow FROM detected_at) as day_of_week,
EXTRACT(hour FROM detected_at) as hour_of_day,
COUNT(*) as transactions,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
AVG(revenue) as avg_transaction_value
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, EXTRACT(dow FROM detected_at), EXTRACT(hour FROM detected_at);
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_hourly_patterns_pk
ON mv_hourly_patterns(dispensary_id, day_of_week, hour_of_day);
-- Brand distribution across stores
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_brand_distribution AS
SELECT
brand_name,
COUNT(DISTINCT dispensary_id) as store_count,
SUM(revenue) as total_revenue,
SUM(ABS(quantity_delta)) as total_units,
AVG(COALESCE(special_price, price)) as avg_price_across_stores
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
AND brand_name IS NOT NULL
GROUP BY brand_name;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_brand_distribution_pk
ON mv_brand_distribution(brand_name);
-- New products performance (first 30 days)
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_new_product_performance AS
SELECT
ic.dispensary_id,
ic.product_id,
ic.product_name,
ic.brand_name,
ic.option,
ic.category,
new_product.first_seen,
SUM(ABS(ic.quantity_delta)) as units_sold_30d,
SUM(ic.revenue) as revenue_30d,
COUNT(*) as transactions_30d
FROM inventory_changes ic
JOIN (
SELECT dispensary_id, product_id, option, MIN(detected_at) as first_seen
FROM inventory_changes
WHERE change_type = 'new'
AND detected_at > NOW() - INTERVAL '60 days'
GROUP BY dispensary_id, product_id, option
) new_product ON ic.dispensary_id = new_product.dispensary_id
AND ic.product_id = new_product.product_id
AND ic.option = new_product.option
WHERE ic.change_type = 'sale'
AND ic.detected_at >= new_product.first_seen
AND ic.detected_at < new_product.first_seen + INTERVAL '30 days'
GROUP BY ic.dispensary_id, ic.product_id, ic.product_name, ic.brand_name,
ic.option, ic.category, new_product.first_seen;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_new_product_performance_pk
ON mv_new_product_performance(dispensary_id, product_id, option);
-- Promotional analysis
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_promotional_analysis AS
SELECT
dispensary_id,
brand_name,
category,
is_special,
COUNT(*) as transactions,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
AVG(price) as avg_regular_price,
AVG(special_price) FILTER (WHERE is_special) as avg_special_price
FROM inventory_changes
WHERE change_type = 'sale'
AND detected_at > NOW() - INTERVAL '30 days'
GROUP BY dispensary_id, brand_name, category, is_special;
CREATE UNIQUE INDEX IF NOT EXISTS idx_mv_promotional_analysis_pk
ON mv_promotional_analysis(dispensary_id, brand_name, category, is_special);
-- Function to refresh all materialized views
CREATE OR REPLACE FUNCTION refresh_inventory_views()
RETURNS void AS $$
BEGIN
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_sku_velocity;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_brand_share;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_category_performance;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_hourly_patterns;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_brand_distribution;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_new_product_performance;
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_promotional_analysis;
END;
$$ LANGUAGE plpgsql;
COMMENT ON FUNCTION refresh_inventory_views IS 'Refresh all inventory analytics materialized views. Run hourly via cron.';

View File

@@ -0,0 +1,179 @@
-- ============================================================
-- Store Status Observations
--
-- Records real-time open/closed status from ConsumerDispensaries API.
-- Over time, patterns emerge showing actual operating hours.
--
-- Data source: ConsumerDispensaries GraphQL query `status` field
-- Values: "open" or "closed"
-- ============================================================
-- Store timezone for local time calculation
ALTER TABLE dispensaries ADD COLUMN IF NOT EXISTS timezone VARCHAR(50);
-- Table to record status observations
CREATE TABLE IF NOT EXISTS store_status_observations (
id BIGSERIAL PRIMARY KEY,
dispensary_id INTEGER REFERENCES dispensaries(id),
location_id INTEGER REFERENCES dutchie_discovery_locations(id),
status VARCHAR(20) NOT NULL, -- 'open' or 'closed'
observed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
local_time TIME, -- time in store's timezone
day_of_week SMALLINT, -- 0=Sunday, 6=Saturday
source VARCHAR(50) DEFAULT 'crawl', -- 'crawl', 'discovery', 'manual'
CONSTRAINT chk_status_has_reference CHECK (dispensary_id IS NOT NULL OR location_id IS NOT NULL)
);
-- Index for pattern analysis
CREATE INDEX IF NOT EXISTS idx_store_status_dispensary_time
ON store_status_observations(dispensary_id, day_of_week, local_time)
WHERE dispensary_id IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_store_status_location_time
ON store_status_observations(location_id, day_of_week, local_time)
WHERE location_id IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_store_status_observed
ON store_status_observations(observed_at DESC);
-- Partition-friendly index for cleanup
CREATE INDEX IF NOT EXISTS idx_store_status_dispensary_observed
ON store_status_observations(dispensary_id, observed_at DESC)
WHERE dispensary_id IS NOT NULL;
COMMENT ON TABLE store_status_observations IS 'Records open/closed status observations to learn store hours over time';
-- ============================================================
-- record_store_status() - Record an observation
-- Accepts either dispensary_id or location_id (or both)
-- ============================================================
CREATE OR REPLACE FUNCTION record_store_status(
p_dispensary_id INTEGER,
p_status VARCHAR(20),
p_source VARCHAR(50) DEFAULT 'crawl',
p_location_id INTEGER DEFAULT NULL
)
RETURNS BIGINT AS $$
DECLARE
v_tz TEXT;
v_local_time TIME;
v_dow SMALLINT;
v_id BIGINT;
BEGIN
-- Get timezone from dispensary or location
IF p_dispensary_id IS NOT NULL THEN
SELECT timezone INTO v_tz FROM dispensaries WHERE id = p_dispensary_id;
ELSIF p_location_id IS NOT NULL THEN
SELECT timezone INTO v_tz FROM dutchie_discovery_locations WHERE id = p_location_id;
END IF;
-- Calculate local time (default to UTC if no timezone)
IF v_tz IS NOT NULL THEN
v_local_time := (NOW() AT TIME ZONE v_tz)::TIME;
v_dow := EXTRACT(DOW FROM NOW() AT TIME ZONE v_tz)::SMALLINT;
ELSE
v_local_time := NOW()::TIME;
v_dow := EXTRACT(DOW FROM NOW())::SMALLINT;
END IF;
INSERT INTO store_status_observations (dispensary_id, location_id, status, local_time, day_of_week, source)
VALUES (p_dispensary_id, p_location_id, p_status, v_local_time, v_dow, p_source)
RETURNING id INTO v_id;
RETURN v_id;
END;
$$ LANGUAGE plpgsql;
-- ============================================================
-- View: v_store_hours_pattern - Derived hours from observations
-- Shows the typical open/close times per day based on status changes
-- Includes both promoted dispensaries and discovery locations
-- ============================================================
CREATE OR REPLACE VIEW v_store_hours_pattern AS
WITH status_changes AS (
-- Find transitions between open/closed
SELECT
COALESCE(o.dispensary_id, l.dispensary_id) as dispensary_id,
o.location_id,
o.day_of_week,
o.local_time,
o.status,
LAG(o.status) OVER (
PARTITION BY COALESCE(o.dispensary_id, o.location_id), o.day_of_week
ORDER BY o.local_time
) as prev_status
FROM store_status_observations o
LEFT JOIN dutchie_discovery_locations l ON o.location_id = l.id
WHERE o.observed_at > NOW() - INTERVAL '30 days'
),
boundaries AS (
-- Identify open/close boundaries
SELECT
dispensary_id,
location_id,
day_of_week,
local_time,
CASE
WHEN status = 'open' AND (prev_status = 'closed' OR prev_status IS NULL) THEN 'opens'
WHEN status = 'closed' AND prev_status = 'open' THEN 'closes'
END as boundary_type
FROM status_changes
WHERE status != prev_status OR prev_status IS NULL
)
SELECT
COALESCE(d.id, l.dispensary_id) as dispensary_id,
b.location_id,
COALESCE(d.name, l.name) as name,
COALESCE(d.state, l.state_code) as state,
b.day_of_week,
CASE b.day_of_week
WHEN 0 THEN 'Sunday'
WHEN 1 THEN 'Monday'
WHEN 2 THEN 'Tuesday'
WHEN 3 THEN 'Wednesday'
WHEN 4 THEN 'Thursday'
WHEN 5 THEN 'Friday'
WHEN 6 THEN 'Saturday'
END as day_name,
MIN(CASE WHEN b.boundary_type = 'opens' THEN b.local_time END) as typical_open,
MAX(CASE WHEN b.boundary_type = 'closes' THEN b.local_time END) as typical_close,
COUNT(*) as observations
FROM boundaries b
LEFT JOIN dispensaries d ON d.id = b.dispensary_id
LEFT JOIN dutchie_discovery_locations l ON l.id = b.location_id
WHERE b.boundary_type IS NOT NULL
GROUP BY COALESCE(d.id, l.dispensary_id), b.location_id, COALESCE(d.name, l.name), COALESCE(d.state, l.state_code), b.day_of_week;
COMMENT ON VIEW v_store_hours_pattern IS 'Derived operating hours from status observations';
-- ============================================================
-- Cleanup: Keep 90 days of observations
-- Run periodically: SELECT cleanup_old_status_observations();
-- ============================================================
CREATE OR REPLACE FUNCTION cleanup_old_status_observations(p_days INTEGER DEFAULT 90)
RETURNS INTEGER AS $$
DECLARE
v_deleted INTEGER;
BEGIN
DELETE FROM store_status_observations
WHERE observed_at < NOW() - (p_days || ' days')::INTERVAL;
GET DIAGNOSTICS v_deleted = ROW_COUNT;
RETURN v_deleted;
END;
$$ LANGUAGE plpgsql;
-- ============================================================
-- Example usage:
--
-- Record a status observation:
-- SELECT record_store_status(106, 'open');
-- SELECT record_store_status(106, 'closed', 'discovery');
--
-- Check recent observations for a store:
-- SELECT * FROM store_status_observations
-- WHERE dispensary_id = 106 ORDER BY observed_at DESC LIMIT 20;
--
-- See derived hours pattern:
-- SELECT * FROM v_store_hours_pattern WHERE dispensary_id = 106;
-- ============================================================

1989
backend/node_modules/.package-lock.json generated vendored

File diff suppressed because it is too large Load Diff

1756
backend/package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -22,8 +22,10 @@
"seed:dt:cities:bulk": "tsx src/scripts/seed-dt-cities-bulk.ts"
},
"dependencies": {
"@aws-sdk/client-s3": "^3.953.0",
"@kubernetes/client-node": "^1.4.0",
"@types/bcryptjs": "^3.0.0",
"algoliasearch": "^5.46.1",
"axios": "^1.6.2",
"bcrypt": "^5.1.1",
"bcryptjs": "^3.0.3",

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -1 +1 @@
cannaiq-menus-1.7.0.zip
cannaiq-menus-2.3.0.zip

View File

@@ -151,18 +151,6 @@ function generateSlug(name: string, city: string, state: string): string {
return base;
}
/**
* Derive menu_type from platform_menu_url pattern
*/
function deriveMenuType(url: string | null): string {
if (!url) return 'unknown';
if (url.includes('/dispensary/')) return 'standalone';
if (url.includes('/embedded-menu/')) return 'embedded';
if (url.includes('/stores/')) return 'standalone';
// Custom domain = embedded widget on store's site
if (!url.includes('dutchie.com')) return 'embedded';
return 'unknown';
}
/**
* Log a promotion action to dutchie_promotion_log
@@ -415,7 +403,7 @@ async function promoteLocation(
loc.timezone, // $15 timezone
loc.platform_location_id, // $16 platform_dispensary_id
loc.platform_menu_url, // $17 menu_url
deriveMenuType(loc.platform_menu_url), // $18 menu_type
'dutchie', // $18 menu_type
loc.description, // $19 description
loc.logo_image, // $20 logo_image
loc.banner_image, // $21 banner_image

View File

@@ -105,6 +105,7 @@ import { createSystemRouter, createPrometheusRouter } from './system/routes';
import { createPortalRoutes } from './portals';
import { createStatesRouter } from './routes/states';
import { createAnalyticsV2Router } from './routes/analytics-v2';
import { createBrandsRouter } from './routes/brands';
import { createDiscoveryRoutes } from './discovery';
import pipelineRoutes from './routes/pipeline';
@@ -123,6 +124,7 @@ import workerRegistryRoutes from './routes/worker-registry';
// Per TASK_WORKFLOW_2024-12-10.md: Raw payload access API
import payloadsRoutes from './routes/payloads';
import k8sRoutes from './routes/k8s';
import poolRoutes from './routes/pool';
// Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com)
@@ -220,6 +222,10 @@ console.log('[Payloads] Routes registered at /api/payloads');
app.use('/api/k8s', k8sRoutes);
console.log('[K8s] Routes registered at /api/k8s');
// Pool control routes - open/close pool, manage tasks
app.use('/api/pool', poolRoutes);
console.log('[Pool] Routes registered at /api/pool');
// Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation
try {
const analyticsV2Router = createAnalyticsV2Router(getPool());
@@ -229,6 +235,15 @@ try {
console.warn('[AnalyticsV2] Failed to register routes:', error);
}
// Brand Analytics API - Hoodie Analytics-style market intelligence
try {
const brandsRouter = createBrandsRouter(getPool());
app.use('/api/brands', brandsRouter);
console.log('[Brands] Routes registered at /api/brands');
} catch (error) {
console.warn('[Brands] Failed to register routes:', error);
}
// Public API v1 - External consumer endpoints (WordPress, etc.)
// Uses dutchie_az data pipeline with per-dispensary API key auth
app.use('/api/v1', publicApiRoutes);
@@ -256,6 +271,26 @@ console.log('[ClickAnalytics] Routes registered at /api/analytics/clicks');
app.use('/api/analytics/price', priceAnalyticsRoutes);
console.log('[PriceAnalytics] Routes registered at /api/analytics/price');
// Sales Analytics API - sales velocity, brand market share, product intelligence
import salesAnalyticsRoutes from './routes/sales-analytics';
app.use('/api/sales-analytics', salesAnalyticsRoutes);
console.log('[SalesAnalytics] Routes registered at /api/sales-analytics');
// Inventory Tracking API - high-frequency crawls, snapshots, visibility events
import inventoryRoutes from './routes/inventory';
app.use('/api/inventory', inventoryRoutes);
console.log('[Inventory] Routes registered at /api/inventory');
// Inventory Analytics API - real-time sales, brand performance, stock-outs
import inventoryAnalyticsRoutes from './routes/inventory-analytics';
app.use('/api/inventory/analytics', inventoryAnalyticsRoutes);
console.log('[InventoryAnalytics] Routes registered at /api/inventory/analytics');
// Hoodie Analytics API - proxy queries to Hoodie's Algolia
import hoodieRoutes from './routes/hoodie';
app.use('/api/hoodie', hoodieRoutes);
console.log('[Hoodie] Routes registered at /api/hoodie');
// States API routes - cannabis legalization status and targeting
try {
const statesRouter = createStatesRouter(getPool());

View File

@@ -289,6 +289,160 @@ export function getStoreConfig(): TreezStoreConfig | null {
return currentStoreConfig;
}
/**
* Extract store config from page HTML for SSR sites.
*
* SSR sites (like BEST Dispensary) pre-render data and don't make client-side
* API requests. The config is embedded in __NEXT_DATA__ or window variables.
*
* Looks for:
* - __NEXT_DATA__.props.pageProps.msoStoreConfig.orgId / entityId
* - window.__SETTINGS__.msoOrgId / msoStoreEntityId
* - treezStores config in page data
*/
async function extractConfigFromPage(page: Page): Promise<TreezStoreConfig | null> {
console.log('[Treez Client] Attempting to extract config from page HTML (SSR fallback)...');
const config = await page.evaluate(() => {
// Try __NEXT_DATA__ first (Next.js SSR)
const nextDataEl = document.getElementById('__NEXT_DATA__');
if (nextDataEl) {
try {
const nextData = JSON.parse(nextDataEl.textContent || '{}');
const pageProps = nextData?.props?.pageProps;
// Look for MSO config in various locations
const msoConfig = pageProps?.msoStoreConfig || pageProps?.storeConfig || {};
const settings = pageProps?.settings || {};
// Extract org-id and entity-id
let orgId = msoConfig.orgId || msoConfig.msoOrgId || settings.msoOrgId;
let entityId = msoConfig.entityId || msoConfig.msoStoreEntityId || settings.msoStoreEntityId;
// Also check treezStores array
if (!orgId || !entityId) {
const treezStores = pageProps?.treezStores || nextData?.props?.treezStores;
if (treezStores && Array.isArray(treezStores) && treezStores.length > 0) {
const store = treezStores[0];
orgId = orgId || store.orgId || store.organization_id;
entityId = entityId || store.entityId || store.entity_id || store.storeId;
}
}
// Check for API settings
const apiSettings = pageProps?.apiSettings || settings.api || {};
if (orgId && entityId) {
return {
orgId,
entityId,
esUrl: apiSettings.esUrl || null,
apiKey: apiSettings.apiKey || null,
};
}
} catch (e) {
console.error('Error parsing __NEXT_DATA__:', e);
}
}
// Try window variables
const win = window as any;
if (win.__SETTINGS__) {
const s = win.__SETTINGS__;
if (s.msoOrgId && s.msoStoreEntityId) {
return {
orgId: s.msoOrgId,
entityId: s.msoStoreEntityId,
esUrl: s.esUrl || null,
apiKey: s.apiKey || null,
};
}
}
// Try Next.js App Router streaming format (self.__next_f.push)
// This format is used by newer Next.js sites like BEST Dispensary
try {
const scripts = Array.from(document.querySelectorAll('script'));
for (const script of scripts) {
const content = script.textContent || '';
if (content.includes('self.__next_f.push') && content.includes('"Treez"')) {
// Extract JSON data from the streaming format
// Format: self.__next_f.push([1,"...escaped json..."])
const matches = content.match(/self\.__next_f\.push\(\[1,"(.+?)"\]\)/g);
if (matches) {
for (const match of matches) {
try {
// Extract the JSON string and unescape it
const jsonMatch = match.match(/self\.__next_f\.push\(\[1,"(.+?)"\]\)/);
if (jsonMatch && jsonMatch[1]) {
// Unescape the JSON string
const unescaped = jsonMatch[1]
.replace(/\\"/g, '"')
.replace(/\\n/g, '\n')
.replace(/\\\\/g, '\\');
// Look for Treez credentials in the data
// Pattern: "apps":[{"name":"Treez","credentials":[{"store_id":"..."}]}]
const treezMatch = unescaped.match(/"apps":\s*\[([\s\S]*?)\]/);
if (treezMatch) {
const appsStr = '[' + treezMatch[1] + ']';
try {
const apps = JSON.parse(appsStr);
const treezApp = apps.find((a: any) => a.name === 'Treez' || a.handler === 'treez');
if (treezApp && treezApp.credentials && treezApp.credentials.length > 0) {
const cred = treezApp.credentials[0];
if (cred.store_id) {
console.log('[Treez Client] Found config in App Router streaming data');
return {
orgId: cred.headless_client_id || null,
entityId: cred.store_id,
esUrl: null,
apiKey: cred.headless_client_secret || null,
};
}
}
} catch (e) {
// Continue searching
}
}
}
} catch (e) {
// Continue to next match
}
}
}
}
}
} catch (e) {
console.error('Error parsing App Router streaming data:', e);
}
return null;
});
if (!config || !config.orgId || !config.entityId) {
console.log('[Treez Client] Could not extract config from page');
return null;
}
// Build full config with defaults for missing values
const fullConfig: TreezStoreConfig = {
orgId: config.orgId,
entityId: config.entityId,
// Default ES URL pattern - gapcommerce is the common tenant
esUrl: config.esUrl || 'https://search-gapcommerce.gapcommerceapi.com/product/search',
// Use default API key from config
apiKey: config.apiKey || TREEZ_CONFIG.esApiKey,
};
console.log('[Treez Client] Extracted config from page (SSR):');
console.log(` ES URL: ${fullConfig.esUrl}`);
console.log(` Org ID: ${fullConfig.orgId}`);
console.log(` Entity ID: ${fullConfig.entityId}`);
return fullConfig;
}
// ============================================================
// PRODUCT FETCHING (Direct API Approach)
// ============================================================
@@ -343,9 +497,15 @@ export async function fetchAllProducts(
// Wait for initial page load to trigger first API request
await sleep(3000);
// Check if we captured the store config
// Check if we captured the store config from network requests
if (!currentStoreConfig) {
console.error('[Treez Client] Failed to capture store config from browser requests');
console.log('[Treez Client] No API requests captured - trying SSR fallback...');
// For SSR sites, extract config from page HTML
currentStoreConfig = await extractConfigFromPage(page);
}
if (!currentStoreConfig) {
console.error('[Treez Client] Failed to capture store config from browser requests or page HTML');
throw new Error('Failed to capture Treez store config');
}

1291
backend/src/routes/brands.ts Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -16,7 +16,82 @@ import { authMiddleware } from '../auth/middleware';
const router = Router();
// All click analytics endpoints require authentication
/**
* POST /api/analytics/click
* Record a click event from WordPress plugin
* This endpoint is public but requires API token in Authorization header
*/
router.post('/click', async (req: Request, res: Response) => {
try {
// Get API token from Authorization header
const authHeader = req.headers.authorization;
if (!authHeader || !authHeader.startsWith('Bearer ')) {
return res.status(401).json({ error: 'Missing API token' });
}
const apiToken = authHeader.substring(7);
// Validate API token and get store_id
const tokenResult = await pool.query(
'SELECT store_id FROM api_tokens WHERE token = $1 AND is_active = true',
[apiToken]
);
if (tokenResult.rows.length === 0) {
return res.status(401).json({ error: 'Invalid API token' });
}
const tokenStoreId = tokenResult.rows[0].store_id;
const {
event_type,
store_id,
product_id,
product_name,
product_price,
category,
url,
referrer,
timestamp
} = req.body;
// Use store_id from token if not provided in request
const finalStoreId = store_id || tokenStoreId;
// Insert click event
await pool.query(`
INSERT INTO product_click_events (
store_id,
product_id,
brand_id,
action,
metadata,
occurred_at
) VALUES ($1, $2, $3, $4, $5, $6)
`, [
finalStoreId,
product_id || null,
null, // brand_id will be looked up later if needed
event_type || 'click',
JSON.stringify({
product_name,
product_price,
category,
url,
referrer,
source: 'wordpress_plugin'
}),
timestamp || new Date().toISOString()
]);
res.json({ success: true });
} catch (error: any) {
console.error('[ClickAnalytics] Error recording click:', error.message);
res.status(500).json({ error: 'Failed to record click' });
}
});
// All other click analytics endpoints require authentication
router.use(authMiddleware);
/**

View File

@@ -0,0 +1,532 @@
/**
* Hoodie Analytics API Routes
*
* Proxy routes that query Hoodie's Algolia directly.
* Comparison reports stored locally - raw Hoodie data stays remote.
*/
import { Router, Request, Response } from 'express';
import { hoodieClient } from '../services/hoodie/client';
import {
runComparisonReport,
runAllComparisons,
getLatestReports,
getReportById,
getReportHistory,
} from '../services/hoodie/comparison';
const router = Router();
// ============================================================
// STATS
// ============================================================
/**
* GET /api/hoodie/stats
* Get total counts from all Hoodie indexes
*/
router.get('/stats', async (req: Request, res: Response) => {
try {
const counts = await hoodieClient.getIndexCounts();
res.json({ success: true, counts });
} catch (error: any) {
console.error('Hoodie stats error:', error);
res.status(500).json({ error: 'Failed to fetch Hoodie stats', message: error.message });
}
});
/**
* GET /api/hoodie/stats/:state
* Get stats for a specific state
*/
router.get('/stats/:state', async (req: Request, res: Response) => {
try {
const { state } = req.params;
const stats = await hoodieClient.getStateStats(state);
res.json({ success: true, state, stats });
} catch (error: any) {
console.error('Hoodie state stats error:', error);
res.status(500).json({ error: 'Failed to fetch state stats', message: error.message });
}
});
// ============================================================
// DISPENSARIES
// ============================================================
/**
* GET /api/hoodie/dispensaries
* Search dispensaries
* Query params: q, state, city, pos, banner, page, limit
*/
router.get('/dispensaries', async (req: Request, res: Response) => {
try {
const { q, state, city, pos, banner, page = '0', limit = '20' } = req.query;
const filters: string[] = [];
if (state) filters.push(`STATE:"${state}"`);
if (city) filters.push(`CITY:"${city}"`);
if (pos) filters.push(`POS_SYSTEM:"${pos}"`);
if (banner) filters.push(`BANNER:"${banner}"`);
const result = await hoodieClient.searchDispensaries({
query: (q as string) || '',
filters: filters.length > 0 ? filters.join(' AND ') : undefined,
page: parseInt(page as string, 10),
hitsPerPage: Math.min(parseInt(limit as string, 10), 100),
});
res.json({
success: true,
dispensaries: result.hits,
pagination: {
total: result.nbHits,
page: result.page,
pages: result.nbPages,
limit: result.hitsPerPage,
},
});
} catch (error: any) {
console.error('Hoodie dispensaries error:', error);
res.status(500).json({ error: 'Failed to fetch dispensaries', message: error.message });
}
});
/**
* GET /api/hoodie/dispensaries/slug/:slug
* Get dispensary by slug
*/
router.get('/dispensaries/slug/:slug', async (req: Request, res: Response) => {
try {
const { slug } = req.params;
const dispensary = await hoodieClient.getDispensaryBySlug(slug);
if (!dispensary) {
return res.status(404).json({ error: 'Dispensary not found' });
}
res.json({ success: true, dispensary });
} catch (error: any) {
console.error('Hoodie dispensary by slug error:', error);
res.status(500).json({ error: 'Failed to fetch dispensary', message: error.message });
}
});
// ============================================================
// PRODUCTS
// ============================================================
/**
* GET /api/hoodie/products
* Search products
* Query params: q, brand, category, state, page, limit
*/
router.get('/products', async (req: Request, res: Response) => {
try {
const { q, brand, category, state, page = '0', limit = '20' } = req.query;
const filters: string[] = [];
if (brand) filters.push(`BRAND:"${brand}"`);
if (category) filters.push(`CATEGORY_0:"${category}"`);
if (state) filters.push(`D_STATE:"${state}"`);
const result = await hoodieClient.searchProducts({
query: (q as string) || '',
filters: filters.length > 0 ? filters.join(' AND ') : undefined,
page: parseInt(page as string, 10),
hitsPerPage: Math.min(parseInt(limit as string, 10), 100),
});
res.json({
success: true,
products: result.hits,
pagination: {
total: result.nbHits,
page: result.page,
pages: result.nbPages,
limit: result.hitsPerPage,
},
});
} catch (error: any) {
console.error('Hoodie products error:', error);
res.status(500).json({ error: 'Failed to fetch products', message: error.message });
}
});
/**
* GET /api/hoodie/master-products
* Search master (deduplicated) products
* Query params: q, brand, category, page, limit
*/
router.get('/master-products', async (req: Request, res: Response) => {
try {
const { q, brand, category, page = '0', limit = '20' } = req.query;
const filters: string[] = [];
if (brand) filters.push(`BRAND:"${brand}"`);
if (category) filters.push(`CATEGORY_0:"${category}"`);
const result = await hoodieClient.searchMasterProducts({
query: (q as string) || '',
filters: filters.length > 0 ? filters.join(' AND ') : undefined,
page: parseInt(page as string, 10),
hitsPerPage: Math.min(parseInt(limit as string, 10), 100),
});
res.json({
success: true,
products: result.hits,
pagination: {
total: result.nbHits,
page: result.page,
pages: result.nbPages,
limit: result.hitsPerPage,
},
});
} catch (error: any) {
console.error('Hoodie master products error:', error);
res.status(500).json({ error: 'Failed to fetch master products', message: error.message });
}
});
// ============================================================
// BRANDS
// ============================================================
/**
* GET /api/hoodie/brands
* Search brands
* Query params: q, state, page, limit
*/
router.get('/brands', async (req: Request, res: Response) => {
try {
const { q, state, page = '0', limit = '20' } = req.query;
let result;
if (state) {
result = await hoodieClient.getBrandsByState(state as string, {
query: (q as string) || '',
page: parseInt(page as string, 10),
hitsPerPage: Math.min(parseInt(limit as string, 10), 100),
});
} else {
result = await hoodieClient.searchBrands({
query: (q as string) || '',
page: parseInt(page as string, 10),
hitsPerPage: Math.min(parseInt(limit as string, 10), 100),
});
}
res.json({
success: true,
brands: result.hits,
pagination: {
total: result.nbHits,
page: result.page,
pages: result.nbPages,
limit: result.hitsPerPage,
},
});
} catch (error: any) {
console.error('Hoodie brands error:', error);
res.status(500).json({ error: 'Failed to fetch brands', message: error.message });
}
});
/**
* GET /api/hoodie/brands/slug/:slug
* Get brand by slug
*/
router.get('/brands/slug/:slug', async (req: Request, res: Response) => {
try {
const { slug } = req.params;
const brand = await hoodieClient.getBrandBySlug(slug);
if (!brand) {
return res.status(404).json({ error: 'Brand not found' });
}
res.json({ success: true, brand });
} catch (error: any) {
console.error('Hoodie brand by slug error:', error);
res.status(500).json({ error: 'Failed to fetch brand', message: error.message });
}
});
// ============================================================
// COMPARISON / DELTA QUERIES
// ============================================================
/**
* GET /api/hoodie/compare/dispensaries/:state
* Compare Hoodie dispensaries with CannaIQ for a state
* Returns: in_both, hoodie_only, cannaiq_only
*/
router.get('/compare/dispensaries/:state', async (req: Request, res: Response) => {
try {
const { state } = req.params;
const { pool } = await import('../db/pool');
// Get Hoodie dispensaries for state
const hoodieResult = await hoodieClient.getDispensariesByState(state, { hitsPerPage: 1000 });
// Get CannaIQ dispensaries for state
const cannaiqResult = await pool.query(
'SELECT id, name, city, menu_type, slug FROM dispensaries WHERE state = $1',
[state]
);
const hoodieDisps = hoodieResult.hits;
const cannaiqDisps = cannaiqResult.rows;
// Build lookup maps (normalize names for comparison)
const normalize = (s: string) => s.toLowerCase().replace(/[^a-z0-9]/g, '');
const hoodieByName = new Map(hoodieDisps.map(d => [normalize(d.DISPENSARY_NAME), d]));
const cannaiqByName = new Map(cannaiqDisps.map(d => [normalize(d.name), d]));
const inBoth: any[] = [];
const hoodieOnly: any[] = [];
const cannaiqOnly: any[] = [];
// Find matches and Hoodie-only
for (const [normName, hoodie] of hoodieByName) {
if (cannaiqByName.has(normName)) {
inBoth.push({
name: hoodie.DISPENSARY_NAME,
hoodie: { slug: hoodie.SLUG, pos: hoodie.POS_SYSTEM, menus: hoodie.MENUS_COUNT },
cannaiq: cannaiqByName.get(normName),
});
} else {
hoodieOnly.push({
name: hoodie.DISPENSARY_NAME,
city: hoodie.CITY,
slug: hoodie.SLUG,
pos: hoodie.POS_SYSTEM,
menus: hoodie.MENUS_COUNT,
daily_sales: hoodie.AVG_DAILY_SALES,
});
}
}
// Find CannaIQ-only
for (const [normName, cannaiq] of cannaiqByName) {
if (!hoodieByName.has(normName)) {
cannaiqOnly.push(cannaiq);
}
}
res.json({
success: true,
state,
summary: {
hoodie_total: hoodieDisps.length,
cannaiq_total: cannaiqDisps.length,
in_both: inBoth.length,
hoodie_only: hoodieOnly.length,
cannaiq_only: cannaiqOnly.length,
},
in_both: inBoth,
hoodie_only: hoodieOnly,
cannaiq_only: cannaiqOnly,
});
} catch (error: any) {
console.error('Hoodie compare dispensaries error:', error);
res.status(500).json({ error: 'Failed to compare dispensaries', message: error.message });
}
});
/**
* GET /api/hoodie/compare/brands/:state
* Compare Hoodie brands with CannaIQ for a state
*/
router.get('/compare/brands/:state', async (req: Request, res: Response) => {
try {
const { state } = req.params;
const { pool } = await import('../db/pool');
// Get Hoodie brands for state
const hoodieResult = await hoodieClient.getBrandsByState(state, { hitsPerPage: 1000 });
// Get CannaIQ brands for state (from products)
const cannaiqResult = await pool.query(`
SELECT DISTINCT p.brand_name_raw as name
FROM store_products p
JOIN dispensaries d ON d.id = p.dispensary_id
WHERE d.state = $1 AND p.brand_name_raw IS NOT NULL
`, [state]);
const hoodieBrands = hoodieResult.hits;
const cannaiqBrands = cannaiqResult.rows;
const normalize = (s: string) => s.toLowerCase().replace(/[^a-z0-9]/g, '');
const hoodieByName = new Map(hoodieBrands.map(b => [normalize(b.BRAND_NAME), b]));
const cannaiqByName = new Set(cannaiqBrands.map(b => normalize(b.name)));
const inBoth: string[] = [];
const hoodieOnly: any[] = [];
const cannaiqOnly: string[] = [];
for (const [normName, hoodie] of hoodieByName) {
if (cannaiqByName.has(normName)) {
inBoth.push(hoodie.BRAND_NAME);
} else {
hoodieOnly.push({
name: hoodie.BRAND_NAME,
slug: hoodie.SLUG,
variants: hoodie.ACTIVE_VARIANTS,
parent: hoodie.PARENT_COMPANY,
});
}
}
for (const brand of cannaiqBrands) {
if (!hoodieByName.has(normalize(brand.name))) {
cannaiqOnly.push(brand.name);
}
}
res.json({
success: true,
state,
summary: {
hoodie_total: hoodieBrands.length,
cannaiq_total: cannaiqBrands.length,
in_both: inBoth.length,
hoodie_only: hoodieOnly.length,
cannaiq_only: cannaiqOnly.length,
},
in_both: inBoth,
hoodie_only: hoodieOnly,
cannaiq_only: cannaiqOnly,
});
} catch (error: any) {
console.error('Hoodie compare brands error:', error);
res.status(500).json({ error: 'Failed to compare brands', message: error.message });
}
});
// ============================================================
// SCHEDULED COMPARISON REPORTS
// ============================================================
/**
* POST /api/hoodie/reports/run/:state
* Run comparison report for a state and save results
* Query params: type (dispensaries, brands, all)
*/
router.post('/reports/run/:state', async (req: Request, res: Response) => {
try {
const { state } = req.params;
const { type = 'all' } = req.query;
if (type === 'all') {
const results = await runAllComparisons(state);
res.json({
success: true,
state,
reports: {
dispensaries: {
id: results.dispensaries.reportId,
summary: {
hoodie_total: results.dispensaries.result.hoodieTotalCount,
cannaiq_total: results.dispensaries.result.cannaiqTotal,
in_both: results.dispensaries.result.inBoth,
hoodie_only: results.dispensaries.result.hoodieOnly,
cannaiq_only: results.dispensaries.result.cannaiqOnly,
},
duration_ms: results.dispensaries.result.durationMs,
},
brands: {
id: results.brands.reportId,
summary: {
hoodie_total: results.brands.result.hoodieTotalCount,
cannaiq_total: results.brands.result.cannaiqTotal,
in_both: results.brands.result.inBoth,
hoodie_only: results.brands.result.hoodieOnly,
cannaiq_only: results.brands.result.cannaiqOnly,
},
duration_ms: results.brands.result.durationMs,
},
},
});
} else {
const reportType = type as 'dispensaries' | 'brands';
const { reportId, result } = await runComparisonReport(reportType, state);
res.json({
success: true,
state,
report: {
id: reportId,
type: reportType,
summary: {
hoodie_total: result.hoodieTotalCount,
cannaiq_total: result.cannaiqTotal,
in_both: result.inBoth,
hoodie_only: result.hoodieOnly,
cannaiq_only: result.cannaiqOnly,
},
duration_ms: result.durationMs,
},
});
}
} catch (error: any) {
console.error('Hoodie run report error:', error);
res.status(500).json({ error: 'Failed to run report', message: error.message });
}
});
/**
* GET /api/hoodie/reports
* Get latest comparison reports
* Query params: state (optional)
*/
router.get('/reports', async (req: Request, res: Response) => {
try {
const { state } = req.query;
const reports = await getLatestReports(state as string | undefined);
res.json({ success: true, reports });
} catch (error: any) {
console.error('Hoodie get reports error:', error);
res.status(500).json({ error: 'Failed to fetch reports', message: error.message });
}
});
/**
* GET /api/hoodie/reports/:id
* Get full comparison report by ID
*/
router.get('/reports/:id', async (req: Request, res: Response) => {
try {
const { id } = req.params;
const report = await getReportById(parseInt(id, 10));
if (!report) {
return res.status(404).json({ error: 'Report not found' });
}
res.json({ success: true, report });
} catch (error: any) {
console.error('Hoodie get report error:', error);
res.status(500).json({ error: 'Failed to fetch report', message: error.message });
}
});
/**
* GET /api/hoodie/reports/history/:type/:state
* Get report history for type/state
*/
router.get('/reports/history/:type/:state', async (req: Request, res: Response) => {
try {
const { type, state } = req.params;
const { limit = '30' } = req.query;
const reports = await getReportHistory(type, state, parseInt(limit as string, 10));
res.json({ success: true, reports });
} catch (error: any) {
console.error('Hoodie get report history error:', error);
res.status(500).json({ error: 'Failed to fetch report history', message: error.message });
}
});
export default router;

View File

@@ -0,0 +1,507 @@
/**
* Inventory Analytics API
*
* Endpoints for querying real-time inventory changes:
* - Hourly/daily sales
* - Brand performance
* - Stock-outs
* - Price change history
* - New products
*/
import { Router, Request, Response } from 'express';
import { pool } from '../db/pool';
const router = Router();
// ============================================================================
// SALES ANALYTICS
// ============================================================================
/**
* GET /api/inventory/sales/hourly/:dispensaryId
* Get hourly sales for a dispensary
*/
router.get('/sales/hourly/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const hours = parseInt(req.query.hours as string) || 24;
const result = await pool.query(`
SELECT
date_trunc('hour', detected_at) as hour,
COUNT(*) as transactions,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
COUNT(DISTINCT brand_name) as brands_sold,
COUNT(DISTINCT product_id) as products_sold
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'sale'
AND detected_at > NOW() - INTERVAL '1 hour' * $2
GROUP BY date_trunc('hour', detected_at)
ORDER BY hour DESC
`, [dispensaryId, hours]);
res.json({
dispensaryId,
hours,
data: result.rows.map((row: any) => ({
hour: row.hour,
transactions: parseInt(row.transactions),
unitsSold: parseInt(row.units_sold) || 0,
revenue: parseFloat(row.total_revenue) || 0,
brandsSold: parseInt(row.brands_sold),
productsSold: parseInt(row.products_sold),
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
/**
* GET /api/inventory/sales/daily/:dispensaryId
* Get daily sales for a dispensary
*/
router.get('/sales/daily/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const days = parseInt(req.query.days as string) || 7;
const result = await pool.query(`
SELECT
date_trunc('day', detected_at) as day,
COUNT(*) as transactions,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
COUNT(DISTINCT brand_name) as brands_sold,
COUNT(DISTINCT product_id) as products_sold
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'sale'
AND detected_at > NOW() - INTERVAL '1 day' * $2
GROUP BY date_trunc('day', detected_at)
ORDER BY day DESC
`, [dispensaryId, days]);
res.json({
dispensaryId,
days,
data: result.rows.map((row: any) => ({
day: row.day,
transactions: parseInt(row.transactions),
unitsSold: parseInt(row.units_sold) || 0,
revenue: parseFloat(row.total_revenue) || 0,
brandsSold: parseInt(row.brands_sold),
productsSold: parseInt(row.products_sold),
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
// ============================================================================
// BRAND ANALYTICS
// ============================================================================
/**
* GET /api/inventory/brands/:dispensaryId
* Get brand performance for a dispensary
*/
router.get('/brands/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const days = parseInt(req.query.days as string) || 7;
const limit = parseInt(req.query.limit as string) || 20;
const result = await pool.query(`
SELECT
brand_name,
COUNT(*) as sales_count,
SUM(ABS(quantity_delta)) as units_sold,
SUM(revenue) as total_revenue,
AVG(COALESCE(special_price, price)) as avg_price,
SUM(CASE WHEN is_special THEN revenue ELSE 0 END) as special_revenue,
SUM(CASE WHEN NOT is_special THEN revenue ELSE 0 END) as regular_revenue,
COUNT(DISTINCT product_id) as unique_products
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'sale'
AND detected_at > NOW() - INTERVAL '1 day' * $2
AND brand_name IS NOT NULL
GROUP BY brand_name
ORDER BY total_revenue DESC
LIMIT $3
`, [dispensaryId, days, limit]);
res.json({
dispensaryId,
days,
data: result.rows.map((row: any) => ({
brand: row.brand_name,
salesCount: parseInt(row.sales_count),
unitsSold: parseInt(row.units_sold) || 0,
revenue: parseFloat(row.total_revenue) || 0,
avgPrice: parseFloat(row.avg_price) || 0,
specialRevenue: parseFloat(row.special_revenue) || 0,
regularRevenue: parseFloat(row.regular_revenue) || 0,
uniqueProducts: parseInt(row.unique_products),
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
// ============================================================================
// STOCK ANALYTICS
// ============================================================================
/**
* GET /api/inventory/stockouts/:dispensaryId
* Get products that hit zero stock
*/
router.get('/stockouts/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const hours = parseInt(req.query.hours as string) || 24;
const result = await pool.query(`
SELECT
product_id,
product_name,
brand_name,
option,
quantity_before,
detected_at as stock_out_at,
price,
special_price,
is_special
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'sale'
AND quantity_after = 0
AND detected_at > NOW() - INTERVAL '1 hour' * $2
ORDER BY detected_at DESC
`, [dispensaryId, hours]);
res.json({
dispensaryId,
hours,
stockOuts: result.rows.map((row: any) => ({
productId: row.product_id,
productName: row.product_name,
brand: row.brand_name,
option: row.option,
quantityBefore: row.quantity_before,
stockOutAt: row.stock_out_at,
price: parseFloat(row.price) || 0,
specialPrice: row.special_price ? parseFloat(row.special_price) : null,
wasOnSpecial: row.is_special,
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
/**
* GET /api/inventory/restocks/:dispensaryId
* Get recent restocks
*/
router.get('/restocks/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const hours = parseInt(req.query.hours as string) || 24;
const result = await pool.query(`
SELECT
product_id,
product_name,
brand_name,
option,
quantity_before,
quantity_after,
quantity_delta,
detected_at
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'restock'
AND detected_at > NOW() - INTERVAL '1 hour' * $2
ORDER BY detected_at DESC
`, [dispensaryId, hours]);
res.json({
dispensaryId,
hours,
restocks: result.rows.map((row: any) => ({
productId: row.product_id,
productName: row.product_name,
brand: row.brand_name,
option: row.option,
quantityBefore: row.quantity_before,
quantityAfter: row.quantity_after,
quantityAdded: row.quantity_delta,
restockedAt: row.detected_at,
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
// ============================================================================
// NEW PRODUCTS
// ============================================================================
/**
* GET /api/inventory/new-products/:dispensaryId
* Get recently added products
*/
router.get('/new-products/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const days = parseInt(req.query.days as string) || 7;
const result = await pool.query(`
SELECT
product_id,
product_name,
brand_name,
option,
category,
subcategory,
strain_type,
price,
special_price,
is_special,
quantity_after as initial_quantity,
thc_content,
cbd_content,
cannabinoids,
effects,
detected_at as added_at
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'new'
AND detected_at > NOW() - INTERVAL '1 day' * $2
ORDER BY detected_at DESC
`, [dispensaryId, days]);
res.json({
dispensaryId,
days,
newProducts: result.rows.map((row: any) => ({
productId: row.product_id,
productName: row.product_name,
brand: row.brand_name,
option: row.option,
category: row.category,
subcategory: row.subcategory,
strainType: row.strain_type,
price: parseFloat(row.price) || 0,
specialPrice: row.special_price ? parseFloat(row.special_price) : null,
isOnSpecial: row.is_special,
initialQuantity: row.initial_quantity,
thcContent: row.thc_content ? parseFloat(row.thc_content) : null,
cbdContent: row.cbd_content ? parseFloat(row.cbd_content) : null,
cannabinoids: row.cannabinoids,
effects: row.effects,
addedAt: row.added_at,
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
// ============================================================================
// PRICE CHANGES
// ============================================================================
/**
* GET /api/inventory/price-changes/:dispensaryId
* Get recent price changes
*/
router.get('/price-changes/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const days = parseInt(req.query.days as string) || 7;
const result = await pool.query(`
SELECT
product_id,
product_name,
brand_name,
option,
price,
special_price,
is_special,
detected_at
FROM inventory_changes
WHERE dispensary_id = $1
AND change_type = 'price_change'
AND detected_at > NOW() - INTERVAL '1 day' * $2
ORDER BY detected_at DESC
`, [dispensaryId, days]);
res.json({
dispensaryId,
days,
priceChanges: result.rows.map((row: any) => ({
productId: row.product_id,
productName: row.product_name,
brand: row.brand_name,
option: row.option,
newPrice: parseFloat(row.price) || 0,
newSpecialPrice: row.special_price ? parseFloat(row.special_price) : null,
isOnSpecial: row.is_special,
changedAt: row.detected_at,
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
// ============================================================================
// SUMMARY
// ============================================================================
/**
* GET /api/inventory/summary/:dispensaryId
* Get overall inventory summary
*/
router.get('/summary/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const hours = parseInt(req.query.hours as string) || 24;
const result = await pool.query(`
SELECT
change_type,
COUNT(*) as count,
SUM(ABS(COALESCE(quantity_delta, 0))) as total_units,
SUM(COALESCE(revenue, 0)) as total_revenue
FROM inventory_changes
WHERE dispensary_id = $1
AND detected_at > NOW() - INTERVAL '1 hour' * $2
GROUP BY change_type
`, [dispensaryId, hours]);
const summary: any = {
dispensaryId,
hours,
sales: { count: 0, units: 0, revenue: 0 },
restocks: { count: 0, units: 0 },
priceChanges: { count: 0 },
newProducts: { count: 0 },
removedProducts: { count: 0 },
};
for (const row of result.rows) {
switch (row.change_type) {
case 'sale':
summary.sales = {
count: parseInt(row.count),
units: parseInt(row.total_units) || 0,
revenue: parseFloat(row.total_revenue) || 0,
};
break;
case 'restock':
summary.restocks = {
count: parseInt(row.count),
units: parseInt(row.total_units) || 0,
};
break;
case 'price_change':
summary.priceChanges = { count: parseInt(row.count) };
break;
case 'new':
summary.newProducts = { count: parseInt(row.count) };
break;
case 'removed':
summary.removedProducts = { count: parseInt(row.count) };
break;
}
}
res.json(summary);
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
// ============================================================================
// RECENT CHANGES
// ============================================================================
/**
* GET /api/inventory/changes/:dispensaryId
* Get recent inventory changes
*/
router.get('/changes/:dispensaryId', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId);
const limit = parseInt(req.query.limit as string) || 50;
const changeType = req.query.type as string;
let whereClause = 'dispensary_id = $1';
const params: any[] = [dispensaryId, limit];
if (changeType) {
whereClause += ' AND change_type = $3';
params.push(changeType);
}
const result = await pool.query(`
SELECT
id,
product_id,
product_name,
brand_name,
option,
change_type,
quantity_before,
quantity_after,
quantity_delta,
price,
special_price,
is_special,
revenue,
category,
detected_at
FROM inventory_changes
WHERE ${whereClause}
ORDER BY detected_at DESC
LIMIT $2
`, params);
res.json({
dispensaryId,
changes: result.rows.map((row: any) => ({
id: row.id,
productId: row.product_id,
productName: row.product_name,
brand: row.brand_name,
option: row.option,
changeType: row.change_type,
quantityBefore: row.quantity_before,
quantityAfter: row.quantity_after,
quantityDelta: row.quantity_delta,
price: row.price ? parseFloat(row.price) : null,
specialPrice: row.special_price ? parseFloat(row.special_price) : null,
isOnSpecial: row.is_special,
revenue: row.revenue ? parseFloat(row.revenue) : null,
category: row.category,
detectedAt: row.detected_at,
})),
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
export default router;

View File

@@ -0,0 +1,448 @@
/**
* Inventory Tracking API Routes
*
* Endpoints for high-frequency crawl management, inventory snapshots,
* and product visibility events.
*
* Routes are prefixed with /api/inventory
*/
import { Router, Request, Response } from 'express';
import { authMiddleware } from '../auth/middleware';
import { pool } from '../db/pool';
const router = Router();
// Apply auth middleware to all routes
router.use(authMiddleware);
// ============================================================
// HIGH-FREQUENCY CRAWL MANAGEMENT
// ============================================================
/**
* GET /high-frequency
* Get all stores configured for high-frequency crawling
*/
router.get('/high-frequency', async (req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
d.id,
d.name,
d.city,
d.state,
d.menu_type,
d.crawl_interval_minutes,
d.next_crawl_at,
d.last_crawl_at,
d.last_baseline_at,
(SELECT COUNT(*) FROM store_products sp WHERE sp.dispensary_id = d.id) as product_count,
(SELECT COUNT(*) FROM inventory_snapshots i WHERE i.dispensary_id = d.id AND i.captured_at > NOW() - INTERVAL '24 hours') as snapshots_24h,
(SELECT COUNT(*) FROM product_visibility_events e WHERE e.dispensary_id = d.id AND e.detected_at > NOW() - INTERVAL '24 hours') as events_24h
FROM dispensaries d
WHERE d.crawl_interval_minutes IS NOT NULL
ORDER BY d.crawl_interval_minutes ASC, d.name ASC
`);
res.json({ success: true, data: rows, count: rows.length });
} catch (error: any) {
console.error('[Inventory] High-frequency list error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /high-frequency/stats
* Get overall high-frequency crawling statistics
*/
router.get('/high-frequency/stats', async (req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
COUNT(*) FILTER (WHERE crawl_interval_minutes IS NOT NULL) as configured_stores,
COUNT(*) FILTER (WHERE crawl_interval_minutes = 15) as interval_15m,
COUNT(*) FILTER (WHERE crawl_interval_minutes = 30) as interval_30m,
COUNT(*) FILTER (WHERE crawl_interval_minutes = 60) as interval_1h,
COUNT(*) FILTER (WHERE crawl_interval_minutes = 120) as interval_2h,
COUNT(*) FILTER (WHERE crawl_interval_minutes = 240) as interval_4h,
(SELECT COUNT(*) FROM inventory_snapshots WHERE captured_at > NOW() - INTERVAL '24 hours') as total_snapshots_24h,
(SELECT COUNT(*) FROM product_visibility_events WHERE detected_at > NOW() - INTERVAL '24 hours') as total_events_24h
FROM dispensaries
`);
res.json({ success: true, data: rows[0] });
} catch (error: any) {
console.error('[Inventory] High-frequency stats error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* PUT /high-frequency/:id
* Set or update crawl interval for a dispensary
*/
router.put('/high-frequency/:id', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.id);
const { interval_minutes } = req.body;
if (!interval_minutes || ![15, 30, 60, 120, 240].includes(interval_minutes)) {
return res.status(400).json({
success: false,
error: 'Invalid interval. Must be one of: 15, 30, 60, 120, 240 minutes',
});
}
const { rows } = await pool.query(
`
UPDATE dispensaries
SET
crawl_interval_minutes = $2,
next_crawl_at = COALESCE(next_crawl_at, NOW())
WHERE id = $1
RETURNING id, name, crawl_interval_minutes, next_crawl_at
`,
[dispensaryId, interval_minutes]
);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Dispensary not found' });
}
res.json({ success: true, data: rows[0] });
} catch (error: any) {
console.error('[Inventory] Set high-frequency error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* DELETE /high-frequency/:id
* Remove high-frequency crawling for a dispensary
*/
router.delete('/high-frequency/:id', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.id);
const { rows } = await pool.query(
`
UPDATE dispensaries
SET crawl_interval_minutes = NULL
WHERE id = $1
RETURNING id, name
`,
[dispensaryId]
);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Dispensary not found' });
}
res.json({ success: true, data: rows[0], message: 'High-frequency crawling disabled' });
} catch (error: any) {
console.error('[Inventory] Remove high-frequency error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// INVENTORY SNAPSHOTS
// ============================================================
/**
* GET /snapshots
* Get recent inventory snapshots with filtering
*/
router.get('/snapshots', async (req: Request, res: Response) => {
try {
const dispensaryId = req.query.dispensary_id
? parseInt(req.query.dispensary_id as string)
: undefined;
const changeType = req.query.change_type as string | undefined;
const brandName = req.query.brand as string | undefined;
const hours = req.query.hours ? parseInt(req.query.hours as string) : 24;
const limit = req.query.limit ? parseInt(req.query.limit as string) : 100;
const offset = req.query.offset ? parseInt(req.query.offset as string) : 0;
let whereClause = `WHERE i.captured_at > NOW() - INTERVAL '1 hour' * $1`;
const params: any[] = [hours];
let paramIndex = 2;
if (dispensaryId) {
whereClause += ` AND i.dispensary_id = $${paramIndex++}`;
params.push(dispensaryId);
}
if (changeType) {
whereClause += ` AND i.change_type = $${paramIndex++}`;
params.push(changeType);
}
if (brandName) {
whereClause += ` AND i.brand_name ILIKE $${paramIndex++}`;
params.push(`%${brandName}%`);
}
params.push(limit, offset);
const { rows } = await pool.query(
`
SELECT
i.id,
i.dispensary_id,
d.name as dispensary_name,
d.state,
i.product_id,
i.product_name,
i.brand_name,
i.category,
i.change_type,
i.quantity_available,
i.prev_quantity,
i.qty_delta,
i.price_rec,
i.prev_price_rec,
i.price_delta,
i.revenue_rec,
i.captured_at,
i.platform
FROM inventory_snapshots i
JOIN dispensaries d ON d.id = i.dispensary_id
${whereClause}
ORDER BY i.captured_at DESC
LIMIT $${paramIndex++} OFFSET $${paramIndex}
`,
params
);
// Get total count
const countResult = await pool.query(
`SELECT COUNT(*) FROM inventory_snapshots i ${whereClause}`,
params.slice(0, -2)
);
res.json({
success: true,
data: rows,
count: rows.length,
total: parseInt(countResult.rows[0].count),
limit,
offset,
});
} catch (error: any) {
console.error('[Inventory] Snapshots error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /snapshots/stats
* Get snapshot statistics summary
*/
router.get('/snapshots/stats', async (req: Request, res: Response) => {
try {
const hours = req.query.hours ? parseInt(req.query.hours as string) : 24;
const { rows } = await pool.query(
`
SELECT
COUNT(*) as total_snapshots,
COUNT(*) FILTER (WHERE change_type = 'sale') as sales,
COUNT(*) FILTER (WHERE change_type = 'restock') as restocks,
COUNT(*) FILTER (WHERE change_type = 'price_change') as price_changes,
COUNT(*) FILTER (WHERE change_type = 'oos') as oos_events,
COUNT(*) FILTER (WHERE change_type = 'back_in_stock') as back_in_stock,
COUNT(*) FILTER (WHERE change_type = 'new_product') as new_products,
COALESCE(SUM(revenue_rec), 0) as total_revenue,
COUNT(DISTINCT dispensary_id) as stores_with_activity
FROM inventory_snapshots
WHERE captured_at > NOW() - INTERVAL '1 hour' * $1
`,
[hours]
);
res.json({ success: true, data: rows[0] });
} catch (error: any) {
console.error('[Inventory] Snapshot stats error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// VISIBILITY EVENTS
// ============================================================
/**
* GET /events
* Get recent visibility events with filtering
*/
router.get('/events', async (req: Request, res: Response) => {
try {
const dispensaryId = req.query.dispensary_id
? parseInt(req.query.dispensary_id as string)
: undefined;
const eventType = req.query.event_type as string | undefined;
const brandName = req.query.brand as string | undefined;
const hours = req.query.hours ? parseInt(req.query.hours as string) : 24;
const limit = req.query.limit ? parseInt(req.query.limit as string) : 100;
const offset = req.query.offset ? parseInt(req.query.offset as string) : 0;
const unacknowledged = req.query.unacknowledged === 'true';
let whereClause = `WHERE e.detected_at > NOW() - INTERVAL '1 hour' * $1`;
const params: any[] = [hours];
let paramIndex = 2;
if (dispensaryId) {
whereClause += ` AND e.dispensary_id = $${paramIndex++}`;
params.push(dispensaryId);
}
if (eventType) {
whereClause += ` AND e.event_type = $${paramIndex++}`;
params.push(eventType);
}
if (brandName) {
whereClause += ` AND e.brand_name ILIKE $${paramIndex++}`;
params.push(`%${brandName}%`);
}
if (unacknowledged) {
whereClause += ` AND e.acknowledged_at IS NULL`;
}
params.push(limit, offset);
const { rows } = await pool.query(
`
SELECT
e.id,
e.dispensary_id,
d.name as dispensary_name,
d.state,
e.product_id,
e.product_name,
e.brand_name,
e.event_type,
e.previous_price,
e.new_price,
e.price_change_pct,
e.detected_at,
e.notified,
e.acknowledged_at,
e.platform
FROM product_visibility_events e
JOIN dispensaries d ON d.id = e.dispensary_id
${whereClause}
ORDER BY e.detected_at DESC
LIMIT $${paramIndex++} OFFSET $${paramIndex}
`,
params
);
// Get total count
const countResult = await pool.query(
`SELECT COUNT(*) FROM product_visibility_events e ${whereClause}`,
params.slice(0, -2)
);
res.json({
success: true,
data: rows,
count: rows.length,
total: parseInt(countResult.rows[0].count),
limit,
offset,
});
} catch (error: any) {
console.error('[Inventory] Events error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /events/stats
* Get event statistics summary
*/
router.get('/events/stats', async (req: Request, res: Response) => {
try {
const hours = req.query.hours ? parseInt(req.query.hours as string) : 24;
const { rows } = await pool.query(
`
SELECT
COUNT(*) as total_events,
COUNT(*) FILTER (WHERE event_type = 'oos') as oos_events,
COUNT(*) FILTER (WHERE event_type = 'back_in_stock') as back_in_stock,
COUNT(*) FILTER (WHERE event_type = 'price_change') as price_changes,
COUNT(*) FILTER (WHERE event_type = 'brand_dropped') as brands_dropped,
COUNT(*) FILTER (WHERE event_type = 'brand_added') as brands_added,
COUNT(*) FILTER (WHERE acknowledged_at IS NULL) as unacknowledged,
COUNT(DISTINCT dispensary_id) as stores_with_events
FROM product_visibility_events
WHERE detected_at > NOW() - INTERVAL '1 hour' * $1
`,
[hours]
);
res.json({ success: true, data: rows[0] });
} catch (error: any) {
console.error('[Inventory] Event stats error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /events/:id/acknowledge
* Mark an event as acknowledged
*/
router.post('/events/:id/acknowledge', async (req: Request, res: Response) => {
try {
const eventId = parseInt(req.params.id);
const { rows } = await pool.query(
`
UPDATE product_visibility_events
SET acknowledged_at = NOW()
WHERE id = $1
RETURNING id, acknowledged_at
`,
[eventId]
);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Event not found' });
}
res.json({ success: true, data: rows[0] });
} catch (error: any) {
console.error('[Inventory] Acknowledge event error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /events/acknowledge-batch
* Mark multiple events as acknowledged
*/
router.post('/events/acknowledge-batch', async (req: Request, res: Response) => {
try {
const { event_ids } = req.body;
if (!event_ids || !Array.isArray(event_ids) || event_ids.length === 0) {
return res.status(400).json({ success: false, error: 'event_ids array required' });
}
const { rowCount } = await pool.query(
`
UPDATE product_visibility_events
SET acknowledged_at = NOW()
WHERE id = ANY($1) AND acknowledged_at IS NULL
`,
[event_ids]
);
res.json({ success: true, acknowledged: rowCount });
} catch (error: any) {
console.error('[Inventory] Acknowledge batch error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
export default router;

View File

@@ -137,4 +137,72 @@ router.post('/workers/scale', async (req: Request, res: Response) => {
}
});
/**
* POST /api/k8s/workers/restart
* Rolling restart of worker deployment
* Triggers a new rollout by updating an annotation
*/
router.post('/workers/restart', async (_req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: k8sError || 'K8s not available',
});
}
try {
// Trigger rolling restart by scaling down then up
// This is simpler than patching annotations and works reliably
const now = new Date().toISOString();
// Get current replicas
const deployment = await client.readNamespacedDeployment({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
});
const currentReplicas = deployment.spec?.replicas || 0;
if (currentReplicas === 0) {
return res.status(400).json({
success: false,
error: 'Deployment has 0 replicas - cannot restart',
});
}
// Scale to 0 then back up to trigger restart
await client.patchNamespacedDeploymentScale({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
body: { spec: { replicas: 0 } },
});
// Brief delay then scale back up
await new Promise(resolve => setTimeout(resolve, 2000));
await client.patchNamespacedDeploymentScale({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
body: { spec: { replicas: currentReplicas } },
});
console.log(`[K8s] Triggered restart of ${WORKER_DEPLOYMENT} (${currentReplicas} replicas)`);
res.json({
success: true,
message: `Restarted ${currentReplicas} workers`,
replicas: currentReplicas,
restartedAt: now,
});
} catch (e: any) {
console.error('[K8s] Error restarting deployment:', e.message);
res.status(500).json({
success: false,
error: e.message,
});
}
});
export default router;

371
backend/src/routes/pool.ts Normal file
View File

@@ -0,0 +1,371 @@
/**
* Task Pool Control Routes
*
* Provides admin control over the task pool:
* - Open/close pool (enable/disable task claiming)
* - View pool status and statistics
* - Clear pending tasks
*/
import { Router, Request, Response } from 'express';
import { pool } from '../db/pool';
const router = Router();
/**
* GET /api/pool/status
* Get current pool status and statistics
*/
router.get('/status', async (_req: Request, res: Response) => {
try {
// Get pool config
const configResult = await pool.query(`
SELECT pool_open, closed_reason, closed_at, closed_by
FROM pool_config
LIMIT 1
`);
const config = configResult.rows[0] || { pool_open: true };
// Get task counts by status
const taskStats = await pool.query(`
SELECT
status,
COUNT(*) as count
FROM worker_tasks
GROUP BY status
`);
const statusCounts: Record<string, number> = {};
for (const row of taskStats.rows) {
statusCounts[row.status] = parseInt(row.count);
}
// Get pending tasks by role
const pendingByRole = await pool.query(`
SELECT
role,
COUNT(*) as count
FROM worker_tasks
WHERE status = 'pending'
GROUP BY role
ORDER BY count DESC
`);
// Get pending tasks by state
const pendingByState = await pool.query(`
SELECT
d.state,
COUNT(*) as count
FROM worker_tasks wt
JOIN dispensaries d ON d.id = wt.dispensary_id
WHERE wt.status = 'pending'
GROUP BY d.state
ORDER BY count DESC
`);
// Get active workers count
const activeWorkers = await pool.query(`
SELECT COUNT(*) as count
FROM worker_registry
WHERE status = 'active'
AND last_heartbeat_at > NOW() - INTERVAL '2 minutes'
`);
res.json({
success: true,
pool_open: config.pool_open,
closed_reason: config.closed_reason,
closed_at: config.closed_at,
closed_by: config.closed_by,
stats: {
total_pending: statusCounts['pending'] || 0,
total_claimed: statusCounts['claimed'] || 0,
total_running: statusCounts['running'] || 0,
total_completed: statusCounts['completed'] || 0,
total_failed: statusCounts['failed'] || 0,
active_workers: parseInt(activeWorkers.rows[0].count),
},
pending_by_role: pendingByRole.rows,
pending_by_state: pendingByState.rows,
});
} catch (error: any) {
console.error('[Pool] Status error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/pool/open
* Open the pool (allow task claiming)
*/
router.post('/open', async (_req: Request, res: Response) => {
try {
await pool.query(`
UPDATE pool_config
SET pool_open = true,
closed_reason = NULL,
closed_at = NULL,
closed_by = NULL,
opened_at = NOW()
`);
console.log('[Pool] Pool opened - workers can claim tasks');
res.json({
success: true,
pool_open: true,
message: 'Pool is now open - workers can claim tasks',
});
} catch (error: any) {
console.error('[Pool] Open error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/pool/close
* Close the pool (stop task claiming)
* Body: { reason?: string }
*/
router.post('/close', async (req: Request, res: Response) => {
try {
const { reason } = req.body;
await pool.query(`
UPDATE pool_config
SET pool_open = false,
closed_reason = $1,
closed_at = NOW(),
closed_by = 'admin'
`, [reason || 'Manually closed']);
console.log(`[Pool] Pool closed - reason: ${reason || 'Manually closed'}`);
res.json({
success: true,
pool_open: false,
message: 'Pool is now closed - workers cannot claim new tasks',
reason: reason || 'Manually closed',
});
} catch (error: any) {
console.error('[Pool] Close error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/pool/clear
* Clear pending tasks (optionally by role or state)
* Body: { role?: string, state?: string, confirm: boolean }
*/
router.post('/clear', async (req: Request, res: Response) => {
try {
const { role, state, confirm } = req.body;
if (!confirm) {
return res.status(400).json({
success: false,
error: 'Must set confirm: true to clear tasks',
});
}
let whereClause = "WHERE status = 'pending'";
const params: any[] = [];
let paramIndex = 1;
if (role) {
whereClause += ` AND role = $${paramIndex++}`;
params.push(role);
}
if (state) {
whereClause += ` AND dispensary_id IN (SELECT id FROM dispensaries WHERE state = $${paramIndex++})`;
params.push(state);
}
// Count first
const countResult = await pool.query(
`SELECT COUNT(*) as count FROM worker_tasks ${whereClause}`,
params
);
const taskCount = parseInt(countResult.rows[0].count);
if (taskCount === 0) {
return res.json({
success: true,
cleared: 0,
message: 'No matching pending tasks found',
});
}
// Delete pending tasks
await pool.query(`DELETE FROM worker_tasks ${whereClause}`, params);
const filterDesc = [
role ? `role=${role}` : null,
state ? `state=${state}` : null,
].filter(Boolean).join(', ') || 'all';
console.log(`[Pool] Cleared ${taskCount} pending tasks (${filterDesc})`);
res.json({
success: true,
cleared: taskCount,
message: `Cleared ${taskCount} pending tasks`,
filter: { role, state },
});
} catch (error: any) {
console.error('[Pool] Clear error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/pool/tasks
* List tasks in pool with pagination
* Query: { status?, role?, state?, limit?, offset? }
*/
router.get('/tasks', async (req: Request, res: Response) => {
try {
const {
status = 'pending',
role,
state,
limit = '50',
offset = '0',
} = req.query;
const conditions: string[] = [];
const params: any[] = [];
let paramIndex = 1;
if (status && status !== 'all') {
conditions.push(`wt.status = $${paramIndex++}`);
params.push(status);
}
if (role) {
conditions.push(`wt.role = $${paramIndex++}`);
params.push(role);
}
if (state) {
conditions.push(`d.state = $${paramIndex++}`);
params.push(state);
}
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
params.push(parseInt(limit as string), parseInt(offset as string));
const result = await pool.query(`
SELECT
wt.id,
wt.role,
wt.platform,
wt.status,
wt.priority,
wt.method,
wt.worker_id,
wt.created_at,
wt.scheduled_for,
wt.claimed_at,
wt.started_at,
wt.completed_at,
wt.error_message,
d.id as dispensary_id,
d.name as dispensary_name,
d.city as dispensary_city,
d.state as dispensary_state
FROM worker_tasks wt
LEFT JOIN dispensaries d ON d.id = wt.dispensary_id
${whereClause}
ORDER BY wt.priority DESC, wt.created_at ASC
LIMIT $${paramIndex++} OFFSET $${paramIndex}
`, params);
res.json({
success: true,
tasks: result.rows,
pagination: {
limit: parseInt(limit as string),
offset: parseInt(offset as string),
returned: result.rows.length,
},
});
} catch (error: any) {
console.error('[Pool] Tasks error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* DELETE /api/pool/tasks/:taskId
* Remove a specific task from the pool
*/
router.delete('/tasks/:taskId', async (req: Request, res: Response) => {
try {
const { taskId } = req.params;
const result = await pool.query(`
DELETE FROM worker_tasks
WHERE id = $1 AND status = 'pending'
RETURNING id, role, dispensary_id
`, [taskId]);
if (result.rows.length === 0) {
return res.status(404).json({
success: false,
error: 'Task not found or not in pending status',
});
}
res.json({
success: true,
deleted: result.rows[0],
message: `Task #${taskId} removed from pool`,
});
} catch (error: any) {
console.error('[Pool] Delete task error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/pool/release-stale
* Release tasks that have been claimed/running too long
* Body: { stale_minutes?: number (default: 30) }
*/
router.post('/release-stale', async (req: Request, res: Response) => {
try {
const { stale_minutes = 30 } = req.body;
const result = await pool.query(`
UPDATE worker_tasks
SET status = 'pending',
worker_id = NULL,
claimed_at = NULL,
started_at = NULL,
error_message = 'Released due to stale timeout'
WHERE status IN ('claimed', 'running')
AND (claimed_at < NOW() - INTERVAL '1 minute' * $1
OR started_at < NOW() - INTERVAL '1 minute' * $1)
RETURNING id, role, dispensary_id, worker_id
`, [stale_minutes]);
console.log(`[Pool] Released ${result.rows.length} stale tasks`);
res.json({
success: true,
released: result.rows.length,
tasks: result.rows,
message: `Released ${result.rows.length} tasks older than ${stale_minutes} minutes`,
});
} catch (error: any) {
console.error('[Pool] Release stale error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
export default router;

View File

@@ -532,6 +532,7 @@ router.get('/products', async (req: PublicApiRequest, res: Response) => {
// Query products with latest snapshot data
// Uses store_products + v_product_snapshots (canonical tables with raw_data)
// Join dispensaries to get menu_url for cart links
const { rows: products } = await pool.query(`
SELECT
p.id,
@@ -555,8 +556,12 @@ router.get('/products', async (req: PublicApiRequest, res: Response) => {
s.stock_quantity as total_quantity_available,
s.special,
s.crawled_at as snapshot_at,
d.menu_url as dispensary_menu_url,
d.slug as dispensary_slug,
d.menu_type as dispensary_menu_type,
${include_variants === 'true' || include_variants === '1' ? "s.raw_data->'POSMetaData'->'children' as variants_raw" : 'NULL as variants_raw'}
FROM store_products p
LEFT JOIN dispensaries d ON d.id = p.dispensary_id
LEFT JOIN LATERAL (
SELECT * FROM v_product_snapshots
WHERE store_product_id = p.id
@@ -611,6 +616,51 @@ router.get('/products', async (req: PublicApiRequest, res: Response) => {
? (p.price_rec_special ? parseFloat(p.price_rec_special).toFixed(2) : null)
: (p.price_med_special ? parseFloat(p.price_med_special).toFixed(2) : null);
// Build product-specific URL from dispensary slug and product name
let productUrl = p.dispensary_menu_url || null;
// Helper to create URL-safe slug from product name
const slugify = (str: string) => str
.toLowerCase()
.replace(/[|/\\]/g, ' ') // Replace separators with spaces
.replace(/[^a-z0-9\s-]/g, '') // Remove special chars
.trim()
.replace(/\s+/g, '-') // Spaces to hyphens
.replace(/-+/g, '-'); // Collapse multiple hyphens
if (p.name) {
const menuType = p.dispensary_menu_type?.toLowerCase();
const productSlug = slugify(p.name);
// Dutchie: https://dutchie.com/embedded-menu/{dispensary-slug}/product/{product-slug}
if (menuType === 'dutchie' && p.dispensary_slug) {
productUrl = `https://dutchie.com/embedded-menu/${p.dispensary_slug}/product/${productSlug}`;
}
// Jane/iHeartJane: https://www.iheartjane.com/stores/{store-slug}/products/{product-id}
else if ((menuType === 'jane' || menuType === 'iheartjane') && p.dispensary_slug) {
productUrl = `https://www.iheartjane.com/stores/${p.dispensary_slug}/products/${p.dutchie_id}`;
}
// Treez: https://www.treez.io/onlinemenu/{store-slug}?product={product-slug}
else if (menuType === 'treez' && p.dispensary_slug) {
productUrl = `https://www.treez.io/onlinemenu/${p.dispensary_slug}?product=${productSlug}`;
}
// Fallback: try to extract from menu_url using regex
else if (p.dispensary_menu_url) {
const dutchieMatch = p.dispensary_menu_url.match(/dutchie\.com\/(?:dispensary|embedded-menu)\/([^\/\?]+)/);
if (dutchieMatch) {
productUrl = `https://dutchie.com/embedded-menu/${dutchieMatch[1]}/product/${productSlug}`;
}
const janeMatch = p.dispensary_menu_url.match(/(?:iheartjane|jane)\.com\/(?:embed|stores)\/([^\/\?]+)/);
if (janeMatch) {
productUrl = `https://www.iheartjane.com/stores/${janeMatch[1]}/products/${p.dutchie_id}`;
}
const treezMatch = p.dispensary_menu_url.match(/treez\.io\/onlinemenu\/([^\/\?]+)/);
if (treezMatch) {
productUrl = `https://www.treez.io/onlinemenu/${treezMatch[1]}?product=${productSlug}`;
}
}
}
const result: any = {
id: p.id,
dispensary_id: p.dispensary_id,
@@ -626,6 +676,7 @@ router.get('/products', async (req: PublicApiRequest, res: Response) => {
thc_percentage: p.thc ? parseFloat(p.thc) : null,
cbd_percentage: p.cbd ? parseFloat(p.cbd) : null,
image_url: p.image_url || null,
menu_url: productUrl,
in_stock: p.stock_status === 'in_stock',
on_special: p.special || false,
quantity_available: p.total_quantity_available || 0,
@@ -934,8 +985,12 @@ router.get('/specials', async (req: PublicApiRequest, res: Response) => {
s.special,
s.options,
p.updated_at,
s.crawled_at as snapshot_at
s.crawled_at as snapshot_at,
d.menu_url as dispensary_menu_url,
d.slug as dispensary_slug,
d.menu_type as dispensary_menu_type
FROM v_products p
LEFT JOIN dispensaries d ON d.id = p.dispensary_id
INNER JOIN LATERAL (
SELECT * FROM v_product_snapshots
WHERE store_product_id = p.id
@@ -961,7 +1016,53 @@ router.get('/specials', async (req: PublicApiRequest, res: Response) => {
${whereClause}
`, countParams);
const transformedProducts = products.map((p) => ({
// Helper to create URL-safe slug from product name
const slugify = (str: string) => str
.toLowerCase()
.replace(/[|/\\]/g, ' ') // Replace separators with spaces
.replace(/[^a-z0-9\s-]/g, '') // Remove special chars
.trim()
.replace(/\s+/g, '-') // Spaces to hyphens
.replace(/-+/g, '-'); // Collapse multiple hyphens
const transformedProducts = products.map((p) => {
// Build product-specific URL from dispensary slug and product name
let productUrl = p.dispensary_menu_url || null;
if (p.name) {
const menuType = p.dispensary_menu_type?.toLowerCase();
const productSlug = slugify(p.name);
// Dutchie: https://dutchie.com/embedded-menu/{dispensary-slug}/product/{product-slug}
if (menuType === 'dutchie' && p.dispensary_slug) {
productUrl = `https://dutchie.com/embedded-menu/${p.dispensary_slug}/product/${productSlug}`;
}
// Jane/iHeartJane: https://www.iheartjane.com/stores/{store-slug}/products/{product-id}
else if ((menuType === 'jane' || menuType === 'iheartjane') && p.dispensary_slug) {
productUrl = `https://www.iheartjane.com/stores/${p.dispensary_slug}/products/${p.dutchie_id}`;
}
// Treez: https://www.treez.io/onlinemenu/{store-slug}?product={product-slug}
else if (menuType === 'treez' && p.dispensary_slug) {
productUrl = `https://www.treez.io/onlinemenu/${p.dispensary_slug}?product=${productSlug}`;
}
// Fallback: try to extract from menu_url using regex
else if (p.dispensary_menu_url) {
const dutchieMatch = p.dispensary_menu_url.match(/dutchie\.com\/(?:dispensary|embedded-menu)\/([^\/\?]+)/);
if (dutchieMatch) {
productUrl = `https://dutchie.com/embedded-menu/${dutchieMatch[1]}/product/${productSlug}`;
}
const janeMatch = p.dispensary_menu_url.match(/(?:iheartjane|jane)\.com\/(?:embed|stores)\/([^\/\?]+)/);
if (janeMatch) {
productUrl = `https://www.iheartjane.com/stores/${janeMatch[1]}/products/${p.dutchie_id}`;
}
const treezMatch = p.dispensary_menu_url.match(/treez\.io\/onlinemenu\/([^\/\?]+)/);
if (treezMatch) {
productUrl = `https://www.treez.io/onlinemenu/${treezMatch[1]}?product=${productSlug}`;
}
}
}
return {
id: p.id,
dispensary_id: p.dispensary_id,
dutchie_id: p.dutchie_id,
@@ -972,11 +1073,13 @@ router.get('/specials', async (req: PublicApiRequest, res: Response) => {
regular_price: p.rec_min_price_cents ? (p.rec_min_price_cents / 100).toFixed(2) : null,
sale_price: p.rec_min_special_price_cents ? (p.rec_min_special_price_cents / 100).toFixed(2) : null,
image_url: p.image_url || null,
menu_url: productUrl,
in_stock: p.stock_status === 'in_stock',
options: p.options || [],
updated_at: p.updated_at,
snapshot_at: p.snapshot_at
}));
};
});
res.json({
success: true,
@@ -2002,6 +2105,7 @@ router.get('/menu', async (req: PublicApiRequest, res: Response) => {
success: true,
scope: scope.type,
dispensary: scope.type === 'wordpress' ? req.apiPermission?.store_name : undefined,
store_id: dispensaryId || null,
menu: {
total_products: parseInt(summary.total_products || '0', 10),
in_stock_count: parseInt(summary.in_stock_count || '0', 10),

View File

@@ -0,0 +1,295 @@
/**
* Sales Analytics API Routes
*
* Market intelligence endpoints for sales velocity, brand market share,
* store performance, and product intelligence.
*
* Routes are prefixed with /api/sales-analytics
*
* Data Sources (materialized views):
* - mv_daily_sales_estimates: Daily sales from inventory deltas
* - mv_brand_market_share: Brand penetration by state
* - mv_sku_velocity: SKU velocity rankings
* - mv_store_performance: Dispensary performance rankings
* - mv_category_weekly_trends: Weekly category trends
* - mv_product_intelligence: Per-product Hoodie-style metrics
*/
import { Router, Request, Response } from 'express';
import { authMiddleware } from '../auth/middleware';
import salesAnalyticsService from '../services/analytics/SalesAnalyticsService';
import { TimeWindow, getDateRangeFromWindow } from '../services/analytics/types';
const router = Router();
// Apply auth middleware to all routes
router.use(authMiddleware);
// ============================================================
// DAILY SALES ESTIMATES
// ============================================================
/**
* GET /daily-sales
* Get daily sales estimates by product/dispensary
*/
router.get('/daily-sales', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const brandName = req.query.brand as string | undefined;
const category = req.query.category as string | undefined;
const dispensaryId = req.query.dispensary_id
? parseInt(req.query.dispensary_id as string)
: undefined;
const limit = req.query.limit
? parseInt(req.query.limit as string)
: 100;
const result = await salesAnalyticsService.getDailySalesEstimates({
stateCode,
brandName,
category,
dispensaryId,
limit,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] Daily sales error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// BRAND MARKET SHARE
// ============================================================
/**
* GET /brand-market-share
* Get brand market share (penetration) by state
*/
router.get('/brand-market-share', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const brandName = req.query.brand as string | undefined;
const minPenetration = req.query.min_penetration
? parseFloat(req.query.min_penetration as string)
: 0;
const limit = req.query.limit
? parseInt(req.query.limit as string)
: 100;
const result = await salesAnalyticsService.getBrandMarketShare({
stateCode,
brandName,
minPenetration,
limit,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] Brand market share error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// SKU VELOCITY
// ============================================================
/**
* GET /sku-velocity
* Get SKU velocity rankings
*/
router.get('/sku-velocity', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const brandName = req.query.brand as string | undefined;
const category = req.query.category as string | undefined;
const dispensaryId = req.query.dispensary_id
? parseInt(req.query.dispensary_id as string)
: undefined;
const velocityTier = req.query.tier as 'hot' | 'steady' | 'slow' | 'stale' | undefined;
const limit = req.query.limit
? parseInt(req.query.limit as string)
: 100;
const result = await salesAnalyticsService.getSkuVelocity({
stateCode,
brandName,
category,
dispensaryId,
velocityTier,
limit,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] SKU velocity error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// STORE PERFORMANCE
// ============================================================
/**
* GET /store-performance
* Get dispensary performance rankings
*/
router.get('/store-performance', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const sortBy = (req.query.sort_by as 'revenue' | 'units' | 'brands' | 'skus') || 'revenue';
const limit = req.query.limit
? parseInt(req.query.limit as string)
: 100;
const result = await salesAnalyticsService.getStorePerformance({
stateCode,
sortBy,
limit,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] Store performance error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// CATEGORY TRENDS
// ============================================================
/**
* GET /category-trends
* Get weekly category performance trends
*/
router.get('/category-trends', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const category = req.query.category as string | undefined;
const weeks = req.query.weeks
? parseInt(req.query.weeks as string)
: 12;
const result = await salesAnalyticsService.getCategoryTrends({
stateCode,
category,
weeks,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] Category trends error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// PRODUCT INTELLIGENCE (Hoodie-style metrics)
// ============================================================
/**
* GET /product-intelligence
* Get per-product metrics including stock_diff_120, days_since_oos, days_until_stock_out
*/
router.get('/product-intelligence', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const brandName = req.query.brand as string | undefined;
const category = req.query.category as string | undefined;
const dispensaryId = req.query.dispensary_id
? parseInt(req.query.dispensary_id as string)
: undefined;
const inStockOnly = req.query.in_stock === 'true';
const lowStock = req.query.low_stock === 'true';
const recentOOS = req.query.recent_oos === 'true';
const limit = req.query.limit
? parseInt(req.query.limit as string)
: 100;
const result = await salesAnalyticsService.getProductIntelligence({
stateCode,
brandName,
category,
dispensaryId,
inStockOnly,
lowStock,
recentOOS,
limit,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] Product intelligence error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// TOP BRANDS
// ============================================================
/**
* GET /top-brands
* Get top selling brands by revenue
*/
router.get('/top-brands', async (req: Request, res: Response) => {
try {
const stateCode = req.query.state as string | undefined;
const window = (req.query.window as TimeWindow) || '30d';
const limit = req.query.limit
? parseInt(req.query.limit as string)
: 50;
const result = await salesAnalyticsService.getTopBrands({
stateCode,
window,
limit,
});
res.json({ success: true, data: result, count: result.length });
} catch (error: any) {
console.error('[SalesAnalytics] Top brands error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// VIEW MANAGEMENT
// ============================================================
/**
* POST /refresh
* Manually refresh materialized views (admin only)
*/
router.post('/refresh', async (req: Request, res: Response) => {
try {
console.log('[SalesAnalytics] Manual view refresh requested');
const result = await salesAnalyticsService.refreshViews();
console.log('[SalesAnalytics] View refresh complete:', result);
res.json({ success: true, data: result });
} catch (error: any) {
console.error('[SalesAnalytics] Refresh error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /stats
* Get view statistics (row counts for each materialized view)
*/
router.get('/stats', async (req: Request, res: Response) => {
try {
const stats = await salesAnalyticsService.getViewStats();
res.json({ success: true, data: stats });
} catch (error: any) {
console.error('[SalesAnalytics] Stats error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
export default router;

View File

@@ -26,6 +26,7 @@
*/
import { Router, Request, Response } from 'express';
import { authMiddleware } from '../auth/middleware';
import {
taskService,
TaskRole,
@@ -597,7 +598,7 @@ router.delete('/schedules/:id', async (req: Request, res: Response) => {
});
}
// Delete the schedule
// Delete the schedule (pending tasks remain in pool for manual management)
await pool.query(`DELETE FROM task_schedules WHERE id = $1`, [scheduleId]);
res.json({
@@ -1918,4 +1919,292 @@ router.get('/pools/:id', async (req: Request, res: Response) => {
}
});
// ============================================================
// INVENTORY SNAPSHOTS API
// Part of Real-Time Inventory Tracking feature
// ============================================================
/**
* GET /inventory-snapshots
* Get inventory snapshots with optional filters
*/
router.get('/inventory-snapshots', authMiddleware, async (req: Request, res: Response) => {
try {
const dispensaryId = req.query.dispensary_id ? parseInt(req.query.dispensary_id as string) : undefined;
const productId = req.query.product_id as string | undefined;
const limit = req.query.limit ? parseInt(req.query.limit as string) : 100;
const offset = req.query.offset ? parseInt(req.query.offset as string) : 0;
let query = `
SELECT
s.id,
s.dispensary_id,
d.name as dispensary_name,
s.product_id,
s.platform,
s.quantity_available,
s.is_below_threshold,
s.status,
s.price_rec,
s.price_med,
s.brand_name,
s.category,
s.product_name,
s.captured_at
FROM inventory_snapshots s
JOIN dispensaries d ON d.id = s.dispensary_id
WHERE 1=1
`;
const params: any[] = [];
let paramIndex = 1;
if (dispensaryId) {
query += ` AND s.dispensary_id = $${paramIndex++}`;
params.push(dispensaryId);
}
if (productId) {
query += ` AND s.product_id = $${paramIndex++}`;
params.push(productId);
}
query += ` ORDER BY s.captured_at DESC LIMIT $${paramIndex++} OFFSET $${paramIndex++}`;
params.push(limit, offset);
const { rows } = await pool.query(query, params);
// Get total count
let countQuery = `SELECT COUNT(*) FROM inventory_snapshots WHERE 1=1`;
const countParams: any[] = [];
let countParamIndex = 1;
if (dispensaryId) {
countQuery += ` AND dispensary_id = $${countParamIndex++}`;
countParams.push(dispensaryId);
}
if (productId) {
countQuery += ` AND product_id = $${countParamIndex++}`;
countParams.push(productId);
}
const { rows: countRows } = await pool.query(countQuery, countParams);
const total = parseInt(countRows[0].count);
res.json({
success: true,
snapshots: rows,
count: total,
limit,
offset,
});
} catch (err: any) {
res.status(500).json({ success: false, error: err.message });
}
});
/**
* GET /inventory-snapshots/stats
* Get inventory snapshot statistics
*/
router.get('/inventory-snapshots/stats', authMiddleware, async (req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
COUNT(*) as total_snapshots,
COUNT(DISTINCT dispensary_id) as stores_tracked,
COUNT(DISTINCT product_id) as products_tracked,
MIN(captured_at) as oldest_snapshot,
MAX(captured_at) as newest_snapshot,
COUNT(*) FILTER (WHERE captured_at > NOW() - INTERVAL '24 hours') as snapshots_24h,
COUNT(*) FILTER (WHERE captured_at > NOW() - INTERVAL '1 hour') as snapshots_1h
FROM inventory_snapshots
`);
res.json({
success: true,
stats: rows[0],
});
} catch (err: any) {
res.status(500).json({ success: false, error: err.message });
}
});
// ============================================================
// VISIBILITY EVENTS API
// Part of Real-Time Inventory Tracking feature
// ============================================================
/**
* GET /visibility-events
* Get visibility events with optional filters
*/
router.get('/visibility-events', authMiddleware, async (req: Request, res: Response) => {
try {
const dispensaryId = req.query.dispensary_id ? parseInt(req.query.dispensary_id as string) : undefined;
const brand = req.query.brand as string | undefined;
const eventType = req.query.event_type as string | undefined;
const limit = req.query.limit ? parseInt(req.query.limit as string) : 100;
const offset = req.query.offset ? parseInt(req.query.offset as string) : 0;
let query = `
SELECT
e.id,
e.dispensary_id,
d.name as dispensary_name,
e.product_id,
e.product_name,
e.brand_name,
e.event_type,
e.detected_at,
e.previous_quantity,
e.previous_price,
e.new_price,
e.price_change_pct,
e.platform,
e.notified,
e.acknowledged_at
FROM product_visibility_events e
JOIN dispensaries d ON d.id = e.dispensary_id
WHERE 1=1
`;
const params: any[] = [];
let paramIndex = 1;
if (dispensaryId) {
query += ` AND e.dispensary_id = $${paramIndex++}`;
params.push(dispensaryId);
}
if (brand) {
query += ` AND e.brand_name ILIKE $${paramIndex++}`;
params.push(`%${brand}%`);
}
if (eventType) {
query += ` AND e.event_type = $${paramIndex++}`;
params.push(eventType);
}
query += ` ORDER BY e.detected_at DESC LIMIT $${paramIndex++} OFFSET $${paramIndex++}`;
params.push(limit, offset);
const { rows } = await pool.query(query, params);
// Get total count
let countQuery = `SELECT COUNT(*) FROM product_visibility_events WHERE 1=1`;
const countParams: any[] = [];
let countParamIndex = 1;
if (dispensaryId) {
countQuery += ` AND dispensary_id = $${countParamIndex++}`;
countParams.push(dispensaryId);
}
if (brand) {
countQuery += ` AND brand_name ILIKE $${countParamIndex++}`;
countParams.push(`%${brand}%`);
}
if (eventType) {
countQuery += ` AND event_type = $${countParamIndex++}`;
countParams.push(eventType);
}
const { rows: countRows } = await pool.query(countQuery, countParams);
const total = parseInt(countRows[0].count);
res.json({
success: true,
events: rows,
count: total,
limit,
offset,
});
} catch (err: any) {
res.status(500).json({ success: false, error: err.message });
}
});
/**
* GET /visibility-events/stats
* Get visibility event statistics
*/
router.get('/visibility-events/stats', authMiddleware, async (req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
COUNT(*) as total_events,
COUNT(*) FILTER (WHERE event_type = 'oos') as oos_events,
COUNT(*) FILTER (WHERE event_type = 'back_in_stock') as back_in_stock_events,
COUNT(*) FILTER (WHERE event_type = 'brand_dropped') as brand_dropped_events,
COUNT(*) FILTER (WHERE event_type = 'brand_added') as brand_added_events,
COUNT(*) FILTER (WHERE event_type = 'price_change') as price_change_events,
COUNT(*) FILTER (WHERE detected_at > NOW() - INTERVAL '24 hours') as events_24h,
COUNT(*) FILTER (WHERE acknowledged_at IS NOT NULL) as acknowledged_events,
COUNT(*) FILTER (WHERE notified = TRUE) as notified_events
FROM product_visibility_events
`);
res.json({
success: true,
stats: rows[0],
});
} catch (err: any) {
res.status(500).json({ success: false, error: err.message });
}
});
/**
* POST /visibility-events/:id/acknowledge
* Acknowledge a visibility event
*/
router.post('/visibility-events/:id/acknowledge', authMiddleware, async (req: Request, res: Response) => {
try {
const eventId = parseInt(req.params.id);
const acknowledgedBy = (req as any).user?.email || 'unknown';
await pool.query(`
UPDATE product_visibility_events
SET acknowledged_at = NOW(),
acknowledged_by = $2
WHERE id = $1
`, [eventId, acknowledgedBy]);
res.json({
success: true,
message: 'Event acknowledged',
});
} catch (err: any) {
res.status(500).json({ success: false, error: err.message });
}
});
/**
* POST /visibility-events/acknowledge-bulk
* Acknowledge multiple visibility events
*/
router.post('/visibility-events/acknowledge-bulk', authMiddleware, async (req: Request, res: Response) => {
try {
const { event_ids } = req.body;
if (!event_ids || !Array.isArray(event_ids)) {
return res.status(400).json({ success: false, error: 'event_ids array required' });
}
const acknowledgedBy = (req as any).user?.email || 'unknown';
const { rowCount } = await pool.query(`
UPDATE product_visibility_events
SET acknowledged_at = NOW(),
acknowledged_by = $2
WHERE id = ANY($1)
`, [event_ids, acknowledgedBy]);
res.json({
success: true,
message: `${rowCount} events acknowledged`,
count: rowCount,
});
} catch (err: any) {
res.status(500).json({ success: false, error: err.message });
}
});
export default router;

View File

@@ -0,0 +1,224 @@
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<any> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return JSON.parse(data.toString('utf8'));
}
interface VariantStock {
productId: string;
productName: string;
brand: string;
option: string;
quantity: number;
price: number; // regular price
specialPrice: number | null; // sale price if on special
isSpecial: boolean;
}
function extractVariantStock(product: any): VariantStock[] {
const variants: VariantStock[] = [];
const children = product.POSMetaData?.children || [];
const options = product.Options || [];
const prices = product.Prices || [];
const specialPrices = product.recSpecialPrices || [];
const isSpecial = product.special === true;
for (let i = 0; i < children.length; i++) {
const child = children[i];
variants.push({
productId: product.id || product._id,
productName: product.Name,
brand: product.brand?.name || product.brandName || 'Unknown',
option: child.option || options[i] || `variant_${i}`,
quantity: child.quantity || 0,
price: child.price || child.recPrice || prices[i] || 0,
specialPrice: isSpecial ? (specialPrices[i] || null) : null,
isSpecial,
});
}
return variants;
}
async function main() {
// Compare Dec 13 (before) to Dec 17 (after/today)
const beforeKey = 'payloads/2025/12/13/store_112_1765609042078.json.gz';
const afterKey = 'payloads/dutchie/2025/12/17/store_112_t008jb_1765939307492.json';
console.log('Loading payloads...');
const before = await fetchPayload(beforeKey);
const after = await fetchPayload(afterKey);
const beforeProducts = Array.isArray(before) ? before : before.products || [];
const afterProducts = Array.isArray(after) ? after : after.products || [];
console.log(`Before (Dec 13): ${beforeProducts.length} products`);
console.log(`After (Dec 17): ${afterProducts.length} products`);
// Build variant stock maps keyed by productId_option
const beforeStock = new Map<string, VariantStock>();
const afterStock = new Map<string, VariantStock>();
for (const p of beforeProducts) {
for (const v of extractVariantStock(p)) {
beforeStock.set(`${v.productId}_${v.option}`, v);
}
}
for (const p of afterProducts) {
for (const v of extractVariantStock(p)) {
afterStock.set(`${v.productId}_${v.option}`, v);
}
}
console.log(`\nBefore variants: ${beforeStock.size}`);
console.log(`After variants: ${afterStock.size}`);
// Calculate sales
interface Sale {
productName: string;
brand: string;
option: string;
qtySold: number;
priceEach: number;
revenue: number;
wasSpecial: boolean;
}
const sales: Sale[] = [];
let totalRevenue = 0;
let totalUnits = 0;
// Products that existed before and quantity decreased
for (const [key, beforeVariant] of beforeStock) {
const afterVariant = afterStock.get(key);
if (!afterVariant) {
// Product removed entirely - sold out
// Use TODAY's pricing (we don't have it since product is gone)
// Fall back to before pricing
const priceEach = beforeVariant.specialPrice || beforeVariant.price;
const qtySold = beforeVariant.quantity;
if (qtySold > 0 && priceEach > 0) {
sales.push({
productName: beforeVariant.productName,
brand: beforeVariant.brand,
option: beforeVariant.option,
qtySold,
priceEach,
revenue: qtySold * priceEach,
wasSpecial: beforeVariant.isSpecial,
});
totalRevenue += qtySold * priceEach;
totalUnits += qtySold;
}
} else if (afterVariant.quantity < beforeVariant.quantity) {
// Quantity decreased - use TODAY's price (afterVariant)
const priceEach = afterVariant.specialPrice || afterVariant.price;
const qtySold = beforeVariant.quantity - afterVariant.quantity;
if (qtySold > 0 && priceEach > 0) {
sales.push({
productName: afterVariant.productName,
brand: afterVariant.brand,
option: afterVariant.option,
qtySold,
priceEach,
revenue: qtySold * priceEach,
wasSpecial: afterVariant.isSpecial,
});
totalRevenue += qtySold * priceEach;
totalUnits += qtySold;
}
}
}
// Sort by revenue descending
sales.sort((a, b) => b.revenue - a.revenue);
console.log(`\n${'='.repeat(70)}`);
console.log('SALES REPORT: Dec 13 -> Dec 17 (Deeply Rooted Phoenix)');
console.log('='.repeat(70));
console.log(`\n Total Units Sold: ${totalUnits}`);
console.log(` Total Revenue: $${totalRevenue.toFixed(2)}`);
console.log(` Avg Price/Unit: $${totalUnits > 0 ? (totalRevenue / totalUnits).toFixed(2) : '0'}`);
// Top selling products
console.log(`\n=== TOP 20 PRODUCTS BY REVENUE ===`);
sales.slice(0, 20).forEach((s, i) => {
const specialTag = s.wasSpecial ? ' [SPECIAL]' : '';
console.log(` ${i + 1}. ${s.brand} - ${s.productName} (${s.option})`);
console.log(` ${s.qtySold} units × $${s.priceEach} = $${s.revenue.toFixed(2)}${specialTag}`);
});
// Revenue by brand
const brandRevenue = new Map<string, { units: number; revenue: number }>();
for (const s of sales) {
const current = brandRevenue.get(s.brand) || { units: 0, revenue: 0 };
current.units += s.qtySold;
current.revenue += s.revenue;
brandRevenue.set(s.brand, current);
}
const sortedBrands = [...brandRevenue.entries()]
.sort((a, b) => b[1].revenue - a[1].revenue);
console.log(`\n=== REVENUE BY BRAND ===`);
sortedBrands.slice(0, 15).forEach(([brand, data]) => {
const pct = ((data.revenue / totalRevenue) * 100).toFixed(1);
console.log(` ${brand}`);
console.log(` ${data.units} units, $${data.revenue.toFixed(2)} (${pct}%)`);
});
// Special vs Regular pricing breakdown
const specialSales = sales.filter(s => s.wasSpecial);
const regularSales = sales.filter(s => !s.wasSpecial);
const specialRevenue = specialSales.reduce((sum, s) => sum + s.revenue, 0);
const specialUnits = specialSales.reduce((sum, s) => sum + s.qtySold, 0);
const regularRevenue = regularSales.reduce((sum, s) => sum + s.revenue, 0);
const regularUnits = regularSales.reduce((sum, s) => sum + s.qtySold, 0);
console.log(`\n=== SPECIAL vs REGULAR PRICING ===`);
console.log(` Special (sale) items:`);
console.log(` ${specialUnits} units, $${specialRevenue.toFixed(2)} (${((specialRevenue / totalRevenue) * 100).toFixed(1)}%)`);
console.log(` Regular price items:`);
console.log(` ${regularUnits} units, $${regularRevenue.toFixed(2)} (${((regularRevenue / totalRevenue) * 100).toFixed(1)}%)`);
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,202 @@
import { S3Client, GetObjectCommand, ListObjectsV2Command } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<any> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return JSON.parse(data.toString('utf8'));
}
function extractTimestamp(key: string): Date {
const match = key.match(/_(\d{13})/);
if (match) {
return new Date(parseInt(match[1]));
}
return new Date();
}
interface VariantQty {
productName: string;
brand: string;
option: string;
quantity: number;
quantityAvailable: number;
kioskQuantityAvailable: number;
}
function extractVariants(products: any[]): Map<string, VariantQty> {
const map = new Map<string, VariantQty>();
for (const p of products) {
const children = p.POSMetaData?.children || [];
for (const child of children) {
const key = `${p.id}_${child.option}`;
map.set(key, {
productName: p.Name,
brand: p.brand?.name || 'Unknown',
option: child.option,
quantity: child.quantity || 0,
quantityAvailable: child.quantityAvailable ?? -999,
kioskQuantityAvailable: child.kioskQuantityAvailable ?? -999,
});
}
}
return map;
}
async function main() {
// Get Dec 13 payloads (same day, different times)
const keys = [
'payloads/2025/12/13/store_112_1765609042078.json.gz', // 6:57 AM
'payloads/2025/12/13/store_112_1765626591000.json.gz', // 11:49 AM
'payloads/2025/12/13/store_112_1765648448421.json.gz', // 5:54 PM
];
console.log('Loading Dec 13 payloads (same day)...\n');
const snapshots: { time: Date; variants: Map<string, VariantQty> }[] = [];
for (const key of keys) {
const timestamp = extractTimestamp(key);
const data = await fetchPayload(key);
const products = Array.isArray(data) ? data : data.products || [];
const variants = extractVariants(products);
console.log(`${timestamp.toLocaleTimeString()}: ${products.length} products, ${variants.size} variants`);
snapshots.push({ time: timestamp, variants });
}
// Compare consecutive snapshots
console.log('\n' + '='.repeat(80));
console.log('QUANTITY CHANGES BETWEEN SNAPSHOTS');
console.log('='.repeat(80));
for (let i = 1; i < snapshots.length; i++) {
const prev = snapshots[i - 1];
const curr = snapshots[i];
const hoursDiff = (curr.time.getTime() - prev.time.getTime()) / (1000 * 60 * 60);
console.log(`\n--- ${prev.time.toLocaleTimeString()}${curr.time.toLocaleTimeString()} (${hoursDiff.toFixed(1)} hours) ---`);
let qtyChanges = 0;
let qtyAvailChanges = 0;
let kioskQtyChanges = 0;
const changes: any[] = [];
for (const [key, currVariant] of curr.variants) {
const prevVariant = prev.variants.get(key);
if (!prevVariant) continue;
const qtyDiff = currVariant.quantity - prevVariant.quantity;
const qtyAvailDiff = currVariant.quantityAvailable - prevVariant.quantityAvailable;
const kioskDiff = currVariant.kioskQuantityAvailable - prevVariant.kioskQuantityAvailable;
if (qtyDiff !== 0) {
qtyChanges++;
changes.push({
name: currVariant.productName,
brand: currVariant.brand,
option: currVariant.option,
field: 'quantity',
before: prevVariant.quantity,
after: currVariant.quantity,
diff: qtyDiff,
});
}
if (qtyAvailDiff !== 0 && prevVariant.quantityAvailable !== -999) {
qtyAvailChanges++;
if (qtyDiff === 0) { // Only log if quantity didn't change
changes.push({
name: currVariant.productName,
brand: currVariant.brand,
option: currVariant.option,
field: 'quantityAvailable',
before: prevVariant.quantityAvailable,
after: currVariant.quantityAvailable,
diff: qtyAvailDiff,
});
}
}
if (kioskDiff !== 0 && prevVariant.kioskQuantityAvailable !== -999) {
kioskQtyChanges++;
}
}
console.log(` quantity changes: ${qtyChanges}`);
console.log(` quantityAvailable changes: ${qtyAvailChanges}`);
console.log(` kioskQuantityAvailable changes: ${kioskQtyChanges}`);
// Show examples
if (changes.length > 0) {
console.log(`\n Examples:`);
changes.slice(0, 15).forEach(c => {
const sign = c.diff > 0 ? '+' : '';
console.log(` ${c.brand} - ${c.name} (${c.option})`);
console.log(` ${c.field}: ${c.before}${c.after} (${sign}${c.diff})`);
});
if (changes.length > 15) console.log(` ... and ${changes.length - 15} more`);
}
}
// Now check the full day totals
console.log('\n' + '='.repeat(80));
console.log('FULL DAY SUMMARY (6:57 AM → 5:54 PM)');
console.log('='.repeat(80));
const first = snapshots[0];
const last = snapshots[snapshots.length - 1];
const totalHours = (last.time.getTime() - first.time.getTime()) / (1000 * 60 * 60);
let totalSold = 0;
let totalRestocked = 0;
for (const [key, lastVariant] of last.variants) {
const firstVariant = first.variants.get(key);
if (!firstVariant) continue;
const diff = firstVariant.quantity - lastVariant.quantity;
if (diff > 0) totalSold += diff;
if (diff < 0) totalRestocked += Math.abs(diff);
}
console.log(`\n Time span: ${totalHours.toFixed(1)} hours`);
console.log(` Total units sold: ${totalSold}`);
console.log(` Total units restocked: ${totalRestocked}`);
console.log(` Sales rate: ${(totalSold / totalHours).toFixed(1)} units/hour`);
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,127 @@
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<any> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return JSON.parse(data.toString('utf8'));
}
async function main() {
const key = 'payloads/dutchie/2025/12/17/store_112_t008jb_1765939307492.json';
const data = await fetchPayload(key);
const products = Array.isArray(data) ? data : data.products || [];
// Payload capture time
const captureTime = new Date(1765939307492);
console.log(`Payload captured at: ${captureTime.toISOString()}`);
console.log(`Products: ${products.length}\n`);
// Check updatedAt for all products
const updates: { name: string; updatedAt: Date; minutesAgo: number }[] = [];
for (const p of products) {
if (p.updatedAt) {
const updated = new Date(p.updatedAt);
const minutesAgo = (captureTime.getTime() - updated.getTime()) / (1000 * 60);
updates.push({
name: p.Name,
updatedAt: updated,
minutesAgo,
});
}
}
// Sort by most recent
updates.sort((a, b) => a.minutesAgo - b.minutesAgo);
console.log('=== MOST RECENTLY UPDATED PRODUCTS ===');
console.log('(relative to payload capture time)\n');
updates.slice(0, 30).forEach((u, i) => {
const ago = u.minutesAgo < 60
? `${u.minutesAgo.toFixed(0)} mins ago`
: `${(u.minutesAgo / 60).toFixed(1)} hours ago`;
console.log(`${i + 1}. ${u.name}`);
console.log(` Updated: ${u.updatedAt.toISOString()} (${ago})`);
});
// Distribution of update times
console.log('\n=== UPDATE TIME DISTRIBUTION ===');
const buckets = {
'<1 min': 0,
'1-5 mins': 0,
'5-15 mins': 0,
'15-60 mins': 0,
'1-6 hours': 0,
'6-24 hours': 0,
'>24 hours': 0,
};
for (const u of updates) {
if (u.minutesAgo < 1) buckets['<1 min']++;
else if (u.minutesAgo < 5) buckets['1-5 mins']++;
else if (u.minutesAgo < 15) buckets['5-15 mins']++;
else if (u.minutesAgo < 60) buckets['15-60 mins']++;
else if (u.minutesAgo < 360) buckets['1-6 hours']++;
else if (u.minutesAgo < 1440) buckets['6-24 hours']++;
else buckets['>24 hours']++;
}
for (const [bucket, count] of Object.entries(buckets)) {
const pct = ((count / updates.length) * 100).toFixed(1);
console.log(` ${bucket}: ${count} products (${pct}%)`);
}
// Check createdAt vs updatedAt to see churn
console.log('\n=== PRODUCT AGE vs UPDATE FREQUENCY ===');
let newAndUpdated = 0;
let oldButRecentUpdate = 0;
for (const p of products) {
if (!p.createdAt || !p.updatedAt) continue;
const created = new Date(parseInt(p.createdAt));
const updated = new Date(p.updatedAt);
const ageHours = (captureTime.getTime() - created.getTime()) / (1000 * 60 * 60);
const updateAgeHours = (captureTime.getTime() - updated.getTime()) / (1000 * 60 * 60);
if (ageHours < 24 && updateAgeHours < 1) newAndUpdated++;
if (ageHours > 168 && updateAgeHours < 1) oldButRecentUpdate++; // >1 week old, updated in last hour
}
console.log(` New products (<24h) with recent updates: ${newAndUpdated}`);
console.log(` Old products (>1 week) with recent updates (<1h): ${oldButRecentUpdate}`);
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,212 @@
import { S3Client, GetObjectCommand, ListObjectsV2Command } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<any> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return JSON.parse(data.toString('utf8'));
}
async function main() {
// Fetch Dec 13 (earliest) and Dec 17 (latest)
const dec13Key = 'payloads/2025/12/13/store_112_1765609042078.json.gz';
const dec17Key = 'payloads/dutchie/2025/12/17/store_112_t008jb_1765939307492.json';
console.log('Loading payloads...');
const dec13 = await fetchPayload(dec13Key);
const dec17 = await fetchPayload(dec17Key);
const dec13Products = Array.isArray(dec13) ? dec13 : dec13.products || [];
const dec17Products = Array.isArray(dec17) ? dec17 : dec17.products || [];
console.log(`\nDec 13: ${dec13Products.length} products`);
console.log(`Dec 17: ${dec17Products.length} products`);
// Build maps
const dec13Map: Map<string, any> = new Map(dec13Products.map((p: any) => [p.id || p._id, p]));
const dec17Map: Map<string, any> = new Map(dec17Products.map((p: any) => [p.id || p._id, p]));
// === NEW PRODUCTS ===
const newProducts: any[] = [];
for (const [id, p] of dec17Map) {
if (!dec13Map.has(id)) newProducts.push(p);
}
// === REMOVED PRODUCTS ===
const removedProducts: any[] = [];
for (const [id, p] of dec13Map) {
if (!dec17Map.has(id)) removedProducts.push(p);
}
// === PRICE CHANGES ===
const priceChanges: any[] = [];
const specialChanges: any[] = [];
for (const [id, newP] of dec17Map) {
const oldP = dec13Map.get(id);
if (!oldP) continue;
// Regular price change
const oldPrice = oldP.Prices?.[0];
const newPrice = newP.Prices?.[0];
if (oldPrice && newPrice && oldPrice !== newPrice) {
priceChanges.push({
name: newP.Name,
brand: newP.brand?.name,
oldPrice,
newPrice,
delta: newPrice - oldPrice,
});
}
// Special status change
const wasSpecial = oldP.special === true;
const isSpecial = newP.special === true;
const oldSpecialPrice = oldP.recSpecialPrices?.[0];
const newSpecialPrice = newP.recSpecialPrices?.[0];
if (wasSpecial !== isSpecial || oldSpecialPrice !== newSpecialPrice) {
specialChanges.push({
name: newP.Name,
brand: newP.brand?.name,
wasSpecial,
isSpecial,
regularPrice: newP.Prices?.[0],
oldSpecialPrice,
newSpecialPrice,
});
}
}
// === OUTPUT ===
console.log(`\n${'='.repeat(60)}`);
console.log('INVENTORY CHANGES: Dec 13 -> Dec 17 (4 days)');
console.log('='.repeat(60));
console.log(`\n=== NEW ARRIVALS (${newProducts.length}) ===`);
newProducts.slice(0, 10).forEach((p: any) => {
const brand = p.brand?.name || 'Unknown';
const price = p.Prices?.[0];
const special = p.special ? ` [SPECIAL: $${p.recSpecialPrices?.[0]}]` : '';
console.log(` + ${brand} - ${p.Name}: $${price}${special}`);
});
if (newProducts.length > 10) console.log(` ... and ${newProducts.length - 10} more`);
console.log(`\n=== SOLD OUT / REMOVED (${removedProducts.length}) ===`);
removedProducts.slice(0, 10).forEach((p: any) => {
const brand = p.brand?.name || 'Unknown';
const price = p.Prices?.[0];
console.log(` - ${brand} - ${p.Name}: was $${price}`);
});
if (removedProducts.length > 10) console.log(` ... and ${removedProducts.length - 10} more`);
console.log(`\n=== PRICE CHANGES (${priceChanges.length}) ===`);
priceChanges.slice(0, 10).forEach((c: any) => {
const sign = c.delta > 0 ? '+' : '';
console.log(` ${c.brand} - ${c.name}`);
console.log(` $${c.oldPrice} -> $${c.newPrice} (${sign}$${c.delta})`);
});
if (priceChanges.length > 10) console.log(` ... and ${priceChanges.length - 10} more`);
console.log(`\n=== SPECIAL/SALE CHANGES (${specialChanges.length}) ===`);
// New specials
const newSpecials = specialChanges.filter(c => !c.wasSpecial && c.isSpecial);
const endedSpecials = specialChanges.filter(c => c.wasSpecial && !c.isSpecial);
const priceAdjusted = specialChanges.filter(c =>
c.wasSpecial && c.isSpecial && c.oldSpecialPrice !== c.newSpecialPrice
);
console.log(`\n NEW SPECIALS (${newSpecials.length}):`);
newSpecials.slice(0, 5).forEach((c: any) => {
const discount = c.regularPrice - c.newSpecialPrice;
const pct = ((discount / c.regularPrice) * 100).toFixed(0);
console.log(` + ${c.brand} - ${c.name}`);
console.log(` $${c.regularPrice} -> $${c.newSpecialPrice} (${pct}% off)`);
});
console.log(`\n SPECIALS ENDED (${endedSpecials.length}):`);
endedSpecials.slice(0, 5).forEach((c: any) => {
console.log(` - ${c.brand} - ${c.name} (was $${c.oldSpecialPrice})`);
});
console.log(`\n SPECIAL PRICE ADJUSTED (${priceAdjusted.length}):`);
priceAdjusted.slice(0, 5).forEach((c: any) => {
console.log(` ~ ${c.brand} - ${c.name}`);
console.log(` $${c.oldSpecialPrice} -> $${c.newSpecialPrice}`);
});
// === REVENUE ESTIMATES ===
console.log(`\n${'='.repeat(60)}`);
console.log('ESTIMATED REVENUE FROM SOLD-OUT ITEMS');
console.log('='.repeat(60));
let totalRevenue = 0;
let itemCount = 0;
for (const p of removedProducts) {
const price = p.special ? (p.recSpecialPrices?.[0] || p.Prices?.[0]) : p.Prices?.[0];
if (price) {
totalRevenue += price;
itemCount++;
}
}
console.log(`\n Sold out items: ${itemCount}`);
console.log(` Estimated revenue: $${totalRevenue.toFixed(2)}`);
console.log(` (Using special prices where applicable)`);
// By brand
const brandRevenue = new Map<string, { count: number; revenue: number }>();
for (const p of removedProducts) {
const brand = p.brand?.name || 'Unknown';
const price = p.special ? (p.recSpecialPrices?.[0] || p.Prices?.[0]) : p.Prices?.[0];
if (!price) continue;
const current = brandRevenue.get(brand) || { count: 0, revenue: 0 };
current.count++;
current.revenue += price;
brandRevenue.set(brand, current);
}
console.log('\n BY BRAND:');
const sortedBrands = [...brandRevenue.entries()]
.sort((a, b) => b[1].revenue - a[1].revenue)
.slice(0, 10);
for (const [brand, data] of sortedBrands) {
console.log(` ${brand}: ${data.count} items, $${data.revenue.toFixed(2)}`);
}
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,168 @@
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<{ raw: Buffer; parsed: any }> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
const compressedSize = data.length;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return { raw: data, parsed: JSON.parse(data.toString('utf8')) };
}
async function main() {
const key = 'payloads/dutchie/2025/12/17/store_112_t008jb_1765939307492.json';
console.log('Analyzing payload storage requirements...\n');
const { raw, parsed } = await fetchPayload(key);
const products = Array.isArray(parsed) ? parsed : parsed.products || [];
const rawSizeKB = raw.length / 1024;
const rawSizeMB = rawSizeKB / 1024;
// Compress to estimate gzip storage
const compressed = await new Promise<Buffer>((resolve, reject) => {
zlib.gzip(raw, (err, result) => {
if (err) reject(err);
else resolve(result);
});
});
const compressedSizeKB = compressed.length / 1024;
console.log('=== PAYLOAD SIZE (Deeply Rooted - 987 products) ===');
console.log(` Raw JSON: ${rawSizeMB.toFixed(2)} MB (${rawSizeKB.toFixed(0)} KB)`);
console.log(` Gzipped: ${compressedSizeKB.toFixed(0)} KB`);
console.log(` Compression ratio: ${(rawSizeKB / compressedSizeKB).toFixed(1)}x`);
// Calculate per-product size
const perProductKB = rawSizeKB / products.length;
console.log(`\n Per product: ~${perProductKB.toFixed(2)} KB`);
// Storage projections for FULL PAYLOADS
console.log('\n=== FULL PAYLOAD STORAGE (if keeping everything) ===');
const payloadsPerMinute = 1;
const payloadsPerHour = 60;
const payloadsPerDay = 1440;
const payloadsPerMonth = payloadsPerDay * 30;
const dailyMB = (compressedSizeKB * payloadsPerDay) / 1024;
const monthlyGB = (dailyMB * 30) / 1024;
console.log(` 1 store @ 1/minute:`);
console.log(` Per day: ${dailyMB.toFixed(0)} MB`);
console.log(` Per month: ${monthlyGB.toFixed(1)} GB`);
console.log(`\n 10 stores @ 1/minute:`);
console.log(` Per day: ${(dailyMB * 10 / 1024).toFixed(1)} GB`);
console.log(` Per month: ${(monthlyGB * 10).toFixed(0)} GB`);
console.log(`\n 100 stores @ 1/minute:`);
console.log(` Per day: ${(dailyMB * 100 / 1024).toFixed(0)} GB`);
console.log(` Per month: ${(monthlyGB * 100 / 1024).toFixed(1)} TB`);
// Now estimate DIFF-ONLY storage
console.log('\n=== DIFF-ONLY STORAGE (recommended) ===');
// Typical diff: 20-50 changes per 10 minutes
// Each change record: ~200 bytes (product_id, field, old, new, timestamp)
const avgChangesPerPayload = 5; // Conservative - most payloads have few changes
const changeRecordBytes = 200;
const diffSizePerPayload = avgChangesPerPayload * changeRecordBytes;
const dailyDiffKB = (diffSizePerPayload * payloadsPerDay) / 1024;
const monthlyDiffMB = (dailyDiffKB * 30) / 1024;
console.log(` Avg changes per payload: ~${avgChangesPerPayload}`);
console.log(` Diff record size: ~${changeRecordBytes} bytes`);
console.log(`\n 1 store @ 1/minute:`);
console.log(` Per day: ${dailyDiffKB.toFixed(0)} KB`);
console.log(` Per month: ${monthlyDiffMB.toFixed(1)} MB`);
console.log(`\n 100 stores @ 1/minute:`);
console.log(` Per day: ${(dailyDiffKB * 100 / 1024).toFixed(1)} MB`);
console.log(` Per month: ${(monthlyDiffMB * 100 / 1024).toFixed(1)} GB`);
// Database row estimates
console.log('\n=== DATABASE ROW ESTIMATES ===');
console.log(`\n Option A: Store full snapshots`);
console.log(` 1 row per payload = ${payloadsPerDay.toLocaleString()} rows/day/store`);
console.log(` 100 stores = ${(payloadsPerDay * 100).toLocaleString()} rows/day`);
console.log(` 1 year = ${(payloadsPerDay * 100 * 365 / 1000000).toFixed(0)}M rows`);
console.log(`\n Option B: Store diffs only`);
console.log(` ~${avgChangesPerPayload} rows per payload = ${(avgChangesPerPayload * payloadsPerDay).toLocaleString()} rows/day/store`);
console.log(` 100 stores = ${(avgChangesPerPayload * payloadsPerDay * 100).toLocaleString()} rows/day`);
console.log(` 1 year = ${(avgChangesPerPayload * payloadsPerDay * 100 * 365 / 1000000).toFixed(0)}M rows`);
console.log(`\n Option C: Hybrid (daily snapshot + diffs)`);
console.log(` 1 snapshot/day + diffs`);
console.log(` Storage: ~${(monthlyDiffMB + (compressedSizeKB * 30 / 1024)).toFixed(0)} MB/month/store`);
// What we lose without full payloads
console.log('\n=== WHAT WE LOSE WITHOUT FULL PAYLOADS ===');
console.log(' - Ability to re-analyze with new logic');
console.log(' - Historical cannabinoid/terpene data');
console.log(' - Historical effect scores');
console.log(' - Historical images/descriptions');
console.log(' - Debug/audit trail');
console.log('\n=== WHAT WE NEED FROM DIFFS ===');
console.log(' - product_id, sku');
console.log(' - quantity_before, quantity_after');
console.log(' - price_before, price_after');
console.log(' - special_price_before, special_price_after');
console.log(' - status (new, removed, in_stock, out_of_stock)');
console.log(' - timestamp');
// Recommendation
console.log('\n' + '='.repeat(60));
console.log('RECOMMENDATION');
console.log('='.repeat(60));
console.log(`
1. Store DIFFS in PostgreSQL (small, queryable)
- inventory_changes table
- ~1KB per payload, scales to 100+ stores
2. Store FULL PAYLOADS in MinIO/S3 (cold storage)
- Keep for 30-90 days
- Compressed ~${compressedSizeKB.toFixed(0)}KB each
- Use for re-analysis, debugging
3. Store DAILY SNAPSHOTS in PostgreSQL
- One full product state per day
- Enables point-in-time reconstruction
`);
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,133 @@
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import * as fs from 'fs';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<any> {
console.log(`Fetching: ${key}`);
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
// Decompress if gzipped
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return JSON.parse(data.toString('utf8'));
}
async function main() {
// Fetch Dec 13 (first) and Dec 17 (latest)
const dec13Key = 'payloads/2025/12/13/store_112_1765609042078.json.gz';
const dec17Key = 'payloads/dutchie/2025/12/17/store_112_t008jb_1765939307492.json';
const dec13 = await fetchPayload(dec13Key);
const dec17 = await fetchPayload(dec17Key);
// Save locally for further analysis
fs.writeFileSync('/tmp/payloads-deeply-rooted/dec13.json', JSON.stringify(dec13, null, 2));
fs.writeFileSync('/tmp/payloads-deeply-rooted/dec17.json', JSON.stringify(dec17, null, 2));
// Handle different payload formats
const dec13Products = Array.isArray(dec13) ? dec13 : dec13.products || [];
const dec17Products = Array.isArray(dec17) ? dec17 : dec17.products || [];
console.log(`\nDec 13: ${dec13Products.length} products`);
console.log(`Dec 17: ${dec17Products.length} products`);
console.log(`Change: ${dec17Products.length - dec13Products.length} products`);
// Build maps by product ID
const dec13Map: Map<string, any> = new Map(dec13Products.map((p: any) => [p.id || p._id, p]));
const dec17Map: Map<string, any> = new Map(dec17Products.map((p: any) => [p.id || p._id, p]));
// Find new products (in Dec 17, not in Dec 13)
const newProducts: any[] = [];
for (const [id, product] of dec17Map) {
if (!dec13Map.has(id)) {
newProducts.push(product);
}
}
// Find removed products (in Dec 13, not in Dec 17)
const removedProducts: any[] = [];
for (const [id, product] of dec13Map) {
if (!dec17Map.has(id)) {
removedProducts.push(product);
}
}
console.log(`\n=== NEW PRODUCTS (${newProducts.length}) ===`);
newProducts.slice(0, 10).forEach((p: any) => {
const brand = p.brand?.name || p.brandName || 'Unknown';
console.log(` + ${brand} - ${p.Name}`);
});
if (newProducts.length > 10) console.log(` ... and ${newProducts.length - 10} more`);
console.log(`\n=== REMOVED PRODUCTS (${removedProducts.length}) ===`);
removedProducts.slice(0, 10).forEach((p: any) => {
const brand = p.brand?.name || p.brandName || 'Unknown';
console.log(` - ${brand} - ${p.Name}`);
});
if (removedProducts.length > 10) console.log(` ... and ${removedProducts.length - 10} more`);
// Check for price changes on common products
let priceChanges = 0;
const priceChangeExamples: any[] = [];
for (const [id, newProduct] of dec17Map) {
const oldProduct = dec13Map.get(id);
if (!oldProduct) continue;
const oldPrice = oldProduct.Prices?.[0] || oldProduct.recPrices?.[0];
const newPrice = newProduct.Prices?.[0] || newProduct.recPrices?.[0];
if (oldPrice && newPrice && oldPrice !== newPrice) {
priceChanges++;
if (priceChangeExamples.length < 5) {
priceChangeExamples.push({
name: newProduct.Name,
brand: newProduct.brand?.name || newProduct.brandName,
oldPrice,
newPrice,
delta: newPrice - oldPrice,
});
}
}
}
console.log(`\n=== PRICE CHANGES (${priceChanges}) ===`);
priceChangeExamples.forEach(c => {
const sign = c.delta > 0 ? '+' : '';
console.log(` ${c.brand} - ${c.name}: $${c.oldPrice}$${c.newPrice} (${sign}$${c.delta})`);
});
if (priceChanges > 5) console.log(` ... and ${priceChanges - 5} more`);
console.log('\nPayloads saved to /tmp/payloads-deeply-rooted/');
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,166 @@
/**
* Test the inventory tracker with real MinIO payloads
*
* Usage: npx tsx src/scripts/test-inventory-tracker.ts
*/
import { S3Client, GetObjectCommand, ListObjectsV2Command } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
import { calculateDiff } from '../services/inventory-tracker';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function fetchPayload(key: string): Promise<any[]> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
const parsed = JSON.parse(data.toString('utf8'));
return Array.isArray(parsed) ? parsed : parsed.products || [];
}
async function listPayloads(prefix: string): Promise<string[]> {
const response = await client.send(new ListObjectsV2Command({
Bucket: 'cannaiq',
Prefix: prefix,
}));
return (response.Contents || [])
.map(obj => obj.Key!)
.filter(Boolean)
.sort((a, b) => {
// Sort by timestamp in filename
const tsA = a.match(/_(\d{13})/)?.[1] || '0';
const tsB = b.match(/_(\d{13})/)?.[1] || '0';
return parseInt(tsA) - parseInt(tsB);
});
}
function extractTimestamp(key: string): Date {
const match = key.match(/_(\d{13})/);
if (match) {
return new Date(parseInt(match[1]));
}
return new Date();
}
async function main() {
console.log('Testing Inventory Tracker with MinIO Payloads\n');
console.log('='.repeat(70));
// Find all Deeply Rooted payloads (store 112)
const keys = await listPayloads('payloads/');
const store112Keys = keys.filter(k => k.includes('store_112'));
console.log(`Found ${store112Keys.length} payloads for store 112\n`);
if (store112Keys.length < 2) {
console.log('Need at least 2 payloads to compare. Exiting.');
return;
}
// Process consecutive payloads
let totalSales = 0;
let totalRevenue = 0;
let totalNewProducts = 0;
let totalRemoved = 0;
for (let i = 1; i < store112Keys.length; i++) {
const prevKey = store112Keys[i - 1];
const currKey = store112Keys[i];
const prevTime = extractTimestamp(prevKey);
const currTime = extractTimestamp(currKey);
console.log(`\n--- Comparing ---`);
console.log(` Previous: ${prevTime.toLocaleString()}`);
console.log(` Current: ${currTime.toLocaleString()}`);
const [prevProducts, currProducts] = await Promise.all([
fetchPayload(prevKey),
fetchPayload(currKey),
]);
console.log(` Products: ${prevProducts.length}${currProducts.length}`);
const diff = calculateDiff(prevProducts, currProducts, 112, currTime);
console.log(` Changes detected: ${diff.changes.length}`);
console.log(` - New products: ${diff.summary.newProducts}`);
console.log(` - Removed: ${diff.summary.removedProducts}`);
console.log(` - Sales: ${diff.summary.sales} (${diff.summary.unitsStold} units)`);
console.log(` - Restocks: ${diff.summary.restocks}`);
console.log(` - Price changes: ${diff.summary.priceChanges}`);
console.log(` - Revenue: $${diff.summary.totalRevenue.toFixed(2)}`);
totalSales += diff.summary.sales;
totalRevenue += diff.summary.totalRevenue;
totalNewProducts += diff.summary.newProducts;
totalRemoved += diff.summary.removedProducts;
// Show some example changes
if (diff.changes.length > 0) {
console.log(`\n Examples:`);
const examples = diff.changes.slice(0, 5);
for (const change of examples) {
if (change.changeType === 'sale') {
console.log(` [SALE] ${change.brandName} - ${change.productName} (${change.option})`);
console.log(` Qty: ${change.quantityBefore}${change.quantityAfter} (${change.quantityDelta})`);
console.log(` Revenue: $${change.revenue?.toFixed(2)} @ $${change.isSpecial ? change.specialPrice : change.price}`);
} else if (change.changeType === 'new') {
console.log(` [NEW] ${change.brandName} - ${change.productName} (${change.option})`);
console.log(` Qty: ${change.quantityAfter}, Price: $${change.price}`);
} else if (change.changeType === 'removed') {
console.log(` [REMOVED] ${change.brandName} - ${change.productName}`);
} else if (change.changeType === 'restock') {
console.log(` [RESTOCK] ${change.brandName} - ${change.productName} (${change.option})`);
console.log(` Qty: ${change.quantityBefore}${change.quantityAfter} (+${change.quantityDelta})`);
}
}
if (diff.changes.length > 5) {
console.log(` ... and ${diff.changes.length - 5} more changes`);
}
}
}
// Summary
console.log('\n' + '='.repeat(70));
console.log('TOTAL SUMMARY');
console.log('='.repeat(70));
console.log(` Payloads compared: ${store112Keys.length - 1}`);
console.log(` Total sales events: ${totalSales}`);
console.log(` Total revenue: $${totalRevenue.toFixed(2)}`);
console.log(` New products added: ${totalNewProducts}`);
console.log(` Products removed: ${totalRemoved}`);
}
main().catch(err => {
console.error('Error:', err.message);
process.exit(1);
});

View File

@@ -0,0 +1,255 @@
import { S3Client, GetObjectCommand, ListObjectsV2Command } from '@aws-sdk/client-s3';
import * as zlib from 'zlib';
import { Readable } from 'stream';
const client = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9002',
credentials: {
accessKeyId: 'cannaiq-app',
secretAccessKey: 'cannaiq-secret',
},
forcePathStyle: true,
});
async function listPayloads(): Promise<string[]> {
const keys: string[] = [];
// List all objects and filter for store 112
const response = await client.send(new ListObjectsV2Command({
Bucket: 'cannaiq',
Prefix: 'payloads/',
MaxKeys: 1000,
}));
for (const obj of response.Contents || []) {
if (obj.Key?.includes('112')) {
keys.push(obj.Key);
}
}
return keys.sort(); // Sort by path (chronological)
}
async function fetchPayload(key: string): Promise<any> {
const response = await client.send(new GetObjectCommand({
Bucket: 'cannaiq',
Key: key,
}));
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
let data: Buffer = Buffer.concat(chunks) as Buffer;
if (key.endsWith('.gz')) {
data = await new Promise((resolve, reject) => {
zlib.gunzip(data, (err, result) => {
if (err) reject(err);
else resolve(result);
});
}) as Buffer;
}
return JSON.parse(data.toString('utf8'));
}
function extractTimestamp(key: string): Date {
// Extract timestamp from filename like store_112_1765609042078.json
const match = key.match(/_(\d{13})/);
if (match) {
return new Date(parseInt(match[1]));
}
return new Date();
}
interface SkuSnapshot {
timestamp: Date;
quantity: number;
quantityAvailable: number;
price: number;
specialPrice: number | null;
}
interface SkuTimeline {
productId: string;
productName: string;
brand: string;
option: string;
snapshots: SkuSnapshot[];
}
async function main() {
console.log('Finding Deeply Rooted payloads...\n');
const keys = await listPayloads();
console.log(`Found ${keys.length} payloads:`);
keys.forEach(k => console.log(` ${k}`));
// Load all payloads
const payloads: { key: string; timestamp: Date; products: any[] }[] = [];
for (const key of keys) {
console.log(`\nLoading ${key}...`);
const data = await fetchPayload(key);
const products = Array.isArray(data) ? data : data.products || [];
const timestamp = extractTimestamp(key);
payloads.push({ key, timestamp, products });
console.log(` ${products.length} products, timestamp: ${timestamp.toISOString()}`);
}
// Sort by timestamp
payloads.sort((a, b) => a.timestamp.getTime() - b.timestamp.getTime());
// Build SKU timelines
const timelines = new Map<string, SkuTimeline>();
for (const { timestamp, products } of payloads) {
for (const p of products) {
const children = p.POSMetaData?.children || [];
const options = p.Options || [];
const prices = p.Prices || [];
const specialPrices = p.recSpecialPrices || [];
const isSpecial = p.special === true;
for (let i = 0; i < children.length; i++) {
const child = children[i];
const skuKey = `${p.id || p._id}_${child.option || options[i] || i}`;
if (!timelines.has(skuKey)) {
timelines.set(skuKey, {
productId: p.id || p._id,
productName: p.Name,
brand: p.brand?.name || p.brandName || 'Unknown',
option: child.option || options[i] || `variant_${i}`,
snapshots: [],
});
}
timelines.get(skuKey)!.snapshots.push({
timestamp,
quantity: child.quantity || 0,
quantityAvailable: child.quantityAvailable || 0,
price: child.price || child.recPrice || prices[i] || 0,
specialPrice: isSpecial ? (specialPrices[i] || null) : null,
});
}
}
}
// Find SKUs with quantity changes
const skusWithChanges: { sku: SkuTimeline; totalSold: number; revenue: number }[] = [];
for (const [key, timeline] of timelines) {
if (timeline.snapshots.length < 2) continue;
const first = timeline.snapshots[0];
const last = timeline.snapshots[timeline.snapshots.length - 1];
const totalSold = first.quantity - last.quantity;
if (totalSold > 0) {
// Use last price (today's price) for revenue
const priceEach = last.specialPrice || last.price;
skusWithChanges.push({
sku: timeline,
totalSold,
revenue: totalSold * priceEach,
});
}
}
// Sort by total sold
skusWithChanges.sort((a, b) => b.totalSold - a.totalSold);
console.log(`\n${'='.repeat(80)}`);
console.log('SKU QUANTITY CHANGES OVER TIME - Deeply Rooted Phoenix');
console.log('='.repeat(80));
console.log(`\nTime range: ${payloads[0].timestamp.toISOString()} to ${payloads[payloads.length - 1].timestamp.toISOString()}`);
console.log(`Total snapshots: ${payloads.length}`);
console.log(`SKUs with sales: ${skusWithChanges.length}`);
// Show top movers with full timeline
console.log(`\n=== TOP 30 SELLERS (with quantity timeline) ===\n`);
skusWithChanges.slice(0, 30).forEach((item, idx) => {
const { sku, totalSold, revenue } = item;
const priceInfo = sku.snapshots[sku.snapshots.length - 1];
const priceStr = priceInfo.specialPrice
? `$${priceInfo.specialPrice} (special)`
: `$${priceInfo.price}`;
console.log(`${idx + 1}. ${sku.brand} - ${sku.productName} (${sku.option})`);
console.log(` Price: ${priceStr} | Sold: ${totalSold} units | Revenue: $${revenue.toFixed(2)}`);
// Show quantity at each snapshot
const qtyLine = sku.snapshots.map((s, i) => {
const date = s.timestamp.toLocaleDateString('en-US', { month: 'short', day: 'numeric' });
const time = s.timestamp.toLocaleTimeString('en-US', { hour: '2-digit', minute: '2-digit' });
return `${date} ${time}: ${s.quantity}`;
}).join(' → ');
console.log(` Qty: ${qtyLine}`);
console.log('');
});
// Summary stats
const totalUnits = skusWithChanges.reduce((sum, s) => sum + s.totalSold, 0);
const totalRevenue = skusWithChanges.reduce((sum, s) => sum + s.revenue, 0);
console.log(`${'='.repeat(80)}`);
console.log('SUMMARY');
console.log('='.repeat(80));
console.log(` Total Units Sold: ${totalUnits}`);
console.log(` Total Revenue: $${totalRevenue.toFixed(2)}`);
console.log(` Unique SKUs Sold: ${skusWithChanges.length}`);
// Daily breakdown
console.log(`\n=== SALES BY DAY ===`);
const dayMap = new Map<string, { units: number; revenue: number }>();
for (let i = 1; i < payloads.length; i++) {
const prev = payloads[i - 1];
const curr = payloads[i];
const day = curr.timestamp.toLocaleDateString('en-US', { weekday: 'short', month: 'short', day: 'numeric' });
if (!dayMap.has(day)) {
dayMap.set(day, { units: 0, revenue: 0 });
}
// Calculate sales between these two snapshots
const prevMap = new Map<string, number>();
for (const p of prev.products) {
for (const child of p.POSMetaData?.children || []) {
prevMap.set(`${p.id}_${child.option}`, child.quantity || 0);
}
}
for (const p of curr.products) {
const specialPrices = p.recSpecialPrices || [];
const isSpecial = p.special === true;
for (let j = 0; j < (p.POSMetaData?.children || []).length; j++) {
const child = p.POSMetaData.children[j];
const key = `${p.id}_${child.option}`;
const prevQty = prevMap.get(key) || 0;
const currQty = child.quantity || 0;
const sold = prevQty - currQty;
if (sold > 0) {
const price = isSpecial ? (specialPrices[j] || child.price) : child.price;
const data = dayMap.get(day)!;
data.units += sold;
data.revenue += sold * price;
}
}
}
}
for (const [day, data] of dayMap) {
console.log(` ${day}: ${data.units} units, $${data.revenue.toFixed(2)}`);
}
}
main().catch(err => console.error('Error:', err.message));

View File

@@ -0,0 +1,589 @@
/**
* SalesAnalyticsService
*
* Market intelligence and sales velocity analytics using materialized views.
* Provides fast queries for dashboards with pre-computed metrics.
*
* Data Sources:
* - mv_daily_sales_estimates: Daily sales from inventory deltas
* - mv_brand_market_share: Brand penetration by state
* - mv_sku_velocity: SKU velocity rankings
* - mv_store_performance: Dispensary performance rankings
* - mv_category_weekly_trends: Weekly category trends
* - mv_product_intelligence: Per-product Hoodie-style metrics
*/
import { pool } from '../../db/pool';
import { TimeWindow, DateRange, getDateRangeFromWindow } from './types';
// ============================================================
// TYPES
// ============================================================
export interface DailySalesEstimate {
dispensary_id: number;
product_id: string;
brand_name: string | null;
category: string | null;
sale_date: string;
avg_price: number | null;
units_sold: number;
units_restocked: number;
revenue_estimate: number;
snapshot_count: number;
}
export interface BrandMarketShare {
brand_name: string;
state_code: string;
stores_carrying: number;
total_stores: number;
penetration_pct: number;
sku_count: number;
in_stock_skus: number;
avg_price: number | null;
}
export interface SkuVelocity {
product_id: string;
brand_name: string | null;
category: string | null;
dispensary_id: number;
dispensary_name: string;
state_code: string;
total_units_30d: number;
total_revenue_30d: number;
days_with_sales: number;
avg_daily_units: number;
avg_price: number | null;
velocity_tier: 'hot' | 'steady' | 'slow' | 'stale';
}
export interface StorePerformance {
dispensary_id: number;
dispensary_name: string;
city: string | null;
state_code: string;
total_revenue_30d: number;
total_units_30d: number;
total_skus: number;
in_stock_skus: number;
unique_brands: number;
unique_categories: number;
avg_price: number | null;
last_updated: string | null;
}
export interface CategoryWeeklyTrend {
category: string;
state_code: string;
week_start: string;
sku_count: number;
store_count: number;
total_units: number;
total_revenue: number;
avg_price: number | null;
}
export interface ProductIntelligence {
dispensary_id: number;
dispensary_name: string;
state_code: string;
city: string | null;
sku: string;
product_name: string | null;
brand: string | null;
category: string | null;
is_in_stock: boolean;
stock_status: string | null;
stock_quantity: number | null;
price: number | null;
first_seen: string | null;
last_seen: string | null;
stock_diff_120: number;
days_since_oos: number | null;
days_until_stock_out: number | null;
avg_daily_units: number | null;
}
export interface ViewRefreshResult {
view_name: string;
rows_affected: number;
}
// ============================================================
// SERVICE CLASS
// ============================================================
export class SalesAnalyticsService {
/**
* Get daily sales estimates with filters
*/
async getDailySalesEstimates(options: {
stateCode?: string;
brandName?: string;
category?: string;
dispensaryId?: number;
dateRange?: DateRange;
limit?: number;
} = {}): Promise<DailySalesEstimate[]> {
const { stateCode, brandName, category, dispensaryId, dateRange, limit = 100 } = options;
const params: (string | number | Date)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
if (stateCode) {
conditions.push(`d.state = $${paramIdx++}`);
params.push(stateCode);
}
if (brandName) {
conditions.push(`dse.brand_name ILIKE $${paramIdx++}`);
params.push(`%${brandName}%`);
}
if (category) {
conditions.push(`dse.category = $${paramIdx++}`);
params.push(category);
}
if (dispensaryId) {
conditions.push(`dse.dispensary_id = $${paramIdx++}`);
params.push(dispensaryId);
}
if (dateRange) {
conditions.push(`dse.sale_date >= $${paramIdx++}`);
params.push(dateRange.start);
conditions.push(`dse.sale_date <= $${paramIdx++}`);
params.push(dateRange.end);
}
params.push(limit);
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const result = await pool.query(`
SELECT dse.*
FROM mv_daily_sales_estimates dse
JOIN dispensaries d ON d.id = dse.dispensary_id
${whereClause}
ORDER BY dse.sale_date DESC, dse.revenue_estimate DESC
LIMIT $${paramIdx}
`, params);
return result.rows.map((row: any) => ({
dispensary_id: row.dispensary_id,
product_id: row.product_id,
brand_name: row.brand_name,
category: row.category,
sale_date: row.sale_date?.toISOString().split('T')[0] || '',
avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
units_sold: parseInt(row.units_sold) || 0,
units_restocked: parseInt(row.units_restocked) || 0,
revenue_estimate: parseFloat(row.revenue_estimate) || 0,
snapshot_count: parseInt(row.snapshot_count) || 0,
}));
}
/**
* Get brand market share by state
*/
async getBrandMarketShare(options: {
stateCode?: string;
brandName?: string;
minPenetration?: number;
limit?: number;
} = {}): Promise<BrandMarketShare[]> {
const { stateCode, brandName, minPenetration = 0, limit = 100 } = options;
const params: (string | number)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
if (stateCode) {
conditions.push(`state_code = $${paramIdx++}`);
params.push(stateCode);
}
if (brandName) {
conditions.push(`brand_name ILIKE $${paramIdx++}`);
params.push(`%${brandName}%`);
}
if (minPenetration > 0) {
conditions.push(`penetration_pct >= $${paramIdx++}`);
params.push(minPenetration);
}
params.push(limit);
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const result = await pool.query(`
SELECT *
FROM mv_brand_market_share
${whereClause}
ORDER BY penetration_pct DESC, stores_carrying DESC
LIMIT $${paramIdx}
`, params);
return result.rows.map((row: any) => ({
brand_name: row.brand_name,
state_code: row.state_code,
stores_carrying: parseInt(row.stores_carrying) || 0,
total_stores: parseInt(row.total_stores) || 0,
penetration_pct: parseFloat(row.penetration_pct) || 0,
sku_count: parseInt(row.sku_count) || 0,
in_stock_skus: parseInt(row.in_stock_skus) || 0,
avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
}));
}
/**
* Get SKU velocity rankings
*/
async getSkuVelocity(options: {
stateCode?: string;
brandName?: string;
category?: string;
dispensaryId?: number;
velocityTier?: 'hot' | 'steady' | 'slow' | 'stale';
limit?: number;
} = {}): Promise<SkuVelocity[]> {
const { stateCode, brandName, category, dispensaryId, velocityTier, limit = 100 } = options;
const params: (string | number)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
if (stateCode) {
conditions.push(`state_code = $${paramIdx++}`);
params.push(stateCode);
}
if (brandName) {
conditions.push(`brand_name ILIKE $${paramIdx++}`);
params.push(`%${brandName}%`);
}
if (category) {
conditions.push(`category = $${paramIdx++}`);
params.push(category);
}
if (dispensaryId) {
conditions.push(`dispensary_id = $${paramIdx++}`);
params.push(dispensaryId);
}
if (velocityTier) {
conditions.push(`velocity_tier = $${paramIdx++}`);
params.push(velocityTier);
}
params.push(limit);
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const result = await pool.query(`
SELECT *
FROM mv_sku_velocity
${whereClause}
ORDER BY total_units_30d DESC
LIMIT $${paramIdx}
`, params);
return result.rows.map((row: any) => ({
product_id: row.product_id,
brand_name: row.brand_name,
category: row.category,
dispensary_id: row.dispensary_id,
dispensary_name: row.dispensary_name,
state_code: row.state_code,
total_units_30d: parseInt(row.total_units_30d) || 0,
total_revenue_30d: parseFloat(row.total_revenue_30d) || 0,
days_with_sales: parseInt(row.days_with_sales) || 0,
avg_daily_units: parseFloat(row.avg_daily_units) || 0,
avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
velocity_tier: row.velocity_tier,
}));
}
/**
* Get dispensary performance rankings
*/
async getStorePerformance(options: {
stateCode?: string;
sortBy?: 'revenue' | 'units' | 'brands' | 'skus';
limit?: number;
} = {}): Promise<StorePerformance[]> {
const { stateCode, sortBy = 'revenue', limit = 100 } = options;
const params: (string | number)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
if (stateCode) {
conditions.push(`state_code = $${paramIdx++}`);
params.push(stateCode);
}
params.push(limit);
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const orderByMap: Record<string, string> = {
revenue: 'total_revenue_30d DESC',
units: 'total_units_30d DESC',
brands: 'unique_brands DESC',
skus: 'total_skus DESC',
};
const orderBy = orderByMap[sortBy] || orderByMap.revenue;
const result = await pool.query(`
SELECT *
FROM mv_store_performance
${whereClause}
ORDER BY ${orderBy}
LIMIT $${paramIdx}
`, params);
return result.rows.map((row: any) => ({
dispensary_id: row.dispensary_id,
dispensary_name: row.dispensary_name,
city: row.city,
state_code: row.state_code,
total_revenue_30d: parseFloat(row.total_revenue_30d) || 0,
total_units_30d: parseInt(row.total_units_30d) || 0,
total_skus: parseInt(row.total_skus) || 0,
in_stock_skus: parseInt(row.in_stock_skus) || 0,
unique_brands: parseInt(row.unique_brands) || 0,
unique_categories: parseInt(row.unique_categories) || 0,
avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
last_updated: row.last_updated?.toISOString() || null,
}));
}
/**
* Get category weekly trends
*/
async getCategoryTrends(options: {
stateCode?: string;
category?: string;
weeks?: number;
} = {}): Promise<CategoryWeeklyTrend[]> {
const { stateCode, category, weeks = 12 } = options;
const params: (string | number)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
if (stateCode) {
conditions.push(`state_code = $${paramIdx++}`);
params.push(stateCode);
}
if (category) {
conditions.push(`category = $${paramIdx++}`);
params.push(category);
}
conditions.push(`week_start >= CURRENT_DATE - INTERVAL '${weeks} weeks'`);
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const result = await pool.query(`
SELECT *
FROM mv_category_weekly_trends
${whereClause}
ORDER BY week_start DESC, total_revenue DESC
`, params);
return result.rows.map((row: any) => ({
category: row.category,
state_code: row.state_code,
week_start: row.week_start?.toISOString().split('T')[0] || '',
sku_count: parseInt(row.sku_count) || 0,
store_count: parseInt(row.store_count) || 0,
total_units: parseInt(row.total_units) || 0,
total_revenue: parseFloat(row.total_revenue) || 0,
avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
}));
}
/**
* Get product intelligence (Hoodie-style per-product metrics)
*/
async getProductIntelligence(options: {
stateCode?: string;
brandName?: string;
category?: string;
dispensaryId?: number;
inStockOnly?: boolean;
lowStock?: boolean; // days_until_stock_out <= 7
recentOOS?: boolean; // days_since_oos <= 7
limit?: number;
} = {}): Promise<ProductIntelligence[]> {
const { stateCode, brandName, category, dispensaryId, inStockOnly, lowStock, recentOOS, limit = 100 } = options;
const params: (string | number)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
if (stateCode) {
conditions.push(`state_code = $${paramIdx++}`);
params.push(stateCode);
}
if (brandName) {
conditions.push(`brand ILIKE $${paramIdx++}`);
params.push(`%${brandName}%`);
}
if (category) {
conditions.push(`category = $${paramIdx++}`);
params.push(category);
}
if (dispensaryId) {
conditions.push(`dispensary_id = $${paramIdx++}`);
params.push(dispensaryId);
}
if (inStockOnly) {
conditions.push(`is_in_stock = TRUE`);
}
if (lowStock) {
conditions.push(`days_until_stock_out IS NOT NULL AND days_until_stock_out <= 7`);
}
if (recentOOS) {
conditions.push(`days_since_oos IS NOT NULL AND days_since_oos <= 7`);
}
params.push(limit);
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const result = await pool.query(`
SELECT *
FROM mv_product_intelligence
${whereClause}
ORDER BY
CASE WHEN days_until_stock_out IS NOT NULL THEN 0 ELSE 1 END,
days_until_stock_out ASC NULLS LAST,
stock_quantity DESC
LIMIT $${paramIdx}
`, params);
return result.rows.map((row: any) => ({
dispensary_id: row.dispensary_id,
dispensary_name: row.dispensary_name,
state_code: row.state_code,
city: row.city,
sku: row.sku,
product_name: row.product_name,
brand: row.brand,
category: row.category,
is_in_stock: row.is_in_stock,
stock_status: row.stock_status,
stock_quantity: row.stock_quantity ? parseInt(row.stock_quantity) : null,
price: row.price ? parseFloat(row.price) : null,
first_seen: row.first_seen?.toISOString() || null,
last_seen: row.last_seen?.toISOString() || null,
stock_diff_120: parseInt(row.stock_diff_120) || 0,
days_since_oos: row.days_since_oos ? parseInt(row.days_since_oos) : null,
days_until_stock_out: row.days_until_stock_out ? parseInt(row.days_until_stock_out) : null,
avg_daily_units: row.avg_daily_units ? parseFloat(row.avg_daily_units) : null,
}));
}
/**
* Get top selling brands by revenue
*/
async getTopBrands(options: {
stateCode?: string;
window?: TimeWindow;
limit?: number;
} = {}): Promise<Array<{
brand_name: string;
total_revenue: number;
total_units: number;
store_count: number;
sku_count: number;
avg_price: number | null;
}>> {
const { stateCode, window = '30d', limit = 50 } = options;
const params: (string | number)[] = [];
let paramIdx = 1;
const conditions: string[] = [];
const dateRange = getDateRangeFromWindow(window);
conditions.push(`dse.sale_date >= $${paramIdx++}`);
params.push(dateRange.start.toISOString().split('T')[0]);
if (stateCode) {
conditions.push(`d.state = $${paramIdx++}`);
params.push(stateCode);
}
params.push(limit);
const whereClause = `WHERE ${conditions.join(' AND ')}`;
const result = await pool.query(`
SELECT
dse.brand_name,
SUM(dse.revenue_estimate) AS total_revenue,
SUM(dse.units_sold) AS total_units,
COUNT(DISTINCT dse.dispensary_id) AS store_count,
COUNT(DISTINCT dse.product_id) AS sku_count,
AVG(dse.avg_price) AS avg_price
FROM mv_daily_sales_estimates dse
JOIN dispensaries d ON d.id = dse.dispensary_id
${whereClause}
AND dse.brand_name IS NOT NULL
GROUP BY dse.brand_name
ORDER BY total_revenue DESC
LIMIT $${paramIdx}
`, params);
return result.rows.map((row: any) => ({
brand_name: row.brand_name,
total_revenue: parseFloat(row.total_revenue) || 0,
total_units: parseInt(row.total_units) || 0,
store_count: parseInt(row.store_count) || 0,
sku_count: parseInt(row.sku_count) || 0,
avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
}));
}
/**
* Refresh all materialized views
*/
async refreshViews(): Promise<ViewRefreshResult[]> {
try {
const result = await pool.query('SELECT * FROM refresh_sales_analytics_views()');
return result.rows.map((row: any) => ({
view_name: row.view_name,
rows_affected: parseInt(row.rows_affected) || 0,
}));
} catch (error: any) {
// If function doesn't exist yet (migration not run), return empty
if (error.code === '42883') {
console.warn('[SalesAnalytics] refresh_sales_analytics_views() not found - run migration 121');
return [];
}
throw error;
}
}
/**
* Get view statistics (row counts)
*/
async getViewStats(): Promise<Record<string, number>> {
const views = [
'mv_daily_sales_estimates',
'mv_brand_market_share',
'mv_sku_velocity',
'mv_store_performance',
'mv_category_weekly_trends',
'mv_product_intelligence',
];
const stats: Record<string, number> = {};
for (const view of views) {
try {
const result = await pool.query(`SELECT COUNT(*) FROM ${view}`);
stats[view] = parseInt(result.rows[0].count) || 0;
} catch {
stats[view] = -1; // View doesn't exist yet
}
}
return stats;
}
}
export default new SalesAnalyticsService();

View File

@@ -12,3 +12,4 @@ export { CategoryAnalyticsService } from './CategoryAnalyticsService';
export { StoreAnalyticsService } from './StoreAnalyticsService';
export { StateAnalyticsService } from './StateAnalyticsService';
export { BrandIntelligenceService } from './BrandIntelligenceService';
export { SalesAnalyticsService } from './SalesAnalyticsService';

View File

@@ -1061,7 +1061,9 @@ export function buildEvomiProxyUrl(
let geoDisplay = region;
if (city) {
geoParams += `_city-${city}`;
// Evomi expects city as lowercase with dots for spaces: "El Mirage" -> "el.mirage"
const formattedCity = city.toLowerCase().replace(/\s+/g, '.');
geoParams += `_city-${formattedCity}`;
geoDisplay = `${city}, ${region}`;
}

View File

@@ -0,0 +1,263 @@
/**
* Daily Snapshot Service
*
* Stores a daily benchmark payload for each dispensary.
* The first payload of each day becomes the "benchmark" against which
* all subsequent payloads are compared to detect new products.
*
* Key insight: New products are detected when they appear in the current
* payload but NOT in the daily snapshot (benchmark).
*/
import { pool } from '../db/pool';
// ============================================================================
// TYPES
// ============================================================================
export interface DailySnapshot {
id: number;
dispensaryId: number;
snapshotDate: string; // YYYY-MM-DD
products: any[];
productCount: number;
totalSkus: number;
createdAt: Date;
}
// ============================================================================
// DATABASE OPERATIONS
// ============================================================================
/**
* Store a daily snapshot for a dispensary
* Only stores if no snapshot exists for that day yet
*/
export async function storeDailySnapshot(
dispensaryId: number,
products: any[],
date: Date = new Date()
): Promise<{ stored: boolean; isNew: boolean }> {
const snapshotDate = date.toISOString().split('T')[0]; // YYYY-MM-DD
// Count total SKUs (variants)
let totalSkus = 0;
for (const product of products) {
const children = product.POSMetaData?.children || [];
totalSkus += children.length || 1; // At least 1 if no children
}
// Try to insert - will skip if already exists for this day
const result = await pool.query(`
INSERT INTO daily_snapshots (
dispensary_id, snapshot_date, products, product_count, total_skus
) VALUES ($1, $2, $3, $4, $5)
ON CONFLICT (dispensary_id, snapshot_date) DO NOTHING
RETURNING id
`, [
dispensaryId,
snapshotDate,
JSON.stringify(products),
products.length,
totalSkus,
]);
const isNew = result.rows.length > 0;
if (isNew) {
console.log(`[DailySnapshot] Stored new daily snapshot for dispensary ${dispensaryId} on ${snapshotDate} (${products.length} products, ${totalSkus} SKUs)`);
}
return { stored: true, isNew };
}
/**
* Get the daily snapshot for a dispensary
* Returns the benchmark to compare against
*/
export async function getDailySnapshot(
dispensaryId: number,
date: Date = new Date()
): Promise<DailySnapshot | null> {
const snapshotDate = date.toISOString().split('T')[0];
const result = await pool.query(`
SELECT id, dispensary_id, snapshot_date, products, product_count, total_skus, created_at
FROM daily_snapshots
WHERE dispensary_id = $1 AND snapshot_date = $2
`, [dispensaryId, snapshotDate]);
if (result.rows.length === 0) {
return null;
}
const row = result.rows[0];
return {
id: row.id,
dispensaryId: row.dispensary_id,
snapshotDate: row.snapshot_date,
products: row.products,
productCount: row.product_count,
totalSkus: row.total_skus,
createdAt: row.created_at,
};
}
/**
* Get the most recent daily snapshot for a dispensary
* Falls back to yesterday or earlier if today's doesn't exist
*/
export async function getLatestSnapshot(dispensaryId: number): Promise<DailySnapshot | null> {
const result = await pool.query(`
SELECT id, dispensary_id, snapshot_date, products, product_count, total_skus, created_at
FROM daily_snapshots
WHERE dispensary_id = $1
ORDER BY snapshot_date DESC
LIMIT 1
`, [dispensaryId]);
if (result.rows.length === 0) {
return null;
}
const row = result.rows[0];
return {
id: row.id,
dispensaryId: row.dispensary_id,
snapshotDate: row.snapshot_date,
products: row.products,
productCount: row.product_count,
totalSkus: row.total_skus,
createdAt: row.created_at,
};
}
/**
* Check if a daily snapshot exists for a dispensary
*/
export async function hasSnapshot(
dispensaryId: number,
date: Date = new Date()
): Promise<boolean> {
const snapshotDate = date.toISOString().split('T')[0];
const result = await pool.query(`
SELECT 1 FROM daily_snapshots
WHERE dispensary_id = $1 AND snapshot_date = $2
LIMIT 1
`, [dispensaryId, snapshotDate]);
return result.rows.length > 0;
}
/**
* Get snapshot statistics for a dispensary
*/
export async function getSnapshotStats(dispensaryId: number): Promise<{
totalSnapshots: number;
oldestDate: string | null;
newestDate: string | null;
avgProductCount: number;
}> {
const result = await pool.query(`
SELECT
COUNT(*) as total,
MIN(snapshot_date) as oldest,
MAX(snapshot_date) as newest,
AVG(product_count) as avg_products
FROM daily_snapshots
WHERE dispensary_id = $1
`, [dispensaryId]);
const row = result.rows[0];
return {
totalSnapshots: parseInt(row.total) || 0,
oldestDate: row.oldest,
newestDate: row.newest,
avgProductCount: parseFloat(row.avg_products) || 0,
};
}
/**
* Clean up old snapshots (keep last N days)
*/
export async function pruneOldSnapshots(
dispensaryId: number,
keepDays: number = 90
): Promise<number> {
const cutoffDate = new Date();
cutoffDate.setDate(cutoffDate.getDate() - keepDays);
const cutoff = cutoffDate.toISOString().split('T')[0];
const result = await pool.query(`
DELETE FROM daily_snapshots
WHERE dispensary_id = $1 AND snapshot_date < $2
RETURNING id
`, [dispensaryId, cutoff]);
const deleted = result.rows.length;
if (deleted > 0) {
console.log(`[DailySnapshot] Pruned ${deleted} old snapshots for dispensary ${dispensaryId}`);
}
return deleted;
}
/**
* Get the "benchmark" payload for comparison
*
* Strategy:
* 1. If today's snapshot exists, use it
* 2. Otherwise, use the most recent snapshot
* 3. If no snapshots exist, return null (first payload becomes benchmark)
*/
export async function getBenchmarkProducts(
dispensaryId: number
): Promise<any[] | null> {
// First try today
const today = await getDailySnapshot(dispensaryId);
if (today) {
return today.products;
}
// Fall back to most recent
const latest = await getLatestSnapshot(dispensaryId);
if (latest) {
return latest.products;
}
return null;
}
// ============================================================================
// MIGRATION: Create daily_snapshots table if not exists
// ============================================================================
export async function ensureTableExists(): Promise<void> {
await pool.query(`
CREATE TABLE IF NOT EXISTS daily_snapshots (
id BIGSERIAL PRIMARY KEY,
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id),
snapshot_date DATE NOT NULL,
products JSONB NOT NULL,
product_count INTEGER,
total_skus INTEGER,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(dispensary_id, snapshot_date)
);
CREATE INDEX IF NOT EXISTS idx_daily_snapshots_lookup
ON daily_snapshots(dispensary_id, snapshot_date DESC);
`);
}
export default {
storeDailySnapshot,
getDailySnapshot,
getLatestSnapshot,
hasSnapshot,
getSnapshotStats,
pruneOldSnapshots,
getBenchmarkProducts,
ensureTableExists,
};

View File

@@ -0,0 +1,333 @@
/**
* Hoodie Analytics Client
*
* Queries Hoodie's Algolia indexes directly (no local sync).
* All data is fetched on-demand from their API.
* Uses Algolia v5 API.
*/
import { algoliasearch, SearchResponse } from 'algoliasearch';
// Hoodie Algolia credentials
const HOODIE_APP_ID = 'O2F6KJTKA2';
const HOODIE_SEARCH_KEY = '25e847f4981049ae7c5081b5ed98a74a';
// Index names
const INDEXES = {
dispensaries: 'all_DISPENSARIES_V2',
products: 'all_PRODUCTS_V2',
brands: 'all_BRANDS_V2',
masterProducts: 'master_products',
locations: 'LOCATIONS',
} as const;
// Types based on Hoodie's schema
export interface HoodieDispensary {
objectID: string;
DISPENSARY_ID: string;
DISPENSARY_NAME: string;
DISPENSARY_COMPANY_NAME: string;
SLUG: string;
STREET_ADDRESS: string;
CITY: string;
STATE: string;
POSTAL_CODE: string;
COUNTRY_CODE: string;
PHONE: string;
EMAIL: string;
WEBSITE: string;
URL: string;
POS_SYSTEM: string;
MENUS_COUNT: number;
MENUS_COUNT_REC: number;
MENUS_COUNT_MED: number;
AVG_DAILY_SALES: number;
ESTIMATED_DAILY_SALES: number;
MEDICAL: boolean;
RECREATIONAL: boolean;
DELIVERY: boolean;
DELIVERY_ENABLED: boolean;
CURBSIDE_PICKUP: boolean;
INSTORE_PICKUP: boolean;
IS_CLOSED: boolean;
RATING: number;
REVIEWS_COUNT: number;
LICENSE_NUMBER: string;
LICENSE_TYPE: string;
BANNER: string;
BANNER_ID: string;
LOGO: string;
DESCRIPTION: string;
TIMEZONE: string;
URBANICITY: string;
MEDIAN_HH_INCOME_1000S: number;
_geoloc: { lat: number; lng: number };
LAST_UPDATED_AT: string;
}
export interface HoodieProduct {
objectID: string;
NAME: string;
BRAND: string;
BRAND_ID: string;
BRAND_COMPANY_NAME: string;
CATEGORY_0: string;
CATEGORY_1: string;
CATEGORY_2: string;
CANNABIS_TYPE: string;
STRAIN: string;
DESCRIPTION: string;
IMG: string;
URL: string;
MENU_SLUG: string;
MASTER_SLUG: string;
MASTERED_STATUS: string;
DISPENSARY_COUNT: number;
IN_STOCK: boolean;
VARIANTS: any[];
CANNABINOIDS: Record<string, number>;
D_STATE: string;
D_CITY: string;
D_BANNER: string;
LAST_SEEN_AT: string;
}
export interface HoodieBrand {
objectID: string;
BRAND_NAME: string;
SLUG: string;
BRAND_DESCRIPTION: string;
BRAND_LOGO_URL: string;
BRAND_URL: string;
PARENT_BRAND: string;
PARENT_COMPANY: string;
STATES: string[];
ACTIVE_VARIANTS: number;
ALL_VARIANTS: number;
LINKEDIN_URL: string;
}
export interface HoodieMasterProduct {
objectID: string;
NAME: string;
BRAND: string;
BRAND_ID: string;
CATEGORY_0: string;
CATEGORY_1: string;
CANNABIS_TYPE: string;
STRAIN: string;
IMG: string;
DISPENSARY_COUNT: number;
VARIANTS: any[];
CANNABINOIDS: Record<string, number>;
}
export interface SearchOptions {
query?: string;
filters?: string;
facetFilters?: string[][];
hitsPerPage?: number;
page?: number;
attributesToRetrieve?: string[];
}
export interface SearchResult<T> {
hits: T[];
nbHits: number;
page: number;
nbPages: number;
hitsPerPage: number;
query: string;
}
// Create Algolia client (v5 API)
const client = algoliasearch(HOODIE_APP_ID, HOODIE_SEARCH_KEY);
class HoodieClient {
// ============================================================
// CORE SEARCH METHOD
// ============================================================
private async search<T>(indexName: string, options: SearchOptions = {}): Promise<SearchResult<T>> {
const result = await client.searchSingleIndex<T>({
indexName,
searchParams: {
query: options.query || '',
filters: options.filters,
facetFilters: options.facetFilters as any,
hitsPerPage: options.hitsPerPage || 20,
page: options.page || 0,
attributesToRetrieve: options.attributesToRetrieve,
},
});
return {
hits: result.hits as T[],
nbHits: result.nbHits || 0,
page: result.page || 0,
nbPages: result.nbPages || 0,
hitsPerPage: result.hitsPerPage || 20,
query: result.query || '',
};
}
// ============================================================
// DISPENSARY QUERIES
// ============================================================
async searchDispensaries(options: SearchOptions = {}): Promise<SearchResult<HoodieDispensary>> {
return this.search<HoodieDispensary>(INDEXES.dispensaries, options);
}
async getDispensaryByName(name: string, state?: string): Promise<HoodieDispensary | null> {
const filters = state ? `STATE:"${state}"` : undefined;
const result = await this.searchDispensaries({ query: name, filters, hitsPerPage: 1 });
return result.hits[0] || null;
}
async getDispensaryBySlug(slug: string): Promise<HoodieDispensary | null> {
const result = await this.searchDispensaries({ filters: `SLUG:"${slug}"`, hitsPerPage: 1 });
return result.hits[0] || null;
}
async getDispensariesByState(state: string, options: SearchOptions = {}): Promise<SearchResult<HoodieDispensary>> {
return this.searchDispensaries({
...options,
filters: `STATE:"${state}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
async getDispensariesByCity(city: string, state: string, options: SearchOptions = {}): Promise<SearchResult<HoodieDispensary>> {
return this.searchDispensaries({
...options,
filters: `STATE:"${state}" AND CITY:"${city}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
async getDispensariesByPOS(posSystem: string, options: SearchOptions = {}): Promise<SearchResult<HoodieDispensary>> {
return this.searchDispensaries({
...options,
filters: `POS_SYSTEM:"${posSystem}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
async getDispensariesByBanner(banner: string, options: SearchOptions = {}): Promise<SearchResult<HoodieDispensary>> {
return this.searchDispensaries({
...options,
filters: `BANNER:"${banner}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
// ============================================================
// PRODUCT QUERIES
// ============================================================
async searchProducts(options: SearchOptions = {}): Promise<SearchResult<HoodieProduct>> {
return this.search<HoodieProduct>(INDEXES.products, options);
}
async getProductsByBrand(brand: string, options: SearchOptions = {}): Promise<SearchResult<HoodieProduct>> {
return this.searchProducts({
...options,
filters: `BRAND:"${brand}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
async getProductsByCategory(category: string, options: SearchOptions = {}): Promise<SearchResult<HoodieProduct>> {
return this.searchProducts({
...options,
filters: `CATEGORY_0:"${category}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
async getProductsByState(state: string, options: SearchOptions = {}): Promise<SearchResult<HoodieProduct>> {
return this.searchProducts({
...options,
filters: `D_STATE:"${state}"${options.filters ? ` AND ${options.filters}` : ''}`,
});
}
// ============================================================
// BRAND QUERIES
// ============================================================
async searchBrands(options: SearchOptions = {}): Promise<SearchResult<HoodieBrand>> {
return this.search<HoodieBrand>(INDEXES.brands, options);
}
async getBrandByName(name: string): Promise<HoodieBrand | null> {
const result = await this.searchBrands({ query: name, hitsPerPage: 1 });
return result.hits[0] || null;
}
async getBrandBySlug(slug: string): Promise<HoodieBrand | null> {
const result = await this.searchBrands({ filters: `SLUG:"${slug}"`, hitsPerPage: 1 });
return result.hits[0] || null;
}
async getBrandsByState(state: string, options: SearchOptions = {}): Promise<SearchResult<HoodieBrand>> {
// STATES is an array, use facetFilters
return this.searchBrands({
...options,
facetFilters: [[`STATES:${state}`], ...(options.facetFilters || [])],
});
}
// ============================================================
// MASTER PRODUCT QUERIES
// ============================================================
async searchMasterProducts(options: SearchOptions = {}): Promise<SearchResult<HoodieMasterProduct>> {
return this.search<HoodieMasterProduct>(INDEXES.masterProducts, options);
}
async getMasterProductByName(name: string, brand?: string): Promise<HoodieMasterProduct | null> {
const filters = brand ? `BRAND:"${brand}"` : undefined;
const result = await this.searchMasterProducts({ query: name, filters, hitsPerPage: 1 });
return result.hits[0] || null;
}
// ============================================================
// STATS / COUNTS
// ============================================================
async getIndexCounts(): Promise<Record<string, number>> {
const [dispensaries, products, brands, masterProducts, locations] = await Promise.all([
this.search(INDEXES.dispensaries, { hitsPerPage: 0 }),
this.search(INDEXES.products, { hitsPerPage: 0 }),
this.search(INDEXES.brands, { hitsPerPage: 0 }),
this.search(INDEXES.masterProducts, { hitsPerPage: 0 }),
this.search(INDEXES.locations, { hitsPerPage: 0 }),
]);
return {
dispensaries: dispensaries.nbHits,
products: products.nbHits,
brands: brands.nbHits,
masterProducts: masterProducts.nbHits,
locations: locations.nbHits,
};
}
async getStateStats(state: string): Promise<{
dispensaries: number;
products: number;
brands: number;
}> {
const [dispensaries, products, brands] = await Promise.all([
this.searchDispensaries({ filters: `STATE:"${state}"`, hitsPerPage: 0 }),
this.searchProducts({ filters: `D_STATE:"${state}"`, hitsPerPage: 0 }),
this.getBrandsByState(state, { hitsPerPage: 0 }),
]);
return {
dispensaries: dispensaries.nbHits,
products: products.nbHits,
brands: brands.nbHits,
};
}
}
// Singleton instance
export const hoodieClient = new HoodieClient();
export default hoodieClient;

View File

@@ -0,0 +1,342 @@
/**
* Hoodie Comparison Service
*
* Runs scheduled comparisons between Hoodie and CannaIQ data.
* Stores delta results - raw Hoodie data stays remote (proxy only).
*/
import { pool } from '../../db/pool';
import { hoodieClient, HoodieDispensary, HoodieBrand } from './client';
export interface ComparisonResult {
reportType: 'dispensaries' | 'brands';
state: string;
hoodieTotalCount: number;
cannaiqTotal: number;
inBoth: number;
hoodieOnly: number;
cannaiqOnly: number;
hoodieOnlyItems: any[];
cannaiqOnlyItems: any[];
matchedItems: any[];
durationMs: number;
}
// Normalize name for comparison
const normalize = (s: string) => s.toLowerCase().replace(/[^a-z0-9]/g, '');
/**
* Compare dispensaries between Hoodie and CannaIQ for a state
*/
export async function compareDispensaries(state: string): Promise<ComparisonResult> {
const startTime = Date.now();
// Fetch all Hoodie dispensaries for state (paginate through all)
const hoodieDisps: HoodieDispensary[] = [];
let page = 0;
const pageSize = 100;
while (true) {
const result = await hoodieClient.getDispensariesByState(state, {
hitsPerPage: pageSize,
page,
});
hoodieDisps.push(...result.hits);
if (result.hits.length < pageSize || hoodieDisps.length >= result.nbHits) {
break;
}
page++;
}
// Fetch CannaIQ dispensaries for state
const cannaiqResult = await pool.query(
`SELECT id, name, city, menu_type, slug, address1, phone
FROM dispensaries
WHERE state = $1`,
[state]
);
const cannaiqDisps = cannaiqResult.rows;
// Build lookup maps
const hoodieByName = new Map(hoodieDisps.map(d => [normalize(d.DISPENSARY_NAME), d]));
const cannaiqByName = new Map(cannaiqDisps.map(d => [normalize(d.name), d]));
const inBoth: any[] = [];
const hoodieOnly: any[] = [];
const cannaiqOnly: any[] = [];
// Find matches and Hoodie-only
for (const [normName, hoodie] of hoodieByName) {
const cannaiq = cannaiqByName.get(normName);
if (cannaiq) {
inBoth.push({
name: hoodie.DISPENSARY_NAME,
city: hoodie.CITY,
hoodie: {
slug: hoodie.SLUG,
pos: hoodie.POS_SYSTEM,
menus: hoodie.MENUS_COUNT,
daily_sales: hoodie.AVG_DAILY_SALES,
rating: hoodie.RATING,
},
cannaiq: {
id: cannaiq.id,
slug: cannaiq.slug,
menu_type: cannaiq.menu_type,
},
});
} else {
hoodieOnly.push({
name: hoodie.DISPENSARY_NAME,
city: hoodie.CITY,
address: hoodie.STREET_ADDRESS,
slug: hoodie.SLUG,
pos: hoodie.POS_SYSTEM,
menus: hoodie.MENUS_COUNT,
daily_sales: hoodie.AVG_DAILY_SALES,
rating: hoodie.RATING,
url: hoodie.URL,
});
}
}
// Find CannaIQ-only
for (const [normName, cannaiq] of cannaiqByName) {
if (!hoodieByName.has(normName)) {
cannaiqOnly.push({
id: cannaiq.id,
name: cannaiq.name,
city: cannaiq.city,
slug: cannaiq.slug,
menu_type: cannaiq.menu_type,
});
}
}
const durationMs = Date.now() - startTime;
return {
reportType: 'dispensaries',
state,
hoodieTotalCount: hoodieDisps.length,
cannaiqTotal: cannaiqDisps.length,
inBoth: inBoth.length,
hoodieOnly: hoodieOnly.length,
cannaiqOnly: cannaiqOnly.length,
hoodieOnlyItems: hoodieOnly,
cannaiqOnlyItems: cannaiqOnly,
matchedItems: inBoth,
durationMs,
};
}
/**
* Compare brands between Hoodie and CannaIQ for a state
*/
export async function compareBrands(state: string): Promise<ComparisonResult> {
const startTime = Date.now();
// Fetch all Hoodie brands for state (paginate through all)
const hoodieBrands: HoodieBrand[] = [];
let page = 0;
const pageSize = 100;
while (true) {
const result = await hoodieClient.getBrandsByState(state, {
hitsPerPage: pageSize,
page,
});
hoodieBrands.push(...result.hits);
if (result.hits.length < pageSize || hoodieBrands.length >= result.nbHits) {
break;
}
page++;
}
// Fetch CannaIQ brands for state (from products)
const cannaiqResult = await pool.query(`
SELECT DISTINCT p.brand_name_raw as name, COUNT(*) as product_count
FROM store_products p
JOIN dispensaries d ON d.id = p.dispensary_id
WHERE d.state = $1 AND p.brand_name_raw IS NOT NULL
GROUP BY p.brand_name_raw
`, [state]);
const cannaiqBrands = cannaiqResult.rows;
// Build lookup maps
const hoodieByName = new Map(hoodieBrands.map(b => [normalize(b.BRAND_NAME), b]));
const cannaiqByName = new Map(cannaiqBrands.map(b => [normalize(b.name), b]));
const inBoth: any[] = [];
const hoodieOnly: any[] = [];
const cannaiqOnly: any[] = [];
// Find matches and Hoodie-only
for (const [normName, hoodie] of hoodieByName) {
const cannaiq = cannaiqByName.get(normName);
if (cannaiq) {
inBoth.push({
name: hoodie.BRAND_NAME,
hoodie: {
slug: hoodie.SLUG,
variants: hoodie.ACTIVE_VARIANTS,
parent: hoodie.PARENT_COMPANY,
logo: hoodie.BRAND_LOGO_URL,
},
cannaiq: {
product_count: cannaiq.product_count,
},
});
} else {
hoodieOnly.push({
name: hoodie.BRAND_NAME,
slug: hoodie.SLUG,
variants: hoodie.ACTIVE_VARIANTS,
parent: hoodie.PARENT_COMPANY,
logo: hoodie.BRAND_LOGO_URL,
url: hoodie.BRAND_URL,
});
}
}
// Find CannaIQ-only
for (const [normName, cannaiq] of cannaiqByName) {
if (!hoodieByName.has(normName)) {
cannaiqOnly.push({
name: cannaiq.name,
product_count: cannaiq.product_count,
});
}
}
const durationMs = Date.now() - startTime;
return {
reportType: 'brands',
state,
hoodieTotalCount: hoodieBrands.length,
cannaiqTotal: cannaiqBrands.length,
inBoth: inBoth.length,
hoodieOnly: hoodieOnly.length,
cannaiqOnly: cannaiqOnly.length,
hoodieOnlyItems: hoodieOnly,
cannaiqOnlyItems: cannaiqOnly,
matchedItems: inBoth,
durationMs,
};
}
/**
* Save comparison result to database
*/
export async function saveComparisonReport(result: ComparisonResult): Promise<number> {
const { rows } = await pool.query(`
INSERT INTO hoodie_comparison_reports (
report_type, state,
hoodie_total, cannaiq_total, in_both, hoodie_only, cannaiq_only,
hoodie_only_items, cannaiq_only_items, matched_items,
duration_ms
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
RETURNING id
`, [
result.reportType,
result.state,
result.hoodieTotalCount,
result.cannaiqTotal,
result.inBoth,
result.hoodieOnly,
result.cannaiqOnly,
JSON.stringify(result.hoodieOnlyItems),
JSON.stringify(result.cannaiqOnlyItems),
JSON.stringify(result.matchedItems),
result.durationMs,
]);
return rows[0].id;
}
/**
* Run comparison and save results
*/
export async function runComparisonReport(
reportType: 'dispensaries' | 'brands',
state: string
): Promise<{ reportId: number; result: ComparisonResult }> {
let result: ComparisonResult;
if (reportType === 'dispensaries') {
result = await compareDispensaries(state);
} else {
result = await compareBrands(state);
}
const reportId = await saveComparisonReport(result);
return { reportId, result };
}
/**
* Run all comparisons for a state (dispensaries + brands)
*/
export async function runAllComparisons(state: string): Promise<{
dispensaries: { reportId: number; result: ComparisonResult };
brands: { reportId: number; result: ComparisonResult };
}> {
const [dispensaries, brands] = await Promise.all([
runComparisonReport('dispensaries', state),
runComparisonReport('brands', state),
]);
return { dispensaries, brands };
}
/**
* Get latest comparison reports
*/
export async function getLatestReports(state?: string): Promise<any[]> {
let query = 'SELECT * FROM v_hoodie_latest_reports';
const params: any[] = [];
if (state) {
query += ' WHERE state = $1';
params.push(state);
}
query += ' ORDER BY report_type, state';
const { rows } = await pool.query(query, params);
return rows;
}
/**
* Get comparison report by ID with full details
*/
export async function getReportById(id: number): Promise<any | null> {
const { rows } = await pool.query(
'SELECT * FROM hoodie_comparison_reports WHERE id = $1',
[id]
);
return rows[0] || null;
}
/**
* Get report history for a type/state
*/
export async function getReportHistory(
reportType: string,
state: string,
limit: number = 30
): Promise<any[]> {
const { rows } = await pool.query(`
SELECT id, report_type, state, hoodie_total, cannaiq_total,
in_both, hoodie_only, cannaiq_only, created_at, duration_ms
FROM hoodie_comparison_reports
WHERE report_type = $1 AND state = $2
ORDER BY created_at DESC
LIMIT $3
`, [reportType, state, limit]);
return rows;
}

View File

@@ -38,20 +38,34 @@ export interface WorkerIdentity {
}
export interface IdentityFingerprint {
// Browser & OS
userAgent: string;
browser: string;
browserVersion: string;
os: string;
osVersion: string;
platform: string; // e.g., "Win32", "MacIntel", "Linux x86_64"
// Display & Device
device: 'desktop' | 'mobile' | 'tablet';
screenWidth: number;
screenHeight: number;
colorDepth: number;
pixelRatio: number;
// Hardware
hardwareConcurrency: number; // CPU cores
deviceMemory: number; // RAM in GB
maxTouchPoints: number;
// WebGL
webglVendor: string;
webglRenderer: string;
// Language & Locale
timezone: string;
locale: string;
// Additional anti-detect properties
webglVendor?: string;
webglRenderer?: string;
languages?: string[];
languages: string[];
}
export interface PendingTaskGeo {
@@ -225,15 +239,92 @@ export function generateFingerprint(stateCode: string): IdentityFingerprint {
// Build user agent
const userAgent = buildUserAgent(browser, browserVersion, os, osVersion, device);
// Platform string based on OS
let platform: string;
if (os === 'Windows') {
platform = 'Win32';
} else if (os === 'macOS') {
platform = 'MacIntel';
} else if (os === 'iOS') {
platform = device === 'tablet' ? 'iPad' : 'iPhone';
} else {
platform = 'Linux armv8l'; // Android
}
// Hardware specs based on device type
let hardwareConcurrency: number;
let deviceMemory: number;
let maxTouchPoints: number;
if (device === 'desktop') {
hardwareConcurrency = randomFrom([4, 6, 8, 12, 16]);
deviceMemory = randomFrom([8, 16, 32]);
maxTouchPoints = 0;
} else if (device === 'mobile') {
hardwareConcurrency = randomFrom([4, 6, 8]);
deviceMemory = randomFrom([4, 6, 8]);
maxTouchPoints = randomFrom([5, 10]);
} else {
// tablet
hardwareConcurrency = randomFrom([4, 6, 8]);
deviceMemory = randomFrom([4, 8]);
maxTouchPoints = randomFrom([5, 10]);
}
// WebGL vendor/renderer based on device
let webglVendor: string;
let webglRenderer: string;
if (os === 'Windows') {
webglVendor = 'Google Inc. (NVIDIA)';
webglRenderer = randomFrom([
'ANGLE (NVIDIA, NVIDIA GeForce RTX 3060 Direct3D11 vs_5_0 ps_5_0)',
'ANGLE (NVIDIA, NVIDIA GeForce GTX 1660 Direct3D11 vs_5_0 ps_5_0)',
'ANGLE (Intel, Intel(R) UHD Graphics 630 Direct3D11 vs_5_0 ps_5_0)',
'ANGLE (AMD, AMD Radeon RX 580 Direct3D11 vs_5_0 ps_5_0)',
]);
} else if (os === 'macOS') {
webglVendor = 'Apple Inc.';
webglRenderer = randomFrom([
'Apple M1',
'Apple M2',
'Apple M3',
'AMD Radeon Pro 5500M',
'Intel(R) Iris(TM) Plus Graphics',
]);
} else if (os === 'iOS') {
webglVendor = 'Apple Inc.';
webglRenderer = 'Apple GPU';
} else {
// Android
webglVendor = 'Qualcomm';
webglRenderer = randomFrom([
'Adreno (TM) 730',
'Adreno (TM) 660',
'Mali-G78',
]);
}
// Pixel ratio based on device
const pixelRatio = device === 'desktop' ? randomFrom([1, 1.25, 1.5, 2]) : randomFrom([2, 3]);
return {
userAgent,
browser: `${browser.charAt(0).toUpperCase()}${browser.slice(1)}`,
browserVersion,
os,
osVersion,
platform,
device,
screenWidth: resolution.width,
screenHeight: resolution.height,
colorDepth: 24,
pixelRatio,
hardwareConcurrency,
deviceMemory,
maxTouchPoints,
webglVendor,
webglRenderer,
timezone,
locale: 'en-US',
languages: ['en-US', 'en'],
@@ -331,6 +422,9 @@ export class IdentityPoolService {
/**
* Create a new identity with Evomi proxy
* Generates session ID, gets IP, creates fingerprint
*
* IMPORTANT: If IP already exists in DB, reuse that identity's fingerprint
* This ensures fingerprints are "sticky" per IP for anti-detection consistency
*/
async createIdentity(
workerId: string,
@@ -365,14 +459,49 @@ export class IdentityPoolService {
timeout: 15000,
});
ipAddress = response.data?.ip || null;
console.log(`[IdentityPool] New identity IP: ${ipAddress} (${proxyResult.geo})`);
console.log(`[IdentityPool] Proxy IP: ${ipAddress} (${proxyResult.geo})`);
} catch (err: any) {
console.error(`[IdentityPool] Failed to get IP for new identity: ${err.message}`);
// Still create identity - IP will be detected during preflight
}
// Generate fingerprint
// STICKY FINGERPRINT: Check if IP already exists in DB
// If so, reuse that identity's fingerprint for consistency
if (ipAddress) {
const existingResult = await this.pool.query(`
SELECT * FROM worker_identities
WHERE ip_address = $1::inet
ORDER BY last_used_at DESC NULLS LAST
LIMIT 1
`, [ipAddress]);
if (existingResult.rows[0]) {
const existing = existingResult.rows[0];
console.log(`[IdentityPool] Found existing identity #${existing.id} for IP ${ipAddress} - reusing fingerprint`);
// Update the existing identity to be active with this worker
// Also update session_id so proxy session matches
const updateResult = await this.pool.query(`
UPDATE worker_identities
SET session_id = $1,
is_active = TRUE,
active_worker_id = $2,
last_used_at = NOW(),
total_sessions = total_sessions + 1
WHERE id = $3
RETURNING *
`, [sessionId, workerId, existing.id]);
if (updateResult.rows[0]) {
console.log(`[IdentityPool] Reactivated identity #${existing.id} for ${stateCode} (sticky fingerprint)`);
return this.rowToIdentity(updateResult.rows[0]);
}
}
}
// IP not found in DB - generate new fingerprint and store
const fingerprint = generateFingerprint(stateCode);
console.log(`[IdentityPool] New IP ${ipAddress || 'unknown'} - generated fresh fingerprint`);
// Insert into database
const insertResult = await this.pool.query(`

View File

@@ -1,23 +1,18 @@
/**
* Inventory Snapshots Service
* Inventory Snapshots Service (Delta-Only)
*
* Shared utility for saving lightweight inventory snapshots after each crawl.
* Normalizes fields across all platforms (Dutchie, Jane, Treez) into a
* common format for sales velocity tracking and analytics.
* Only stores snapshots when something CHANGES (quantity, price, status).
* This reduces storage by ~95% while capturing all meaningful events.
*
* Part of Real-Time Inventory Tracking feature.
*
* Field mappings:
* | Field | Dutchie | Jane | Treez |
* |-----------|------------------------|--------------------|------------------|
* | ID | id | product_id | id |
* | Quantity | children.quantityAvailable | max_cart_quantity | availableUnits |
* | Low stock | isBelowThreshold | false | !isAboveThreshold|
* | Price rec | recPrices[0] | bucket_price | customMinPrice |
* | Brand | brand.name | brand | brand |
* | Category | category | kind | category |
* | Name | Name | name | name |
* | Status | Status | (presence=active) | status |
* Change types:
* - sale: quantity decreased (qty_delta < 0)
* - restock: quantity increased (qty_delta > 0)
* - price_change: price changed but quantity same
* - oos: went out of stock (quantity -> 0)
* - back_in_stock: came back in stock (0 -> quantity)
* - new_product: first time seeing this product
*/
import { Pool } from 'pg';
@@ -31,11 +26,37 @@ interface SnapshotRow {
status: string | null;
price_rec: number | null;
price_med: number | null;
price_rec_special: number | null;
price_med_special: number | null;
is_on_special: boolean;
brand_name: string | null;
category: string | null;
product_name: string | null;
}
interface PreviousState {
quantity_available: number | null;
price_rec: number | null;
price_med: number | null;
status: string | null;
captured_at: Date;
}
interface DeltaSnapshot extends SnapshotRow {
prev_quantity: number | null;
prev_price_rec: number | null;
prev_price_med: number | null;
prev_status: string | null;
qty_delta: number | null;
price_delta: number | null;
change_type: string;
effective_price_rec: number | null;
effective_price_med: number | null;
revenue_rec: number | null;
revenue_med: number | null;
hours_since_last: number | null;
}
/**
* Extract a normalized snapshot row from a raw product based on platform.
*/
@@ -46,6 +67,9 @@ function normalizeProduct(product: any, platform: Platform): SnapshotRow | null
let status: string | null = null;
let priceRec: number | null = null;
let priceMed: number | null = null;
let priceRecSpecial: number | null = null;
let priceMedSpecial: number | null = null;
let isOnSpecial = false;
let brandName: string | null = null;
let category: string | null = null;
let productName: string | null = null;
@@ -75,6 +99,15 @@ function normalizeProduct(product: any, platform: Platform): SnapshotRow | null
const medPrices = product.medicalPrices || product.medPrices || [];
priceMed = medPrices.length > 0 ? parseFloat(medPrices[0]) : null;
// Special/sale prices
if (product.specialPrices && product.specialPrices.length > 0) {
priceRecSpecial = parseFloat(product.specialPrices[0]);
isOnSpecial = true;
} else if (product.discountedPrices && product.discountedPrices.length > 0) {
priceRecSpecial = parseFloat(product.discountedPrices[0]);
isOnSpecial = true;
}
break;
}
@@ -83,20 +116,24 @@ function normalizeProduct(product: any, platform: Platform): SnapshotRow | null
productName = product.name;
brandName = product.brand || null;
category = product.kind || null;
status = 'Active'; // Jane products present = active
isBelowThreshold = false; // Jane doesn't expose this
status = 'Active';
isBelowThreshold = false;
// Quantity: max_cart_quantity
quantityAvailable = product.max_cart_quantity ?? null;
// Price: bucket_price or first available weight-based price
priceRec =
product.bucket_price ||
product.price_gram ||
product.price_eighth_ounce ||
product.price_each ||
null;
priceMed = null; // Jane doesn't separate med prices clearly
priceMed = null;
// Jane sale prices
if (product.discounted_price && priceRec && product.discounted_price < priceRec) {
priceRecSpecial = product.discounted_price;
isOnSpecial = true;
}
break;
}
@@ -107,15 +144,17 @@ function normalizeProduct(product: any, platform: Platform): SnapshotRow | null
category = product.category || null;
status = product.status || (product.isActive ? 'ACTIVE' : 'INACTIVE');
// Quantity: availableUnits
quantityAvailable = product.availableUnits ?? null;
// Low stock: inverse of isAboveThreshold
isBelowThreshold = product.isAboveThreshold === false;
// Price: customMinPrice
priceRec = product.customMinPrice ?? null;
priceMed = null; // Treez doesn't distinguish med pricing
priceMed = null;
// Treez sale prices
if (product.customOnSaleValue && priceRec && product.customOnSaleValue < priceRec) {
priceRecSpecial = product.customOnSaleValue;
isOnSpecial = true;
}
break;
}
}
@@ -131,6 +170,9 @@ function normalizeProduct(product: any, platform: Platform): SnapshotRow | null
status,
price_rec: priceRec,
price_med: priceMed,
price_rec_special: priceRecSpecial,
price_med_special: priceMedSpecial,
is_on_special: isOnSpecial,
brand_name: brandName,
category,
product_name: productName,
@@ -138,61 +180,223 @@ function normalizeProduct(product: any, platform: Platform): SnapshotRow | null
}
/**
* Save inventory snapshots for all products in a crawl result.
*
* Call this after fetching products in any platform handler.
* Uses bulk insert for efficiency.
* Determine if product state changed and calculate deltas
*/
function calculateDelta(
current: SnapshotRow,
previous: PreviousState | null,
now: Date
): DeltaSnapshot | null {
const qtyChanged =
previous?.quantity_available !== current.quantity_available;
const priceRecChanged =
previous?.price_rec !== current.price_rec;
const priceMedChanged =
previous?.price_med !== current.price_med;
const statusChanged =
previous?.status !== current.status;
// No change - skip
if (previous && !qtyChanged && !priceRecChanged && !priceMedChanged && !statusChanged) {
return null;
}
// Calculate qty delta
const prevQty = previous?.quantity_available ?? null;
const currQty = current.quantity_available ?? 0;
const qtyDelta = previous ? currQty - (prevQty ?? 0) : null;
// Calculate price delta
const priceDelta = previous && current.price_rec && previous.price_rec
? current.price_rec - previous.price_rec
: null;
// Determine change type
let changeType = 'new_product';
if (previous) {
if (currQty === 0 && (prevQty ?? 0) > 0) {
changeType = 'oos';
} else if (currQty > 0 && (prevQty ?? 0) === 0) {
changeType = 'back_in_stock';
} else if (qtyDelta !== null && qtyDelta < 0) {
changeType = 'sale';
} else if (qtyDelta !== null && qtyDelta > 0) {
changeType = 'restock';
} else if (priceRecChanged || priceMedChanged) {
changeType = 'price_change';
} else {
changeType = 'status_change';
}
}
// Calculate effective prices (sale price if on special, otherwise regular)
const effectivePriceRec = current.is_on_special && current.price_rec_special
? current.price_rec_special
: current.price_rec;
const effectivePriceMed = current.is_on_special && current.price_med_special
? current.price_med_special
: current.price_med;
// Calculate revenue (only for sales)
let revenueRec: number | null = null;
let revenueMed: number | null = null;
if (changeType === 'sale' && qtyDelta !== null && qtyDelta < 0) {
const unitsSold = Math.abs(qtyDelta);
if (effectivePriceRec) {
revenueRec = unitsSold * effectivePriceRec;
}
if (effectivePriceMed) {
revenueMed = unitsSold * effectivePriceMed;
}
}
// Calculate hours since last snapshot
let hoursSinceLast: number | null = null;
if (previous?.captured_at) {
const msDiff = now.getTime() - previous.captured_at.getTime();
hoursSinceLast = Math.round((msDiff / 3600000) * 100) / 100; // 2 decimal places
}
return {
...current,
prev_quantity: prevQty,
prev_price_rec: previous?.price_rec ?? null,
prev_price_med: previous?.price_med ?? null,
prev_status: previous?.status ?? null,
qty_delta: qtyDelta,
price_delta: priceDelta,
change_type: changeType,
effective_price_rec: effectivePriceRec,
effective_price_med: effectivePriceMed,
revenue_rec: revenueRec,
revenue_med: revenueMed,
hours_since_last: hoursSinceLast,
};
}
/**
* Get the previous snapshot state for a dispensary.
* Returns a map of product_id -> previous state.
*/
export async function getPreviousSnapshots(
pool: Pool,
dispensaryId: number
): Promise<Map<string, PreviousState>> {
const result = await pool.query(
`
SELECT DISTINCT ON (product_id)
product_id,
quantity_available,
price_rec,
price_med,
status,
captured_at
FROM inventory_snapshots
WHERE dispensary_id = $1
ORDER BY product_id, captured_at DESC
`,
[dispensaryId]
);
const map = new Map<string, PreviousState>();
for (const row of result.rows) {
map.set(row.product_id, {
quantity_available: row.quantity_available,
price_rec: row.price_rec ? parseFloat(row.price_rec) : null,
price_med: row.price_med ? parseFloat(row.price_med) : null,
status: row.status,
captured_at: row.captured_at,
});
}
return map;
}
/**
* Save delta-only inventory snapshots.
* Only stores rows where something changed (qty, price, or status).
*
* @param pool - Database connection pool
* @param dispensaryId - The dispensary ID
* @param products - Array of raw products from the platform
* @param platform - The platform type
* @returns Number of snapshots saved
* @returns Object with counts: { total, changed, sales, restocks }
*/
export async function saveInventorySnapshots(
pool: Pool,
dispensaryId: number,
products: any[],
platform: Platform
): Promise<number> {
): Promise<{ total: number; changed: number; sales: number; restocks: number; revenue: number }> {
if (!products || products.length === 0) {
return 0;
return { total: 0, changed: 0, sales: 0, restocks: 0, revenue: 0 };
}
const snapshots: SnapshotRow[] = [];
const now = new Date();
// Get previous state for comparison
const previousStates = await getPreviousSnapshots(pool, dispensaryId);
// Normalize products and calculate deltas
const deltas: DeltaSnapshot[] = [];
let salesCount = 0;
let restockCount = 0;
let totalRevenue = 0;
for (const product of products) {
const row = normalizeProduct(product, platform);
if (row) {
snapshots.push(row);
const normalized = normalizeProduct(product, platform);
if (!normalized) continue;
const previous = previousStates.get(normalized.product_id) || null;
const delta = calculateDelta(normalized, previous, now);
if (delta) {
deltas.push(delta);
if (delta.change_type === 'sale') {
salesCount++;
totalRevenue += (delta.revenue_rec || 0) + (delta.revenue_med || 0);
} else if (delta.change_type === 'restock') {
restockCount++;
}
}
}
if (snapshots.length === 0) {
return 0;
if (deltas.length === 0) {
return { total: products.length, changed: 0, sales: 0, restocks: 0, revenue: 0 };
}
// Bulk insert using VALUES list
// Build parameterized query
// Bulk insert deltas
const values: any[] = [];
const placeholders: string[] = [];
let paramIndex = 1;
for (const s of snapshots) {
for (const d of deltas) {
placeholders.push(
`($${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++})`
`($${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++}, $${paramIndex++})`
);
values.push(
dispensaryId,
s.product_id,
d.product_id,
platform,
s.quantity_available,
s.is_below_threshold,
s.status,
s.price_rec,
s.price_med,
s.brand_name,
s.category,
s.product_name
d.quantity_available,
d.is_below_threshold,
d.status,
d.price_rec,
d.price_med,
d.brand_name,
d.category,
d.product_name,
d.prev_quantity,
d.prev_price_rec,
d.prev_price_med,
d.prev_status,
d.qty_delta,
d.price_delta,
d.change_type,
d.effective_price_rec,
d.effective_price_med,
d.revenue_rec,
d.revenue_med,
d.hours_since_last
);
}
@@ -208,45 +412,71 @@ export async function saveInventorySnapshots(
price_med,
brand_name,
category,
product_name
product_name,
prev_quantity,
prev_price_rec,
prev_price_med,
prev_status,
qty_delta,
price_delta,
change_type,
effective_price_rec,
effective_price_med,
revenue_rec,
revenue_med,
hours_since_last
) VALUES ${placeholders.join(', ')}
`;
await pool.query(query, values);
return snapshots.length;
return {
total: products.length,
changed: deltas.length,
sales: salesCount,
restocks: restockCount,
revenue: Math.round(totalRevenue * 100) / 100,
};
}
/**
* Get the previous snapshot for a dispensary (for delta calculation).
* Returns a map of product_id -> snapshot data.
* Get snapshot statistics for a dispensary
*/
export async function getPreviousSnapshots(
export async function getSnapshotStats(
pool: Pool,
dispensaryId: number
): Promise<Map<string, SnapshotRow>> {
dispensaryId: number,
hours: number = 24
): Promise<{
totalSnapshots: number;
sales: number;
restocks: number;
priceChanges: number;
oosEvents: number;
revenue: number;
}> {
const result = await pool.query(
`
SELECT DISTINCT ON (product_id)
product_id,
quantity_available,
is_below_threshold,
status,
price_rec,
price_med,
brand_name,
category,
product_name
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE change_type = 'sale') as sales,
COUNT(*) FILTER (WHERE change_type = 'restock') as restocks,
COUNT(*) FILTER (WHERE change_type = 'price_change') as price_changes,
COUNT(*) FILTER (WHERE change_type = 'oos') as oos_events,
COALESCE(SUM(revenue_rec), 0) + COALESCE(SUM(revenue_med), 0) as revenue
FROM inventory_snapshots
WHERE dispensary_id = $1
ORDER BY product_id, captured_at DESC
AND captured_at >= NOW() - INTERVAL '1 hour' * $2
`,
[dispensaryId]
[dispensaryId, hours]
);
const map = new Map<string, SnapshotRow>();
for (const row of result.rows) {
map.set(row.product_id, row);
}
return map;
const row = result.rows[0];
return {
totalSnapshots: parseInt(row.total),
sales: parseInt(row.sales),
restocks: parseInt(row.restocks),
priceChanges: parseInt(row.price_changes),
oosEvents: parseInt(row.oos_events),
revenue: parseFloat(row.revenue) || 0,
};
}

View File

@@ -0,0 +1,663 @@
/**
* Inventory Tracker Service
*
* Compares two payloads and detects inventory changes:
* - New products
* - Removed products
* - Quantity changes (sales/restocks)
* - Price changes
* - Cannabinoid changes
* - Effect changes
*
* Key insight: New products are also diffs! Daily payloads become the benchmark,
* and we track everything that's new or changed.
*/
import { pool } from '../db/pool';
import crypto from 'crypto';
// ============================================================================
// TYPES
// ============================================================================
export interface InventoryChange {
dispensaryId: number;
productId: string;
canonicalId?: string;
canonicalSku?: string;
productName: string;
brandName?: string;
option: string;
changeType: 'sale' | 'restock' | 'price_change' | 'new' | 'removed' | 'cannabinoid_change' | 'effect_change';
quantityBefore?: number;
quantityAfter?: number;
quantityDelta?: number;
price?: number;
specialPrice?: number;
isSpecial?: boolean;
revenue?: number;
category?: string;
subcategory?: string;
strainType?: string;
thcContent?: number;
cbdContent?: number;
thcaContent?: number;
cbgContent?: number;
cannabinoids?: any;
effects?: any;
payloadTimestamp: Date;
}
interface VariantData {
productId: string;
productName: string;
brandName?: string;
option: string;
quantity: number;
quantityAvailable?: number;
price: number;
specialPrice?: number;
isSpecial: boolean;
category?: string;
subcategory?: string;
strainType?: string;
thcContent?: number;
cbdContent?: number;
thcaContent?: number;
cbgContent?: number;
cannabinoids?: any;
effects?: any;
canonicalId?: string;
canonicalSku?: string;
}
interface DiffResult {
changes: InventoryChange[];
summary: {
newProducts: number;
removedProducts: number;
sales: number;
restocks: number;
priceChanges: number;
cannabinoidChanges: number;
effectChanges: number;
totalRevenue: number;
unitsStold: number;
};
payloadHash: string;
}
// ============================================================================
// VARIANT EXTRACTION
// ============================================================================
/**
* Extract all variants from a Dutchie payload
* Key is productId + option for unique identification
*/
function extractVariants(products: any[]): Map<string, VariantData> {
const variants = new Map<string, VariantData>();
for (const product of products) {
const children = product.POSMetaData?.children || [];
// Get product-level data
const productName = product.Name || product.name || '';
const brandName = product.brand?.name || product.brandName || '';
const category = product.type || product.category || '';
const subcategory = product.subcategory || '';
const strainType = product.strainType || '';
const isSpecial = product.special === true;
// Extract cannabinoids
let thcContent: number | undefined;
let cbdContent: number | undefined;
let thcaContent: number | undefined;
let cbgContent: number | undefined;
let cannabinoids: any;
if (product.cannabinoidsV2) {
cannabinoids = product.cannabinoidsV2;
for (const c of product.cannabinoidsV2) {
const name = c.cannabinoid?.name?.toLowerCase() || c.name?.toLowerCase() || '';
const value = c.value;
if (name === 'thc' || name === 'thc-d9') thcContent = value;
if (name === 'cbd') cbdContent = value;
if (name === 'thca') thcaContent = value;
if (name === 'cbg' || name === 'cbga') cbgContent = value;
}
} else if (product.THCContent?.range) {
thcContent = product.THCContent.range[0];
}
// Extract effects
const effects = product.effects || undefined;
// Process each variant
for (let i = 0; i < children.length; i++) {
const child = children[i];
const option = child.option || `variant_${i}`;
const key = `${product.id || product._id}_${option}`;
// Get price for this variant
const prices = product.Prices || product.prices || [];
const recPrices = product.recPrices || [];
const recSpecialPrices = product.recSpecialPrices || [];
const price = recPrices[i] || prices[i] || child.price || child.recPrice || 0;
const specialPrice = isSpecial ? (recSpecialPrices[i] || undefined) : undefined;
variants.set(key, {
productId: product.id || product._id,
productName,
brandName,
option,
quantity: child.quantity || 0,
quantityAvailable: child.quantityAvailable,
price,
specialPrice,
isSpecial,
category,
subcategory,
strainType,
thcContent,
cbdContent,
thcaContent,
cbgContent,
cannabinoids,
effects,
canonicalId: child.canonicalID || child.canonicalId,
canonicalSku: child.canonicalSKU || child.canonicalSku,
});
}
// Handle products without children (single variant)
if (children.length === 0) {
const key = `${product.id || product._id}_default`;
const price = product.Prices?.[0] || product.recPrices?.[0] || product.price || 0;
const specialPrice = isSpecial ? (product.recSpecialPrices?.[0] || undefined) : undefined;
variants.set(key, {
productId: product.id || product._id,
productName,
brandName,
option: 'default',
quantity: product.quantity || 0,
price,
specialPrice,
isSpecial,
category,
subcategory,
strainType,
thcContent,
cbdContent,
thcaContent,
cbgContent,
cannabinoids,
effects,
});
}
}
return variants;
}
/**
* Generate a hash of the payload for deduplication
*/
function hashPayload(products: any[]): string {
const content = JSON.stringify(products);
return crypto.createHash('sha256').update(content).digest('hex');
}
/**
* Deep compare objects for changes
*/
function objectsEqual(a: any, b: any): boolean {
if (a === b) return true;
if (a === undefined || b === undefined) return a === b;
if (a === null || b === null) return a === b;
return JSON.stringify(a) === JSON.stringify(b);
}
// ============================================================================
// DIFF CALCULATION
// ============================================================================
/**
* Compare two payloads and detect all changes
*
* @param prevProducts - Previous payload (benchmark)
* @param currProducts - Current payload
* @param dispensaryId - Store ID
* @param payloadTimestamp - When the current payload was captured
*/
export function calculateDiff(
prevProducts: any[],
currProducts: any[],
dispensaryId: number,
payloadTimestamp: Date
): DiffResult {
const changes: InventoryChange[] = [];
const prevVariants = extractVariants(prevProducts);
const currVariants = extractVariants(currProducts);
let newProducts = 0;
let removedProducts = 0;
let sales = 0;
let restocks = 0;
let priceChanges = 0;
let cannabinoidChanges = 0;
let effectChanges = 0;
let totalRevenue = 0;
let unitsSold = 0;
// Check current variants for changes
for (const [key, curr] of currVariants) {
const prev = prevVariants.get(key);
if (!prev) {
// NEW PRODUCT - this is a diff!
newProducts++;
changes.push({
dispensaryId,
productId: curr.productId,
canonicalId: curr.canonicalId,
canonicalSku: curr.canonicalSku,
productName: curr.productName,
brandName: curr.brandName,
option: curr.option,
changeType: 'new',
quantityAfter: curr.quantity,
price: curr.price,
specialPrice: curr.specialPrice,
isSpecial: curr.isSpecial,
category: curr.category,
subcategory: curr.subcategory,
strainType: curr.strainType,
thcContent: curr.thcContent,
cbdContent: curr.cbdContent,
thcaContent: curr.thcaContent,
cbgContent: curr.cbgContent,
cannabinoids: curr.cannabinoids,
effects: curr.effects,
payloadTimestamp,
});
continue;
}
// Check quantity changes
const qtyDelta = curr.quantity - prev.quantity;
if (qtyDelta < 0) {
// SALE - quantity decreased
sales++;
const effectivePrice = curr.isSpecial && curr.specialPrice ? curr.specialPrice : curr.price;
const saleRevenue = Math.abs(qtyDelta) * effectivePrice;
totalRevenue += saleRevenue;
unitsSold += Math.abs(qtyDelta);
changes.push({
dispensaryId,
productId: curr.productId,
canonicalId: curr.canonicalId,
canonicalSku: curr.canonicalSku,
productName: curr.productName,
brandName: curr.brandName,
option: curr.option,
changeType: 'sale',
quantityBefore: prev.quantity,
quantityAfter: curr.quantity,
quantityDelta: qtyDelta,
price: curr.price,
specialPrice: curr.specialPrice,
isSpecial: curr.isSpecial,
revenue: saleRevenue,
category: curr.category,
subcategory: curr.subcategory,
strainType: curr.strainType,
thcContent: curr.thcContent,
payloadTimestamp,
});
} else if (qtyDelta > 0) {
// RESTOCK - quantity increased
restocks++;
changes.push({
dispensaryId,
productId: curr.productId,
canonicalId: curr.canonicalId,
canonicalSku: curr.canonicalSku,
productName: curr.productName,
brandName: curr.brandName,
option: curr.option,
changeType: 'restock',
quantityBefore: prev.quantity,
quantityAfter: curr.quantity,
quantityDelta: qtyDelta,
price: curr.price,
category: curr.category,
payloadTimestamp,
});
}
// Check price changes (separate from quantity)
if (prev.price !== curr.price || prev.specialPrice !== curr.specialPrice) {
priceChanges++;
changes.push({
dispensaryId,
productId: curr.productId,
canonicalId: curr.canonicalId,
canonicalSku: curr.canonicalSku,
productName: curr.productName,
brandName: curr.brandName,
option: curr.option,
changeType: 'price_change',
price: curr.price,
specialPrice: curr.specialPrice,
isSpecial: curr.isSpecial,
category: curr.category,
payloadTimestamp,
});
}
// Check cannabinoid changes
if (!objectsEqual(prev.cannabinoids, curr.cannabinoids) && curr.cannabinoids) {
cannabinoidChanges++;
changes.push({
dispensaryId,
productId: curr.productId,
productName: curr.productName,
brandName: curr.brandName,
option: curr.option,
changeType: 'cannabinoid_change',
thcContent: curr.thcContent,
cbdContent: curr.cbdContent,
thcaContent: curr.thcaContent,
cbgContent: curr.cbgContent,
cannabinoids: curr.cannabinoids,
category: curr.category,
payloadTimestamp,
});
}
// Check effect changes
if (!objectsEqual(prev.effects, curr.effects) && curr.effects) {
effectChanges++;
changes.push({
dispensaryId,
productId: curr.productId,
productName: curr.productName,
brandName: curr.brandName,
option: curr.option,
changeType: 'effect_change',
effects: curr.effects,
category: curr.category,
payloadTimestamp,
});
}
}
// Check for removed products
for (const [key, prev] of prevVariants) {
if (!currVariants.has(key)) {
removedProducts++;
changes.push({
dispensaryId,
productId: prev.productId,
canonicalId: prev.canonicalId,
canonicalSku: prev.canonicalSku,
productName: prev.productName,
brandName: prev.brandName,
option: prev.option,
changeType: 'removed',
quantityBefore: prev.quantity,
quantityAfter: 0,
price: prev.price,
category: prev.category,
payloadTimestamp,
});
}
}
return {
changes,
summary: {
newProducts,
removedProducts,
sales,
restocks,
priceChanges,
cannabinoidChanges,
effectChanges,
totalRevenue,
unitsStold: unitsSold,
},
payloadHash: hashPayload(currProducts),
};
}
// ============================================================================
// DATABASE OPERATIONS
// ============================================================================
/**
* Insert inventory changes into the database
*/
export async function insertChanges(changes: InventoryChange[]): Promise<number> {
if (changes.length === 0) return 0;
const client = await pool.connect();
try {
await client.query('BEGIN');
let inserted = 0;
for (const change of changes) {
await client.query(`
INSERT INTO inventory_changes (
dispensary_id, product_id, canonical_id, canonical_sku,
product_name, brand_name, option, change_type,
quantity_before, quantity_after, quantity_delta,
price, special_price, is_special, revenue,
category, subcategory, strain_type,
thc_content, cbd_content, thca_content, cbg_content,
cannabinoids, effects, payload_timestamp
) VALUES (
$1, $2, $3, $4,
$5, $6, $7, $8,
$9, $10, $11,
$12, $13, $14, $15,
$16, $17, $18,
$19, $20, $21, $22,
$23, $24, $25
)
`, [
change.dispensaryId,
change.productId,
change.canonicalId || null,
change.canonicalSku || null,
change.productName,
change.brandName || null,
change.option,
change.changeType,
change.quantityBefore ?? null,
change.quantityAfter ?? null,
change.quantityDelta ?? null,
change.price ?? null,
change.specialPrice ?? null,
change.isSpecial ?? false,
change.revenue ?? null,
change.category || null,
change.subcategory || null,
change.strainType || null,
change.thcContent ?? null,
change.cbdContent ?? null,
change.thcaContent ?? null,
change.cbgContent ?? null,
change.cannabinoids ? JSON.stringify(change.cannabinoids) : null,
change.effects ? JSON.stringify(change.effects) : null,
change.payloadTimestamp,
]);
inserted++;
}
await client.query('COMMIT');
return inserted;
} catch (error) {
await client.query('ROLLBACK');
throw error;
} finally {
client.release();
}
}
/**
* Log a processed payload to prevent duplicate processing
*/
export async function logProcessedPayload(
dispensaryId: number,
payloadHash: string,
payloadTimestamp: Date,
productCount: number,
changesDetected: number,
salesDetected: number,
revenueDetected: number,
previousPayloadHash?: string
): Promise<void> {
await pool.query(`
INSERT INTO payload_processing_log (
dispensary_id, payload_hash, payload_timestamp,
product_count, changes_detected, sales_detected, revenue_detected,
previous_payload_hash
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
ON CONFLICT (dispensary_id, payload_hash) DO NOTHING
`, [
dispensaryId,
payloadHash,
payloadTimestamp,
productCount,
changesDetected,
salesDetected,
revenueDetected,
previousPayloadHash || null,
]);
}
/**
* Check if a payload has already been processed
*/
export async function isPayloadProcessed(
dispensaryId: number,
payloadHash: string
): Promise<boolean> {
const result = await pool.query(`
SELECT 1 FROM payload_processing_log
WHERE dispensary_id = $1 AND payload_hash = $2
LIMIT 1
`, [dispensaryId, payloadHash]);
return result.rows.length > 0;
}
/**
* Get the last processed payload hash for a dispensary
*/
export async function getLastPayloadHash(dispensaryId: number): Promise<string | null> {
const result = await pool.query(`
SELECT payload_hash
FROM payload_processing_log
WHERE dispensary_id = $1
ORDER BY payload_timestamp DESC
LIMIT 1
`, [dispensaryId]);
return result.rows[0]?.payload_hash || null;
}
// ============================================================================
// MAIN PROCESSING FUNCTION
// ============================================================================
/**
* Process a new payload against the previous one
*
* @param dispensaryId - Store ID
* @param currentProducts - Current payload products
* @param previousProducts - Previous payload products (benchmark)
* @param payloadTimestamp - When the payload was captured
*/
export async function processPayload(
dispensaryId: number,
currentProducts: any[],
previousProducts: any[],
payloadTimestamp: Date
): Promise<DiffResult> {
// Calculate diff
const diff = calculateDiff(
previousProducts,
currentProducts,
dispensaryId,
payloadTimestamp
);
// Check if already processed
const alreadyProcessed = await isPayloadProcessed(dispensaryId, diff.payloadHash);
if (alreadyProcessed) {
console.log(`[InventoryTracker] Payload already processed for dispensary ${dispensaryId}`);
return { ...diff, changes: [] }; // Return empty changes if duplicate
}
// Insert changes
if (diff.changes.length > 0) {
const inserted = await insertChanges(diff.changes);
console.log(`[InventoryTracker] Inserted ${inserted} changes for dispensary ${dispensaryId}`);
}
// Get previous hash for linking
const previousHash = await getLastPayloadHash(dispensaryId);
// Log the processed payload
await logProcessedPayload(
dispensaryId,
diff.payloadHash,
payloadTimestamp,
currentProducts.length,
diff.changes.length,
diff.summary.sales,
diff.summary.totalRevenue,
previousHash || undefined
);
console.log(`[InventoryTracker] Processed payload for dispensary ${dispensaryId}:`);
console.log(` - New products: ${diff.summary.newProducts}`);
console.log(` - Removed: ${diff.summary.removedProducts}`);
console.log(` - Sales: ${diff.summary.sales} (${diff.summary.unitsStold} units, $${diff.summary.totalRevenue.toFixed(2)})`);
console.log(` - Restocks: ${diff.summary.restocks}`);
console.log(` - Price changes: ${diff.summary.priceChanges}`);
return diff;
}
export default {
calculateDiff,
processPayload,
insertChanges,
isPayloadProcessed,
getLastPayloadHash,
};

View File

@@ -262,9 +262,10 @@ class TaskScheduler {
source: 'high_frequency_schedule',
});
// Add jitter: interval + random(0, 20% of interval)
const jitterMinutes = Math.floor(Math.random() * (store.crawl_interval_minutes * 0.2));
const nextIntervalMinutes = store.crawl_interval_minutes + jitterMinutes;
// Add jitter: interval + random(-3, +3) minutes
const JITTER_MINUTES = 3;
const jitterMinutes = Math.floor((Math.random() * JITTER_MINUTES * 2) - JITTER_MINUTES);
const nextIntervalMinutes = Math.max(1, store.crawl_interval_minutes + jitterMinutes);
// Update next_crawl_at and last_crawl_started_at
await pool.query(`

View File

@@ -110,8 +110,8 @@ export async function detectVisibilityEvents(
`
SELECT
provider_product_id as id,
name,
brand,
name_raw as name,
brand_name_raw as brand,
price_rec as price
FROM store_products
WHERE dispensary_id = $1

View File

@@ -0,0 +1,245 @@
/**
* Wasabi S3 Storage Service
*
* Stores raw crawl payloads to Wasabi S3-compatible storage for long-term archive.
* Payloads can be reprocessed later if analytics logic changes.
*
* Environment variables:
* - WASABI_ACCESS_KEY: Wasabi access key
* - WASABI_SECRET_KEY: Wasabi secret key
* - WASABI_BUCKET: Bucket name (default: cannaiq-payloads)
* - WASABI_REGION: Region (default: us-east-1)
* - WASABI_ENDPOINT: Endpoint URL (default: s3.wasabisys.com)
*/
import { S3Client, PutObjectCommand, GetObjectCommand, ListObjectsV2Command } from '@aws-sdk/client-s3';
import { Readable } from 'stream';
import * as zlib from 'zlib';
interface WasabiConfig {
accessKey: string;
secretKey: string;
bucket: string;
region: string;
endpoint: string;
}
function getConfig(): WasabiConfig {
return {
accessKey: process.env.WASABI_ACCESS_KEY || '',
secretKey: process.env.WASABI_SECRET_KEY || '',
bucket: process.env.WASABI_BUCKET || 'cannaiq',
region: process.env.WASABI_REGION || 'us-west-2',
endpoint: process.env.WASABI_ENDPOINT || 'https://s3.us-west-2.wasabisys.com',
};
}
let s3Client: S3Client | null = null;
function getClient(): S3Client {
if (s3Client) return s3Client;
const config = getConfig();
if (!config.accessKey || !config.secretKey) {
throw new Error('Wasabi credentials not configured (WASABI_ACCESS_KEY, WASABI_SECRET_KEY)');
}
s3Client = new S3Client({
region: config.region,
endpoint: config.endpoint,
credentials: {
accessKeyId: config.accessKey,
secretAccessKey: config.secretKey,
},
forcePathStyle: true, // Required for Wasabi
});
return s3Client;
}
/**
* Generate storage path for a payload
* Format: payloads/{state}/{YYYY-MM-DD}/{dispensary_id}/{timestamp}.json.gz
*/
export function getPayloadPath(
dispensaryId: number,
stateCode: string,
platform: string,
timestamp: Date = new Date()
): string {
const date = timestamp.toISOString().split('T')[0]; // YYYY-MM-DD
const ts = timestamp.toISOString().replace(/[:.]/g, '-'); // Safe filename
return `payloads/${stateCode.toUpperCase()}/${date}/${dispensaryId}/${platform}_${ts}.json.gz`;
}
/**
* Store a raw payload to Wasabi
* Compresses with gzip before upload to save space (~70% compression on JSON)
*/
export async function storePayload(
dispensaryId: number,
stateCode: string,
platform: string,
payload: any,
metadata?: Record<string, string>
): Promise<{ path: string; sizeBytes: number; compressedBytes: number }> {
const config = getConfig();
const client = getClient();
const jsonString = JSON.stringify(payload);
const originalSize = Buffer.byteLength(jsonString, 'utf8');
// Compress with gzip
const compressed = await new Promise<Buffer>((resolve, reject) => {
zlib.gzip(jsonString, { level: 9 }, (err, result) => {
if (err) reject(err);
else resolve(result);
});
});
const path = getPayloadPath(dispensaryId, stateCode, platform);
await client.send(new PutObjectCommand({
Bucket: config.bucket,
Key: path,
Body: compressed,
ContentType: 'application/json',
ContentEncoding: 'gzip',
Metadata: {
dispensaryId: String(dispensaryId),
stateCode: stateCode,
platform: platform,
originalSize: String(originalSize),
productCount: String(Array.isArray(payload) ? payload.length : 0),
...metadata,
},
}));
return {
path,
sizeBytes: originalSize,
compressedBytes: compressed.length,
};
}
/**
* Retrieve a payload from Wasabi
*/
export async function getPayload(path: string): Promise<any> {
const config = getConfig();
const client = getClient();
const response = await client.send(new GetObjectCommand({
Bucket: config.bucket,
Key: path,
}));
if (!response.Body) {
throw new Error(`Empty response for ${path}`);
}
// Read stream to buffer
const stream = response.Body as Readable;
const chunks: Buffer[] = [];
for await (const chunk of stream) {
chunks.push(chunk);
}
const compressed = Buffer.concat(chunks);
// Decompress
const decompressed = await new Promise<Buffer>((resolve, reject) => {
zlib.gunzip(compressed, (err, result) => {
if (err) reject(err);
else resolve(result);
});
});
return JSON.parse(decompressed.toString('utf8'));
}
/**
* List payloads for a dispensary on a specific date
*/
export async function listPayloads(
dispensaryId: number,
stateCode: string,
date: string // YYYY-MM-DD
): Promise<string[]> {
const config = getConfig();
const client = getClient();
const prefix = `payloads/${stateCode.toUpperCase()}/${date}/${dispensaryId}/`;
const response = await client.send(new ListObjectsV2Command({
Bucket: config.bucket,
Prefix: prefix,
}));
return (response.Contents || []).map(obj => obj.Key!).filter(Boolean);
}
/**
* Check if Wasabi is configured and accessible
*/
export async function checkConnection(): Promise<{ connected: boolean; error?: string }> {
try {
const config = getConfig();
if (!config.accessKey || !config.secretKey) {
return { connected: false, error: 'Credentials not configured' };
}
const client = getClient();
// Try to list bucket (will fail if credentials or bucket invalid)
await client.send(new ListObjectsV2Command({
Bucket: config.bucket,
MaxKeys: 1,
}));
return { connected: true };
} catch (error: any) {
return { connected: false, error: error.message };
}
}
/**
* Get storage statistics
*/
export async function getStorageStats(
stateCode?: string,
date?: string
): Promise<{ objectCount: number; totalSizeBytes: number }> {
const config = getConfig();
const client = getClient();
let prefix = 'payloads/';
if (stateCode) {
prefix += `${stateCode.toUpperCase()}/`;
if (date) {
prefix += `${date}/`;
}
}
let objectCount = 0;
let totalSizeBytes = 0;
let continuationToken: string | undefined;
do {
const response = await client.send(new ListObjectsV2Command({
Bucket: config.bucket,
Prefix: prefix,
ContinuationToken: continuationToken,
}));
for (const obj of response.Contents || []) {
objectCount++;
totalSizeBytes += obj.Size || 0;
}
continuationToken = response.NextContinuationToken;
} while (continuationToken);
return { objectCount, totalSizeBytes };
}

View File

@@ -12,6 +12,8 @@
import { pool } from '../db/pool';
import { buildEvomiProxyUrl, getEvomiConfig } from './crawl-rotator';
import { generateFingerprint, IdentityFingerprint } from './identity-pool';
import { runPuppeteerPreflightWithRetry } from './puppeteer-preflight';
export interface ClaimedTask {
task_id: number;
@@ -44,9 +46,15 @@ export interface SessionWithTasks {
session: WorkerSession;
tasks: ClaimedTask[];
proxyUrl: string;
fingerprint: IdentityFingerprint;
qualified: boolean; // True if preflight passed
}
// Random 3-5 tasks per session for natural traffic patterns
function getRandomTaskCount(): number {
return 3 + Math.floor(Math.random() * 3); // 3, 4, or 5
}
const MAX_TASKS_PER_SESSION = 6;
const MAX_IP_ATTEMPTS = 10; // How many IPs to try before giving up
const COOLDOWN_HOURS = 8;
@@ -55,11 +63,14 @@ const COOLDOWN_HOURS = 8;
* This is the main entry point for the new worker flow.
*
* Flow:
* 1. Claim up to 6 tasks for same geo
* 2. Get Evomi proxy for that geo
* 3. Try IPs until we find one that's available
* 4. Lock IP to this worker
* 1. Claim ONE task first (determines geo)
* 2. Get IP matching task's city/state
* 3. Go back to pool, claim more tasks (random 2-4 more) for SAME geo
* 4. Total 3-5 tasks per session
* 5. Return session + tasks + proxy URL
*
* IMPORTANT: Worker does NOT get IP until first task is claimed.
* IP must match task's city/state for geoip consistency.
*/
export async function claimSessionWithTasks(
workerId: string,
@@ -70,21 +81,21 @@ export async function claimSessionWithTasks(
try {
await client.query('BEGIN');
// Step 1: Claim up to 6 tasks for same geo
const { rows: tasks } = await client.query<ClaimedTask>(
// Step 1: Claim ONE task first to determine geo
const { rows: firstTaskRows } = await client.query<ClaimedTask>(
`SELECT * FROM claim_tasks_batch($1, $2, $3)`,
[workerId, MAX_TASKS_PER_SESSION, role || null]
[workerId, 1, role || null]
);
if (tasks.length === 0) {
if (firstTaskRows.length === 0) {
await client.query('ROLLBACK');
console.log(`[WorkerSession] No pending tasks available for ${workerId}`);
return null;
}
// Get geo from first claimed task (all same geo)
const { state_code, city } = tasks[0];
console.log(`[WorkerSession] ${workerId} claimed ${tasks.length} tasks for ${city || 'any'}, ${state_code}`);
const firstTask = firstTaskRows[0];
const { state_code, city } = firstTask;
console.log(`[WorkerSession] ${workerId} claimed first task #${firstTask.task_id} for ${city || 'any'}, ${state_code}`);
// Step 2: Get Evomi proxy for this geo
const evomiConfig = getEvomiConfig();
@@ -96,36 +107,83 @@ export async function claimSessionWithTasks(
// Step 3: Try to get an available IP
let session: WorkerSession | null = null;
let proxyUrl: string | null = null;
let lockedFingerprint: IdentityFingerprint | null = null;
// Fallback chain for city targeting: exact city -> major city -> state only
const STATE_MAJOR_CITIES: Record<string, string> = {
AZ: 'phoenix', CA: 'los.angeles', CO: 'denver', FL: 'miami', IL: 'chicago',
MA: 'boston', MI: 'detroit', NV: 'las.vegas', NJ: 'newark', NY: 'new.york',
OH: 'columbus', OR: 'portland', PA: 'philadelphia', WA: 'seattle',
};
for (let attempt = 0; attempt < MAX_IP_ATTEMPTS; attempt++) {
// Build proxy URL with unique session ID for each attempt
const sessionId = `${workerId}-${Date.now()}-${attempt}`;
const proxyResult = buildEvomiProxyUrl(state_code, sessionId, city || undefined);
if (!proxyResult) {
console.warn(`[WorkerSession] Failed to build proxy URL for ${state_code}`);
continue;
// Try cities in order: exact city -> major city -> state only
const citiesToTry: (string | undefined)[] = [
city || undefined, // Exact dispensary city
STATE_MAJOR_CITIES[state_code.toUpperCase()], // State's major city
undefined // State-only fallback
].filter((c, i, arr) => c !== arr[i - 1]); // Remove duplicates
let testIp: string | null = null;
let usedCity: string | undefined;
let successfulProxyUrl: string | null = null;
for (const tryCity of citiesToTry) {
const proxyResult = buildEvomiProxyUrl(state_code, sessionId, tryCity);
if (!proxyResult) continue;
const result = await getProxyIp(proxyResult.url);
if (result.ip) {
testIp = result.ip;
usedCity = tryCity || undefined;
successfulProxyUrl = proxyResult.url;
if (tryCity !== city) {
console.log(`[WorkerSession] City ${city} not available, using ${tryCity || 'state-only'}`);
}
break;
}
if (!result.error412) break; // Non-412 error, don't try other cities
}
// TODO: Actually make a request through the proxy to get the real IP
// For now, we'll use a placeholder - in production, run a quick IP check
const testIp = await getProxyIp(proxyResult.url);
if (!testIp) {
if (!testIp || !successfulProxyUrl) {
console.warn(`[WorkerSession] Failed to get IP from proxy attempt ${attempt + 1}`);
continue;
}
// Step 4: Try to lock this IP
// Check if this IP already has a fingerprint in DB (1 IP = 1 fingerprint rule)
const existingSession = await client.query(
`SELECT fingerprint_data FROM worker_sessions
WHERE ip_address = $1 AND fingerprint_data IS NOT NULL
ORDER BY created_at DESC LIMIT 1`,
[testIp]
);
let fingerprintData: IdentityFingerprint;
if (existingSession.rows[0]?.fingerprint_data) {
// STICKY FINGERPRINT: Reuse existing fingerprint for this IP
fingerprintData = existingSession.rows[0].fingerprint_data;
console.log(`[WorkerSession] Reusing sticky fingerprint for IP ${testIp}`);
} else {
// NEW IP: Generate fresh fingerprint and record it
fingerprintData = generateFingerprint(state_code);
console.log(`[WorkerSession] NEW IP ${testIp} - generated fingerprint, recording to pool`);
}
// Lock session with fingerprint (DB generates hash if not provided)
const { rows } = await client.query<WorkerSession>(
`SELECT * FROM lock_worker_session($1, $2, $3, $4)`,
[workerId, testIp, state_code, city]
`SELECT * FROM lock_worker_session($1, $2, $3, $4, $5, $6)`,
[workerId, testIp, state_code, city, null, JSON.stringify(fingerprintData)]
);
if (rows[0]?.id) {
session = rows[0];
proxyUrl = proxyResult.url;
console.log(`[WorkerSession] ${workerId} locked IP ${testIp} for ${city || 'any'}, ${state_code}`);
proxyUrl = successfulProxyUrl;
lockedFingerprint = fingerprintData;
console.log(`[WorkerSession] ${workerId} locked IP ${testIp} for ${usedCity || city || 'any'}, ${state_code}`);
break;
}
@@ -140,18 +198,87 @@ export async function claimSessionWithTasks(
return null;
}
// Update session with task count
// =========================================================================
// STEP 4: PREFLIGHT/QUALIFY - Worker must qualify before proceeding
// =========================================================================
// Rules:
// - 1 IP = 1 fingerprint (enforced above via sticky lookup)
// - Verify antidetect is working (timezone/geo matches IP)
// - If preflight fails, release task and session, return null
// =========================================================================
console.log(`[WorkerSession] ${workerId} running preflight to qualify...`);
const preflightResult = await runPuppeteerPreflightWithRetry(
undefined, // No crawl rotator needed, we have custom proxy
1, // 1 retry
proxyUrl, // Use the proxy we just locked
state_code // Target state for geo verification
);
if (!preflightResult.passed) {
// PREFLIGHT FAILED - Worker not qualified
console.error(`[WorkerSession] ${workerId} PREFLIGHT FAILED: ${preflightResult.error}`);
// Release first task back to pool
await client.query(`SELECT release_claimed_tasks($1)`, [workerId]);
// Mark session as unhealthy and retire it
await client.query(`
UPDATE worker_sessions
SET status = 'cooldown',
cooldown_until = NOW() + INTERVAL '1 hour',
worker_id = NULL
WHERE id = $1
`, [session.id]);
await client.query('ROLLBACK');
return null;
}
console.log(`[WorkerSession] ${workerId} QUALIFIED - preflight passed (${preflightResult.responseTimeMs}ms)`);
// Verify IP matches what we expected
if (preflightResult.proxyIp && preflightResult.proxyIp !== session.ip_address) {
console.warn(`[WorkerSession] IP mismatch: expected ${session.ip_address}, got ${preflightResult.proxyIp}`);
}
// Set GOLD BADGE - worker is now qualified
await client.query(
`SELECT set_worker_qualified($1, $2)`,
[workerId, session.id]
);
console.log(`[WorkerSession] ${workerId} awarded GOLD BADGE`);
// =========================================================================
// STEP 5: Now qualified - claim more tasks for SAME geo (random 2-4 more)
// =========================================================================
// Total will be 3-5 tasks (1 first + 2-4 additional)
const additionalTaskCount = 2 + Math.floor(Math.random() * 3); // 2, 3, or 4
const { rows: additionalTasks } = await client.query<ClaimedTask>(
`SELECT * FROM claim_tasks_batch_for_geo($1, $2, $3, $4, $5)`,
[workerId, additionalTaskCount, state_code, city, role || null]
);
// Combine first task + additional tasks
const allTasks = [firstTask, ...additionalTasks];
const totalTasks = allTasks.length;
console.log(`[WorkerSession] ${workerId} claimed ${additionalTasks.length} more tasks, total: ${totalTasks} for ${city || 'any'}, ${state_code}`);
// Update session with total task count
await client.query(
`SELECT session_task_claimed($1, $2)`,
[workerId, tasks.length]
[workerId, totalTasks]
);
await client.query('COMMIT');
return {
session,
tasks,
tasks: allTasks,
proxyUrl,
fingerprint: lockedFingerprint!,
qualified: true, // Preflight passed
};
} catch (err) {
await client.query('ROLLBACK');
@@ -163,8 +290,9 @@ export async function claimSessionWithTasks(
/**
* Get the real IP address from a proxy by making a test request
* Returns { ip, error412 } to distinguish 412 errors (city not available)
*/
async function getProxyIp(proxyUrl: string): Promise<string | null> {
async function getProxyIp(proxyUrl: string): Promise<{ ip: string | null; error412: boolean }> {
try {
// Use a simple IP check service
const { default: axios } = await import('axios');
@@ -177,10 +305,11 @@ async function getProxyIp(proxyUrl: string): Promise<string | null> {
timeout: 10000,
});
return response.data?.ip || null;
return { ip: response.data?.ip || null, error412: false };
} catch (err: any) {
console.warn(`[WorkerSession] IP check failed: ${err.message}`);
return null;
const is412 = err.response?.status === 412;
console.warn(`[WorkerSession] IP check failed: ${err.message}${is412 ? ' (city not available)' : ''}`);
return { ip: null, error412: is412 };
}
}
@@ -285,13 +414,18 @@ export async function isSessionComplete(workerId: string): Promise<boolean> {
/**
* Retire a worker's session (start 8hr cooldown)
* Clears gold badge - worker must re-qualify with new session
*/
export async function retireSession(workerId: string): Promise<boolean> {
const { rows } = await pool.query(
`SELECT retire_worker_session($1) as success`,
[workerId]
);
console.log(`[WorkerSession] ${workerId} session retired, IP in ${COOLDOWN_HOURS}hr cooldown`);
// Clear GOLD BADGE - worker no longer qualified
await pool.query(`SELECT clear_worker_badge($1)`, [workerId]);
console.log(`[WorkerSession] ${workerId} session retired, badge cleared, IP in ${COOLDOWN_HOURS}hr cooldown`);
return rows[0]?.success || false;
}

View File

@@ -26,6 +26,7 @@ import { saveDailyBaseline } from '../../utils/payload-storage';
import { taskService } from '../task-service';
import { saveInventorySnapshots } from '../../services/inventory-snapshots';
import { detectVisibilityEvents } from '../../services/visibility-events';
import { storePayload as storeWasabiPayload, checkConnection as checkWasabiConnection } from '../../services/wasabi-storage';
// GraphQL hash for FilteredProducts query - MUST match CLAUDE.md
const FILTERED_PRODUCTS_HASH = 'ee29c060826dc41c527e470e9ae502c9b2c169720faa0a9f5d25e1b9a530a4a0';
@@ -367,9 +368,8 @@ export async function handleProductDiscoveryDutchie(ctx: TaskContext): Promise<T
await ctx.heartbeat();
// ============================================================
// STEP 5: Save daily baseline (full payload) if in window
// Daily baselines are saved once per day per store (12:01 AM - 3:00 AM)
// Outside this window, only inventory snapshots are saved (Step 5.5)
// STEP 5: Archive raw payload to Wasabi S3 (long-term storage)
// Every crawl is archived for potential reprocessing
// ============================================================
updateStep('saving', `Saving ${result.products.length} products`);
const rawPayload = {
@@ -381,6 +381,37 @@ export async function handleProductDiscoveryDutchie(ctx: TaskContext): Promise<T
products: result.products,
};
// Archive to Wasabi S3 (if configured)
let wasabiPath: string | null = null;
try {
const wasabiResult = await storeWasabiPayload(
dispensaryId,
dispensary.state || 'XX',
'dutchie',
rawPayload,
{
taskId: String(task.id),
cName,
productCount: String(result.products.length),
}
);
wasabiPath = wasabiResult.path;
const compressionRatio = Math.round((1 - wasabiResult.compressedBytes / wasabiResult.sizeBytes) * 100);
console.log(`[ProductDiscoveryHTTP] Archived to Wasabi: ${wasabiPath} (${(wasabiResult.compressedBytes / 1024).toFixed(1)}KB, ${compressionRatio}% compression)`);
} catch (wasabiErr: any) {
// Wasabi archival is optional - don't fail the task if it fails
if (wasabiErr.message?.includes('not configured')) {
console.log(`[ProductDiscoveryHTTP] Wasabi not configured, skipping archive`);
} else {
console.warn(`[ProductDiscoveryHTTP] Wasabi archive failed: ${wasabiErr.message}`);
}
}
// ============================================================
// STEP 5b: Save daily baseline to PostgreSQL (if in window)
// Daily baselines are saved once per day per store (12:01 AM - 3:00 AM)
// Outside this window, only inventory snapshots are saved (Step 5.5)
// ============================================================
// saveDailyBaseline returns null if outside window or baseline already exists today
const payloadResult = await saveDailyBaseline(
pool,
@@ -395,7 +426,7 @@ export async function handleProductDiscoveryDutchie(ctx: TaskContext): Promise<T
if (payloadResult) {
console.log(`[ProductDiscoveryHTTP] Saved daily baseline #${payloadResult.id} (${(payloadResult.sizeBytes / 1024).toFixed(1)}KB)`);
} else {
console.log(`[ProductDiscoveryHTTP] Skipped full payload save (outside baseline window or already exists)`);
console.log(`[ProductDiscoveryHTTP] Skipped PostgreSQL baseline (outside window or already exists)`);
}
// ============================================================
@@ -459,6 +490,7 @@ export async function handleProductDiscoveryDutchie(ctx: TaskContext): Promise<T
productCount: result.products.length,
sizeBytes: payloadResult?.sizeBytes || 0,
baselineSaved: !!payloadResult,
wasabiPath,
snapshotCount,
eventCount,
};

View File

@@ -19,6 +19,7 @@ import { saveDailyBaseline } from '../../utils/payload-storage';
import { taskService } from '../task-service';
import { saveInventorySnapshots } from '../../services/inventory-snapshots';
import { detectVisibilityEvents } from '../../services/visibility-events';
import { storePayload as storeWasabiPayload } from '../../services/wasabi-storage';
export async function handleProductDiscoveryJane(ctx: TaskContext): Promise<TaskResult> {
const { pool, task, crawlRotator } = ctx;
@@ -36,7 +37,7 @@ export async function handleProductDiscoveryJane(ctx: TaskContext): Promise<Task
try {
// Load dispensary
const dispResult = await pool.query(
`SELECT id, name, menu_url, platform_dispensary_id, menu_type
`SELECT id, name, menu_url, platform_dispensary_id, menu_type, state
FROM dispensaries WHERE id = $1`,
[dispensaryId]
);
@@ -99,7 +100,32 @@ export async function handleProductDiscoveryJane(ctx: TaskContext): Promise<Task
storeId: dispensary.platform_dispensary_id,
};
// Save daily baseline to filesystem (only in 12:01-3:00 AM window, once per day)
// Archive to Wasabi S3 (if configured)
let wasabiPath: string | null = null;
try {
const wasabiResult = await storeWasabiPayload(
dispensaryId,
dispensary.state || 'XX',
'jane',
rawPayload,
{
taskId: String(task.id),
storeId: dispensary.platform_dispensary_id,
productCount: String(result.products.length),
}
);
wasabiPath = wasabiResult.path;
const compressionRatio = Math.round((1 - wasabiResult.compressedBytes / wasabiResult.sizeBytes) * 100);
console.log(`[JaneProductDiscovery] Archived to Wasabi: ${wasabiPath} (${(wasabiResult.compressedBytes / 1024).toFixed(1)}KB, ${compressionRatio}% compression)`);
} catch (wasabiErr: any) {
if (wasabiErr.message?.includes('not configured')) {
console.log(`[JaneProductDiscovery] Wasabi not configured, skipping archive`);
} else {
console.warn(`[JaneProductDiscovery] Wasabi archive failed: ${wasabiErr.message}`);
}
}
// Save daily baseline to PostgreSQL (only in 12:01-3:00 AM window, once per day)
const payloadResult = await saveDailyBaseline(
pool,
dispensaryId,
@@ -113,7 +139,7 @@ export async function handleProductDiscoveryJane(ctx: TaskContext): Promise<Task
if (payloadResult) {
console.log(`[JaneProductDiscovery] Saved daily baseline ${payloadResult.id} (${Math.round(payloadResult.sizeBytes / 1024)}KB)`);
} else {
console.log(`[JaneProductDiscovery] Skipped full payload save (outside baseline window or already exists)`);
console.log(`[JaneProductDiscovery] Skipped PostgreSQL baseline (outside window or already exists)`);
}
// Save inventory snapshots and detect visibility events
@@ -155,6 +181,7 @@ export async function handleProductDiscoveryJane(ctx: TaskContext): Promise<Task
payloadId: payloadResult?.id || null,
payloadSizeKB: payloadResult ? Math.round(payloadResult.sizeBytes / 1024) : 0,
baselineSaved: !!payloadResult,
wasabiPath,
snapshotCount,
eventCount,
storeInfo: result.store ? {

View File

@@ -33,6 +33,7 @@ import { saveDailyBaseline } from '../../utils/payload-storage';
import { taskService } from '../task-service';
import { saveInventorySnapshots } from '../../services/inventory-snapshots';
import { detectVisibilityEvents } from '../../services/visibility-events';
import { storePayload as storeWasabiPayload } from '../../services/wasabi-storage';
export async function handleProductDiscoveryTreez(ctx: TaskContext): Promise<TaskResult> {
const { pool, task, crawlRotator } = ctx;
@@ -50,7 +51,7 @@ export async function handleProductDiscoveryTreez(ctx: TaskContext): Promise<Tas
try {
// Load dispensary
const dispResult = await pool.query(
`SELECT id, name, menu_url, platform_dispensary_id, menu_type, platform
`SELECT id, name, menu_url, platform_dispensary_id, menu_type, platform, state
FROM dispensaries WHERE id = $1`,
[dispensaryId]
);
@@ -116,7 +117,32 @@ export async function handleProductDiscoveryTreez(ctx: TaskContext): Promise<Tas
dispensaryId,
};
// Save daily baseline to filesystem (only in 12:01-3:00 AM window, once per day)
// Archive to Wasabi S3 (if configured)
let wasabiPath: string | null = null;
try {
const wasabiResult = await storeWasabiPayload(
dispensaryId,
dispensary.state || 'XX',
'treez',
rawPayload,
{
taskId: String(task.id),
storeId: result.storeId || 'unknown',
productCount: String(result.products.length),
}
);
wasabiPath = wasabiResult.path;
const compressionRatio = Math.round((1 - wasabiResult.compressedBytes / wasabiResult.sizeBytes) * 100);
console.log(`[TreezProductDiscovery] Archived to Wasabi: ${wasabiPath} (${(wasabiResult.compressedBytes / 1024).toFixed(1)}KB, ${compressionRatio}% compression)`);
} catch (wasabiErr: any) {
if (wasabiErr.message?.includes('not configured')) {
console.log(`[TreezProductDiscovery] Wasabi not configured, skipping archive`);
} else {
console.warn(`[TreezProductDiscovery] Wasabi archive failed: ${wasabiErr.message}`);
}
}
// Save daily baseline to PostgreSQL (only in 12:01-3:00 AM window, once per day)
const payloadResult = await saveDailyBaseline(
pool,
dispensaryId,
@@ -130,7 +156,7 @@ export async function handleProductDiscoveryTreez(ctx: TaskContext): Promise<Tas
if (payloadResult) {
console.log(`[TreezProductDiscovery] Saved daily baseline ${payloadResult.id} (${Math.round(payloadResult.sizeBytes / 1024)}KB)`);
} else {
console.log(`[TreezProductDiscovery] Skipped full payload save (outside baseline window or already exists)`);
console.log(`[TreezProductDiscovery] Skipped PostgreSQL baseline (outside window or already exists)`);
}
// Save inventory snapshots and detect visibility events
@@ -171,6 +197,7 @@ export async function handleProductDiscoveryTreez(ctx: TaskContext): Promise<Tas
payloadId: payloadResult?.id || null,
payloadSizeKB: payloadResult ? Math.round(payloadResult.sizeBytes / 1024) : 0,
baselineSaved: !!payloadResult,
wasabiPath,
snapshotCount,
eventCount,
storeId: result.storeId,

View File

@@ -31,8 +31,10 @@ import {
createStoreProductSnapshots,
downloadProductImages,
} from '../../hydration/canonical-upsert';
import { loadRawPayloadById, getLatestPayload } from '../../utils/payload-storage';
import { loadRawPayloadById, getLatestPayload, getRecentPayloads } from '../../utils/payload-storage';
import { taskService } from '../task-service';
import { processPayload as processInventoryChanges } from '../../services/inventory-tracker';
import { storeDailySnapshot, getBenchmarkProducts } from '../../services/daily-snapshot';
// Platform-aware normalizer registry
const NORMALIZERS: Record<string, BaseNormalizer> = {
@@ -169,6 +171,62 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
await ctx.heartbeat();
// ============================================================
// STEP 2.5: Real-Time Inventory Tracking (Dutchie only for now)
// Compare current payload to benchmark and track all changes
// ============================================================
let inventoryResult: { sales: number; revenue: number; newProducts: number } = {
sales: 0,
revenue: 0,
newProducts: 0,
};
if (detectedPlatform === 'dutchie') {
try {
updateStep('tracking', 'Processing inventory changes');
// Get benchmark products (today's daily snapshot or most recent)
const benchmarkProducts = await getBenchmarkProducts(dispensaryId);
if (benchmarkProducts && benchmarkProducts.length > 0) {
// Process inventory changes against benchmark
const diffResult = await processInventoryChanges(
dispensaryId,
allProducts,
benchmarkProducts,
payloadData.fetchedAt || new Date()
);
inventoryResult = {
sales: diffResult.summary.sales,
revenue: diffResult.summary.totalRevenue,
newProducts: diffResult.summary.newProducts,
};
if (diffResult.changes.length > 0) {
console.log(`[ProductRefresh] Inventory changes detected:`);
console.log(` - Sales: ${diffResult.summary.sales} (${diffResult.summary.unitsStold} units, $${diffResult.summary.totalRevenue.toFixed(2)})`);
console.log(` - New products: ${diffResult.summary.newProducts}`);
console.log(` - Removed: ${diffResult.summary.removedProducts}`);
console.log(` - Price changes: ${diffResult.summary.priceChanges}`);
}
} else {
console.log(`[ProductRefresh] No benchmark found - this payload becomes the first benchmark`);
}
// Store daily snapshot (first payload of day becomes benchmark)
const snapshotResult = await storeDailySnapshot(dispensaryId, allProducts);
if (snapshotResult.isNew) {
console.log(`[ProductRefresh] Stored new daily benchmark snapshot`);
}
await ctx.heartbeat();
} catch (trackError: any) {
// Inventory tracking errors shouldn't fail the whole refresh
console.warn(`[ProductRefresh] Inventory tracking error (non-fatal): ${trackError.message}`);
}
}
// ============================================================
// STEP 3: Normalize data
// ============================================================
@@ -384,6 +442,12 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
newProducts: upsertResult.new,
updatedProducts: upsertResult.updated,
markedOos: markedOosCount,
// Inventory tracking results
inventoryTracking: {
salesDetected: inventoryResult.sales,
revenueDetected: inventoryResult.revenue,
newProductsTracked: inventoryResult.newProducts,
},
};
} catch (error: unknown) {

View File

@@ -378,6 +378,20 @@ export async function handleStoreDiscoveryDutchie(ctx: TaskContext): Promise<Tas
if (result.isNew) {
totalDiscovered++;
}
// Record status observation for pattern learning
// result.id is the location_id from dutchie_discovery_locations
if (disp.status && result.id) {
try {
await pool.query(
`SELECT record_store_status(NULL, $1, 'discovery', $2)`,
[disp.status, result.id]
);
} catch (statusErr: any) {
// Non-fatal - just log
console.warn(`[StoreDiscoveryHTTP] Failed to record status for ${disp.name}: ${statusErr.message}`);
}
}
}
} catch (err: any) {
console.error(`[StoreDiscoveryHTTP] Upsert error for ${disp.name}:`, err.message);

View File

@@ -194,6 +194,21 @@ class TaskService {
return null;
}
// ENFORCE MAX TASK LIMIT - check before ANY claiming path
const workerCheck = await pool.query(`
SELECT session_task_count, COALESCE(session_max_tasks, 5) as max_tasks
FROM worker_registry
WHERE worker_id = $1
`, [workerId]);
if (workerCheck.rows.length > 0) {
const { session_task_count, max_tasks } = workerCheck.rows[0];
if (session_task_count >= max_tasks) {
console.log(`[TaskService] Worker ${workerId} at max capacity (${session_task_count}/${max_tasks})`);
return null;
}
}
if (role) {
// Role-specific claiming - use the SQL function with preflight capabilities
const result = await pool.query(
@@ -203,7 +218,24 @@ class TaskService {
return (result.rows[0] as WorkerTask) || null;
}
// Role-agnostic claiming - claim ANY pending task matching worker capabilities
// Role-agnostic claiming - MUST still enforce geo session + state matching
// First verify worker has a valid geo session
const geoCheck = await pool.query(`
SELECT current_state,
(geo_session_started_at IS NOT NULL
AND geo_session_started_at > NOW() - INTERVAL '60 minutes') as session_valid
FROM worker_registry
WHERE worker_id = $1
`, [workerId]);
if (geoCheck.rows.length === 0 || !geoCheck.rows[0].session_valid || !geoCheck.rows[0].current_state) {
console.log(`[TaskService] Worker ${workerId} has no valid geo session - cannot claim tasks`);
return null;
}
const workerState = geoCheck.rows[0].current_state;
// Claim task matching worker's state and method capabilities
const result = await pool.query(`
UPDATE worker_tasks
SET
@@ -211,27 +243,39 @@ class TaskService {
worker_id = $1,
claimed_at = NOW()
WHERE id = (
SELECT id FROM worker_tasks
WHERE status = 'pending'
AND (scheduled_for IS NULL OR scheduled_for <= NOW())
SELECT wt.id FROM worker_tasks wt
JOIN dispensaries d ON wt.dispensary_id = d.id
WHERE wt.status = 'pending'
AND (wt.scheduled_for IS NULL OR wt.scheduled_for <= NOW())
-- GEO FILTER: Task's dispensary must match worker's state
AND d.state = $4
-- Method compatibility: worker must have passed the required preflight
AND (
method IS NULL -- No preference, any worker can claim
OR (method = 'curl' AND $2 = TRUE)
OR (method = 'http' AND $3 = TRUE)
wt.method IS NULL -- No preference, any worker can claim
OR (wt.method = 'curl' AND $2 = TRUE)
OR (wt.method = 'http' AND $3 = TRUE)
)
-- Exclude stores that already have an active task
AND (dispensary_id IS NULL OR dispensary_id NOT IN (
AND (wt.dispensary_id IS NULL OR wt.dispensary_id NOT IN (
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
AND dispensary_id IS NOT NULL
))
ORDER BY priority DESC, created_at ASC
ORDER BY wt.priority DESC, wt.created_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING *
`, [workerId, curlPassed, httpPassed]);
`, [workerId, curlPassed, httpPassed, workerState]);
// Increment session_task_count if task was claimed
if (result.rows[0]) {
await pool.query(`
UPDATE worker_registry
SET session_task_count = session_task_count + 1
WHERE worker_id = $1
`, [workerId]);
}
return (result.rows[0] as WorkerTask) || null;
}
@@ -261,28 +305,24 @@ class TaskService {
}
/**
* Mark a task as completed with verification
* Returns true if completion was verified in DB, false otherwise
* Mark a task as completed and remove from pool
* Completed tasks are deleted - only failed tasks stay in the pool for retry/review
* Returns true if task was successfully deleted
*/
async completeTask(taskId: number, result?: Record<string, unknown>): Promise<boolean> {
await pool.query(
`UPDATE worker_tasks
SET status = 'completed', completed_at = NOW(), result = $2
WHERE id = $1`,
[taskId, result ? JSON.stringify(result) : null]
);
// Verify completion was recorded
const verify = await pool.query(
`SELECT status FROM worker_tasks WHERE id = $1`,
// Delete the completed task from the pool
// Only failed tasks stay in the table for retry/review
const deleteResult = await pool.query(
`DELETE FROM worker_tasks WHERE id = $1 RETURNING id`,
[taskId]
);
if (verify.rows[0]?.status !== 'completed') {
console.error(`[TaskService] Task ${taskId} completion NOT VERIFIED - DB shows status: ${verify.rows[0]?.status}`);
if (deleteResult.rowCount === 0) {
console.error(`[TaskService] Task ${taskId} completion FAILED - task not found or already deleted`);
return false;
}
console.log(`[TaskService] Task ${taskId} completed and removed from pool`);
return true;
}
@@ -351,7 +391,7 @@ class TaskService {
* Hard failures: Auto-retry up to MAX_RETRIES with exponential backoff
*/
async failTask(taskId: number, errorMessage: string): Promise<boolean> {
const MAX_RETRIES = 3;
const MAX_RETRIES = 5;
const isSoft = this.isSoftFailure(errorMessage);
// Get current retry count
@@ -490,7 +530,15 @@ class TaskService {
${poolJoin}
LEFT JOIN worker_registry w ON w.worker_id = t.worker_id
${whereClause}
ORDER BY t.created_at DESC
ORDER BY
CASE t.status
WHEN 'active' THEN 1
WHEN 'pending' THEN 2
WHEN 'failed' THEN 3
WHEN 'completed' THEN 4
ELSE 5
END,
t.created_at DESC
LIMIT ${limit} OFFSET ${offset}`,
params
);
@@ -1001,9 +1049,31 @@ class TaskService {
const claimedAt = task.claimed_at || task.created_at;
switch (task.role) {
case 'product_refresh':
case 'product_discovery': {
// Verify payload was saved to raw_crawl_payloads after task was claimed
// For product_discovery, verify inventory snapshots were saved (always happens)
// Note: raw_crawl_payloads only saved during baseline window, so check snapshots instead
const snapshotResult = await pool.query(
`SELECT COUNT(*)::int as count
FROM inventory_snapshots
WHERE dispensary_id = $1
AND captured_at > $2`,
[task.dispensary_id, claimedAt]
);
const snapshotCount = snapshotResult.rows[0]?.count || 0;
if (snapshotCount === 0) {
return {
verified: false,
reason: `No inventory snapshots found for dispensary ${task.dispensary_id} after ${claimedAt}`
};
}
return { verified: true };
}
case 'product_refresh': {
// For product_refresh, verify payload was saved to raw_crawl_payloads
const payloadResult = await pool.query(
`SELECT id, product_count, fetched_at
FROM raw_crawl_payloads

View File

@@ -131,9 +131,9 @@ const API_BASE_URL = process.env.API_BASE_URL || 'http://localhost:3010';
// Browser tasks (Puppeteer) use ~400MB RAM each. With 2GB pod limit:
// - 3 browsers = ~1.3GB = SAFE
// - 4 browsers = ~1.7GB = RISKY
// - 5+ browsers = OOM CRASH
// - 5 browsers = ~2.0GB = AT LIMIT (monitor memory closely)
// See: docs/WORKER_TASK_ARCHITECTURE.md#browser-task-memory-limits
const MAX_CONCURRENT_TASKS = parseInt(process.env.MAX_CONCURRENT_TASKS || '3');
const MAX_CONCURRENT_TASKS = parseInt(process.env.MAX_CONCURRENT_TASKS || '5');
// When heap memory usage exceeds this threshold (as decimal 0.0-1.0), stop claiming new tasks
// Default 85% - gives headroom before OOM
@@ -161,11 +161,39 @@ const CPU_BACKOFF_THRESHOLD = parseFloat(process.env.CPU_BACKOFF_THRESHOLD || '0
const BACKOFF_DURATION_MS = parseInt(process.env.BACKOFF_DURATION_MS || '10000');
export interface WorkerFingerprint {
// Browser & OS
userAgent?: string;
browser?: string;
browserVersion?: string;
os?: string;
osVersion?: string;
platform?: string; // e.g., "Win32", "MacIntel", "Linux x86_64"
// Display & Device
device?: 'desktop' | 'mobile' | 'tablet';
screenWidth?: number;
screenHeight?: number;
colorDepth?: number;
pixelRatio?: number;
// Hardware
hardwareConcurrency?: number; // CPU cores
deviceMemory?: number; // RAM in GB
maxTouchPoints?: number;
// WebGL
webglVendor?: string;
webglRenderer?: string;
// Language & Locale
timezone?: string;
locale?: string;
languages?: string[];
// Geo (detected/verified)
city?: string;
state?: string;
ip?: string;
locale?: string;
}
export interface TaskContext {
@@ -787,17 +815,29 @@ export class TaskWorker {
const detectedLocation = (this.preflightHttpResult as any).detectedLocation;
console.log(`[TaskWorker] HTTP IP: ${this.preflightHttpResult.proxyIp}, Timezone: ${detectedTimezone || 'unknown'}`);
// Store fingerprint for task execution - CRITICAL for anti-detect consistency
// Add detected geo to existing fingerprint (don't overwrite generated anti-detect config)
if (this.preflightHttpPassed) {
this.storedTimezone = detectedTimezone || null;
this.storedTimezone = detectedTimezone || this.storedTimezone || null;
if (this.storedFingerprint) {
// Merge detected geo into generated fingerprint
this.storedFingerprint.city = detectedLocation?.city;
this.storedFingerprint.state = detectedLocation?.region;
this.storedFingerprint.ip = this.preflightHttpResult.proxyIp;
// Use detected timezone if available
if (detectedTimezone) {
this.storedFingerprint.timezone = detectedTimezone;
}
} else {
// Fallback if no generated fingerprint (shouldn't happen with session flow)
this.storedFingerprint = {
timezone: detectedTimezone,
city: detectedLocation?.city,
state: detectedLocation?.region,
ip: this.preflightHttpResult.proxyIp,
locale: 'en-US', // US proxies use English
locale: 'en-US',
};
console.log(`[TaskWorker] Stored fingerprint: ${JSON.stringify(this.storedFingerprint)}`);
}
console.log(`[TaskWorker] Fingerprint updated with geo: ${detectedLocation?.city}, ${detectedLocation?.region}, IP: ${this.preflightHttpResult.proxyIp}`);
}
}
} catch (err: any) {
@@ -1827,18 +1867,7 @@ export class TaskWorker {
// If no active session, claim new batch of tasks
if (!this.currentSession) {
// Step 1: Initialize stealth
this.setPreflightStep('init', 'Initializing stealth plugins');
if (!this.stealthInitialized) {
const initSuccess = await this.ensureStealthInitialized();
if (!initSuccess) {
this.setPreflightStep('init_failed', 'Stealth init failed');
await this.sleep(30000);
return;
}
}
// Step 2: Claim tasks from pool
// Step 1: Check for tasks FIRST (before using any proxy bandwidth)
this.setPreflightStep('claiming', 'Claiming tasks from pool');
console.log(`[TaskWorker] ${this.friendlyName} claiming new session...`);
const result = await WorkerSession.claimSessionWithTasks(this.workerId, this.role || undefined);
@@ -1849,12 +1878,47 @@ export class TaskWorker {
return;
}
// Step 2: Tasks claimed - NOW initialize stealth (only when we have work to do)
this.setPreflightStep('init', 'Initializing stealth plugins');
if (!this.stealthInitialized) {
// Only init CrawlRotator, skip early preflight since we'll run it with session proxy
await this.initializeStealth();
this.stealthInitialized = true;
}
this.currentSession = result.session;
this.sessionTasks = result.tasks;
this.sessionProxyUrl = result.proxyUrl;
this.geoState = result.session.state_code;
this.geoCity = result.session.city || null;
// Store the GENERATED fingerprint from session (anti-detect config)
if (result.fingerprint) {
this.storedFingerprint = {
userAgent: result.fingerprint.userAgent,
browser: result.fingerprint.browser,
browserVersion: result.fingerprint.browserVersion,
os: result.fingerprint.os,
osVersion: result.fingerprint.osVersion,
platform: result.fingerprint.platform,
device: result.fingerprint.device,
screenWidth: result.fingerprint.screenWidth,
screenHeight: result.fingerprint.screenHeight,
colorDepth: result.fingerprint.colorDepth,
pixelRatio: result.fingerprint.pixelRatio,
hardwareConcurrency: result.fingerprint.hardwareConcurrency,
deviceMemory: result.fingerprint.deviceMemory,
maxTouchPoints: result.fingerprint.maxTouchPoints,
webglVendor: result.fingerprint.webglVendor,
webglRenderer: result.fingerprint.webglRenderer,
timezone: result.fingerprint.timezone,
locale: result.fingerprint.locale,
languages: result.fingerprint.languages,
};
this.storedTimezone = result.fingerprint.timezone;
console.log(`[TaskWorker] ${this.friendlyName} fingerprint: ${result.fingerprint.browser} ${result.fingerprint.browserVersion} on ${result.fingerprint.os}, ${result.fingerprint.device}, ${result.fingerprint.screenWidth}x${result.fingerprint.screenHeight}`);
}
console.log(`[TaskWorker] ${this.friendlyName} new session: ${result.tasks.length} tasks for ${this.geoCity || 'any'}, ${this.geoState} (IP: ${result.session.ip_address})`);
// Step 3: Configure proxy

View File

@@ -417,23 +417,26 @@ export async function listPayloadMetadata(
sizeBytes: number;
sizeBytesRaw: number;
fetchedAt: Date;
dispensary_name: string | null;
city: string | null;
state: string | null;
}>> {
const conditions: string[] = [];
const params: any[] = [];
let paramIndex = 1;
if (options.dispensaryId) {
conditions.push(`dispensary_id = $${paramIndex++}`);
conditions.push(`rcp.dispensary_id = $${paramIndex++}`);
params.push(options.dispensaryId);
}
if (options.startDate) {
conditions.push(`fetched_at >= $${paramIndex++}`);
conditions.push(`rcp.fetched_at >= $${paramIndex++}`);
params.push(options.startDate);
}
if (options.endDate) {
conditions.push(`fetched_at <= $${paramIndex++}`);
conditions.push(`rcp.fetched_at <= $${paramIndex++}`);
params.push(options.endDate);
}
@@ -445,17 +448,21 @@ export async function listPayloadMetadata(
const result = await pool.query(`
SELECT
id,
dispensary_id,
crawl_run_id,
storage_path,
product_count,
size_bytes,
size_bytes_raw,
fetched_at
FROM raw_crawl_payloads
rcp.id,
rcp.dispensary_id,
rcp.crawl_run_id,
rcp.storage_path,
rcp.product_count,
rcp.size_bytes,
rcp.size_bytes_raw,
rcp.fetched_at,
d.name as dispensary_name,
d.city,
d.state
FROM raw_crawl_payloads rcp
LEFT JOIN dispensaries d ON d.id = rcp.dispensary_id
${whereClause}
ORDER BY fetched_at DESC
ORDER BY rcp.fetched_at DESC
LIMIT $${paramIndex++} OFFSET $${paramIndex}
`, params);
@@ -467,7 +474,10 @@ export async function listPayloadMetadata(
productCount: row.product_count,
sizeBytes: row.size_bytes,
sizeBytesRaw: row.size_bytes_raw,
fetchedAt: row.fetched_at
fetchedAt: row.fetched_at,
dispensary_name: row.dispensary_name,
city: row.city,
state: row.state
}));
}

View File

@@ -1,29 +1,36 @@
/**
* Provider Display Names
*
* Maps internal provider identifiers to safe display labels.
* Internal identifiers (menu_type, product_provider, crawler_type) remain unchanged.
* Only the display label shown to users is transformed.
* Maps internal menu_type values to display labels.
* - standalone/embedded → dutchie (both are Dutchie platform)
* - treez → treez
* - jane/iheartjane → jane
*/
export const ProviderDisplayNames: Record<string, string> = {
// All menu providers map to anonymous "Menu Feed" label
dutchie: 'Menu Feed',
treez: 'Menu Feed',
jane: 'Menu Feed',
iheartjane: 'Menu Feed',
blaze: 'Menu Feed',
flowhub: 'Menu Feed',
weedmaps: 'Menu Feed',
leafly: 'Menu Feed',
leaflogix: 'Menu Feed',
tymber: 'Menu Feed',
dispense: 'Menu Feed',
// Dutchie (standalone and embedded are both Dutchie)
dutchie: 'dutchie',
standalone: 'dutchie',
embedded: 'dutchie',
// Other platforms
treez: 'treez',
jane: 'jane',
iheartjane: 'jane',
// Future platforms
blaze: 'blaze',
flowhub: 'flowhub',
weedmaps: 'weedmaps',
leafly: 'leafly',
leaflogix: 'leaflogix',
tymber: 'tymber',
dispense: 'dispense',
// Catch-all
unknown: 'Menu Feed',
default: 'Menu Feed',
'': 'Menu Feed',
unknown: 'unknown',
default: 'unknown',
'': 'unknown',
};
/**

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM registry.spdy.io/library/node:22-slim AS builder
WORKDIR /app
@@ -20,7 +20,7 @@ COPY . .
RUN npm run build
# Production stage
FROM nginx:alpine
FROM registry.spdy.io/library/nginx:alpine
# Copy built assets from builder stage
COPY --from=builder /app/dist /usr/share/nginx/html

View File

@@ -7,8 +7,8 @@
<title>CannaIQ - Cannabis Menu Intelligence Platform</title>
<meta name="description" content="CannaIQ provides real-time cannabis dispensary menu data, product tracking, and analytics for dispensaries across Arizona." />
<meta name="keywords" content="cannabis, dispensary, menu, products, analytics, Arizona" />
<script type="module" crossorigin src="/assets/index-CJhaZjAX.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-BRta4lo8.css">
<script type="module" crossorigin src="/assets/index-DlFYBvaE.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-C0P8dnSa.css">
<link rel="manifest" href="/manifest.webmanifest"></head>
<body>
<div id="root"></div>

View File

@@ -51,6 +51,9 @@ import { ProxyManagement } from './pages/ProxyManagement';
import TasksDashboard from './pages/TasksDashboard';
import { PayloadsDashboard } from './pages/PayloadsDashboard';
import { ScraperOverviewDashboard } from './pages/ScraperOverviewDashboard';
import { HighFrequencyManager } from './pages/HighFrequencyManager';
import { VisibilityEventsDashboard } from './pages/VisibilityEventsDashboard';
import { InventorySnapshotsDashboard } from './pages/InventorySnapshotsDashboard';
import { SeoOrchestrator } from './pages/admin/seo/SeoOrchestrator';
import { StatePage } from './pages/public/StatePage';
import { SeoPage } from './pages/public/SeoPage';
@@ -135,6 +138,10 @@ export default function App() {
<Route path="/payloads" element={<PrivateRoute><PayloadsDashboard /></PrivateRoute>} />
{/* Scraper Overview Dashboard (new primary) */}
<Route path="/scraper/overview" element={<PrivateRoute><ScraperOverviewDashboard /></PrivateRoute>} />
{/* Inventory Tracking routes */}
<Route path="/inventory/high-frequency" element={<PrivateRoute><HighFrequencyManager /></PrivateRoute>} />
<Route path="/inventory/events" element={<PrivateRoute><VisibilityEventsDashboard /></PrivateRoute>} />
<Route path="/inventory/snapshots" element={<PrivateRoute><InventorySnapshotsDashboard /></PrivateRoute>} />
<Route path="*" element={<Navigate to="/dashboard" replace />} />
</Routes>
</BrowserRouter>

View File

@@ -0,0 +1,141 @@
/**
* Event Type Badge Component
*
* Color-coded badge for visibility event types.
* Used in VisibilityEventsDashboard and brand event views.
*/
import React from 'react';
export type EventType = 'oos' | 'back_in_stock' | 'brand_dropped' | 'brand_added' | 'price_change';
interface EventTypeConfig {
label: string;
shortLabel: string;
bgColor: string;
textColor: string;
icon: string;
description: string;
}
const EVENT_TYPE_CONFIG: Record<EventType, EventTypeConfig> = {
oos: {
label: 'Out of Stock',
shortLabel: 'OOS',
bgColor: 'bg-red-600',
textColor: 'text-white',
icon: '!',
description: 'Product went out of stock',
},
back_in_stock: {
label: 'Back in Stock',
shortLabel: 'In Stock',
bgColor: 'bg-green-600',
textColor: 'text-white',
icon: '+',
description: 'Product returned to stock',
},
brand_dropped: {
label: 'Brand Dropped',
shortLabel: 'Dropped',
bgColor: 'bg-orange-600',
textColor: 'text-white',
icon: '-',
description: 'Brand no longer at this store',
},
brand_added: {
label: 'Brand Added',
shortLabel: 'Added',
bgColor: 'bg-blue-600',
textColor: 'text-white',
icon: '+',
description: 'New brand at this store',
},
price_change: {
label: 'Price Change',
shortLabel: 'Price',
bgColor: 'bg-yellow-600',
textColor: 'text-black',
icon: '$',
description: 'Significant price change (>5%)',
},
};
interface EventTypeBadgeProps {
type: EventType;
size?: 'sm' | 'md' | 'lg';
showLabel?: boolean;
showIcon?: boolean;
className?: string;
}
export function EventTypeBadge({
type,
size = 'md',
showLabel = true,
showIcon = true,
className = '',
}: EventTypeBadgeProps) {
const config = EVENT_TYPE_CONFIG[type];
const sizeClasses = {
sm: 'px-1.5 py-0.5 text-xs',
md: 'px-2 py-1 text-sm',
lg: 'px-3 py-1.5 text-base',
};
return (
<span
className={`
inline-flex items-center gap-1 rounded-full font-medium
${config.bgColor} ${config.textColor}
${sizeClasses[size]}
${className}
`}
title={config.description}
>
{showIcon && (
<span className="font-bold">{config.icon}</span>
)}
{showLabel && (
<span>{size === 'sm' ? config.shortLabel : config.label}</span>
)}
</span>
);
}
/**
* Get event type configuration
*/
export function getEventTypeConfig(type: EventType): EventTypeConfig {
return EVENT_TYPE_CONFIG[type];
}
/**
* Get all event types for filtering
*/
export function getAllEventTypes(): { value: EventType; label: string }[] {
return Object.entries(EVENT_TYPE_CONFIG).map(([value, config]) => ({
value: value as EventType,
label: config.label,
}));
}
/**
* Format price change for display
*/
export function formatPriceChange(
previousPrice: number | null,
newPrice: number | null,
pctChange: number | null
): string {
if (previousPrice === null || newPrice === null) return 'N/A';
const diff = newPrice - previousPrice;
const sign = diff > 0 ? '+' : '';
const pct = pctChange !== null ? ` (${sign}${pctChange.toFixed(1)}%)` : '';
return `$${previousPrice.toFixed(2)} -> $${newPrice.toFixed(2)}${pct}`;
}
export default EventTypeBadge;

View File

@@ -0,0 +1,93 @@
/**
* Interval Dropdown Component
*
* Dropdown selector for high-frequency crawl intervals.
* Used in HighFrequencyManager and DispensaryDetail pages.
*/
import React from 'react';
interface IntervalOption {
value: number;
label: string;
}
const INTERVAL_OPTIONS: IntervalOption[] = [
{ value: 15, label: '15 minutes' },
{ value: 30, label: '30 minutes' },
{ value: 60, label: '1 hour' },
{ value: 120, label: '2 hours' },
{ value: 240, label: '4 hours' },
];
interface IntervalDropdownProps {
value: number | null;
onChange: (value: number | null) => void;
includeNone?: boolean;
disabled?: boolean;
className?: string;
size?: 'sm' | 'md' | 'lg';
}
export function IntervalDropdown({
value,
onChange,
includeNone = true,
disabled = false,
className = '',
size = 'md',
}: IntervalDropdownProps) {
const sizeClasses = {
sm: 'px-2 py-1 text-sm',
md: 'px-3 py-2 text-base',
lg: 'px-4 py-3 text-lg',
};
return (
<select
value={value ?? ''}
onChange={(e) => {
const val = e.target.value;
onChange(val === '' ? null : parseInt(val, 10));
}}
disabled={disabled}
className={`
${sizeClasses[size]}
bg-gray-800 border border-gray-700 rounded-md text-white
focus:ring-2 focus:ring-blue-500 focus:border-transparent
disabled:opacity-50 disabled:cursor-not-allowed
${className}
`}
>
{includeNone && <option value="">No high-frequency</option>}
{INTERVAL_OPTIONS.map((opt) => (
<option key={opt.value} value={opt.value}>
{opt.label}
</option>
))}
</select>
);
}
/**
* Format interval minutes to human-readable string
*/
export function formatInterval(minutes: number | null): string {
if (minutes === null) return 'Standard';
if (minutes < 60) return `${minutes}m`;
const hours = minutes / 60;
return hours === 1 ? '1 hour' : `${hours} hours`;
}
/**
* Get interval badge color
*/
export function getIntervalColor(minutes: number | null): string {
if (minutes === null) return 'bg-gray-600';
if (minutes <= 15) return 'bg-red-600';
if (minutes <= 30) return 'bg-orange-600';
if (minutes <= 60) return 'bg-yellow-600';
return 'bg-green-600';
}
export default IntervalDropdown;

View File

@@ -24,7 +24,10 @@ import {
Key,
Bot,
ListChecks,
Database
Database,
Clock,
Bell,
Package
} from 'lucide-react';
interface LayoutProps {
@@ -176,6 +179,12 @@ export function Layout({ children }: LayoutProps) {
<NavLink to="/analytics/clicks" icon={<MousePointerClick className="w-4 h-4" />} label="Click Analytics" isActive={isActive('/analytics/clicks')} />
</NavSection>
<NavSection title="Inventory">
<NavLink to="/inventory/high-frequency" icon={<Clock className="w-4 h-4" />} label="High-Frequency" isActive={isActive('/inventory/high-frequency')} />
<NavLink to="/inventory/events" icon={<Bell className="w-4 h-4" />} label="Visibility Events" isActive={isActive('/inventory/events')} />
<NavLink to="/inventory/snapshots" icon={<Package className="w-4 h-4" />} label="Snapshots" isActive={isActive('/inventory/snapshots')} />
</NavSection>
<NavSection title="Admin">
<NavLink to="/admin/orchestrator" icon={<Activity className="w-4 h-4" />} label="Orchestrator" isActive={isActive('/admin/orchestrator')} />
<NavLink to="/users" icon={<UserCog className="w-4 h-4" />} label="Users" isActive={isActive('/users')} />

View File

@@ -0,0 +1,128 @@
/**
* Stock Status Badge Component
*
* Color-coded badge for product stock status.
* Used in inventory views and product intelligence displays.
*/
import React from 'react';
interface StockStatusBadgeProps {
inStock: boolean;
quantity?: number | null;
daysUntilOOS?: number | null;
size?: 'sm' | 'md' | 'lg';
showQuantity?: boolean;
className?: string;
}
export function StockStatusBadge({
inStock,
quantity,
daysUntilOOS,
size = 'md',
showQuantity = false,
className = '',
}: StockStatusBadgeProps) {
const sizeClasses = {
sm: 'px-1.5 py-0.5 text-xs',
md: 'px-2 py-1 text-sm',
lg: 'px-3 py-1.5 text-base',
};
// Determine badge color based on stock status and days until OOS
let bgColor = 'bg-gray-600';
let textColor = 'text-white';
let label = 'Unknown';
if (!inStock) {
bgColor = 'bg-red-600';
label = 'Out of Stock';
} else if (daysUntilOOS != null && daysUntilOOS <= 3) {
bgColor = 'bg-orange-600';
label = `Low (${daysUntilOOS}d)`;
} else if (daysUntilOOS != null && daysUntilOOS <= 7) {
bgColor = 'bg-yellow-600';
textColor = 'text-black';
label = `Moderate (${daysUntilOOS}d)`;
} else if (inStock) {
bgColor = 'bg-green-600';
label = 'In Stock';
}
// Add quantity if requested
if (showQuantity && quantity !== null && quantity !== undefined) {
label = `${label} (${quantity})`;
}
return (
<span
className={`
inline-flex items-center rounded-full font-medium
${bgColor} ${textColor}
${sizeClasses[size]}
${className}
`}
>
{label}
</span>
);
}
/**
* Days Until Stock Out Indicator
*/
interface DaysUntilOOSProps {
days: number | null;
className?: string;
}
export function DaysUntilOOS({ days, className = '' }: DaysUntilOOSProps) {
if (days === null) {
return (
<span className={`text-gray-500 ${className}`}>-</span>
);
}
let color = 'text-green-500';
if (days <= 3) {
color = 'text-red-500';
} else if (days <= 7) {
color = 'text-yellow-500';
}
return (
<span className={`font-medium ${color} ${className}`}>
{days}d
</span>
);
}
/**
* Stock Diff Indicator
* Shows the change in stock over a period (e.g., 120 days)
*/
interface StockDiffProps {
diff: number;
className?: string;
}
export function StockDiff({ diff, className = '' }: StockDiffProps) {
let color = 'text-gray-500';
let sign = '';
if (diff > 0) {
color = 'text-green-500';
sign = '+';
} else if (diff < 0) {
color = 'text-red-500';
}
return (
<span className={`font-medium ${color} ${className}`}>
{sign}{diff}
</span>
);
}
export default StockStatusBadge;

View File

@@ -0,0 +1,122 @@
/**
* Velocity Tier Badge Component
*
* Color-coded badge for SKU velocity tiers.
* Used in product intelligence and SKU velocity views.
*/
import React from 'react';
export type VelocityTier = 'hot' | 'steady' | 'slow' | 'stale';
interface TierConfig {
label: string;
bgColor: string;
textColor: string;
icon: string;
description: string;
unitsPerDay: string;
}
const TIER_CONFIG: Record<VelocityTier, TierConfig> = {
hot: {
label: 'Hot',
bgColor: 'bg-red-600',
textColor: 'text-white',
icon: '\u{1F525}', // Fire emoji
description: 'High velocity - 5+ units/day',
unitsPerDay: '5+',
},
steady: {
label: 'Steady',
bgColor: 'bg-green-600',
textColor: 'text-white',
icon: '\u{2705}', // Checkmark
description: 'Moderate velocity - 1-5 units/day',
unitsPerDay: '1-5',
},
slow: {
label: 'Slow',
bgColor: 'bg-yellow-600',
textColor: 'text-black',
icon: '\u{1F422}', // Turtle
description: 'Low velocity - 0.1-1 units/day',
unitsPerDay: '0.1-1',
},
stale: {
label: 'Stale',
bgColor: 'bg-gray-600',
textColor: 'text-white',
icon: '\u{1F4A4}', // Zzz
description: 'No movement - <0.1 units/day',
unitsPerDay: '<0.1',
},
};
interface VelocityTierBadgeProps {
tier: VelocityTier;
size?: 'sm' | 'md' | 'lg';
showIcon?: boolean;
className?: string;
}
export function VelocityTierBadge({
tier,
size = 'md',
showIcon = false,
className = '',
}: VelocityTierBadgeProps) {
const config = TIER_CONFIG[tier];
const sizeClasses = {
sm: 'px-1.5 py-0.5 text-xs',
md: 'px-2 py-1 text-sm',
lg: 'px-3 py-1.5 text-base',
};
return (
<span
className={`
inline-flex items-center gap-1 rounded-full font-medium
${config.bgColor} ${config.textColor}
${sizeClasses[size]}
${className}
`}
title={config.description}
>
{showIcon && <span>{config.icon}</span>}
<span>{config.label}</span>
</span>
);
}
/**
* Get tier configuration
*/
export function getTierConfig(tier: VelocityTier): TierConfig {
return TIER_CONFIG[tier];
}
/**
* Get all velocity tiers for filtering
*/
export function getAllVelocityTiers(): { value: VelocityTier; label: string; description: string }[] {
return Object.entries(TIER_CONFIG).map(([value, config]) => ({
value: value as VelocityTier,
label: config.label,
description: config.description,
}));
}
/**
* Determine velocity tier from avg daily units
*/
export function getVelocityTier(avgDailyUnits: number | null): VelocityTier {
if (avgDailyUnits === null) return 'stale';
if (avgDailyUnits >= 5) return 'hot';
if (avgDailyUnits >= 1) return 'steady';
if (avgDailyUnits >= 0.1) return 'slow';
return 'stale';
}
export default VelocityTierBadge;

View File

@@ -3231,6 +3231,399 @@ class ApiClient {
};
}>(`/api/payloads/store/${dispensaryId}/diff${query ? '?' + query : ''}`);
}
// ============================================================
// SALES ANALYTICS API (Materialized Views)
// Part of Real-Time Inventory Tracking feature
// ============================================================
async getDailySalesEstimates(params?: {
state?: string;
brand?: string;
category?: string;
dispensary_id?: number;
limit?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.brand) searchParams.append('brand', params.brand);
if (params?.category) searchParams.append('category', params.category);
if (params?.dispensary_id) searchParams.append('dispensary_id', String(params.dispensary_id));
if (params?.limit) searchParams.append('limit', String(params.limit));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: DailySalesEstimate[];
count: number;
}>(`/api/sales-analytics/daily-sales${query ? '?' + query : ''}`);
}
async getBrandMarketShare(params?: {
state?: string;
brand?: string;
min_penetration?: number;
limit?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.brand) searchParams.append('brand', params.brand);
if (params?.min_penetration) searchParams.append('min_penetration', String(params.min_penetration));
if (params?.limit) searchParams.append('limit', String(params.limit));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: BrandMarketShare[];
count: number;
}>(`/api/sales-analytics/brand-market-share${query ? '?' + query : ''}`);
}
async getSkuVelocity(params?: {
state?: string;
brand?: string;
category?: string;
dispensary_id?: number;
tier?: 'hot' | 'steady' | 'slow' | 'stale';
limit?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.brand) searchParams.append('brand', params.brand);
if (params?.category) searchParams.append('category', params.category);
if (params?.dispensary_id) searchParams.append('dispensary_id', String(params.dispensary_id));
if (params?.tier) searchParams.append('tier', params.tier);
if (params?.limit) searchParams.append('limit', String(params.limit));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: SkuVelocity[];
count: number;
}>(`/api/sales-analytics/sku-velocity${query ? '?' + query : ''}`);
}
async getStorePerformance(params?: {
state?: string;
sort_by?: 'revenue' | 'units' | 'brands' | 'skus';
limit?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.sort_by) searchParams.append('sort_by', params.sort_by);
if (params?.limit) searchParams.append('limit', String(params.limit));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: StorePerformance[];
count: number;
}>(`/api/sales-analytics/store-performance${query ? '?' + query : ''}`);
}
async getCategoryTrends(params?: {
state?: string;
category?: string;
weeks?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.category) searchParams.append('category', params.category);
if (params?.weeks) searchParams.append('weeks', String(params.weeks));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: CategoryTrend[];
count: number;
}>(`/api/sales-analytics/category-trends${query ? '?' + query : ''}`);
}
async getProductIntelligence(params?: {
state?: string;
brand?: string;
category?: string;
dispensary_id?: number;
in_stock?: boolean;
low_stock?: boolean;
recent_oos?: boolean;
limit?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.brand) searchParams.append('brand', params.brand);
if (params?.category) searchParams.append('category', params.category);
if (params?.dispensary_id) searchParams.append('dispensary_id', String(params.dispensary_id));
if (params?.in_stock !== undefined) searchParams.append('in_stock', String(params.in_stock));
if (params?.low_stock !== undefined) searchParams.append('low_stock', String(params.low_stock));
if (params?.recent_oos !== undefined) searchParams.append('recent_oos', String(params.recent_oos));
if (params?.limit) searchParams.append('limit', String(params.limit));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: ProductIntelligence[];
count: number;
}>(`/api/sales-analytics/product-intelligence${query ? '?' + query : ''}`);
}
async getTopBrands(params?: {
state?: string;
window?: '7d' | '30d' | '90d' | '1y' | 'all';
limit?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.window) searchParams.append('window', params.window);
if (params?.limit) searchParams.append('limit', String(params.limit));
const query = searchParams.toString();
return this.request<{
success: boolean;
data: TopBrand[];
count: number;
}>(`/api/sales-analytics/top-brands${query ? '?' + query : ''}`);
}
async refreshSalesAnalytics() {
return this.request<{
success: boolean;
data: Array<{ view_name: string; rows_affected: number }>;
}>('/api/sales-analytics/refresh', {
method: 'POST',
});
}
async getSalesAnalyticsStats() {
return this.request<{
success: boolean;
data: Record<string, number>;
}>('/api/sales-analytics/stats');
}
// ============================================================
// INVENTORY SNAPSHOTS & VISIBILITY EVENTS API
// Part of Real-Time Inventory Tracking feature
// ============================================================
async getInventorySnapshots(params?: {
dispensary_id?: number;
product_id?: string;
limit?: number;
offset?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.dispensary_id) searchParams.append('dispensary_id', String(params.dispensary_id));
if (params?.product_id) searchParams.append('product_id', params.product_id);
if (params?.limit) searchParams.append('limit', String(params.limit));
if (params?.offset) searchParams.append('offset', String(params.offset));
const query = searchParams.toString();
return this.request<{
success: boolean;
snapshots: InventorySnapshot[];
count: number;
}>(`/api/tasks/inventory-snapshots${query ? '?' + query : ''}`);
}
async getVisibilityEvents(params?: {
dispensary_id?: number;
brand?: string;
event_type?: 'oos' | 'back_in_stock' | 'brand_dropped' | 'brand_added' | 'price_change';
limit?: number;
offset?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.dispensary_id) searchParams.append('dispensary_id', String(params.dispensary_id));
if (params?.brand) searchParams.append('brand', params.brand);
if (params?.event_type) searchParams.append('event_type', params.event_type);
if (params?.limit) searchParams.append('limit', String(params.limit));
if (params?.offset) searchParams.append('offset', String(params.offset));
const query = searchParams.toString();
return this.request<{
success: boolean;
events: VisibilityEvent[];
count: number;
}>(`/api/tasks/visibility-events${query ? '?' + query : ''}`);
}
async acknowledgeVisibilityEvent(eventId: number) {
return this.request<{
success: boolean;
message: string;
}>(`/api/tasks/visibility-events/${eventId}/acknowledge`, {
method: 'POST',
});
}
async getBrandVisibilityEvents(brand: string, params?: {
state?: string;
event_type?: string;
limit?: number;
offset?: number;
}) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
if (params?.event_type) searchParams.append('event_type', params.event_type);
if (params?.limit) searchParams.append('limit', String(params.limit));
if (params?.offset) searchParams.append('offset', String(params.offset));
const query = searchParams.toString();
return this.request<{
success: boolean;
events: BrandVisibilityEvent[];
count: number;
}>(`/api/brands/${encodeURIComponent(brand)}/events${query ? '?' + query : ''}`);
}
}
// ============================================================
// SALES ANALYTICS TYPES
// ============================================================
export interface DailySalesEstimate {
dispensary_id: number;
product_id: string;
brand_name: string | null;
category: string | null;
sale_date: string;
avg_price: number | null;
units_sold: number;
units_restocked: number;
revenue_estimate: number;
snapshot_count: number;
}
export interface BrandMarketShare {
brand_name: string;
state_code: string;
stores_carrying: number;
total_stores: number;
penetration_pct: number;
sku_count: number;
in_stock_skus: number;
avg_price: number | null;
calculated_at: string;
}
export interface SkuVelocity {
product_id: string;
brand_name: string | null;
category: string | null;
dispensary_id: number;
dispensary_name: string;
state_code: string;
total_units_30d: number;
total_revenue_30d: number;
days_with_sales: number;
avg_daily_units: number;
avg_price: number | null;
velocity_tier: 'hot' | 'steady' | 'slow' | 'stale';
calculated_at: string;
}
export interface StorePerformance {
dispensary_id: number;
dispensary_name: string;
city: string | null;
state_code: string;
total_revenue_30d: number;
total_units_30d: number;
total_skus: number;
in_stock_skus: number;
unique_brands: number;
unique_categories: number;
avg_price: number | null;
last_updated: string | null;
calculated_at: string;
}
export interface CategoryTrend {
category: string;
state_code: string;
week_start: string;
sku_count: number;
store_count: number;
total_units: number;
total_revenue: number;
avg_price: number | null;
calculated_at: string;
}
export interface ProductIntelligence {
dispensary_id: number;
dispensary_name: string;
state_code: string;
city: string | null;
sku: string;
product_name: string;
brand: string | null;
category: string | null;
is_in_stock: boolean;
stock_status: string | null;
stock_quantity: number | null;
price: number | null;
first_seen: string | null;
last_seen: string | null;
stock_diff_120: number;
days_since_oos: number | null;
days_until_stock_out: number | null;
avg_daily_units: number | null;
calculated_at: string;
}
export interface TopBrand {
brand_name: string;
total_revenue: number;
total_units: number;
store_count: number;
sku_count: number;
avg_price: number | null;
}
// ============================================================
// INVENTORY & VISIBILITY TYPES
// ============================================================
export interface InventorySnapshot {
id: number;
dispensary_id: number;
product_id: string;
platform: 'dutchie' | 'jane' | 'treez';
quantity_available: number | null;
is_below_threshold: boolean;
status: string | null;
price_rec: number | null;
price_med: number | null;
brand_name: string | null;
category: string | null;
product_name: string | null;
captured_at: string;
}
export interface VisibilityEvent {
id: number;
dispensary_id: number;
dispensary_name?: string;
product_id: string | null;
product_name: string | null;
brand_name: string | null;
event_type: 'oos' | 'back_in_stock' | 'brand_dropped' | 'brand_added' | 'price_change';
detected_at: string;
previous_quantity: number | null;
previous_price: number | null;
new_price: number | null;
price_change_pct: number | null;
platform: 'dutchie' | 'jane' | 'treez';
notified: boolean;
acknowledged_at: string | null;
}
export interface BrandVisibilityEvent {
id: number;
dispensary_id: number;
dispensary_name: string;
state_code: string | null;
product_id: string | null;
product_name: string | null;
brand_name: string;
event_type: 'oos' | 'back_in_stock' | 'brand_dropped' | 'brand_added' | 'price_change';
detected_at: string;
previous_price: number | null;
new_price: number | null;
price_change_pct: number | null;
platform: 'dutchie' | 'jane' | 'treez';
}
// Type for task schedules
@@ -3269,6 +3662,8 @@ export interface PayloadMetadata {
sizeBytesRaw: number;
fetchedAt: string;
dispensary_name?: string;
city?: string;
state?: string;
}
// Type for high-frequency (per-store) schedules

View File

@@ -1,32 +1,36 @@
/**
* Provider Display Names
*
* Maps internal provider identifiers to safe display labels.
* Internal identifiers (menu_type, product_provider, crawler_type) remain unchanged.
* Only the display label shown to users is transformed.
*
* IMPORTANT: Raw provider names (dutchie, treez, jane, etc.) must NEVER
* be displayed directly in the UI. Always use this utility.
* Maps internal menu_type values to display labels.
* - standalone/embedded → Dutchie (both are Dutchie platform)
* - treez → Treez
* - jane/iheartjane → Jane
*/
export const ProviderDisplayNames: Record<string, string> = {
// All menu providers map to anonymous "Menu Feed" label
dutchie: 'Menu Feed',
treez: 'Menu Feed',
jane: 'Menu Feed',
iheartjane: 'Menu Feed',
blaze: 'Menu Feed',
flowhub: 'Menu Feed',
weedmaps: 'Menu Feed',
leafly: 'Menu Feed',
leaflogix: 'Menu Feed',
tymber: 'Menu Feed',
dispense: 'Menu Feed',
// Dutchie (standalone and embedded are both Dutchie)
dutchie: 'dutchie',
standalone: 'dutchie',
embedded: 'dutchie',
// Other platforms
treez: 'treez',
jane: 'jane',
iheartjane: 'jane',
// Future platforms
blaze: 'blaze',
flowhub: 'flowhub',
weedmaps: 'weedmaps',
leafly: 'leafly',
leaflogix: 'leaflogix',
tymber: 'tymber',
dispense: 'dispense',
// Catch-all
unknown: 'Menu Feed',
default: 'Menu Feed',
'': 'Menu Feed',
unknown: 'unknown',
default: 'unknown',
'': 'unknown',
};
/**

View File

@@ -0,0 +1,342 @@
/**
* High-Frequency Manager Page
*
* View and manage stores with custom high-frequency crawl intervals.
* Part of Real-Time Inventory Tracking feature.
*/
import { useEffect, useState } from 'react';
import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api, HighFrequencyStore } from '../lib/api';
import { IntervalDropdown, formatInterval, getIntervalColor } from '../components/IntervalDropdown';
import {
Clock,
Store,
RefreshCw,
Plus,
Trash2,
AlertCircle,
CheckCircle,
Search,
TrendingUp,
Package,
} from 'lucide-react';
interface Stats {
totalStores: number;
byInterval: Record<number, number>;
byPlatform: Record<string, number>;
nextDueCount: number;
}
export function HighFrequencyManager() {
const navigate = useNavigate();
const [stores, setStores] = useState<HighFrequencyStore[]>([]);
const [stats, setStats] = useState<Stats | null>(null);
const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState('');
const [updating, setUpdating] = useState<number | null>(null);
const [error, setError] = useState<string | null>(null);
const [success, setSuccess] = useState<string | null>(null);
// Load data
useEffect(() => {
loadData();
}, []);
const loadData = async () => {
try {
setLoading(true);
setError(null);
const data = await api.getHighFrequencySchedules();
setStores(data.stores || []);
setStats(data.stats || null);
} catch (err: any) {
console.error('Failed to load high-frequency schedules:', err);
setError(err.message || 'Failed to load data');
} finally {
setLoading(false);
}
};
const handleIntervalChange = async (dispensaryId: number, intervalMinutes: number | null) => {
try {
setUpdating(dispensaryId);
setError(null);
if (intervalMinutes === null) {
await api.removeHighFrequencyInterval(dispensaryId);
setSuccess('High-frequency scheduling removed');
} else {
await api.setHighFrequencyInterval(dispensaryId, intervalMinutes);
setSuccess(`Interval updated to ${formatInterval(intervalMinutes)}`);
}
// Reload data
await loadData();
// Clear success message after 3 seconds
setTimeout(() => setSuccess(null), 3000);
} catch (err: any) {
console.error('Failed to update interval:', err);
setError(err.message || 'Failed to update interval');
} finally {
setUpdating(null);
}
};
const handleRemove = async (dispensaryId: number) => {
if (!confirm('Remove high-frequency scheduling for this store?')) return;
await handleIntervalChange(dispensaryId, null);
};
// Filter stores by search term
const filteredStores = stores.filter((store) =>
store.name.toLowerCase().includes(searchTerm.toLowerCase())
);
// Format timestamp
const formatTime = (ts: string | null) => {
if (!ts) return '-';
const date = new Date(ts);
return date.toLocaleString();
};
// Format relative time
const formatRelativeTime = (ts: string | null) => {
if (!ts) return '-';
const date = new Date(ts);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffMins = Math.floor(diffMs / 60000);
if (diffMins < 1) return 'just now';
if (diffMins < 60) return `${diffMins}m ago`;
const diffHours = Math.floor(diffMins / 60);
if (diffHours < 24) return `${diffHours}h ago`;
const diffDays = Math.floor(diffHours / 24);
return `${diffDays}d ago`;
};
if (loading) {
return (
<Layout>
<div className="text-center py-12">
<div className="inline-block animate-spin rounded-full h-8 w-8 border-4 border-blue-500 border-t-transparent"></div>
<p className="mt-2 text-sm text-gray-400">Loading high-frequency schedules...</p>
</div>
</Layout>
);
}
return (
<Layout>
<div className="max-w-7xl mx-auto px-4 py-6">
{/* Header */}
<div className="flex items-center justify-between mb-6">
<div>
<h1 className="text-2xl font-bold text-white flex items-center gap-2">
<Clock className="h-6 w-6 text-blue-500" />
High-Frequency Manager
</h1>
<p className="text-gray-400 mt-1">
Manage stores with custom crawl intervals for real-time inventory tracking
</p>
</div>
<button
onClick={loadData}
className="flex items-center gap-2 px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded-lg transition-colors"
>
<RefreshCw className="h-4 w-4" />
Refresh
</button>
</div>
{/* Alerts */}
{error && (
<div className="mb-4 p-4 bg-red-900/50 border border-red-700 rounded-lg flex items-center gap-2 text-red-200">
<AlertCircle className="h-5 w-5" />
{error}
</div>
)}
{success && (
<div className="mb-4 p-4 bg-green-900/50 border border-green-700 rounded-lg flex items-center gap-2 text-green-200">
<CheckCircle className="h-5 w-5" />
{success}
</div>
)}
{/* Stats Cards */}
{stats && (
<div className="grid grid-cols-1 md:grid-cols-4 gap-4 mb-6">
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Store className="h-4 w-4" />
Total Stores
</div>
<div className="text-2xl font-bold text-white mt-1">{stats.totalStores}</div>
</div>
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Clock className="h-4 w-4" />
Next Due
</div>
<div className="text-2xl font-bold text-white mt-1">{stats.nextDueCount}</div>
</div>
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<TrendingUp className="h-4 w-4" />
15m Interval
</div>
<div className="text-2xl font-bold text-white mt-1">{stats.byInterval[15] || 0}</div>
</div>
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Package className="h-4 w-4" />
30m Interval
</div>
<div className="text-2xl font-bold text-white mt-1">{stats.byInterval[30] || 0}</div>
</div>
</div>
)}
{/* Search */}
<div className="mb-6">
<div className="relative">
<Search className="absolute left-3 top-1/2 transform -translate-y-1/2 h-5 w-5 text-gray-400" />
<input
type="text"
value={searchTerm}
onChange={(e) => setSearchTerm(e.target.value)}
placeholder="Search stores..."
className="w-full pl-10 pr-4 py-2 bg-gray-800 border border-gray-700 rounded-lg text-white placeholder-gray-400 focus:ring-2 focus:ring-blue-500 focus:border-transparent"
/>
</div>
</div>
{/* Stores Table */}
<div className="bg-gray-800 rounded-lg border border-gray-700 overflow-hidden">
<table className="w-full">
<thead className="bg-gray-900">
<tr>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Store</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Platform</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Interval</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Next Crawl</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Last Crawl</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Changes (24h)</th>
<th className="px-4 py-3 text-right text-sm font-medium text-gray-400">Actions</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-700">
{filteredStores.length === 0 ? (
<tr>
<td colSpan={7} className="px-4 py-8 text-center text-gray-400">
{stores.length === 0
? 'No stores configured for high-frequency crawling'
: 'No stores match your search'}
</td>
</tr>
) : (
filteredStores.map((store) => (
<tr key={store.id} className="hover:bg-gray-700/50">
<td className="px-4 py-3">
<button
onClick={() => navigate(`/dispensaries/${store.id}`)}
className="text-blue-400 hover:text-blue-300 font-medium"
>
{store.name}
</button>
</td>
<td className="px-4 py-3">
<span className="px-2 py-1 bg-gray-700 rounded text-sm text-gray-300">
{store.menu_type}
</span>
</td>
<td className="px-4 py-3">
<IntervalDropdown
value={store.crawl_interval_minutes}
onChange={(val) => handleIntervalChange(store.id, val)}
disabled={updating === store.id}
size="sm"
includeNone={true}
/>
</td>
<td className="px-4 py-3 text-sm text-gray-300">
{store.next_crawl_at ? (
<span title={formatTime(store.next_crawl_at)}>
{formatRelativeTime(store.next_crawl_at)}
</span>
) : (
'-'
)}
</td>
<td className="px-4 py-3 text-sm text-gray-300">
{store.last_crawl_started_at ? (
<span title={formatTime(store.last_crawl_started_at)}>
{formatRelativeTime(store.last_crawl_started_at)}
</span>
) : (
'-'
)}
</td>
<td className="px-4 py-3">
<div className="flex items-center gap-2">
<span
className={`px-2 py-0.5 rounded text-xs ${
store.inventory_changes_24h > 0
? 'bg-blue-600 text-white'
: 'bg-gray-700 text-gray-400'
}`}
>
{store.inventory_changes_24h} inv
</span>
<span
className={`px-2 py-0.5 rounded text-xs ${
store.price_changes_24h > 0
? 'bg-yellow-600 text-white'
: 'bg-gray-700 text-gray-400'
}`}
>
{store.price_changes_24h} price
</span>
</div>
</td>
<td className="px-4 py-3 text-right">
<button
onClick={() => handleRemove(store.id)}
disabled={updating === store.id}
className="p-1 text-red-400 hover:text-red-300 disabled:opacity-50"
title="Remove from high-frequency"
>
<Trash2 className="h-4 w-4" />
</button>
</td>
</tr>
))
)}
</tbody>
</table>
</div>
{/* Info Box */}
<div className="mt-6 p-4 bg-gray-800 border border-gray-700 rounded-lg">
<h3 className="text-sm font-medium text-gray-300 mb-2">About High-Frequency Crawling</h3>
<p className="text-sm text-gray-400">
High-frequency crawling allows you to track inventory changes in near real-time for select stores.
Stores on 15-minute intervals will be crawled 96 times per day, enabling detection of:
</p>
<ul className="mt-2 text-sm text-gray-400 list-disc list-inside space-y-1">
<li>Out-of-stock events</li>
<li>Price changes ({'>'}5% threshold)</li>
<li>Brand drops and additions</li>
<li>Stock level changes for velocity calculations</li>
</ul>
</div>
</div>
</Layout>
);
}
export default HighFrequencyManager;

View File

@@ -0,0 +1,392 @@
/**
* Inventory Snapshots Dashboard
*
* View inventory snapshots captured from high-frequency crawls.
* Part of Real-Time Inventory Tracking feature.
*/
import { useEffect, useState } from 'react';
import { useNavigate, useSearchParams } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api, InventorySnapshot } from '../lib/api';
import {
Database,
RefreshCw,
AlertCircle,
Search,
Store,
Package,
Clock,
TrendingDown,
Filter,
} from 'lucide-react';
interface SnapshotStats {
total_snapshots: string;
stores_tracked: string;
products_tracked: string;
oldest_snapshot: string;
newest_snapshot: string;
snapshots_24h: string;
snapshots_1h: string;
}
export function InventorySnapshotsDashboard() {
const navigate = useNavigate();
const [searchParams] = useSearchParams();
const [snapshots, setSnapshots] = useState<InventorySnapshot[]>([]);
const [stats, setStats] = useState<SnapshotStats | null>(null);
const [loading, setLoading] = useState(true);
const [dispensaryId, setDispensaryId] = useState<string>('');
const [productId, setProductId] = useState<string>('');
const [error, setError] = useState<string | null>(null);
const [page, setPage] = useState(0);
const [hasMore, setHasMore] = useState(true);
const LIMIT = 50;
// Load data
useEffect(() => {
loadData();
loadStats();
}, [page]);
const loadData = async () => {
try {
setLoading(true);
setError(null);
const params: any = {
limit: LIMIT,
offset: page * LIMIT,
};
if (dispensaryId) {
params.dispensary_id = parseInt(dispensaryId);
}
if (productId) {
params.product_id = productId;
}
const data = await api.getInventorySnapshots(params);
setSnapshots(data.snapshots || []);
setHasMore((data.snapshots || []).length === LIMIT);
} catch (err: any) {
console.error('Failed to load snapshots:', err);
setError(err.message || 'Failed to load snapshots');
} finally {
setLoading(false);
}
};
const loadStats = async () => {
try {
const response = await api.get<{ success: boolean; stats: SnapshotStats }>(
'/api/tasks/inventory-snapshots/stats'
);
setStats(response.data.stats);
} catch (err) {
console.error('Failed to load stats:', err);
}
};
const handleSearch = () => {
setPage(0);
loadData();
};
// Format timestamp
const formatTime = (ts: string) => {
const date = new Date(ts);
return date.toLocaleString();
};
const formatRelativeTime = (ts: string) => {
const date = new Date(ts);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffMins = Math.floor(diffMs / 60000);
if (diffMins < 1) return 'just now';
if (diffMins < 60) return `${diffMins}m ago`;
const diffHours = Math.floor(diffMins / 60);
if (diffHours < 24) return `${diffHours}h ago`;
const diffDays = Math.floor(diffHours / 24);
return `${diffDays}d ago`;
};
// Get stock status color
const getStockColor = (qty: number | null, isBelowThreshold: boolean) => {
if (qty === null) return 'text-gray-400';
if (qty === 0 || isBelowThreshold) return 'text-red-400';
if (qty < 10) return 'text-yellow-400';
return 'text-green-400';
};
// Get platform badge color
const getPlatformColor = (platform: string) => {
switch (platform) {
case 'dutchie':
return 'bg-green-600';
case 'jane':
return 'bg-blue-600';
case 'treez':
return 'bg-purple-600';
default:
return 'bg-gray-600';
}
};
return (
<Layout>
<div className="max-w-7xl mx-auto px-4 py-6">
{/* Header */}
<div className="flex items-center justify-between mb-6">
<div>
<h1 className="text-2xl font-bold text-white flex items-center gap-2">
<Database className="h-6 w-6 text-purple-500" />
Inventory Snapshots
</h1>
<p className="text-gray-400 mt-1">
View inventory snapshots captured from high-frequency crawls
</p>
</div>
<button
onClick={() => {
loadData();
loadStats();
}}
className="flex items-center gap-2 px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded-lg transition-colors"
>
<RefreshCw className="h-4 w-4" />
Refresh
</button>
</div>
{/* Error */}
{error && (
<div className="mb-4 p-4 bg-red-900/50 border border-red-700 rounded-lg flex items-center gap-2 text-red-200">
<AlertCircle className="h-5 w-5" />
{error}
</div>
)}
{/* Stats Cards */}
{stats && (
<div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-6">
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Database className="h-4 w-4" />
Total Snapshots
</div>
<div className="text-2xl font-bold text-white mt-1">
{parseInt(stats.total_snapshots).toLocaleString()}
</div>
</div>
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Store className="h-4 w-4" />
Stores Tracked
</div>
<div className="text-2xl font-bold text-white mt-1">{stats.stores_tracked}</div>
</div>
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Package className="h-4 w-4" />
Products Tracked
</div>
<div className="text-2xl font-bold text-white mt-1">
{parseInt(stats.products_tracked).toLocaleString()}
</div>
</div>
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Clock className="h-4 w-4" />
Last Hour
</div>
<div className="text-2xl font-bold text-white mt-1">
{parseInt(stats.snapshots_1h).toLocaleString()}
</div>
</div>
</div>
)}
{/* Filters */}
<div className="flex flex-wrap gap-4 mb-6">
<div className="flex-1 min-w-[200px]">
<label className="block text-sm text-gray-400 mb-1">Dispensary ID</label>
<input
type="text"
value={dispensaryId}
onChange={(e) => setDispensaryId(e.target.value)}
onKeyDown={(e) => e.key === 'Enter' && handleSearch()}
placeholder="Filter by dispensary ID..."
className="w-full px-4 py-2 bg-gray-800 border border-gray-700 rounded-lg text-white placeholder-gray-400 focus:ring-2 focus:ring-blue-500 focus:border-transparent"
/>
</div>
<div className="flex-1 min-w-[200px]">
<label className="block text-sm text-gray-400 mb-1">Product ID</label>
<input
type="text"
value={productId}
onChange={(e) => setProductId(e.target.value)}
onKeyDown={(e) => e.key === 'Enter' && handleSearch()}
placeholder="Filter by product ID..."
className="w-full px-4 py-2 bg-gray-800 border border-gray-700 rounded-lg text-white placeholder-gray-400 focus:ring-2 focus:ring-blue-500 focus:border-transparent"
/>
</div>
<div className="flex items-end">
<button
onClick={handleSearch}
className="flex items-center gap-2 px-4 py-2 bg-blue-600 hover:bg-blue-500 text-white rounded-lg transition-colors"
>
<Filter className="h-4 w-4" />
Apply
</button>
</div>
</div>
{/* Snapshots Table */}
<div className="bg-gray-800 rounded-lg border border-gray-700 overflow-hidden">
<div className="overflow-x-auto">
<table className="w-full">
<thead className="bg-gray-900">
<tr>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Time</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Platform</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Store</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Product</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Brand</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Category</th>
<th className="px-4 py-3 text-right text-sm font-medium text-gray-400">Qty</th>
<th className="px-4 py-3 text-right text-sm font-medium text-gray-400">Price</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Status</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-700">
{loading ? (
<tr>
<td colSpan={9} className="px-4 py-8 text-center text-gray-400">
<div className="inline-block animate-spin rounded-full h-6 w-6 border-2 border-blue-500 border-t-transparent mb-2"></div>
<p>Loading snapshots...</p>
</td>
</tr>
) : snapshots.length === 0 ? (
<tr>
<td colSpan={9} className="px-4 py-8 text-center text-gray-400">
No inventory snapshots found
</td>
</tr>
) : (
snapshots.map((snapshot) => (
<tr key={snapshot.id} className="hover:bg-gray-700/50">
<td className="px-4 py-3 text-sm text-gray-300">
<span title={formatTime(snapshot.captured_at)}>
{formatRelativeTime(snapshot.captured_at)}
</span>
</td>
<td className="px-4 py-3">
<span
className={`px-2 py-1 rounded text-xs text-white ${getPlatformColor(
snapshot.platform
)}`}
>
{snapshot.platform}
</span>
</td>
<td className="px-4 py-3">
<button
onClick={() => navigate(`/dispensaries/${snapshot.dispensary_id}`)}
className="text-blue-400 hover:text-blue-300 text-sm"
>
#{snapshot.dispensary_id}
</button>
</td>
<td className="px-4 py-3 text-sm text-white max-w-[200px] truncate">
{snapshot.product_name || snapshot.product_id}
</td>
<td className="px-4 py-3 text-sm text-gray-300">
{snapshot.brand_name || '-'}
</td>
<td className="px-4 py-3 text-sm text-gray-300">
{snapshot.category || '-'}
</td>
<td
className={`px-4 py-3 text-sm text-right font-medium ${getStockColor(
snapshot.quantity_available,
snapshot.is_below_threshold
)}`}
>
{snapshot.quantity_available ?? '-'}
{snapshot.is_below_threshold && (
<TrendingDown className="h-3 w-3 inline ml-1" />
)}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-300">
{snapshot.price_rec ? `$${snapshot.price_rec.toFixed(2)}` : '-'}
</td>
<td className="px-4 py-3">
<span
className={`px-2 py-1 rounded text-xs ${
snapshot.status === 'Active' || snapshot.status === 'ACTIVE'
? 'bg-green-900/50 text-green-400'
: 'bg-gray-700 text-gray-400'
}`}
>
{snapshot.status || 'Unknown'}
</span>
</td>
</tr>
))
)}
</tbody>
</table>
</div>
</div>
{/* Pagination */}
<div className="mt-4 flex justify-between items-center">
<div className="text-sm text-gray-400">
Showing {snapshots.length} snapshots (page {page + 1})
</div>
<div className="flex gap-2">
<button
onClick={() => setPage((p) => Math.max(0, p - 1))}
disabled={page === 0}
className="px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded disabled:opacity-50 disabled:cursor-not-allowed"
>
Previous
</button>
<button
onClick={() => setPage((p) => p + 1)}
disabled={!hasMore}
className="px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded disabled:opacity-50 disabled:cursor-not-allowed"
>
Next
</button>
</div>
</div>
{/* Info Box */}
<div className="mt-6 p-4 bg-gray-800 border border-gray-700 rounded-lg">
<h3 className="text-sm font-medium text-gray-300 mb-2">About Inventory Snapshots</h3>
<p className="text-sm text-gray-400">
Inventory snapshots capture the state of products during each crawl. They include:
</p>
<ul className="mt-2 text-sm text-gray-400 list-disc list-inside space-y-1">
<li>Quantity available (for delta/velocity calculations)</li>
<li>Price (recreational and medical)</li>
<li>Stock status and low-stock indicators</li>
<li>Brand and category information</li>
</ul>
<p className="mt-2 text-sm text-gray-400">
Data is normalized across all platforms (Dutchie, Jane, Treez) into a common format.
</p>
</div>
</div>
</Layout>
);
}
export default InventorySnapshotsDashboard;

View File

@@ -347,10 +347,17 @@ export function PayloadsDashboard() {
</td>
<td className="px-4 py-3">
<div className="flex items-center gap-2">
<Store className="w-4 h-4 text-gray-400" />
<span className="text-sm font-medium truncate max-w-[200px]">
<Store className="w-4 h-4 text-gray-400 flex-shrink-0" />
<div className="min-w-0">
<div className="text-sm font-medium truncate max-w-[200px]">
{payload.dispensary_name || `Store #${payload.dispensaryId}`}
</span>
</div>
{(payload.city || payload.state) && (
<div className="text-xs text-gray-500 truncate">
{payload.city}{payload.city && payload.state ? ', ' : ''}{payload.state}
</div>
)}
</div>
</div>
</td>
<td className="px-4 py-3">

View File

@@ -0,0 +1,435 @@
/**
* Visibility Events Dashboard
*
* View and manage product visibility events (OOS, price changes, brand drops, etc.)
* Part of Real-Time Inventory Tracking feature.
*/
import { useEffect, useState } from 'react';
import { useNavigate, useSearchParams } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api, VisibilityEvent } from '../lib/api';
import { EventTypeBadge, getAllEventTypes, formatPriceChange, EventType } from '../components/EventTypeBadge';
import {
Bell,
RefreshCw,
AlertCircle,
CheckCircle,
Search,
Filter,
Check,
Clock,
Store,
Package,
DollarSign,
Tag,
} from 'lucide-react';
interface EventStats {
total_events: string;
oos_events: string;
back_in_stock_events: string;
brand_dropped_events: string;
brand_added_events: string;
price_change_events: string;
events_24h: string;
acknowledged_events: string;
notified_events: string;
}
export function VisibilityEventsDashboard() {
const navigate = useNavigate();
const [searchParams, setSearchParams] = useSearchParams();
const [events, setEvents] = useState<VisibilityEvent[]>([]);
const [stats, setStats] = useState<EventStats | null>(null);
const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState('');
const [selectedType, setSelectedType] = useState<EventType | ''>('');
const [selectedEvents, setSelectedEvents] = useState<Set<number>>(new Set());
const [acknowledging, setAcknowledging] = useState(false);
const [error, setError] = useState<string | null>(null);
const [success, setSuccess] = useState<string | null>(null);
const [page, setPage] = useState(0);
const [hasMore, setHasMore] = useState(true);
const LIMIT = 50;
// Load data
useEffect(() => {
loadData();
loadStats();
}, [selectedType, page]);
const loadData = async () => {
try {
setLoading(true);
setError(null);
const params: any = {
limit: LIMIT,
offset: page * LIMIT,
};
if (selectedType) {
params.event_type = selectedType;
}
if (searchTerm) {
params.brand = searchTerm;
}
const data = await api.getVisibilityEvents(params);
setEvents(data.events || []);
setHasMore((data.events || []).length === LIMIT);
} catch (err: any) {
console.error('Failed to load visibility events:', err);
setError(err.message || 'Failed to load events');
} finally {
setLoading(false);
}
};
const loadStats = async () => {
try {
const response = await api.get<{ success: boolean; stats: EventStats }>(
'/api/tasks/visibility-events/stats'
);
setStats(response.data.stats);
} catch (err) {
console.error('Failed to load stats:', err);
}
};
const handleAcknowledge = async (eventId: number) => {
try {
setAcknowledging(true);
await api.acknowledgeVisibilityEvent(eventId);
setSuccess('Event acknowledged');
await loadData();
await loadStats();
setTimeout(() => setSuccess(null), 3000);
} catch (err: any) {
setError(err.message || 'Failed to acknowledge event');
} finally {
setAcknowledging(false);
}
};
const handleBulkAcknowledge = async () => {
if (selectedEvents.size === 0) return;
if (!confirm(`Acknowledge ${selectedEvents.size} events?`)) return;
try {
setAcknowledging(true);
await api.post('/api/tasks/visibility-events/acknowledge-bulk', {
event_ids: Array.from(selectedEvents),
});
setSuccess(`${selectedEvents.size} events acknowledged`);
setSelectedEvents(new Set());
await loadData();
await loadStats();
setTimeout(() => setSuccess(null), 3000);
} catch (err: any) {
setError(err.message || 'Failed to acknowledge events');
} finally {
setAcknowledging(false);
}
};
const toggleSelectAll = () => {
if (selectedEvents.size === events.length) {
setSelectedEvents(new Set());
} else {
setSelectedEvents(new Set(events.map((e) => e.id)));
}
};
const toggleSelect = (eventId: number) => {
const newSelected = new Set(selectedEvents);
if (newSelected.has(eventId)) {
newSelected.delete(eventId);
} else {
newSelected.add(eventId);
}
setSelectedEvents(newSelected);
};
// Format timestamp
const formatTime = (ts: string) => {
const date = new Date(ts);
return date.toLocaleString();
};
const formatRelativeTime = (ts: string) => {
const date = new Date(ts);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffMins = Math.floor(diffMs / 60000);
if (diffMins < 1) return 'just now';
if (diffMins < 60) return `${diffMins}m ago`;
const diffHours = Math.floor(diffMins / 60);
if (diffHours < 24) return `${diffHours}h ago`;
const diffDays = Math.floor(diffHours / 24);
return `${diffDays}d ago`;
};
const eventTypes = getAllEventTypes();
return (
<Layout>
<div className="max-w-7xl mx-auto px-4 py-6">
{/* Header */}
<div className="flex items-center justify-between mb-6">
<div>
<h1 className="text-2xl font-bold text-white flex items-center gap-2">
<Bell className="h-6 w-6 text-yellow-500" />
Visibility Events
</h1>
<p className="text-gray-400 mt-1">
Track product out-of-stock, price changes, and brand visibility events
</p>
</div>
<div className="flex items-center gap-2">
{selectedEvents.size > 0 && (
<button
onClick={handleBulkAcknowledge}
disabled={acknowledging}
className="flex items-center gap-2 px-4 py-2 bg-green-600 hover:bg-green-500 text-white rounded-lg transition-colors disabled:opacity-50"
>
<Check className="h-4 w-4" />
Acknowledge ({selectedEvents.size})
</button>
)}
<button
onClick={() => {
loadData();
loadStats();
}}
className="flex items-center gap-2 px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded-lg transition-colors"
>
<RefreshCw className="h-4 w-4" />
Refresh
</button>
</div>
</div>
{/* Alerts */}
{error && (
<div className="mb-4 p-4 bg-red-900/50 border border-red-700 rounded-lg flex items-center gap-2 text-red-200">
<AlertCircle className="h-5 w-5" />
{error}
</div>
)}
{success && (
<div className="mb-4 p-4 bg-green-900/50 border border-green-700 rounded-lg flex items-center gap-2 text-green-200">
<CheckCircle className="h-5 w-5" />
{success}
</div>
)}
{/* Stats Cards */}
{stats && (
<div className="grid grid-cols-2 md:grid-cols-5 gap-4 mb-6">
<div className="bg-gray-800 rounded-lg p-4 border border-gray-700">
<div className="flex items-center gap-2 text-gray-400 text-sm">
<Clock className="h-4 w-4" />
Last 24h
</div>
<div className="text-2xl font-bold text-white mt-1">{stats.events_24h}</div>
</div>
<div className="bg-red-900/30 rounded-lg p-4 border border-red-700">
<div className="flex items-center gap-2 text-red-400 text-sm">
<Package className="h-4 w-4" />
OOS Events
</div>
<div className="text-2xl font-bold text-red-400 mt-1">{stats.oos_events}</div>
</div>
<div className="bg-green-900/30 rounded-lg p-4 border border-green-700">
<div className="flex items-center gap-2 text-green-400 text-sm">
<Package className="h-4 w-4" />
Back in Stock
</div>
<div className="text-2xl font-bold text-green-400 mt-1">{stats.back_in_stock_events}</div>
</div>
<div className="bg-yellow-900/30 rounded-lg p-4 border border-yellow-700">
<div className="flex items-center gap-2 text-yellow-400 text-sm">
<DollarSign className="h-4 w-4" />
Price Changes
</div>
<div className="text-2xl font-bold text-yellow-400 mt-1">{stats.price_change_events}</div>
</div>
<div className="bg-orange-900/30 rounded-lg p-4 border border-orange-700">
<div className="flex items-center gap-2 text-orange-400 text-sm">
<Tag className="h-4 w-4" />
Brand Drops
</div>
<div className="text-2xl font-bold text-orange-400 mt-1">{stats.brand_dropped_events}</div>
</div>
</div>
)}
{/* Filters */}
<div className="flex flex-wrap gap-4 mb-6">
<div className="flex-1 min-w-[200px]">
<div className="relative">
<Search className="absolute left-3 top-1/2 transform -translate-y-1/2 h-5 w-5 text-gray-400" />
<input
type="text"
value={searchTerm}
onChange={(e) => setSearchTerm(e.target.value)}
onKeyDown={(e) => e.key === 'Enter' && loadData()}
placeholder="Search by brand..."
className="w-full pl-10 pr-4 py-2 bg-gray-800 border border-gray-700 rounded-lg text-white placeholder-gray-400 focus:ring-2 focus:ring-blue-500 focus:border-transparent"
/>
</div>
</div>
<select
value={selectedType}
onChange={(e) => {
setSelectedType(e.target.value as EventType | '');
setPage(0);
}}
className="px-4 py-2 bg-gray-800 border border-gray-700 rounded-lg text-white focus:ring-2 focus:ring-blue-500"
>
<option value="">All Event Types</option>
{eventTypes.map((type) => (
<option key={type.value} value={type.value}>
{type.label}
</option>
))}
</select>
</div>
{/* Events Table */}
<div className="bg-gray-800 rounded-lg border border-gray-700 overflow-hidden">
<table className="w-full">
<thead className="bg-gray-900">
<tr>
<th className="px-4 py-3 text-left">
<input
type="checkbox"
checked={selectedEvents.size === events.length && events.length > 0}
onChange={toggleSelectAll}
className="rounded bg-gray-700 border-gray-600 text-blue-500"
/>
</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Type</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Time</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Store</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Product/Brand</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Details</th>
<th className="px-4 py-3 text-left text-sm font-medium text-gray-400">Status</th>
<th className="px-4 py-3 text-right text-sm font-medium text-gray-400">Actions</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-700">
{loading ? (
<tr>
<td colSpan={8} className="px-4 py-8 text-center text-gray-400">
<div className="inline-block animate-spin rounded-full h-6 w-6 border-2 border-blue-500 border-t-transparent mb-2"></div>
<p>Loading events...</p>
</td>
</tr>
) : events.length === 0 ? (
<tr>
<td colSpan={8} className="px-4 py-8 text-center text-gray-400">
No visibility events found
</td>
</tr>
) : (
events.map((event) => (
<tr key={event.id} className="hover:bg-gray-700/50">
<td className="px-4 py-3">
<input
type="checkbox"
checked={selectedEvents.has(event.id)}
onChange={() => toggleSelect(event.id)}
className="rounded bg-gray-700 border-gray-600 text-blue-500"
/>
</td>
<td className="px-4 py-3">
<EventTypeBadge type={event.event_type} size="sm" />
</td>
<td className="px-4 py-3 text-sm text-gray-300">
<span title={formatTime(event.detected_at)}>
{formatRelativeTime(event.detected_at)}
</span>
</td>
<td className="px-4 py-3">
<button
onClick={() => navigate(`/dispensaries/${event.dispensary_id}`)}
className="text-blue-400 hover:text-blue-300 text-sm"
>
{event.dispensary_name || `Store #${event.dispensary_id}`}
</button>
</td>
<td className="px-4 py-3">
<div className="text-sm text-white">{event.product_name || event.brand_name || '-'}</div>
{event.product_name && event.brand_name && (
<div className="text-xs text-gray-400">{event.brand_name}</div>
)}
</td>
<td className="px-4 py-3 text-sm text-gray-300">
{event.event_type === 'price_change'
? formatPriceChange(event.previous_price, event.new_price, event.price_change_pct)
: '-'}
</td>
<td className="px-4 py-3">
{event.acknowledged_at ? (
<span className="inline-flex items-center gap-1 px-2 py-1 bg-green-900/50 text-green-400 rounded text-xs">
<Check className="h-3 w-3" />
Acknowledged
</span>
) : (
<span className="px-2 py-1 bg-gray-700 text-gray-400 rounded text-xs">
Pending
</span>
)}
</td>
<td className="px-4 py-3 text-right">
{!event.acknowledged_at && (
<button
onClick={() => handleAcknowledge(event.id)}
disabled={acknowledging}
className="px-2 py-1 bg-green-600 hover:bg-green-500 text-white rounded text-xs disabled:opacity-50"
>
Acknowledge
</button>
)}
</td>
</tr>
))
)}
</tbody>
</table>
</div>
{/* Pagination */}
<div className="mt-4 flex justify-between items-center">
<div className="text-sm text-gray-400">
Showing {events.length} events (page {page + 1})
</div>
<div className="flex gap-2">
<button
onClick={() => setPage((p) => Math.max(0, p - 1))}
disabled={page === 0}
className="px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded disabled:opacity-50 disabled:cursor-not-allowed"
>
Previous
</button>
<button
onClick={() => setPage((p) => p + 1)}
disabled={!hasMore}
className="px-4 py-2 bg-gray-700 hover:bg-gray-600 text-white rounded disabled:opacity-50 disabled:cursor-not-allowed"
>
Next
</button>
</div>
</div>
</div>
</Layout>
);
}
export default VisibilityEventsDashboard;

View File

@@ -383,9 +383,10 @@ function PreflightSummary({ worker, poolOpen = true }: { worker: Worker; poolOpe
const fingerprint = worker.fingerprint_data;
const httpError = worker.preflight_http_error;
const httpMs = worker.preflight_http_ms;
// Geo from current_city/state columns, or fallback to fingerprint detected location
const geoState = worker.current_state || fingerprint?.detectedLocation?.region;
const geoCity = worker.current_city || fingerprint?.detectedLocation?.city;
// Show DETECTED proxy location (from fingerprint), not assigned state
// This lets us verify the proxy is geo-targeted correctly
const geoState = fingerprint?.detectedLocation?.region || worker.current_state;
const geoCity = fingerprint?.detectedLocation?.city || worker.current_city;
// Worker is ONLY qualified if http preflight passed AND has geo assigned
const hasGeo = Boolean(geoState);
const isQualified = (worker.is_qualified || httpStatus === 'passed') && hasGeo;
@@ -702,8 +703,9 @@ function WorkerSlot({
const httpIp = worker?.http_ip;
const fingerprint = worker?.fingerprint_data;
const geoState = worker?.current_state || (fingerprint as any)?.detectedLocation?.region;
const geoCity = worker?.current_city || (fingerprint as any)?.detectedLocation?.city;
// Show DETECTED proxy location (from fingerprint), not assigned state
const geoState = (fingerprint as any)?.detectedLocation?.region || worker?.current_state;
const geoCity = (fingerprint as any)?.detectedLocation?.city || worker?.current_city;
const isQualified = worker?.is_qualified;
// Build fingerprint tooltip
@@ -803,7 +805,7 @@ function PodVisualization({
// Get the single worker for this pod (1 worker_registry entry per K8s pod)
const worker = workers[0];
const activeTasks = worker?.active_tasks ?? [];
const maxSlots = worker?.max_concurrent_tasks ?? 3;
const maxSlots = worker?.max_concurrent_tasks ?? 5;
const activeCount = activeTasks.length;
const isBackingOff = worker?.metadata?.is_backing_off;
const isDecommissioning = worker?.decommission_requested;

84
docs/DOCKER_REGISTRY.md Normal file
View File

@@ -0,0 +1,84 @@
# Using the Docker Registry Cache
To avoid Docker Hub rate limits, use our registry at `registry.spdy.io` (HTTPS) or `10.100.9.70:5000` (HTTP internal).
## For Woodpecker CI (Kaniko builds)
In your `.woodpecker.yml`, use these Kaniko flags:
```yaml
docker-build:
image: gcr.io/kaniko-project/executor:debug
commands:
- /kaniko/executor
--context=/woodpecker/src/...
--dockerfile=Dockerfile
--destination=10.100.9.70:5000/your-image:tag
--registry-mirror=10.100.9.70:5000
--insecure-registry=10.100.9.70:5000
--cache=true
--cache-repo=10.100.9.70:5000/your-image/cache
--cache-ttl=168h
```
**Key points:**
- `--registry-mirror=10.100.9.70:5000` - Pulls base images from local cache
- `--insecure-registry=10.100.9.70:5000` - Allows HTTP (not HTTPS)
- `--cache=true` + `--cache-repo=...` - Caches build layers locally
## Available Base Images
The local registry has these cached:
| Image | Tags |
|-------|------|
| `node` | `20-slim`, `22-slim`, `22-alpine`, `20-alpine` |
| `alpine` | `latest` |
| `nginx` | `alpine` |
| `bitnami/kubectl` | `latest` |
| `gcr.io/kaniko-project/executor` | `debug` |
Need a different image? Add it to the cache using crane:
```bash
kubectl run cache-image --rm -it --restart=Never \
--image=gcr.io/go-containerregistry/crane:latest \
-- copy docker.io/library/IMAGE:TAG 10.100.9.70:5000/library/IMAGE:TAG --insecure
```
## Which Registry URL to Use
| Context | URL | Why |
|---------|-----|-----|
| Kaniko builds (CI) | `10.100.9.70:5000` | Internal HTTP, faster |
| kubectl set image | `registry.spdy.io` | HTTPS, k8s nodes can pull |
| Checking images | Either works | Same backend |
## DO NOT USE
- ~~`--registry-mirror=mirror.gcr.io`~~ - Rate limited by Docker Hub
- ~~Direct pulls from `docker.io`~~ - Rate limited (100 pulls/6hr anonymous)
- ~~`10.100.9.70:5000` in kubectl commands~~ - k8s nodes require HTTPS
## Checking Cached Images
List all cached images:
```bash
curl -s http://10.100.9.70:5000/v2/_catalog | jq
```
List tags for a specific image:
```bash
curl -s http://10.100.9.70:5000/v2/library/node/tags/list | jq
```
## Troubleshooting
### "no such host" or DNS errors
The CI runner can't reach the registry mirror. Make sure you're using `10.100.9.70:5000`, not `mirror.gcr.io`.
### "manifest unknown"
The image/tag isn't cached. Add it using the crane command above.
### HTTP vs HTTPS errors
Always use `--insecure-registry=10.100.9.70:5000` - the local registry uses HTTP.

104
docs/SPDY_INFRASTRUCTURE.md Normal file
View File

@@ -0,0 +1,104 @@
# CannaIQ Infrastructure (spdy.io)
External services for the spdy.io Kubernetes cluster. **Do not create containers for these.**
## PostgreSQL
| Setting | Value |
|----------|----------------------|
| Host | 10.100.6.50 |
| Port | 5432 |
| Database | cannaiq |
| Username | cannaiq |
| Password | SpDyCannaIQ2024 |
```bash
# Connection string
DATABASE_URL=postgres://cannaiq:SpDyCannaIQ2024@10.100.6.50:5432/cannaiq
# Test connection
PGPASSWORD='SpDyCannaIQ2024' psql -h 10.100.6.50 -p 5432 -U cannaiq -d cannaiq -c "SELECT 1"
```
## Redis
| Setting | Value |
|----------|----------------|
| Host | 10.100.9.50 |
| Port | 6379 |
| Password | SpDyR3d1s2024! |
```bash
# Connection URL
REDIS_URL=redis://:SpDyR3d1s2024!@10.100.9.50:6379
# Node.js .env
REDIS_HOST=10.100.9.50
REDIS_PORT=6379
REDIS_PASSWORD=SpDyR3d1s2024!
```
## MinIO (S3-Compatible Storage)
| Setting | Value |
|----------------|------------------|
| Endpoint | 10.100.9.80:9000 |
| Console | 10.100.9.80:9001 |
| Region | us-east-1 |
| Use Path Style | true |
### CannaIQ Bucket
| Setting | Value |
|------------|----------------|
| Bucket | cannaiq |
| Access Key | cannaiq-app |
| Secret Key | cannaiq-secret |
```bash
# Node.js .env
MINIO_ENDPOINT=10.100.9.80
MINIO_PORT=9000
MINIO_ACCESS_KEY=cannaiq-app
MINIO_SECRET_KEY=cannaiq-secret
MINIO_BUCKET=cannaiq
MINIO_USE_SSL=false
```
### Cannabrands Bucket
| Setting | Value |
|------------|------------------------------------------|
| Bucket | cannabrands |
| Access Key | cannabrands-app |
| Secret Key | cdbdcd0c7b6f3994d4ab09f68eaff98665df234f |
## Kubernetes Secrets
Create secrets in the `cannaiq` namespace:
```bash
# Database
kubectl create secret generic db-credentials -n cannaiq \
--from-literal=DATABASE_URL='postgres://cannaiq:SpDyCannaIQ2024@10.100.6.50:5432/cannaiq'
# Redis
kubectl create secret generic redis-credentials -n cannaiq \
--from-literal=REDIS_URL='redis://:SpDyR3d1s2024!@10.100.9.50:6379'
# MinIO
kubectl create secret generic minio-credentials -n cannaiq \
--from-literal=MINIO_ACCESS_KEY='cannaiq-app' \
--from-literal=MINIO_SECRET_KEY='cannaiq-secret'
```
## Network
All services are on the `10.100.x.x` internal network:
| Service | IP | Port |
|------------|--------------|------|
| PostgreSQL | 10.100.6.50 | 5432 |
| Redis | 10.100.9.50 | 6379 |
| MinIO | 10.100.9.80 | 9000 |
| Registry | 10.100.9.70 | 5000 |

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM registry.spdy.io/library/node:22-slim AS builder
WORKDIR /app
@@ -20,7 +20,7 @@ COPY . .
RUN npm run build
# Production stage
FROM nginx:alpine
FROM registry.spdy.io/library/nginx:alpine
# Copy built assets from builder stage (CRA outputs to /build)
COPY --from=builder /app/build /usr/share/nginx/html

View File

@@ -1,5 +1,5 @@
# Find a Gram Backend - FastAPI
FROM python:3.11-slim
FROM registry.spdy.io/library/python:3.11-slim
WORKDIR /app

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM registry.spdy.io/library/node:22-slim AS builder
WORKDIR /app
@@ -25,7 +25,7 @@ COPY . .
RUN npm run build
# Production stage
FROM nginx:alpine
FROM registry.spdy.io/library/nginx:alpine
# Copy built assets from builder stage (CRA outputs to /build)
COPY --from=builder /app/build /usr/share/nginx/html

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM registry.spdy.io/library/node:20-slim AS builder
WORKDIR /app
@@ -19,7 +19,7 @@ ENV VITE_API_URL=https://cannaiq.co
RUN npm run build
# Production stage
FROM nginx:alpine
FROM registry.spdy.io/library/nginx:alpine
# Copy built assets from builder stage
COPY --from=builder /app/dist /usr/share/nginx/html

View File

@@ -1,76 +0,0 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: cannaiq
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: cannaiq
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: scraper-secrets
key: POSTGRES_USER
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: scraper-secrets
key: POSTGRES_PASSWORD
- name: POSTGRES_DB
valueFrom:
secretKeyRef:
name: scraper-secrets
key: POSTGRES_DB
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
---
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: cannaiq
spec:
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432

View File

@@ -1,66 +0,0 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-data
namespace: cannaiq
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: cannaiq
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
ports:
- containerPort: 6379
resources:
requests:
memory: "64Mi"
cpu: "50m"
limits:
memory: "256Mi"
cpu: "200m"
volumeMounts:
- name: redis-data
mountPath: /data
command:
- redis-server
- --appendonly
- "yes"
- --maxmemory
- "200mb"
- --maxmemory-policy
- allkeys-lru
volumes:
- name: redis-data
persistentVolumeClaim:
claimName: redis-data
---
apiVersion: v1
kind: Service
metadata:
name: redis
namespace: cannaiq
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379

View File

@@ -0,0 +1,56 @@
# Daily job to sync base images from Docker Hub to local registry
# Runs at 3 AM daily to refresh the cache before rate limits reset
apiVersion: batch/v1
kind: CronJob
metadata:
name: registry-sync
namespace: woodpecker
spec:
schedule: "0 3 * * *" # 3 AM daily
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: sync
image: gcr.io/go-containerregistry/crane:latest
command:
- /bin/sh
- -c
- |
set -e
echo "=== Registry Sync: $(date) ==="
REGISTRY="registry.spdy.io"
# Base images to cache (source of truth for all K8s deployments)
# Add new images here - all deployments should use registry.spdy.io/library/*
IMAGES="
library/busybox:latest
library/node:20-slim
library/node:22-slim
library/node:22
library/node:22-alpine
library/node:20-alpine
library/alpine:latest
library/nginx:alpine
library/python:3.11-slim
bitnami/kubectl:latest
"
for img in $IMAGES; do
echo "Syncing docker.io/$img -> $REGISTRY/$img"
crane copy "docker.io/$img" "$REGISTRY/$img" || echo "WARN: Failed $img"
done
echo "=== Sync complete ==="
resources:
limits:
memory: "256Mi"
cpu: "200m"
requests:
memory: "128Mi"
cpu: "100m"

View File

@@ -16,6 +16,12 @@ spec:
# Each pod runs up to MAX_CONCURRENT_TASKS browsers (~400MB each)
# Scale pods for throughput, not concurrent tasks per pod
replicas: 8
# CRITICAL: Prevent pod count from EVER exceeding 8 during rollouts
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0 # Never create extra pods
maxUnavailable: 1 # Roll out 1 at a time
selector:
matchLabels:
app: scraper-worker
@@ -33,7 +39,7 @@ spec:
args: ["dist/tasks/task-worker.js"]
envFrom:
- configMapRef:
name: scraper-config
name: cannaiq-config
- secretRef:
name: scraper-secrets
env:
@@ -51,11 +57,17 @@ spec:
# 3 browsers × ~400MB = ~1.3GB (safe for 2GB pod limit)
- name: MAX_CONCURRENT_TASKS
value: "3"
# Task Pool System (geo-based pools)
# Correct flow: check pools → claim pool → get proxy → preflight → pull tasks
- name: USE_TASK_POOLS
# Session Pool: CORRECT FLOW - claim tasks first, then get IP
# 1. Worker claims tasks (no IP yet)
# 2. Get city/state from task
# 3. Get IP matching that city/state
# 4. Execute tasks with that IP
# 5. Retire IP (8hr cooldown)
- name: USE_SESSION_POOL
value: "true"
# Disable legacy identity pool
# Disable legacy modes (wrong flow - get IP before tasks)
- name: USE_TASK_POOLS
value: "false"
- name: USE_IDENTITY_POOL
value: "false"
resources:

View File

@@ -1,20 +1,10 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: scraper-images-pvc
namespace: cannaiq
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: scraper
namespace: cannaiq
labels:
app: scraper
spec:
replicas: 1
selector:
@@ -25,27 +15,22 @@ spec:
labels:
app: scraper
spec:
serviceAccountName: scraper-sa
imagePullSecrets:
- name: regcred
- name: gitea-registry
containers:
- name: scraper
image: git.spdy.io/creationshop/cannaiq:latest
image: registry.spdy.io/cannaiq/backend:latest
imagePullPolicy: Always
ports:
- containerPort: 3010
- containerPort: 3000
envFrom:
- configMapRef:
name: scraper-config
- secretRef:
name: scraper-secrets
volumeMounts:
- name: images-storage
mountPath: /app/public/images
name: cannaiq-config
# Liveness probe: restarts pod if it becomes unresponsive
livenessProbe:
httpGet:
path: /health
port: 3010
port: 3000
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 10
@@ -54,7 +39,7 @@ spec:
readinessProbe:
httpGet:
path: /health
port: 3010
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
@@ -64,9 +49,5 @@ spec:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
memory: "2Gi"
cpu: "1000m"
volumes:
- name: images-storage
persistentVolumeClaim:
claimName: scraper-images-pvc

Some files were not shown because too many files have changed in this diff Show More