- Add SSR config extraction for Treez sites (BEST Dispensary)
- Increase MAX_RETRIES from 3 to 5 for task failures
- Update task list ordering: active > pending > failed > completed
- Show detected proxy location in worker dashboard (from fingerprint)
- Hardcode 'dutchie' menu_type in promotion.ts (remove deriveMenuType)
- Update provider display to show actual provider names
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
raw_crawl_payloads only saved during baseline window (12:01-3:00 AM),
but inventory_snapshots are always saved. This caused product_discovery
tasks to fail verification outside the baseline window.
Changed name -> name_raw and brand -> brand_name_raw to match
store_products table schema.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tags assigned per store:
- must_win: High-revenue store with room to grow SKUs
- at_risk: High OOS% (losing shelf presence)
- top_performer: High sales + good inventory management
- growth: Above-average velocity
- low_inventory: Low days on hand
Configurable via query params:
- ?must_win_max_skus=5
- ?at_risk_oos_pct=30
- ?top_performer_max_oos=15
- ?low_inventory_days=7
Response includes tag_thresholds showing applied values.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ?margin_pct query param (default 50% industry standard)
- Returns margin_pct and margin_est per store
- Includes margin_pct_assumed in response metadata
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive brand-level analytics endpoints at /api/brands:
Brand Discovery:
- GET /api/brands - List all brands with summary metrics
- GET /api/brands/search - Search brands by name
- GET /api/brands/top - Top brands by distribution
Brand Overview:
- GET /api/brands/:brand - Full brand intelligence dashboard
- GET /api/brands/:brand/analytics - Alias for overview
Sales & Velocity:
- GET /api/brands/:brand/sales - Sales data (4wk, daily avg)
- GET /api/brands/:brand/velocity - Units/day by SKU
- GET /api/brands/:brand/trends - Weekly sales trends
Inventory & Stock:
- GET /api/brands/:brand/inventory - Current stock levels
- GET /api/brands/:brand/oos - Out-of-stock products
- GET /api/brands/:brand/low-stock - Products below threshold
Pricing:
- GET /api/brands/:brand/pricing - Current prices
- GET /api/brands/:brand/price-history - Price changes over time
Distribution:
- GET /api/brands/:brand/distribution - Store count, market coverage
- GET /api/brands/:brand/stores - Stores carrying brand
- GET /api/brands/:brand/gaps - Whitespace opportunities
Events & Alerts:
- GET /api/brands/:brand/events - Visibility events
- POST /api/brands/:brand/events/:id/ack - Acknowledge alert
Products:
- GET /api/brands/:brand/products - All SKUs with metrics
- GET /api/brands/:brand/products/:sku - Single product deep dive
All endpoints support ?state=XX, ?days=N, and ?category=X filters.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add interval_minutes column to TaskSchedule interface
- Prefer interval_minutes over interval_hours when calculating next_run_at
- Add jitter (0-20% of interval) for sub-hour schedules to prevent detection
- Update getSchedules() to include interval_minutes and dispensary_name
- Update updateSchedule() to allow setting interval_minutes
- Add migration 121 for interval_minutes column
Part of Real-Time Inventory Tracking feature.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace saveRawPayload with saveDailyBaseline in all handlers
- Full payloads only saved once per day per store during window
- Inventory snapshots still saved every crawl (lightweight tracking)
- Add last_baseline_at column to dispensaries table
- Show baseline status in Per-Store Schedules dashboard
- Display baseline window info (12:01 AM - 3:00 AM) in UI
Reduces storage ~95% for high-frequency stores while maintaining
full audit capability via daily baselines.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update buildProductQuery() to use match_all by default
- Captures hidden, below-threshold, and out-of-stock products
- Add extractPrimaryImage() and extractImages() to normalizer
- Add product_refresh_treez handler for platform-specific refresh
- Add product_refresh_treez to TaskRole type
Best Dispensary: 228 → 271 products (+43)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add PlatformBadge component showing D=Dutchie, J=Jane, T=Treez
- Include platform field in worker-registry API response
- Fix null running_seconds displaying as "nulls"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1. task-worker.ts: Pass full result object to completeTask instead of
non-existent result.data property (was causing {} to be stored)
2. WorkersDashboard.tsx: Handle null running_seconds in formatSecondsToTime
(was displaying "nulls" due to JS type coercion)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add task completion verification with DB and output layers
- Add reconciliation loop to sync worker memory with DB state
- Implement IP-per-store-per-platform conflict detection
- Add task ID hash to MinIO payload filenames for traceability
- Fix schedule edit modal with dispensary info in API responses
- Add task ID display after dispensary name in worker dashboard
- Add migrations for proxy_ip and source tracking columns
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Set includeEnterpriseSpecials: true to get BOGO/sale deal names
- Set Status: 'All' to capture both Active and Inactive (sold out) products
- Make schedules query backward-compatible for missing pool_id column
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tasks were being created for dispensaries with crawl_enabled=false
(duplicates, deprecated stores). Added EXISTS check to filter only
crawl_enabled=true stores before creating tasks.
This prevents errors like:
"Dispensary 207 not found or not crawl_enabled"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tasks Dashboard:
- Add clickable Pool Open/Paused toggle button in header
- Add sortable columns (ID, Role, Store, Status, Worker, Duration, Created)
- Show menu_type and pool badges under Store column
- Add Pool column to Schedules table
- Filter stores by platform in Create Task modal
Workers Dashboard:
- Redesign pod visualization to show 3 worker slots per pod
- Each slot shows preflight checklist (Overload? Terminating? Pool Query?)
- Once qualified, shows City/State, Proxy IP, Antidetect status
- Hover shows full fingerprint data (browser, platform, bot detection)
Backend:
- Add menu_type to listTasks query
- Add pool_id/pool_name to schedules query with task_pools JOIN
- Migration 114: Add pool_id column to task_schedules table
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Auto-migrate fails early, so task_pools may not exist yet.
Check table existence before including pool columns/joins.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Workers now follow the correct flow:
1. Check what pools have pending tasks
2. Claim a pool (e.g., Phoenix AZ)
3. Get Evomi proxy for that geo
4. Run preflight with geo proxy
5. Pull tasks from pool (up to 6 stores)
6. Execute tasks
7. Release pool when exhausted (6 stores visited)
Task pools group dispensaries by metro area (100mi radius):
- Phoenix AZ, Tucson AZ
- Los Angeles CA, San Francisco CA, San Diego CA, Sacramento CA
- Denver CO, Chicago IL, Boston MA, Detroit MI
- Las Vegas NV, Reno NV, Newark NJ, New York NY
- Oklahoma City OK, Tulsa OK, Portland OR, Seattle WA
Benefits:
- Workers know geo BEFORE getting proxy (no more "No geo assigned")
- IP diversity within metro area (Phoenix worker can use Tempe IP)
- Simpler worker logic - just match pool geo
- Pre-organized tasks, not grouped at claim time
New files:
- migrations/113_task_pools.sql - schema, seed data, functions
- src/services/task-pool.ts - TypeScript service
Env vars:
- USE_TASK_POOLS=true (new system)
- USE_IDENTITY_POOL=false (disabled)
🤖 Generated with [Claude Code](https://claude.ai/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Workers were showing "No geo assigned" on dashboard because geo info
was set internally but never reported to worker_registry after
identity pool claim.
Now updates current_state and current_city columns when identity
is claimed, so dashboard shows correct geo assignment.
Also documents CI/CD batching rule to minimize build time.
🤖 Generated with [Claude Code](https://claude.ai/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Removes session parameter from Evomi proxy URL so each request
gets a different IP. Prevents all workers from sharing same IP.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Try ip-api.com first, then ipapi.co as fallback
- If both fail, use state coords from targetState param
- Prevents workers from getting stuck in preflight loop
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace hardcoded state coords with IP geolocation lookup via ip-api.com
- Browser timezone and geolocation now match actual proxy IP location
- City-level proxy targeting already in place via Evomi _city- parameter
- Add browser-factory.ts shared utility for antidetect setup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use denormalized d.product_count instead of JOIN to store_products
- Remove expensive per-product aggregations (avg_price, brand counts, stock)
- Query now runs in <100ms instead of 7.5s
The massive JOIN between dispensaries and store_products was causing
the slow load. State metrics now use pre-computed product_count column.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New worker flow (enabled via USE_SESSION_POOL=true):
1. Worker claims up to 6 tasks for same geo (atomically marked claimed)
2. Gets Evomi proxy for that geo
3. Checks IP availability (not in use, not in 8hr cooldown)
4. Locks IP exclusively to this worker
5. Runs preflight with locked IP
6. Executes tasks (3 concurrent)
7. After 6 tasks, retires session (8hr IP cooldown)
8. Repeats with new IP
Key files:
- migrations/112_worker_session_pool.sql: Session table + atomic claiming
- services/worker-session.ts: Session lifecycle management
- tasks/task-worker.ts: sessionPoolMainLoop() with new flow
- services/crawl-rotator.ts: setFixedProxy() for session locking
Failed tasks return to pending for retry by another worker.
No two workers can share same IP simultaneously.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Workers without geo now show orange "NO GEO" badge instead of gold qualified
- Orange ring + X badge on avatar when preflight OK but no geo
- Gold ring + checkmark only when fully qualified (preflight + geo)
- Add VenetianMask icon for antidetect status indicator
- Lock K8s replica count at exactly 8 pods in CLAUDE.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- pipeline.ts: Replace correlated subquery with d.product_count
- consumer-favorites.ts: Replace correlated subquery with d.product_count
Correlated subqueries were causing N+1 query patterns. Using the
denormalized column is O(1) lookup per row.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use dispensaries.product_count instead of correlated subquery
- Add 1 minute in-memory cache for /dashboard/activity
- Reduces query time from ~30s to <100ms
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Backend:
- Add active_tasks array to GET /worker-registry/workers response
- Include task details: role, dispensary, running_seconds, progress
Frontend:
- Show nested task list under each worker with duration
- Display empty slots when worker has capacity
- Update pod visualization to show 3 task slot nodes
- Active slots pulse blue, empty slots gray
- Hover for task details (dispensary, duration, progress)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix buildEvomiProxyUrl to use passed session ID from identity pool
instead of truncating to worker+region (causing same IP for all workers)
- Add task pool gate feature with database-backed state
- Add /tasks/pool/toggle endpoint and UI toggle button
- Fix isTaskPoolPaused() missing await in claimTask
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create trusted_origins table for DB-backed origin management
- Add API routes for CRUD operations on trusted origins
- Add tabbed interface on /users page with Users and Trusted Origins tabs
- Seeds default trusted origins (cannaiq.co, findadispo.com, findagram.co, etc.)
- Fix TypeScript error in WorkersDashboard fingerprint type
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rewrites Treez platform client to use CDP (Chrome DevTools Protocol)
interception instead of DOM scraping. Key changes:
- Uses Puppeteer Stealth plugin to bypass headless detection
- Intercepts Elasticsearch API responses via CDP Network.responseReceived
- Captures full product data including inventory levels (availableUnits)
- Adds comprehensive TypeScript types for all Treez data structures
- Updates queries.ts with automatic session management
- Fixes product-discovery-treez handler for new API shape
Tested with Best Dispensary: 142 products across 10 categories captured
with inventory data, pricing, and lab results.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add worker_identities table and metro_areas for city groupings
- Create IdentityPoolService for claiming/releasing identities
- Each identity used for 3-5 tasks, then 2-3 hour cooldown
- Integrate with task-worker via USE_IDENTITY_POOL feature flag
- Update puppeteer-preflight to accept custom proxy URLs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Was using rpc.evomi.com which doesn't exist.
Residential proxies use rp.evomi.com:1000
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The preflight was only checking DB proxy pool which was empty.
Now checks Evomi API first (when configured), then falls back to DB.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Platform badge now shows green (emerald) for dutchie even when null/undefined
- State badge shows "ALL" (uppercase) with indigo color when no state specified
- Remove "(HTTP transport)" from store discovery description
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Check Evomi API availability before waiting for DB proxies
- If EVOMI_USER/EVOMI_PASS configured, proceed immediately
- Only fall back to DB proxy polling if Evomi not configured
- Added clear comments explaining proxy initialization order
This fixes workers getting stuck waiting for DB proxies when
Evomi API is available for on-demand geo-targeted proxies.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add WorkerFingerprint interface with timezone, city, state, ip, locale
- Store fingerprint in TaskWorker after preflight passes
- Pass fingerprint through TaskContext to handlers
- Apply timezone via CDP and locale via Accept-Language header
- Ensures browser fingerprint matches proxy IP location
This fixes anti-detect detection where timezone/locale mismatch
with proxy IP was getting blocked by Cloudflare.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add fetchProductsByStoreIdDirect() for reliable Algolia product fetching
- Update product-discovery-jane to use direct Algolia instead of network interception
- Fix product-refresh handler to support both Dutchie and Jane payloads
- Handle both `products` (Dutchie) and `hits` (Jane) formats
- Use platform-appropriate raw_json structure for normalizers
- Fix consecutive_misses tracking to use correct provider
- Extract product IDs correctly (Dutchie _id vs Jane product_id)
- Add store discovery deduplication (prefer REC over MED at same location)
- Add storeTypes field to DiscoveredStore interface
- Add scripts: run-jane-store-discovery.ts, run-jane-product-discovery.ts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Workers are now geo-locked to a specific state for their session:
- Session = 60 minutes OR 7 store visits (whichever comes first)
- Workers ONLY claim tasks matching their assigned state
- State assignment prioritizes: most pending tasks, fewest workers
Changes:
- Migration 108: geo session columns, claim_task with geo filter,
assign_worker_geo(), check_worker_geo_session(), worker_state_capacity view
- task-worker.ts: ensureGeoSession() method before task claiming
- worker-registry.ts: /state-capacity and /geo-sessions API endpoints
- WorkersDashboard: Show qualified icon + geo state in Preflight column
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Platform isolation:
- Rename handlers to {task}-{platform}.ts convention
- Deprecate -curl variants (now _deprecated-*)
- Platform-based routing in task-worker.ts
- Add Jane platform handlers and client
Evomi geo-targeting:
- Add dynamic proxy URL builder with state/city targeting
- Session stickiness per worker per state (30 min)
- Fallback to static proxy table when API unavailable
- Add proxy tracking columns to worker_tasks
Proxy management:
- New /proxies admin page for visibility
- Track proxy_ip, proxy_geo, proxy_source per task
- Show active sessions and task history
Validation filtering:
- Filter by validated stores (platform_dispensary_id + menu_url)
- Mark incomplete stores as deprecated
- Update all dashboard/stats queries
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename 'store_discovery_dutchie' to 'Store Discovery' (platform badge via platform field)
- Add self-healing: scan for stores missing payloads and queue product_discovery
- Catches stores added before chaining was implemented
- Limits to 50 stores per run to avoid overwhelming the system
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>