Commit Graph

51 Commits

Author SHA1 Message Date
Kelly
e826a4dd3e fix: Add consecutive_failures column to migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:20:59 -07:00
Kelly
95c23fcdff fix: Add consecutive_successes column to dispensaries table
Required for stage checkpoint tracking in task handlers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:15:59 -07:00
Kelly
271faf0f00 perf: Optimize dashboard queries for faster load times
- Use pg_stat for approximate product count (instant vs full scan)
- LIMIT on DISTINCT queries for brand/category counts
- Single combined query (reduces round trips)
- Add index on store_product_snapshots.captured_at
- Add index on worker_tasks.worker_id and created_at

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:09:02 -07:00
Kelly
54f59c6082 fix(analytics): Fix market-summary store count and add search indexes
- market-summary now counts from store_products table (not product_variants)
- Added trigram indexes for fast ILIKE product searches

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:35:17 -07:00
Kelly
1d6211db19 perf: Add store_intelligence_cache for fast /intelligence/stores
- Remove costly correlated subquery (snapshot_count) from /stores endpoint
- Add migration 092 for store_intelligence_cache table
- Update analytics_refresh to populate cache with pre-computed metrics
- Add /intelligence/stores/cached endpoint using cache table

Performance: O(n*m) → O(1) for snapshot counts, ~10x faster response

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-13 00:13:41 -07:00
Kelly
e62f927218 feat: Auto-retry failed proxies after cooldown period
- Add last_failed_at column to track failure time
- Failed proxies auto-retry after 4 hours (configurable)
- Proxies permanently failed after 10 failures
- Add /retry-stats and /reenable-failed API endpoints
- markProxySuccess() re-enables recovered proxies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:08:44 -07:00
Kelly
675f42841e feat: Import 500 Evomi residential proxies
- Update unique constraint to include username/password for session-based proxies
- All proxies imported as inactive (run Test All to verify and activate)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:05:21 -07:00
Kelly
2d489e068b fix: Correct mv_state_metrics to use brand_name_raw
- Changed unique_brands from COUNT(brand_id) to COUNT(brand_name_raw)
- brand_id is often NULL, brand_name_raw has actual data
- AZ now correctly shows 462 brands (was 144)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 23:50:53 -07:00
Kelly
470097eb19 fix: Intelligence stores endpoint and UI consistency
- Fix stores endpoint to only show stores with actual products (INNER JOIN + HAVING)
- Update badge colors to match Workers/Tasks dashboard style
- Use emerald/amber/red/gray color scheme consistently
- Chain badge now uses purple (bg-purple-100)
- Add migration 092 to fix Trulieve store URLs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 23:37:28 -07:00
Kelly
55b26e9153 feat: Auto-healing entry_point_discovery with browser-first transport
- Rewrote entry_point_discovery with auto-healing scheme:
  1. Check dutchie_discovery_locations for existing platform_location_id
  2. Browser-based GraphQL with 5x network retries
  3. Mark as needs_investigation on hard failure
- Browser (Puppeteer) is now DEFAULT transport - curl only when explicit
- Added migration 091 for tracking columns:
  - last_store_discovery_at: When store_discovery updated record
  - last_payload_at: When last product payload was saved
- Updated CODEBASE_MAP.md with transport rules documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-12 22:55:21 -07:00
Kelly
c62f8cbf06 feat: Parallelized store discovery, modification tracking, and task deduplication
Store Discovery Parallelization:
- Add store_discovery_state handler for per-state parallel discovery
- Add POST /api/tasks/batch/store-discovery endpoint
- 8 workers can now process states in parallel (~30-45 min vs 3+ hours)

Modification Tracking (Migration 090):
- Add last_modified_at, last_modified_by_task, last_modified_task_id to dispensaries
- Add same columns to store_products
- Update all handlers to set tracking info on modifications

Stale Task Recovery:
- Add periodic stale cleanup every 10 minutes (worker-0 only)
- Prevents orphaned tasks from blocking queue after worker crashes

Task Deduplication:
- createStaggeredTasks now skips if pending/active task exists for same role
- Skips if same role completed within last 4 hours
- API responses include skipped count

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-12 22:15:04 -07:00
kelly
fd4a9b1434 Merge pull request 'feat(scheduler): Immutable schedules and HTTP-only pipeline' (#61) from feat/proxy-reload-and-bulk-import into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/61
2025-12-12 16:37:16 +00:00
Kelly
832ef1cf83 feat(scheduler): Immutable schedules and HTTP-only pipeline
## Changes
- **Migration 089**: Add is_immutable and method columns to task_schedules
  - Per-state product_discovery schedules (4h default)
  - Store discovery weekly (168h)
  - All schedules use HTTP transport (Puppeteer/browser)
- **Task Scheduler**: HTTP-only product discovery with per-state scheduling
  - Each state has its own immutable schedule
  - Schedules can be edited (interval/priority) but not deleted
- **TasksDashboard UI**: Full immutability support
  - Lock icon for immutable schedules
  - State and Method columns in schedules table
  - Disabled delete for immutable, restricted edit fields
- **Store Discovery HTTP**: Auto-queue product_discovery for new stores
- **Migration 088**: Discovery payloads storage schema

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 09:24:08 -07:00
kelly
65b96d9cb9 Merge pull request 'feat(workers): Add proxy reload, staggered tasks, and bulk proxy import' (#58) from feat/proxy-reload-and-bulk-import into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/58
2025-12-12 09:11:23 +00:00
Kelly
f82eed4dc3 feat(workers): Add proxy reload, staggered tasks, and bulk proxy import
- Periodic proxy reload: Workers now reload proxies every 60s to pick up changes
- Staggered task scheduling: New API endpoints for creating tasks with delays
- Bulk proxy import: Script supports multiple URL formats including host:port:user:pass
- Proxy URL column: Migration 086 adds proxy_url for non-standard formats

Key changes:
- crawl-rotator.ts: Added reloadIfStale(), isStale(), setReloadInterval()
- task-worker.ts: Calls reloadIfStale() in main loop
- task-service.ts: Added createStaggeredTasks() and createAZStoreTasks()
- tasks.ts: Added POST /batch/staggered and /batch/az-stores endpoints
- import-proxies.ts: New script for bulk proxy import
- CLAUDE.md: Documented staggered task workflow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 01:53:15 -07:00
kelly
d86190912f Merge pull request 'feat(api): Add payload query API and trusted origins management' (#51) from feat/query-api-and-trusted-origins into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/51
2025-12-12 07:49:54 +00:00
Kelly
ce081effd4 feat(workers): Add preflight schema and StatefulSet
- Migration 085: Add curl_ip, http_ip, fingerprint_data, preflight_status,
  preflight_at columns to worker_registry
- StatefulSet manifest for 8 persistent workers with OnDelete update strategy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 23:45:04 -07:00
Kelly
daab0ae9b2 feat(api): Add payload query API and trusted origins management
Query API:
- GET /api/payloads/store/:id/query - Filter products with flexible params
  (brand, category, price_min/max, thc_min/max, search, sort, pagination)
- GET /api/payloads/store/:id/aggregate - Group by brand/category with metrics
  (count, avg_price, min_price, max_price, avg_thc, in_stock_count)
- Documentation at docs/QUERY_API.md

Trusted Origins Admin:
- GET/POST/PUT/DELETE /api/admin/trusted-origins - Manage auth bypass list
- Trusted IPs, domains, and regex patterns stored in DB
- 5-minute cache with invalidation on admin updates
- Fallback to hardcoded defaults if DB unavailable
- Migration 085 creates table with seed data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 23:28:05 -07:00
Kelly
cdab71a1ee feat(workers): Add dual-transport preflight system
Workers now run both curl and http (Puppeteer) preflights on startup:
- curl-preflight.ts: Tests axios + proxy via httpbin.org
- puppeteer-preflight.ts: Tests browser + StealthPlugin via fingerprint.com
  (with amiunique.org fallback)
- Migration 084: Adds preflight columns to worker_registry and method
  column to worker_tasks
- Workers report preflight status, IP, fingerprint, and response time
- Tasks can require specific transport method (curl/http)
- Dashboard shows Transport column with preflight status badges

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 22:47:52 -07:00
Kelly
c84ef0396b feat(tasks): Add proxy_test task handler and discovery run tracking
- Add proxy_test task handler that fetches IP via proxy to verify connectivity
- Add discovery_runs migration (083) for tracking store discovery progress
- Register proxy_test in task service and worker

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 21:07:58 -07:00
Kelly
8e2f07c941 feat(workers): Concurrent task processing with resource-based backoff
Workers can now process multiple tasks concurrently (default: 3 max).
Self-regulate based on resource usage - back off at 85% memory or 90% CPU.

Backend changes:
- TaskWorker handles concurrent tasks using async Maps
- Resource monitoring (memory %, CPU %) with backoff logic
- Heartbeat reports active_task_count, max_concurrent_tasks, resource stats
- Decommission support via worker_commands table

Frontend changes:
- Workers Dashboard shows tasks per worker (N/M format)
- Resource badges with color-coded thresholds
- Pod visualization with clickable selection
- Decommission controls per worker

New env vars:
- MAX_CONCURRENT_TASKS (default: 3)
- MEMORY_BACKOFF_THRESHOLD (default: 0.85)
- CPU_BACKOFF_THRESHOLD (default: 0.90)
- BACKOFF_DURATION_MS (default: 10000)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 11:47:24 -07:00
Kelly
cf06f4a8c0 feat(worker): Listen for proxy_added notifications
- Workers now use PostgreSQL LISTEN/NOTIFY to wake up immediately when proxies are added
- Added trigger on proxies table to NOTIFY 'proxy_added' when active proxy inserted/updated
- Falls back to 30s polling if LISTEN fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 22:58:00 -07:00
Kelly
4949b22457 feat(tasks): Refactor task workflow with payload/refresh separation
Major changes:
- Split crawl into payload_fetch (API → disk) and product_refresh (disk → DB)
- Add task chaining: store_discovery → product_discovery → payload_fetch → product_refresh
- Add payload storage utilities for gzipped JSON on filesystem
- Add /api/payloads endpoints for payload access and diffing
- Add DB-driven TaskScheduler with schedule persistence
- Track newDispensaryIds through discovery promotion for chaining
- Add stealth improvements: HTTP fingerprinting, proxy rotation enhancements
- Add Workers dashboard K8s scaling controls

New files:
- src/tasks/handlers/payload-fetch.ts - Fetches from API, saves to disk
- src/services/task-scheduler.ts - DB-driven schedule management
- src/utils/payload-storage.ts - Payload save/load utilities
- src/routes/payloads.ts - Payload API endpoints
- src/services/http-fingerprint.ts - Browser fingerprint generation
- docs/TASK_WORKFLOW_2024-12-10.md - Complete workflow documentation

Migrations:
- 078: Proxy consecutive 403 tracking
- 079: task_schedules table
- 080: raw_crawl_payloads table
- 081: payload column and last_fetch_at

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 22:15:35 -07:00
Kelly
56cc171287 feat: Stealth worker system with mandatory proxy rotation
## Worker System
- Role-agnostic workers that can handle any task type
- Pod-based architecture with StatefulSet (5-15 pods, 5 workers each)
- Custom pod names (Aethelgard, Xylos, Kryll, etc.)
- Worker registry with friendly names and resource monitoring
- Hub-and-spoke visualization on JobQueue page

## Stealth & Anti-Detection (REQUIRED)
- Proxies are MANDATORY - workers fail to start without active proxies
- CrawlRotator initializes on worker startup
- Loads proxies from `proxies` table
- Auto-rotates proxy + fingerprint on 403 errors
- 12 browser fingerprints (Chrome, Firefox, Safari, Edge)
- Locale/timezone matching for geographic consistency

## Task System
- Renamed product_resync → product_refresh
- Task chaining: store_discovery → entry_point → product_discovery
- Priority-based claiming with FOR UPDATE SKIP LOCKED
- Heartbeat and stale task recovery

## UI Updates
- JobQueue: Pod visualization, resource monitoring on hover
- WorkersDashboard: Simplified worker list
- Removed unused filters from task list

## Other
- IP2Location service for visitor analytics
- Findagram consumer features scaffolding
- Documentation updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 00:44:59 -07:00
Kelly
0295637ed6 fix: Public API column mappings and OOS detection
- Fix store_products column references (name_raw, brand_name_raw, category_raw)
- Fix v_product_snapshots column references (crawled_at, *_cents pricing)
- Fix dispensaries column references (zipcode, logo_image, remove hours/amenities)
- Add services and license_type to dispensary API response
- Add consecutive_misses OOS tracking to product-resync handler
- Add migration 075 for consecutive_misses column
- Add CRAWL_PIPELINE.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 20:44:53 -07:00
Kelly
89c262ee20 feat(tasks): Add unified task-based worker architecture
Replace fragmented job systems (job_schedules, dispensary_crawl_jobs, SyncOrchestrator)
with a single unified task queue:

- Add worker_tasks table with atomic task claiming via SELECT FOR UPDATE SKIP LOCKED
- Add TaskService for CRUD, claiming, and capacity metrics
- Add TaskWorker with role-based handlers (resync, discovery, analytics)
- Add /api/tasks endpoints for management and migration from legacy systems
- Add TasksDashboard UI and integrate task counts into main dashboard
- Add comprehensive documentation

Task roles: store_discovery, entry_point_discovery, product_discovery, product_resync, analytics_refresh

Run workers with: WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 16:27:03 -07:00
Kelly
91efd1d03d feat(images): Add local image storage with on-demand resizing
- Store product images locally with hierarchy: /images/products/<state>/<store>/<brand>/<product>/
- Add /img/* proxy endpoint for on-demand resizing via Sharp
- Implement per-product image checking to skip existing downloads
- Fix pathToUrl() to correctly generate /images/... URLs
- Add frontend getImageUrl() helper with preset sizes (thumb, medium, large)
- Update all product pages to use optimized image URLs
- Add stealth session support for Dutchie GraphQL crawls
- Include test scripts for crawl and image verification

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:04:50 -07:00
Kelly
2f483b3084 feat: SEO template library, discovery pipeline, and orchestrator enhancements
## SEO Template Library
- Add complete template library with 7 page types (state, city, category, brand, product, search, regeneration)
- Add Template Library tab in SEO Orchestrator with accordion-based editors
- Add template preview, validation, and variable injection engine
- Add API endpoints: /api/seo/templates, preview, validate, generate, regenerate

## Discovery Pipeline
- Add promotion.ts for discovery location validation and promotion
- Add discover-all-states.ts script for multi-state discovery
- Add promotion log migration (067)
- Enhance discovery routes and types

## Orchestrator & Admin
- Add crawl_enabled filter to stores page
- Add API permissions page
- Add job queue management
- Add price analytics routes
- Add markets and intelligence routes
- Enhance dashboard and worker monitoring

## Infrastructure
- Add migrations for worker definitions, SEO settings, field alignment
- Add canonical pipeline for scraper v2
- Update hydration and sync orchestrator
- Enhance multi-state query service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 00:05:34 -07:00
Kelly
b7cfec0770 feat: AZ dispensary harmonization with Dutchie source of truth
Major changes:
- Add harmonize-az-dispensaries.ts script to sync dispensaries with Dutchie API
- Add migration 057 for crawl_enabled and dutchie_verified fields
- Remove legacy dutchie-az module (replaced by platforms/dutchie)
- Clean up deprecated crawlers, scrapers, and orchestrator code
- Update location-discovery to not fallback to slug when ID is missing
- Add crawl-rotator service for proxy rotation
- Add types/index.ts for shared type definitions
- Add woodpecker-agent k8s manifest

Harmonization script:
- Queries ConsumerDispensaries API for all 32 AZ cities
- Matches dispensaries by platform_dispensary_id (not slug)
- Updates existing records with full Dutchie data
- Creates new records for unmatched Dutchie dispensaries
- Disables dispensaries not found in Dutchie

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 10:19:49 -07:00
Kelly
bf4ceaf09e fix: Make all migrations idempotent
- 008: Add IF NOT EXISTS to ALTER TABLE ADD COLUMN
- 011: Add IF NOT EXISTS to CREATE TABLE and INDEX
- 012: Add IF NOT EXISTS, DROP TRIGGER IF EXISTS
- 013: Add ON CONFLICT (azdhs_id) DO NOTHING
- 014: Add IF NOT EXISTS to ALTER TABLE ADD COLUMN

All migrations can now be safely re-run without errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 23:48:35 -07:00
Kelly
f7081838cf feat: Add AI settings to database (provider/model configurable via UI)
- ai_provider and ai_model stored in settings table
- Editable via /settings page in admin UI
- API keys remain in env vars for security
- Falls back to env vars if settings not in DB

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 23:18:52 -07:00
Kelly
3bc0effa33 feat: Responsive admin UI, SEO pages, and click analytics
## Responsive Admin UI
- Layout.tsx: Mobile sidebar drawer with hamburger menu
- Dashboard.tsx: 2-col grid on mobile, responsive stats cards
- OrchestratorDashboard.tsx: Responsive table with hidden columns
- PagesTab.tsx: Responsive filters and table

## SEO Pages
- New /admin/seo section with state landing pages
- SEO page generation and management
- State page content with dispensary/product counts

## Click Analytics
- Product click tracking infrastructure
- Click analytics dashboard

## Other Changes
- Consumer features scaffolding (alerts, deals, favorites)
- Health panel component
- Workers dashboard improvements
- Legacy DutchieAZ pages removed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:48:21 -07:00
Kelly
b4a2fb7d03 feat: Add v2 architecture with multi-state support and orchestrator services
Major additions:
- Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare
- Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator
- Discovery system: dutchie discovery service, geo validation, city seeding scripts
- Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages
- Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram)
- Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata

Frontend pages added:
- Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores
- StateHeatmap, CrossStateCompare, SyncInfoPanel

Components added:
- StateSelector, OrchestratorTraceModal, WorkflowStepper

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 11:30:57 -07:00
Kelly
8ac64ba077 feat(cannaiq): Add Workers Dashboard and visibility tracking
Workers Dashboard:
- New /workers route with two-pane layout
- Workers table showing Alice, Henry, Bella, Oscar with role badges
- Run history with visibility stats (lost/restored counts)
- "Run Now" action to trigger workers immediately

Migrations:
- 057: Add visibility tracking columns (visibility_lost, visibility_lost_at, visibility_restored_at)
- 058: Add ID resolution columns for Henry worker
- 059: Add job queue columns (max_retries, retry_count, worker_id, locked_at, locked_by)

Backend fixes:
- Add httpStatus to CrawlResult interface for error classification
- Fix pool.ts typing for event listener
- Update completeJob to accept visibility stats in metadata

Frontend fixes:
- Fix NationalDashboard crash with safe formatMoney helper
- Fix OrchestratorDashboard/Stores StoreInfo type mismatches
- Add workerName/workerRole to getDutchieAZSchedules API type

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 11:04:12 -07:00
Kelly
a0f8d3911c feat: Add Findagram and FindADispo consumer frontends
- Add findagram.co React frontend with product search, brands, categories
- Add findadispo.com React frontend with dispensary locator
- Wire findagram to backend /api/az/* endpoints
- Update category/brand links to route to /products with filters
- Add k8s manifests for both frontends
- Add multi-domain user support migrations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 16:10:15 -07:00
Kelly
66e07b2009 fix(monitor): remove non-existent worker columns from job_run_logs query
The job_run_logs table tracks scheduled job orchestration, not individual
worker jobs. Worker info (worker_id, worker_hostname) belongs on
dispensary_crawl_jobs, not job_run_logs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 18:45:05 -07:00
Kelly
04b5c3bd09 Add CLAUDE guidelines for consolidated pipeline 2025-12-02 13:28:23 -07:00
Kelly
9219d8a77a Add brand history view to dutchie GraphQL migration 2025-12-02 11:34:01 -07:00
Kelly
acecb2b8ee Add dispensary_crawl_status view to consolidated schema 2025-12-02 11:31:39 -07:00
Kelly
ca80f30c8a Fix dashboard views for consolidated schema 2025-12-02 11:28:46 -07:00
Kelly
08b1994d0e Add AZ dashboard views to consolidated schema 2025-12-02 11:15:33 -07:00
Kelly
e345707db2 Add store selection to API permissions
- Add store_id and store_name columns to wp_dutchie_api_permissions
- Backend: Add /stores endpoint, require store_id when creating permissions
- Frontend: Add store selector dropdown to API Permissions form
- WordPress plugin v1.3.0: Remove store_id from shortcodes (store is tied to token)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-01 13:59:01 -07:00
Kelly
199b6a8a23 Remove incorrect migration 029, add snapshot architecture, improve scraper
- Delete migration 029 that was incorrectly creating duplicate dispensaries
- Add migration 028 for snapshot architecture
- Improve downloader with proxy/UA rotation
- Update scraper monitor and tools pages
- Various scraper improvements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-01 08:52:54 -07:00
Kelly
855b9bfd16 Add migration to link Dutchie stores to dispensaries
Migration 029: Creates dispensary records for stores with dutchie_url
that don't have a dispensary_id yet, then links them.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-01 08:22:37 -07:00
Kelly
c9947eb1de Add brand history tracking for new/dropped brands detection
- Add first_seen_at/last_seen_at columns to brands table
- Create brand_history table for event tracking (added, dropped, returned)
- Add dispensary_brand_stats view for dashboard

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-01 00:18:47 -07:00
Kelly
9de0d709b2 Update admin panel to use unified dispensaries table
- Add migration 026 to update dispensary_crawl_status view with new fields
- Update dashboard API to use dispensaries table (not stores)
- Show current inventory counts (products seen in last 7 days)
- Update ScraperSchedule UI to show provider_type correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-01 00:13:41 -07:00
Kelly
9d8972aa86 Fix category-crawler-jobs store lookup query
- Fix column name from s.dutchie_plus_url to s.dutchie_url
- Add availability tracking and product freshness APIs
- Add crawl script for sequential dispensary processing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-01 00:07:00 -07:00
Kelly
1fb8f84929 Add crawl fields directly to dispensaries table
Migration 025: Dispensaries ARE stores - add crawl metadata fields
(menu_url, provider_type, scrape_enabled, crawl_status, etc.)
directly to dispensaries table instead of maintaining separate stores table.

- Copies menu_url from 22 existing stores to their dispensaries
- Migrates products from store_id to dispensary_id
- Detects provider_type from menu_url domain
- Adds indexes for crawl scheduling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-30 23:52:48 -07:00
Kelly
5306c3f4ca Switch scheduler UI to dispensary-based API
- Add migrations 021-023 for dispensary_crawl_schedule tables and views
- Add dispensary-orchestrator service and bootstrap-discovery script
- Update schedule routes with dispensary endpoints (/api/schedule/dispensaries)
- Switch frontend scheduler to use canonical dispensaries table (182 AZDHS entries)
- Add dispensary schedule API methods to frontend api.ts
- Remove "Unmapped" badge logic - all dispensaries now linked properly
- Add proper URL linking to dispensary detail pages (/dispensaries/:state/:city/:slug)
- Update Jobs table to show dispensary_name

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-30 18:54:52 -07:00
Kelly
3861a31a3b Add crawler scheduler, orchestrator, and multi-category intelligence
- Add scheduler UI with store schedules, job queue, and global settings
- Add store crawl orchestrator for intelligent crawl workflow
- Add multi-category intelligence detection (product, specials, brands, metadata)
- Add CrawlerLogger for structured JSON logging
- Add migrations for scheduler tables and dispensary linking
- Add dispensary → scheduler navigation link
- Support production/sandbox crawler modes per provider

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-30 09:29:15 -07:00