cannaiq

Author	SHA1	Message	Date
Kelly	3582c2e9e2	fix(k8s): Use external Postgres/Redis/MinIO services Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Update secrets.yaml with correct MinIO credentials - Add Redis connection details - Remove postgres.yaml (use external 10.100.6.50) - Remove redis.yaml (use external 10.100.9.50)	2025-12-15 19:03:05 -07:00
Kelly	a8360c7260	feat: Migrate to spdy.io infrastructure - Namespace: dispensary-scraper → cannaiq - Registry: code.cannabrands.app → git.spdy.io - Database: External PostgreSQL at 10.100.6.50 - MinIO: Internal at 10.100.9.80:9000 - CI: ci.spdy.io 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 06:40:48 -07:00
Kelly	1861e18396	feat(workers): Implement geo-based task pools Workers now follow the correct flow: 1. Check what pools have pending tasks 2. Claim a pool (e.g., Phoenix AZ) 3. Get Evomi proxy for that geo 4. Run preflight with geo proxy 5. Pull tasks from pool (up to 6 stores) 6. Execute tasks 7. Release pool when exhausted (6 stores visited) Task pools group dispensaries by metro area (100mi radius): - Phoenix AZ, Tucson AZ - Los Angeles CA, San Francisco CA, San Diego CA, Sacramento CA - Denver CO, Chicago IL, Boston MA, Detroit MI - Las Vegas NV, Reno NV, Newark NJ, New York NY - Oklahoma City OK, Tulsa OK, Portland OR, Seattle WA Benefits: - Workers know geo BEFORE getting proxy (no more "No geo assigned") - IP diversity within metro area (Phoenix worker can use Tempe IP) - Simpler worker logic - just match pool geo - Pre-organized tasks, not grouped at claim time New files: - migrations/113_task_pools.sql - schema, seed data, functions - src/services/task-pool.ts - TypeScript service Env vars: - USE_TASK_POOLS=true (new system) - USE_IDENTITY_POOL=false (disabled) 🤖 Generated with [Claude Code](https://claude.ai/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-12-14 01:41:52 -07:00
Kelly	63023a4061	feat: Worker improvements and Run Now duplicate prevention - Fix Run Now to prevent duplicate task creation - Add loading state to Run Now button in UI - Return early when no stores need refresh - Worker dashboard improvements - Browser pooling architecture updates - K8s worker config updates (8 replicas, 3 concurrent tasks)	2025-12-12 20:11:31 -07:00
Kelly	a35976b9e9	chore: Clean up deprecated code and docs - Move deprecated directories to src/_deprecated/: - hydration/ (old pipeline approach) - scraper-v2/ (old Puppeteer scraper) - canonical-hydration/ (merged into tasks) - Unused services: availability, crawler-logger, geolocation, etc - Unused utils: age-gate-playwright, HomepageValidator, stealthBrowser - Archive outdated docs to docs/_archive/: - ANALYTICS_RUNBOOK.md - ANALYTICS_V2_EXAMPLES.md - BRAND_INTELLIGENCE_API.md - CRAWL_PIPELINE.md - TASK_WORKFLOW_2024-12-10.md - WORKER_TASK_ARCHITECTURE.md - ORGANIC_SCRAPING_GUIDE.md - Add docs/CODEBASE_MAP.md as single source of truth - Add warning files to deprecated/archived directories - Slim down CLAUDE.md to essential rules only 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 22:17:40 -07:00
Kelly	e918234928	feat(ci): Add npm cache volume for faster typechecks - Create PVC for shared npm cache across CI jobs - Configure Woodpecker agent to allow npm-cache volume mount - Update typecheck steps to use shared cache directory - First run populates cache, subsequent runs are ~3-4x faster 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 19:27:32 -07:00
Kelly	fdce5e0302	fix(workers): Fix false memory backoff and add backing-off color coding - Fix memory calculation to use max-old-space-size (1500MB) instead of V8's dynamic heapTotal. This prevents false 95%+ readings when idle. - Add yellow color for backing-off workers in pod visualization - Update legend and tooltips with backing-off status - Remove pool toggle from TasksDashboard (moved to Workers page) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 19:11:42 -07:00
Kelly	8e2f07c941	feat(workers): Concurrent task processing with resource-based backoff Workers can now process multiple tasks concurrently (default: 3 max). Self-regulate based on resource usage - back off at 85% memory or 90% CPU. Backend changes: - TaskWorker handles concurrent tasks using async Maps - Resource monitoring (memory %, CPU %) with backoff logic - Heartbeat reports active_task_count, max_concurrent_tasks, resource stats - Decommission support via worker_commands table Frontend changes: - Workers Dashboard shows tasks per worker (N/M format) - Resource badges with color-coded thresholds - Pod visualization with clickable selection - Decommission controls per worker New env vars: - MAX_CONCURRENT_TASKS (default: 3) - MEMORY_BACKOFF_THRESHOLD (default: 0.85) - CPU_BACKOFF_THRESHOLD (default: 0.90) - BACKOFF_DURATION_MS (default: 10000) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 11:47:24 -07:00
Kelly	a880c41d89	feat: Add password confirmation for worker scaling + RBAC - Add /api/auth/verify-password endpoint for re-authentication - Add PasswordConfirmModal component for sensitive actions - Worker scaling (+/-) now requires password confirmation - Add RBAC (ServiceAccount, Role, RoleBinding) for scraper pod - Scraper pod can now read/scale worker deployment via k8s API 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 09:16:27 -07:00
Kelly	a252a7fefd	feat(tasks): 25 workers, pool starts paused by default - Increase worker replicas from 5 to 25 - Task pool now starts PAUSED on deploy, admin must click Start Pool - Prevents workers from grabbing tasks before system is ready 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-11 01:19:02 -07:00
Kelly	a2fa21f65c	fix(worker): Wait for proxies instead of crashing on startup - Task worker now waits up to 60 minutes for active proxies - Retries every 30 seconds with clear logging - Updated K8s scraper-worker.yaml with Deployment definition - Deployment uses task-worker.js entrypoint with correct liveness probe 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-10 22:55:04 -07:00
Kelly	56cc171287	feat: Stealth worker system with mandatory proxy rotation ## Worker System - Role-agnostic workers that can handle any task type - Pod-based architecture with StatefulSet (5-15 pods, 5 workers each) - Custom pod names (Aethelgard, Xylos, Kryll, etc.) - Worker registry with friendly names and resource monitoring - Hub-and-spoke visualization on JobQueue page ## Stealth & Anti-Detection (REQUIRED) - Proxies are MANDATORY - workers fail to start without active proxies - CrawlRotator initializes on worker startup - Loads proxies from `proxies` table - Auto-rotates proxy + fingerprint on 403 errors - 12 browser fingerprints (Chrome, Firefox, Safari, Edge) - Locale/timezone matching for geographic consistency ## Task System - Renamed product_resync → product_refresh - Task chaining: store_discovery → entry_point → product_discovery - Priority-based claiming with FOR UPDATE SKIP LOCKED - Heartbeat and stale task recovery ## UI Updates - JobQueue: Pod visualization, resource monitoring on hover - WorkersDashboard: Simplified worker list - Removed unused filters from task list ## Other - IP2Location service for visitor analytics - Findagram consumer features scaffolding - Documentation updates 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-10 00:44:59 -07:00
Kelly	7f9cf559cf	fix(k8s): Update worker deployment to use v2 hydration worker The old dutchie-az/services/worker.js no longer exists. Workers now use the hydration pipeline at dist/scripts/run-hydration.js with --loop mode. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-09 15:01:18 -07:00
Kelly	ca07606b05	feat(k8s): Add Redis deployment for production - Add k8s/redis.yaml with Redis 7 Alpine deployment - Add REDIS_HOST and REDIS_PORT to configmap - Redis configured with 200MB max memory and LRU eviction - 1GB persistent volume for data persistence 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-09 11:40:11 -07:00
Kelly	5415cac2f3	feat(seo): Add SEO tables to migration and ingress config - Add seo_pages and seo_page_contents tables to migrate.ts for automatic creation on deployment - Update Home.tsx with minor formatting - Add ingress configuration updates 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-08 12:58:38 -07:00
Kelly	b7cfec0770	feat: AZ dispensary harmonization with Dutchie source of truth Major changes: - Add harmonize-az-dispensaries.ts script to sync dispensaries with Dutchie API - Add migration 057 for crawl_enabled and dutchie_verified fields - Remove legacy dutchie-az module (replaced by platforms/dutchie) - Clean up deprecated crawlers, scrapers, and orchestrator code - Update location-discovery to not fallback to slug when ID is missing - Add crawl-rotator service for proxy rotation - Add types/index.ts for shared type definitions - Add woodpecker-agent k8s manifest Harmonization script: - Queries ConsumerDispensaries API for all 32 AZ cities - Matches dispensaries by platform_dispensary_id (not slug) - Updates existing records with full Dutchie data - Creates new records for unmatched Dutchie dispensaries - Disables dispensaries not found in Dutchie 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-08 10:19:49 -07:00
Kelly	3bc0effa33	feat: Responsive admin UI, SEO pages, and click analytics ## Responsive Admin UI - Layout.tsx: Mobile sidebar drawer with hamburger menu - Dashboard.tsx: 2-col grid on mobile, responsive stats cards - OrchestratorDashboard.tsx: Responsive table with hidden columns - PagesTab.tsx: Responsive filters and table ## SEO Pages - New /admin/seo section with state landing pages - SEO page generation and management - State page content with dispensary/product counts ## Click Analytics - Product click tracking infrastructure - Click analytics dashboard ## Other Changes - Consumer features scaffolding (alerts, deals, favorites) - Health panel component - Workers dashboard improvements - Legacy DutchieAZ pages removed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-07 22:48:21 -07:00
Kelly	b4a2fb7d03	feat: Add v2 architecture with multi-state support and orchestrator services Major additions: - Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare - Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator - Discovery system: dutchie discovery service, geo validation, city seeding scripts - Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages - Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram) - Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata Frontend pages added: - Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores - StateHeatmap, CrossStateCompare, SyncInfoPanel Components added: - StateSelector, OrchestratorTraceModal, WorkflowStepper 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-07 11:30:57 -07:00
Kelly	a0f8d3911c	feat: Add Findagram and FindADispo consumer frontends - Add findagram.co React frontend with product search, brands, categories - Add findadispo.com React frontend with dispensary locator - Wire findagram to backend /api/az/* endpoints - Update category/brand links to route to /products with filters - Add k8s manifests for both frontends - Add multi-domain user support migrations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-12-05 16:10:15 -07:00
Kelly	66e07b2009	fix(monitor): remove non-existent worker columns from job_run_logs query The job_run_logs table tracks scheduled job orchestration, not individual worker jobs. Worker info (worker_id, worker_hostname) belongs on dispensary_crawl_jobs, not job_run_logs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-12-03 18:45:05 -07:00
Kelly	8b4292fbb2	Add local product detail page with Dutchie comparison - Add ProductDetail page for viewing products locally - Add Dutchie and Details buttons to product cards in Products and StoreDetail pages - Add Last Updated display showing data freshness - Add parallel scrape scripts and routes - Add K8s deployment configurations - Add frontend Dockerfile with nginx 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 06:34:38 -07:00

21 Commits