- CrawlRotator.getProxyUrl() now converts non-standard format (http://host:port:user:pass) to standard format (http://user:pass@host:port)
- Simplify puppeteer preflight to only use ipify.org for IP verification (much lighter than fingerprint.com)
- Remove heavy anti-detect site tests from preflight - not needed, trust stealth plugin
- Fixes 503 errors when using session-based residential proxies
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add schedule CRUD endpoints to /api/tasks/schedules
- Add Schedules section to TasksDashboard with edit/delete/bulk actions
- Deprecate job_schedules table (entries disabled in DB)
- Mark CrawlSchedulePage as deprecated (removed from menu)
- Add deprecation comments to legacy schedule methods in api.ts
- Add migration comments to workers.ts explaining consolidation
Key changes:
- Schedule management now at /admin/tasks instead of /admin/schedule
- task_schedules uses interval_hours (simpler than base_interval_minutes + jitter)
- All schedule routes placed before /:id to avoid Express route conflicts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix update_worker_preflight call to use correct parameter order:
(worker_id, transport, status, ip, response_ms, error, fingerprint)
- Add proxyIp to both curl and http preflight reports
- Add fingerprint JSONB with timezone, location, and bot detection data
- Log HTTP IP and timezone after preflight completes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add IP geolocation lookup via ip-api.com to get timezone from proxy IP
- Use ipify.org API for reliable proxy IP detection (replaces unreliable fingerprint.com scraping)
- Set browser timezone via CDP Emulation.setTimezoneOverride to match proxy location
- Add detectedTimezone and detectedLocation to preflight result
- Add /api/worker-registry/preflight-test endpoint for smoke testing
Fixes timezone mismatch where browser showed America/Phoenix while proxy was in America/New_York
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Exposes curl_ip, http_ip, preflight_status, preflight_at, and fingerprint_data
in the /api/worker-registry/workers response.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Moved hydration module back from _deprecated (needed for product_refresh)
- Restored product_refresh handler for processing stored payloads
- Restored geolocation service for findadispo/findagram
- Stubbed system routes that depend on deprecated SyncOrchestrator
- Removed crawler-sandbox route (deprecated)
- Fixed all TypeScript compilation errors
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Workers now run both curl and http (Puppeteer) preflights on startup:
- curl-preflight.ts: Tests axios + proxy via httpbin.org
- puppeteer-preflight.ts: Tests browser + StealthPlugin via fingerprint.com
(with amiunique.org fallback)
- Migration 084: Adds preflight columns to worker_registry and method
column to worker_tasks
- Workers report preflight status, IP, fingerprint, and response time
- Tasks can require specific transport method (curl/http)
- Dashboard shows Transport column with preflight status badges
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove cache_from/cache_to from CI (plugin bug splitting commas)
- Add preflight() method to CrawlRotator - tests proxy + anti-detect
- Add pre-task preflight check - workers MUST pass before executing
- Add releaseTask() to release tasks back to pending on preflight fail
- Rename proxy_test task to whoami for clarity
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add proxy_test task handler that fetches IP via proxy to verify connectivity
- Add discovery_runs migration (083) for tracking store discovery progress
- Register proxy_test in task service and worker
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Re-registration no longer overwrites pod_name with K8s name
- New workers get fantasy name (Aethelgard, Xylos, etc.) as pod_name
- Document worker naming convention in CLAUDE.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix memory calculation to use max-old-space-size (1500MB) instead of
V8's dynamic heapTotal. This prevents false 95%+ readings when idle.
- Add yellow color for backing-off workers in pod visualization
- Update legend and tooltips with backing-off status
- Remove pool toggle from TasksDashboard (moved to Workers page)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Workers can now process multiple tasks concurrently (default: 3 max).
Self-regulate based on resource usage - back off at 85% memory or 90% CPU.
Backend changes:
- TaskWorker handles concurrent tasks using async Maps
- Resource monitoring (memory %, CPU %) with backoff logic
- Heartbeat reports active_task_count, max_concurrent_tasks, resource stats
- Decommission support via worker_commands table
Frontend changes:
- Workers Dashboard shows tasks per worker (N/M format)
- Resource badges with color-coded thresholds
- Pod visualization with clickable selection
- Decommission controls per worker
New env vars:
- MAX_CONCURRENT_TASKS (default: 3)
- MEMORY_BACKOFF_THRESHOLD (default: 0.85)
- CPU_BACKOFF_THRESHOLD (default: 0.90)
- BACKOFF_DURATION_MS (default: 10000)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add /api/auth/verify-password endpoint for re-authentication
- Add PasswordConfirmModal component for sensitive actions
- Worker scaling (+/-) now requires password confirmation
- Add RBAC (ServiceAccount, Role, RoleBinding) for scraper pod
- Scraper pod can now read/scale worker deployment via k8s API
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Increase worker replicas from 5 to 25
- Task pool now starts PAUSED on deploy, admin must click Start Pool
- Prevents workers from grabbing tasks before system is ready
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add task-pool-state.ts for shared pause/resume state
- Add /api/tasks/pool/status, pause, resume endpoints
- Add Start/Stop Pool toggle button to TasksDashboard
- Spinner stops when pool is closed
- Fix is_active column name in store-discovery.ts
- Fix missing active column in task-service.ts claimTask
- Auto-refresh every 15 seconds
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change 'active' to 'is_active' in states table query (store-discovery.ts)
- Remove non-existent 'active' column check from worker_tasks query (task-service.ts)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add DispensarySchedule page showing crawl history and upcoming schedule
- Add /dispensaries/:state/:city/:slug/schedule route
- Add API endpoint for store crawl history
- Update View Schedule link to use dispensary-specific route
- Remove colored badges from DispensaryDetail product table (plain text)
- Make Details button ghost style in product table
- Add "Sort by States" option to IntelligenceBrands
- Remove status filter dropdown from Dispensaries page
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The "Active States" metric was arbitrary and confusing. Changed to
show total states count - all states in the system regardless of
whether they have data or not.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Active states should count states with actual product data, not just
states where crawling is enabled. A state can have historical data
even if crawling is currently disabled.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change active_states to count states with crawl_enabled=true dispensaries
- Filter all national summary queries by crawl_enabled=true
- Remove unused StateBadge from National Dashboard header
- StateBadge was showing "Arizona" with no way to change it
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add state filter to Intelligence Brands API and frontend
- Add state filter to Intelligence Pricing API and frontend
- Add state filter to Intelligence Stores API and frontend
- Fix null safety issues with toLocaleString() calls
- Update backend /stores endpoint to return skuCount, snapshotCount, chainName
- Add overall stats to pricing endpoint
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Install curl in Docker container for Dutchie HTTP requests
- Add 'active' column to worker_tasks (default false) to prevent
accidental task execution on startup
- Update task-service to only claim tasks where active=true
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The states dropdown was showing count of ALL dispensaries instead of
just crawl-enabled ones. Now correctly filters to match the actual
stores that would be displayed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Workers now use PostgreSQL LISTEN/NOTIFY to wake up immediately when proxies are added
- Added trigger on proxies table to NOTIFY 'proxy_added' when active proxy inserted/updated
- Falls back to 30s polling if LISTEN fails
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Task worker now waits up to 60 minutes for active proxies
- Retries every 30 seconds with clear logging
- Updated K8s scraper-worker.yaml with Deployment definition
- Deployment uses task-worker.js entrypoint with correct liveness probe
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add authMiddleware to analytics-v2.ts to require authentication
- Add permanent rule #6 to CLAUDE.md: "ALL API ROUTES REQUIRE AUTHENTICATION"
- Add forbidden action #19: "Creating API routes without authMiddleware"
- Document authentication flow and trusted origins
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a user logs in and has a Bearer token, use their actual identity
instead of falling back to internal@system. This ensures logged-in
users see their real email in the admin UI.
Order of auth:
1. If Bearer token provided → use JWT/API token (real user identity)
2. If no token → check trusted origins (for API access like WordPress)
3. Otherwise → 401 unauthorized
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add *.cannaiq.co and *.cannabrands.app to trusted domains list.
Updated isTrustedDomain() to recognize *.domain.com as wildcard
that matches the base domain and any subdomain.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add *.cannaiq.co and *.cannabrands.app patterns to both:
- auth/middleware.ts (admin routes)
- public-api.ts (consumer /api/v1/* routes)
This allows any subdomain of these domains to access the API without
requiring an API key.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Harmonize backend version with WordPress plugin version so admin UI displays correct version.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix mapBrandForUI to use correct 'brand' field from API response
- Add null check in Brands.jsx filter to prevent crash on undefined names
- Fix BrandPenetrationService sps.brand_name -> sps.brand_name_raw
- Remove missing logo192.png and logo512.png from PWA manifest
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The store_products table doesn't have a state_id column - must join
through dispensaries to get state info. Also fixed column references
to use brand_name_raw and category_raw.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
npm ci can fail when package-lock.json has minor mismatches with
package.json. npm install is more forgiving and appropriate for
Docker builds where determinism is less critical than reliability.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add brand promotional history endpoint (GET /api/analytics/v2/brand/:name/promotions)
- Tracks when products go on special, duration, discounts, quantity sold estimates
- Aggregates by category with frequency metrics (weekly/monthly)
- Add quantity changes endpoint (GET /api/analytics/v2/store/:id/quantity-changes)
- Filter by direction (increase/decrease/all) for sales vs restock estimation
- Fix canonical-upsert to populate stock_quantity and total_quantity_available
- Add API key edit functionality in admin UI
- Edit allowed domains and IPs
- Display domains in list view
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
CI/CD:
- Fix build_args format in woodpecker CI (comma-separated, not YAML list)
- This fixes "unknown" SHA/version showing on remote deployments
Backend schema-tolerant fixes (graceful fallbacks when tables missing):
- users.ts: Check which columns exist before querying
- worker-registry.ts: Return empty result if table doesn't exist
- task-service.ts: Add tableExists() helper, handle missing tables/views
- proxies.ts: Return totalProxies in test-all response
Frontend fixes:
- Proxies: Use total from response for accurate progress display
- SEO PagesTab: Dim Generate button when no AI provider active
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Dockerfile: Add COPY migrations ./migrations so auto-migrate works on remote
- intelligence.ts: Fix FILTER clause placement in aggregate functions
- FILTER must be inside AVG(), not wrapping ROUND()
- Remove redundant FILTER on MIN (already filtered by WHERE)
- Remove unsupported FILTER on PERCENTILE_CONT
These fixes resolve:
- "Failed to get task counts" (worker_tasks table missing)
- "FILTER specified but round is not an aggregate function" errors
- /national page "column m.state does not exist" (mv_state_metrics missing)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix 403 handler to rotate BOTH proxy and fingerprint (was only fingerprint)
- Add auto-retry logic to task service (retry up to max_retries before failing)
- Add error tooltip on task status badge showing retry count and error message
- Add DELETE /api/tasks/:id endpoint (only for non-running tasks)
- Add delete button to JobQueue task table
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add delete<T>() method to ApiClient for WorkersDashboard cleanup
- Add npm cache volume for faster npm ci
- Add TypeScript incremental builds with tsBuildInfoFile cache
- Should significantly speed up repeated CI runs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add auto-migration system that runs SQL files from migrations/ on server startup
- Track applied migrations in schema_migrations table
- Show proxy exit location in Workers dashboard
- Add "Cleanup Stale" button to remove old workers
- Add remove button for individual workers
- Include proxy location (city, state, country) in worker heartbeats
- Update Proxy interface with location fields
- Re-enable bulk proxy import without ON CONFLICT
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
## Worker System
- Role-agnostic workers that can handle any task type
- Pod-based architecture with StatefulSet (5-15 pods, 5 workers each)
- Custom pod names (Aethelgard, Xylos, Kryll, etc.)
- Worker registry with friendly names and resource monitoring
- Hub-and-spoke visualization on JobQueue page
## Stealth & Anti-Detection (REQUIRED)
- Proxies are MANDATORY - workers fail to start without active proxies
- CrawlRotator initializes on worker startup
- Loads proxies from `proxies` table
- Auto-rotates proxy + fingerprint on 403 errors
- 12 browser fingerprints (Chrome, Firefox, Safari, Edge)
- Locale/timezone matching for geographic consistency
## Task System
- Renamed product_resync → product_refresh
- Task chaining: store_discovery → entry_point → product_discovery
- Priority-based claiming with FOR UPDATE SKIP LOCKED
- Heartbeat and stale task recovery
## UI Updates
- JobQueue: Pod visualization, resource monitoring on hover
- WorkersDashboard: Simplified worker list
- Removed unused filters from task list
## Other
- IP2Location service for visitor analytics
- Findagram consumer features scaffolding
- Documentation updates
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>