Compare commits

..

85 Commits

Author SHA1 Message Date
Kelly
5a929e9803 refactor(admin): Consolidate JobQueue into TasksDashboard
- Move Create Task modal from JobQueue to TasksDashboard
- Add pagination to TasksDashboard (25 tasks per page)
- Add delete action for failed/completed/pending tasks
- Remove JobQueue page and route
- Rename nav item from "Task Queue" to "Tasks"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 17:41:21 -07:00
Kelly
9944031eea ci: Add worker resilience check to deploy step
If workers are scaled to 0, CI will now automatically scale them to 5
before updating the image. This prevents workers from being stuck at 0
if manually scaled down for maintenance.

The check only scales up if replicas=0, so it won't interfere with
normal deployments or HPA scaling.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 16:40:53 -07:00
kelly
2babaa7136 Merge pull request 'ci: Remove explicit migration step from deploy' (#41) from fix/ci-remove-migration-step into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/41
2025-12-11 23:11:48 +00:00
Kelly
90567511dd ci: Remove explicit migration step from deploy
Auto-migrate runs at server startup and handles migration errors gracefully.
The explicit kubectl exec migration step was failing due to trigger
already existing (schema_migrations table out of sync).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 15:56:47 -07:00
kelly
beb16ad0cb Merge pull request 'ci: Run migrations via kubectl exec instead of separate step' (#40) from fix/ci-migrate-via-kubectl into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/40
2025-12-11 22:38:21 +00:00
Kelly
fc7fc5ea85 ci: Run migrations via kubectl exec instead of separate step
Removes the migrate step that required db_* secrets (which CI can't
access since postgres is cluster-internal). Instead, run migrations
via kubectl exec on the deployed scraper pod, which already has DB
access via its env vars.

Deploy order:
1. Deploy scraper image
2. Wait for rollout
3. Run migrations via kubectl exec
4. Deploy remaining services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 15:16:12 -07:00
kelly
ab8956b14b Merge pull request 'fix: Revert CI event array syntax to single value' (#39) from fix/ci-event-syntax into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/39
2025-12-11 22:10:03 +00:00
Kelly
1d9c90641f fix: Revert event array syntax to single value
The [push, manual] array syntax broke CI config parsing.
Reverting to event: push which is known to work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 14:48:34 -07:00
kelly
6126b907f2 Merge pull request 'ci: Support manual pipeline events for deploy' (#38) from ci/support-manual-pipelines into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/38
2025-12-11 21:12:26 +00:00
Kelly
cc93d2d483 ci: Support manual pipeline events for deploy
Allow deploy steps to run on both push and manual events.
This enables triggering deploys via `woodpecker-cli pipeline create`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 13:54:04 -07:00
kelly
7642c17ec0 Merge pull request 'ci: Fix pipeline config path' (#37) from fix/ci-config-path into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/37
2025-12-11 20:01:30 +00:00
Kelly
cb60dcf352 ci: Add root-level woodpecker config for CI detection 2025-12-11 12:37:35 -07:00
kelly
5ffe05d519 Merge pull request 'feat: Concurrent task processing with resource-based backoff' (#36) from feat/worker-scaling into master 2025-12-11 18:49:01 +00:00
Kelly
8e2f07c941 feat(workers): Concurrent task processing with resource-based backoff
Workers can now process multiple tasks concurrently (default: 3 max).
Self-regulate based on resource usage - back off at 85% memory or 90% CPU.

Backend changes:
- TaskWorker handles concurrent tasks using async Maps
- Resource monitoring (memory %, CPU %) with backoff logic
- Heartbeat reports active_task_count, max_concurrent_tasks, resource stats
- Decommission support via worker_commands table

Frontend changes:
- Workers Dashboard shows tasks per worker (N/M format)
- Resource badges with color-coded thresholds
- Pod visualization with clickable selection
- Decommission controls per worker

New env vars:
- MAX_CONCURRENT_TASKS (default: 3)
- MEMORY_BACKOFF_THRESHOLD (default: 0.85)
- CPU_BACKOFF_THRESHOLD (default: 0.90)
- BACKOFF_DURATION_MS (default: 10000)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 11:47:24 -07:00
kelly
0b6e615075 Merge pull request 'fix: Use React Router Link to prevent scroll reset' (#35) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/35
2025-12-11 17:14:25 +00:00
Kelly
be251c6fb3 fix: Use React Router Link for nav to prevent scroll reset
Changed sidebar NavLink from <a> to <Link> for client-side navigation.
This prevents full page reload and scroll position reset.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:59:34 -07:00
kelly
efb1e89e33 Merge pull request 'fix: Show only git SHA in header' (#34) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/34
2025-12-11 16:56:04 +00:00
Kelly
529c447413 refactor: Move worker scaling to Workers page with password confirmation
- Worker scaling controls now on /workers page only (removed from /tasks)
- Password confirmation required before scaling
- Show only git SHA in header (removed version number)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:41:58 -07:00
Kelly
1eaf95c06b fix: Show only git SHA in header, remove version number
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:35:41 -07:00
kelly
138ed17d8b Merge pull request 'feat: Worker scaling from admin UI with password confirmation' (#33) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/33
2025-12-11 16:29:04 +00:00
Kelly
a880c41d89 feat: Add password confirmation for worker scaling + RBAC
- Add /api/auth/verify-password endpoint for re-authentication
- Add PasswordConfirmModal component for sensitive actions
- Worker scaling (+/-) now requires password confirmation
- Add RBAC (ServiceAccount, Role, RoleBinding) for scraper pod
- Scraper pod can now read/scale worker deployment via k8s API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:16:27 -07:00
Kelly
2a9ae61dce fix(ui): Make Tasks Dashboard header sticky
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:04:21 -07:00
kelly
1f21911fa1 Merge pull request 'feat(admin): Worker scaling controls via k8s API' (#32) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/32
2025-12-11 16:01:25 +00:00
Kelly
6f0a58f5d2 fix(k8s): Correct API call signatures for k8s client v1.4
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 08:47:27 -07:00
Kelly
8206dce821 feat(admin): Worker scaling controls via k8s API
- Add /api/k8s/workers endpoint to get deployment status
- Add /api/k8s/workers/scale endpoint to scale replicas (0-50)
- Add worker scaling UI to Tasks Dashboard (+/- 5 workers)
- Shows ready/desired replica count
- Uses in-cluster config in k8s, kubeconfig locally

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 08:24:32 -07:00
kelly
ced1afaa8a Merge pull request 'fix(ci): CI config fix + 25 workers + pool starts paused' (#31) from fix/ci-filename into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/31
2025-12-11 08:47:26 +00:00
Kelly
d6c602c567 fix(ui): Remove Cleanup Stale button from Workers page
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 01:29:25 -07:00
Kelly
a252a7fefd feat(tasks): 25 workers, pool starts paused by default
- Increase worker replicas from 5 to 25
- Task pool now starts PAUSED on deploy, admin must click Start Pool
- Prevents workers from grabbing tasks before system is ready

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 01:19:02 -07:00
Kelly
83b06c21cc fix(tasks): Stop spinner in status cards when pool is paused
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 01:14:25 -07:00
kelly
f5214da54c Merge pull request 'fix(ci): Fix Woodpecker config - remove invalid top-level when' (#30) from fix/ci-filename into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/30
2025-12-11 08:07:16 +00:00
Kelly
e3d4dd0127 fix(ci): Fix Woodpecker config - remove invalid top-level when
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 00:45:08 -07:00
kelly
d0ee0d72f5 Merge pull request 'feat(tasks): Add task pool start/stop toggle' (#29) from feat/task-pool-toggle into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/29
2025-12-11 07:21:02 +00:00
kelly
521f0550cd Merge pull request 'feat: Admin UI cleanup and dispensary schedule page' (#28) from fix/worker-proxy-wait into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/28
2025-12-11 07:08:03 +00:00
Kelly
8a09691e91 feat(tasks): Add task pool start/stop toggle
- Add task-pool-state.ts for shared pause/resume state
- Add /api/tasks/pool/status, pause, resume endpoints
- Add Start/Stop Pool toggle button to TasksDashboard
- Spinner stops when pool is closed
- Fix is_active column name in store-discovery.ts
- Fix missing active column in task-service.ts claimTask
- Auto-refresh every 15 seconds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 00:07:14 -07:00
Kelly
459ad7d9c9 fix(tasks): Fix missing column errors in task queries
- Change 'active' to 'is_active' in states table query (store-discovery.ts)
- Remove non-existent 'active' column check from worker_tasks query (task-service.ts)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:54:28 -07:00
Kelly
d102d27731 feat(admin): Dispensary schedule page and UI cleanup
- Add DispensarySchedule page showing crawl history and upcoming schedule
- Add /dispensaries/:state/:city/:slug/schedule route
- Add API endpoint for store crawl history
- Update View Schedule link to use dispensary-specific route
- Remove colored badges from DispensaryDetail product table (plain text)
- Make Details button ghost style in product table
- Add "Sort by States" option to IntelligenceBrands
- Remove status filter dropdown from Dispensaries page

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:50:47 -07:00
kelly
01810c40a1 Merge pull request 'fix(worker): Wait for proxies instead of crashing' (#27) from fix/worker-proxy-wait into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/27
2025-12-11 06:43:29 +00:00
Kelly
b7d33e1cbf fix(admin): Clean up store detail and intelligence pages
- Remove Update dropdown from DispensaryDetail page
- Remove Crawl Now button from StoreDetailPage
- Change "Last Crawl" to "Last Updated" on both detail pages
- Tone down emerald colors on StoreDetailPage (use gray borders/tabs)
- Simplify THC/CBD/Stock badges to plain text
- Remove duplicate state dropdown from IntelligenceStores filters
- Make store rows clickable in IntelligenceStores

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:42:50 -07:00
Kelly
5b34b5a78c fix(admin): Consistent navigation across Intelligence pages
- Add state selector dropdown to all three Intelligence pages (Brands, Stores, Pricing)
- Use consistent emerald-styled page navigation badges with current page highlighted
- Remove Refresh buttons from all Intelligence pages
- Update chart styling to use emerald gradient bars (matching Pricing page)
- Load all available states from orchestrator API instead of extracting from local data
- Fix z-index and styling on state dropdown for better visibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:32:57 -07:00
Kelly
c091d2316b fix(dashboard): Remove refresh button and HealthPanel
- Removed refresh button and refreshing state from Dashboard
- Removed HealthPanel component (deploy status auto-refresh)
- Simplified header layout

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:28:18 -07:00
Kelly
e8862b8a8b fix(national): Remove Refresh Metrics button
Removed unused refresh button and related state/handlers from
National Dashboard.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:27:01 -07:00
Kelly
1b46ab699d fix(national): Show all states count, not filtered "active" states
The "Active States" metric was arbitrary and confusing. Changed to
show total states count - all states in the system regardless of
whether they have data or not.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:25:50 -07:00
Kelly
ac1995f63f fix(pricing): Simplify category chart to prevent overflow
- Replace complex price range bars with simple horizontal bars
- Use overflow-hidden to prevent bars extending beyond container
- Calculate bar width as percentage of max avg price
- Limit to top 12 categories for cleaner display
- Fixed-width labels prevent layout shift

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:24:31 -07:00
Kelly
de93669652 fix(national): Count active states by product data, not crawl status
Active states should count states with actual product data, not just
states where crawling is enabled. A state can have historical data
even if crawling is currently disabled.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:23:20 -07:00
Kelly
dffc124920 fix(national): Fix active states count and remove StateBadge
- Change active_states to count states with crawl_enabled=true dispensaries
- Filter all national summary queries by crawl_enabled=true
- Remove unused StateBadge from National Dashboard header
- StateBadge was showing "Arizona" with no way to change it

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:22:19 -07:00
Kelly
932ceb0287 feat(intelligence): Add state filter to all Intelligence pages
- Add state filter to Intelligence Brands API and frontend
- Add state filter to Intelligence Pricing API and frontend
- Add state filter to Intelligence Stores API and frontend
- Fix null safety issues with toLocaleString() calls
- Update backend /stores endpoint to return skuCount, snapshotCount, chainName
- Add overall stats to pricing endpoint

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:19:54 -07:00
Kelly
824d48fd85 fix: Add curl to Docker, add active flag to worker_tasks
- Install curl in Docker container for Dutchie HTTP requests
- Add 'active' column to worker_tasks (default false) to prevent
  accidental task execution on startup
- Update task-service to only claim tasks where active=true

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:12:09 -07:00
Kelly
47fdab0382 fix: Filter orchestrator states by crawl_enabled
The states dropdown was showing count of ALL dispensaries instead of
just crawl-enabled ones. Now correctly filters to match the actual
stores that would be displayed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:09:04 -07:00
Kelly
ed7ddc6375 ci: Add database migration step to deploy pipeline
Migrations now run automatically before deployments:
1. Build new Docker image
2. Run migrations using the new image
3. Deploy to Kubernetes

Requires new secrets: db_host, db_port, db_name, db_user, db_pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:07:25 -07:00
Kelly
cf06f4a8c0 feat(worker): Listen for proxy_added notifications
- Workers now use PostgreSQL LISTEN/NOTIFY to wake up immediately when proxies are added
- Added trigger on proxies table to NOTIFY 'proxy_added' when active proxy inserted/updated
- Falls back to 30s polling if LISTEN fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 22:58:00 -07:00
Kelly
a2fa21f65c fix(worker): Wait for proxies instead of crashing on startup
- Task worker now waits up to 60 minutes for active proxies
- Retries every 30 seconds with clear logging
- Updated K8s scraper-worker.yaml with Deployment definition
- Deployment uses task-worker.js entrypoint with correct liveness probe

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 22:55:04 -07:00
kelly
61e915968f Merge pull request 'feat(tasks): Refactor task workflow with payload/refresh separation' (#26) from feat/task-workflow-refactor into master 2025-12-11 05:24:11 +00:00
Kelly
4949b22457 feat(tasks): Refactor task workflow with payload/refresh separation
Major changes:
- Split crawl into payload_fetch (API → disk) and product_refresh (disk → DB)
- Add task chaining: store_discovery → product_discovery → payload_fetch → product_refresh
- Add payload storage utilities for gzipped JSON on filesystem
- Add /api/payloads endpoints for payload access and diffing
- Add DB-driven TaskScheduler with schedule persistence
- Track newDispensaryIds through discovery promotion for chaining
- Add stealth improvements: HTTP fingerprinting, proxy rotation enhancements
- Add Workers dashboard K8s scaling controls

New files:
- src/tasks/handlers/payload-fetch.ts - Fetches from API, saves to disk
- src/services/task-scheduler.ts - DB-driven schedule management
- src/utils/payload-storage.ts - Payload save/load utilities
- src/routes/payloads.ts - Payload API endpoints
- src/services/http-fingerprint.ts - Browser fingerprint generation
- docs/TASK_WORKFLOW_2024-12-10.md - Complete workflow documentation

Migrations:
- 078: Proxy consecutive 403 tracking
- 079: task_schedules table
- 080: raw_crawl_payloads table
- 081: payload column and last_fetch_at

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 22:15:35 -07:00
Kelly
1fb0eb94c2 security: Add authMiddleware to analytics-v2 routes
- Add authMiddleware to analytics-v2.ts to require authentication
- Add permanent rule #6 to CLAUDE.md: "ALL API ROUTES REQUIRE AUTHENTICATION"
- Add forbidden action #19: "Creating API routes without authMiddleware"
- Document authentication flow and trusted origins

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 19:01:44 -07:00
Kelly
9aefb554bc fix: Correct Analytics V2 SQL queries for schema alignment
- Fix JOIN path: store_products -> dispensaries -> states (was incorrectly joining sp.state_id which doesn't exist)
- Fix column names to use *_raw suffixes (category_raw, brand_name_raw, name_raw)
- Fix row mappings to read correct column names from query results
- Add ::timestamp casts for interval arithmetic in StoreAnalyticsService

All Analytics V2 endpoints now work correctly:
- /state/legal-breakdown
- /state/recreational
- /category/all
- /category/rec-vs-med
- /state/:code/summary
- /store/:id/summary

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 18:52:57 -07:00
kelly
a4338669a9 Merge pull request 'fix(auth): Prioritize JWT token over trusted origin bypass' (#24) from fix/auth-token-priority into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/24
2025-12-11 01:34:10 +00:00
Kelly
1fa9ea496c fix(auth): Prioritize JWT token over trusted origin bypass
When a user logs in and has a Bearer token, use their actual identity
instead of falling back to internal@system. This ensures logged-in
users see their real email in the admin UI.

Order of auth:
1. If Bearer token provided → use JWT/API token (real user identity)
2. If no token → check trusted origins (for API access like WordPress)
3. Otherwise → 401 unauthorized

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 18:21:50 -07:00
kelly
31756a2233 Merge pull request 'chore: Add WordPress plugin v1.6.0 download files' (#23) from chore/wordpress-plugin-downloads into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/23
2025-12-11 00:40:53 +00:00
Kelly
166583621b chore: Add WordPress plugin v1.6.0 download files
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 17:23:25 -07:00
kelly
ca952c4674 Merge pull request 'fix(ci): Use YAML map format for docker-buildx build_args' (#21) from fix/ci-build-args-format into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/21
2025-12-10 23:54:33 +00:00
kelly
4054778b6c Merge pull request 'feat: Add wildcard support for trusted domains' (#20) from fix/trusted-origins-wildcards into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/20
2025-12-10 23:54:11 +00:00
Kelly
56a5f00015 fix(ci): Use YAML map format for docker-buildx build_args
The woodpeckerci/plugin-docker-buildx plugin expects build_args as a
YAML map (key: value), not a list. This was causing build args to not
be passed to the Docker build, resulting in unknown git SHA and build
info in the deployed application.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 16:42:05 -07:00
Kelly
a96d50c481 docs(wordpress): Add deprecation comments for legacy shortcode/migration code
Clarifies that crawlsy_* and dutchie_* shortcodes are deprecated aliases
for backward compatibility only. New implementations should use cannaiq_*.

Also documents the token migration logic that preserves old API tokens.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 16:24:56 -07:00
kelly
4806212f46 Merge pull request 'fix(ci): Use YAML list format for docker-buildx build_args' (#18) from fix/ci-build-args into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/18
2025-12-10 22:29:41 +00:00
kelly
2486f3c6b2 Merge pull request 'feat(analytics): Add Brand Intelligence API endpoint' (#19) from feat/brand-intelligence-api into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/19
2025-12-10 22:29:26 +00:00
Kelly
f25bebf6ee feat: Add wildcard support for trusted domains
Add *.cannaiq.co and *.cannabrands.app to trusted domains list.
Updated isTrustedDomain() to recognize *.domain.com as wildcard
that matches the base domain and any subdomain.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 15:29:23 -07:00
Kelly
22dad6d0fc feat: Add wildcard trusted origins for cannaiq.co and cannabrands.app
Add *.cannaiq.co and *.cannabrands.app patterns to both:
- auth/middleware.ts (admin routes)
- public-api.ts (consumer /api/v1/* routes)

This allows any subdomain of these domains to access the API without
requiring an API key.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 15:25:04 -07:00
Kelly
03eab66d35 chore: Bump backend version to 1.6.0
Harmonize backend version with WordPress plugin version so admin UI displays correct version.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 15:06:42 -07:00
Kelly
97b1ab23d8 fix(ci): Use YAML list format for docker-buildx build_args
The woodpecker docker-buildx plugin expects build_args as a YAML list,
not a comma-separated string. The previous format resulted in all args
being passed as a single malformed arg with "*=" prefix.

This fix ensures APP_GIT_SHA, APP_BUILD_TIME, etc. are properly passed
to the Dockerfile so the /api/version endpoint returns correct values.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 14:56:18 -07:00
Kelly
9fff0ba430 feat(analytics): Add Brand Intelligence API endpoint
New endpoint: GET /api/analytics/v2/brand/:name/intelligence

Returns comprehensive brand analytics payload including:
- Performance snapshot (active SKUs, revenue, stores, market share)
- Alerts (lost stores, delisted SKUs, competitor takeovers)
- SKU performance (velocity, status, stock levels)
- Retail footprint (penetration by region, whitespace opportunities)
- Competitive landscape (price positioning, head-to-head comparisons)
- Inventory health (days of stock, risk levels, overstock alerts)
- Promotion effectiveness (baseline vs promo velocity, lift, ROI)

Supports time windows (7d/30d/90d), state filtering, and category filtering.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 14:53:35 -07:00
kelly
7d3e91b2e6 Merge pull request 'feat(wordpress): Add new Elementor widgets and dynamic selectors v1.6.0' (#17) from feat/wordpress-widgets into master 2025-12-10 20:41:44 +00:00
Kelly
74957a9ec5 feat(wordpress): Add new Elementor widgets and dynamic selectors v1.6.0
New Widgets:
- Brand Grid: Display brands in a grid with product counts
- Category List: Show categories in grid/list/pills layouts
- Specials Grid: Display products on sale with discount badges

Enhanced Product Grid Widget:
- Dynamic category dropdown (fetches from API)
- Dynamic brand dropdown (fetches from API)
- "On Special Only" toggle filter

New Plugin Methods:
- fetch_categories() - Get categories from API
- fetch_brands() - Get brands from API
- fetch_specials() - Get products on sale
- get_category_options() - Cached options for Elementor
- get_brand_options() - Cached options for Elementor

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 13:41:17 -07:00
kelly
2d035c46cf Merge pull request 'fix: Findagram brands page crash and PWA icon errors' (#16) from fix/findagram-brands-crash into master 2025-12-10 20:11:40 +00:00
Kelly
53445fe72a fix: Findagram brands page crash and PWA icon errors
- Fix mapBrandForUI to use correct 'brand' field from API response
- Add null check in Brands.jsx filter to prevent crash on undefined names
- Fix BrandPenetrationService sps.brand_name -> sps.brand_name_raw
- Remove missing logo192.png and logo512.png from PWA manifest

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 13:06:23 -07:00
kelly
37cc8956c5 Merge pull request 'fix: Join states through dispensaries in BrandPenetrationService' (#15) from feat/ci-auto-merge into master 2025-12-10 19:36:06 +00:00
Kelly
197c82f921 fix: Join states through dispensaries in BrandPenetrationService
The store_products table doesn't have a state_id column - must join
through dispensaries to get state info. Also fixed column references
to use brand_name_raw and category_raw.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 12:18:10 -07:00
kelly
2c52493a9c Merge pull request 'fix(docker): Use npm install instead of npm ci for reliability' (#14) from feat/ci-auto-merge into master 2025-12-10 18:44:21 +00:00
Kelly
2ee2ba6b8c fix(docker): Use npm install instead of npm ci for reliability
npm ci can fail when package-lock.json has minor mismatches with
package.json. npm install is more forgiving and appropriate for
Docker builds where determinism is less critical than reliability.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-10 11:28:29 -07:00
kelly
bafcf1694a Merge pull request 'feat(analytics): Brand promotional history + specials fix + API key editing' (#13) from feat/ci-auto-merge into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/13
2025-12-10 18:12:59 +00:00
Kelly
95792aab15 feat(analytics): Brand promotional history + specials fix + API key editing
- Add brand promotional history endpoint (GET /api/analytics/v2/brand/:name/promotions)
  - Tracks when products go on special, duration, discounts, quantity sold estimates
  - Aggregates by category with frequency metrics (weekly/monthly)
- Add quantity changes endpoint (GET /api/analytics/v2/store/:id/quantity-changes)
  - Filter by direction (increase/decrease/all) for sales vs restock estimation
- Fix canonical-upsert to populate stock_quantity and total_quantity_available
- Add API key edit functionality in admin UI
  - Edit allowed domains and IPs
  - Display domains in list view

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-10 10:59:03 -07:00
kelly
38ae2c3a3e Merge pull request 'feat/ci-auto-merge' (#12) from feat/ci-auto-merge into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/12
2025-12-10 17:26:21 +00:00
Kelly
249d3c1b7f fix: Build args format for version info + schema-tolerant routes
CI/CD:
- Fix build_args format in woodpecker CI (comma-separated, not YAML list)
- This fixes "unknown" SHA/version showing on remote deployments

Backend schema-tolerant fixes (graceful fallbacks when tables missing):
- users.ts: Check which columns exist before querying
- worker-registry.ts: Return empty result if table doesn't exist
- task-service.ts: Add tableExists() helper, handle missing tables/views
- proxies.ts: Return totalProxies in test-all response

Frontend fixes:
- Proxies: Use total from response for accurate progress display
- SEO PagesTab: Dim Generate button when no AI provider active

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 09:53:21 -07:00
Kelly
9647f94f89 fix: Copy migrations folder to Docker image + fix SQL FILTER syntax
- Dockerfile: Add COPY migrations ./migrations so auto-migrate works on remote
- intelligence.ts: Fix FILTER clause placement in aggregate functions
  - FILTER must be inside AVG(), not wrapping ROUND()
  - Remove redundant FILTER on MIN (already filtered by WHERE)
  - Remove unsupported FILTER on PERCENTILE_CONT

These fixes resolve:
- "Failed to get task counts" (worker_tasks table missing)
- "FILTER specified but round is not an aggregate function" errors
- /national page "column m.state does not exist" (mv_state_metrics missing)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 09:38:05 -07:00
Kelly
afc288d2cf feat(ci): Auto-merge PRs after all type checks pass
Uses Gitea API to merge PR automatically when all typecheck jobs succeed.
Requires gitea_token secret in Woodpecker.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 09:27:26 -07:00
kelly
df01ce6aad Merge pull request 'feat: Auto-migrations on startup, worker exit location, proxy improvements' (#11) from feat/auto-migrations into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/11
2025-12-10 16:07:17 +00:00
103 changed files with 11902 additions and 2785 deletions

View File

@@ -1,6 +1,3 @@
when:
- event: [push, pull_request]
steps: steps:
# =========================================== # ===========================================
# PR VALIDATION: Parallel type checks (PRs only) # PR VALIDATION: Parallel type checks (PRs only)
@@ -45,6 +42,31 @@ steps:
when: when:
event: pull_request event: pull_request
# ===========================================
# AUTO-MERGE: Merge PR after all checks pass
# ===========================================
auto-merge:
image: alpine:latest
environment:
GITEA_TOKEN:
from_secret: gitea_token
commands:
- apk add --no-cache curl
- |
echo "Merging PR #${CI_COMMIT_PULL_REQUEST}..."
curl -s -X POST \
-H "Authorization: token $GITEA_TOKEN" \
-H "Content-Type: application/json" \
-d '{"Do":"merge"}' \
"https://code.cannabrands.app/api/v1/repos/Creationshop/dispensary-scraper/pulls/${CI_COMMIT_PULL_REQUEST}/merge"
depends_on:
- typecheck-backend
- typecheck-cannaiq
- typecheck-findadispo
- typecheck-findagram
when:
event: pull_request
# =========================================== # ===========================================
# MASTER DEPLOY: Parallel Docker builds # MASTER DEPLOY: Parallel Docker builds
# =========================================== # ===========================================
@@ -65,10 +87,10 @@ steps:
platforms: linux/amd64 platforms: linux/amd64
provenance: false provenance: false
build_args: build_args:
- APP_BUILD_VERSION=${CI_COMMIT_SHA} APP_BUILD_VERSION: ${CI_COMMIT_SHA:0:8}
- APP_GIT_SHA=${CI_COMMIT_SHA} APP_GIT_SHA: ${CI_COMMIT_SHA}
- APP_BUILD_TIME=${CI_PIPELINE_CREATED} APP_BUILD_TIME: ${CI_PIPELINE_CREATED}
- CONTAINER_IMAGE_TAG=${CI_COMMIT_SHA:0:8} CONTAINER_IMAGE_TAG: ${CI_COMMIT_SHA:0:8}
depends_on: [] depends_on: []
when: when:
branch: master branch: master
@@ -138,7 +160,7 @@ steps:
event: push event: push
# =========================================== # ===========================================
# STAGE 3: Deploy (after Docker builds) # STAGE 3: Deploy and Run Migrations
# =========================================== # ===========================================
deploy: deploy:
image: bitnami/kubectl:latest image: bitnami/kubectl:latest
@@ -149,12 +171,17 @@ steps:
- mkdir -p ~/.kube - mkdir -p ~/.kube
- echo "$KUBECONFIG_CONTENT" | tr -d '[:space:]' | base64 -d > ~/.kube/config - echo "$KUBECONFIG_CONTENT" | tr -d '[:space:]' | base64 -d > ~/.kube/config
- chmod 600 ~/.kube/config - chmod 600 ~/.kube/config
# Deploy backend first
- kubectl set image deployment/scraper scraper=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper - kubectl set image deployment/scraper scraper=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/scraper -n dispensary-scraper --timeout=300s
# Note: Migrations run automatically at startup via auto-migrate
# Deploy remaining services
# Resilience: ensure workers are scaled up if at 0
- REPLICAS=$(kubectl get deployment scraper-worker -n dispensary-scraper -o jsonpath='{.spec.replicas}'); if [ "$REPLICAS" = "0" ]; then echo "Scaling workers from 0 to 5"; kubectl scale deployment/scraper-worker --replicas=5 -n dispensary-scraper; fi
- kubectl set image deployment/scraper-worker worker=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper - kubectl set image deployment/scraper-worker worker=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/cannaiq-frontend cannaiq-frontend=code.cannabrands.app/creationshop/cannaiq-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper - kubectl set image deployment/cannaiq-frontend cannaiq-frontend=code.cannabrands.app/creationshop/cannaiq-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findadispo-frontend findadispo-frontend=code.cannabrands.app/creationshop/findadispo-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper - kubectl set image deployment/findadispo-frontend findadispo-frontend=code.cannabrands.app/creationshop/findadispo-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findagram-frontend findagram-frontend=code.cannabrands.app/creationshop/findagram-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper - kubectl set image deployment/findagram-frontend findagram-frontend=code.cannabrands.app/creationshop/findagram-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/scraper -n dispensary-scraper --timeout=300s
- kubectl rollout status deployment/cannaiq-frontend -n dispensary-scraper --timeout=120s - kubectl rollout status deployment/cannaiq-frontend -n dispensary-scraper --timeout=120s
depends_on: depends_on:
- docker-backend - docker-backend

193
.woodpecker/ci.yml Normal file
View File

@@ -0,0 +1,193 @@
steps:
# ===========================================
# PR VALIDATION: Parallel type checks (PRs only)
# ===========================================
typecheck-backend:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd backend
- npm ci --prefer-offline
- npx tsc --noEmit
depends_on: []
when:
event: pull_request
typecheck-cannaiq:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd cannaiq
- npm ci --prefer-offline
- npx tsc --noEmit
depends_on: []
when:
event: pull_request
typecheck-findadispo:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd findadispo/frontend
- npm ci --prefer-offline
- npx tsc --noEmit 2>/dev/null || true
depends_on: []
when:
event: pull_request
typecheck-findagram:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd findagram/frontend
- npm ci --prefer-offline
- npx tsc --noEmit 2>/dev/null || true
depends_on: []
when:
event: pull_request
# ===========================================
# AUTO-MERGE: Merge PR after all checks pass
# ===========================================
auto-merge:
image: alpine:latest
environment:
GITEA_TOKEN:
from_secret: gitea_token
commands:
- apk add --no-cache curl
- |
echo "Merging PR #${CI_COMMIT_PULL_REQUEST}..."
curl -s -X POST \
-H "Authorization: token $GITEA_TOKEN" \
-H "Content-Type: application/json" \
-d '{"Do":"merge"}' \
"https://code.cannabrands.app/api/v1/repos/Creationshop/dispensary-scraper/pulls/${CI_COMMIT_PULL_REQUEST}/merge"
depends_on:
- typecheck-backend
- typecheck-cannaiq
- typecheck-findadispo
- typecheck-findagram
when:
event: pull_request
# ===========================================
# MASTER DEPLOY: Parallel Docker builds
# ===========================================
docker-backend:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/dispensary-scraper
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: backend/Dockerfile
context: backend
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
build_args:
APP_BUILD_VERSION: ${CI_COMMIT_SHA:0:8}
APP_GIT_SHA: ${CI_COMMIT_SHA}
APP_BUILD_TIME: ${CI_PIPELINE_CREATED}
CONTAINER_IMAGE_TAG: ${CI_COMMIT_SHA:0:8}
depends_on: []
when:
branch: master
event: push
docker-cannaiq:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/cannaiq-frontend
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: cannaiq/Dockerfile
context: cannaiq
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
depends_on: []
when:
branch: master
event: push
docker-findadispo:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/findadispo-frontend
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: findadispo/frontend/Dockerfile
context: findadispo/frontend
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
depends_on: []
when:
branch: master
event: push
docker-findagram:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/findagram-frontend
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: findagram/frontend/Dockerfile
context: findagram/frontend
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
depends_on: []
when:
branch: master
event: push
# ===========================================
# STAGE 3: Deploy and Run Migrations
# ===========================================
deploy:
image: bitnami/kubectl:latest
environment:
KUBECONFIG_CONTENT:
from_secret: kubeconfig_data
commands:
- mkdir -p ~/.kube
- echo "$KUBECONFIG_CONTENT" | tr -d '[:space:]' | base64 -d > ~/.kube/config
- chmod 600 ~/.kube/config
# Deploy backend first
- kubectl set image deployment/scraper scraper=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/scraper -n dispensary-scraper --timeout=300s
# Note: Migrations run automatically at startup via auto-migrate
# Deploy remaining services
# Resilience: ensure workers are scaled up if at 0
- REPLICAS=$(kubectl get deployment scraper-worker -n dispensary-scraper -o jsonpath='{.spec.replicas}'); if [ "$REPLICAS" = "0" ]; then echo "Scaling workers from 0 to 5"; kubectl scale deployment/scraper-worker --replicas=5 -n dispensary-scraper; fi
- kubectl set image deployment/scraper-worker worker=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/cannaiq-frontend cannaiq-frontend=code.cannabrands.app/creationshop/cannaiq-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findadispo-frontend findadispo-frontend=code.cannabrands.app/creationshop/findadispo-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findagram-frontend findagram-frontend=code.cannabrands.app/creationshop/findagram-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/cannaiq-frontend -n dispensary-scraper --timeout=120s
depends_on:
- docker-backend
- docker-cannaiq
- docker-findadispo
- docker-findagram
when:
branch: master
event: push

View File

@@ -119,7 +119,42 @@ npx tsx src/db/migrate.ts
- Importing it at runtime causes startup crashes if env vars aren't perfect - Importing it at runtime causes startup crashes if env vars aren't perfect
- `pool.ts` uses lazy initialization - only validates when first query is made - `pool.ts` uses lazy initialization - only validates when first query is made
### 6. LOCAL DEVELOPMENT BY DEFAULT ### 6. ALL API ROUTES REQUIRE AUTHENTICATION — NO EXCEPTIONS
**Every API router MUST apply `authMiddleware` at the router level.**
```typescript
import { authMiddleware } from '../auth/middleware';
const router = Router();
router.use(authMiddleware); // REQUIRED - first line after router creation
```
**Authentication flow (see `src/auth/middleware.ts`):**
1. Check Bearer token (JWT or API token) → grant access if valid
2. Check trusted origins (cannaiq.co, findadispo.com, localhost, etc.) → grant access
3. Check trusted IPs (127.0.0.1, ::1, internal pod IPs) → grant access
4. **Return 401 Unauthorized** if none of the above
**NEVER create API routes without auth middleware:**
- No "public" endpoints that bypass authentication
- No "read-only" exceptions
- No "analytics-only" exceptions
- If an endpoint exists under `/api/*`, it MUST be protected
**When creating new route files:**
1. Import `authMiddleware` from `../auth/middleware`
2. Add `router.use(authMiddleware)` immediately after creating the router
3. Document security requirements in file header comments
**Trusted origins (defined in middleware):**
- `https://cannaiq.co`
- `https://findadispo.com`
- `https://findagram.co`
- `*.cannabrands.app` domains
- `localhost:*` for development
### 7. LOCAL DEVELOPMENT BY DEFAULT
**Quick Start:** **Quick Start:**
```bash ```bash
@@ -452,6 +487,7 @@ const result = await pool.query(`
16. **Running `lsof -ti:PORT | xargs kill`** or similar process-killing commands 16. **Running `lsof -ti:PORT | xargs kill`** or similar process-killing commands
17. **Using hardcoded database names** in code or comments 17. **Using hardcoded database names** in code or comments
18. **Creating or connecting to a second database** 18. **Creating or connecting to a second database**
19. **Creating API routes without authMiddleware** (all `/api/*` routes MUST be protected)
--- ---

View File

@@ -5,7 +5,7 @@ FROM code.cannabrands.app/creationshop/node:20-slim AS builder
WORKDIR /app WORKDIR /app
COPY package*.json ./ COPY package*.json ./
RUN npm ci RUN npm install
COPY . . COPY . .
RUN npm run build RUN npm run build
@@ -25,8 +25,9 @@ ENV APP_GIT_SHA=${APP_GIT_SHA}
ENV APP_BUILD_TIME=${APP_BUILD_TIME} ENV APP_BUILD_TIME=${APP_BUILD_TIME}
ENV CONTAINER_IMAGE_TAG=${CONTAINER_IMAGE_TAG} ENV CONTAINER_IMAGE_TAG=${CONTAINER_IMAGE_TAG}
# Install Chromium dependencies # Install Chromium dependencies and curl for HTTP requests
RUN apt-get update && apt-get install -y \ RUN apt-get update && apt-get install -y \
curl \
chromium \ chromium \
fonts-liberation \ fonts-liberation \
libnss3 \ libnss3 \
@@ -43,10 +44,13 @@ ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium
WORKDIR /app WORKDIR /app
COPY package*.json ./ COPY package*.json ./
RUN npm ci --omit=dev RUN npm install --omit=dev
COPY --from=builder /app/dist ./dist COPY --from=builder /app/dist ./dist
# Copy migrations for auto-migrate on startup
COPY migrations ./migrations
# Create local images directory for when MinIO is not configured # Create local images directory for when MinIO is not configured
RUN mkdir -p /app/public/images/products RUN mkdir -p /app/public/images/products

View File

@@ -0,0 +1,394 @@
# Brand Intelligence API
## Endpoint
```
GET /api/analytics/v2/brand/:name/intelligence
```
## Query Parameters
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `window` | `7d\|30d\|90d` | `30d` | Time window for trend calculations |
| `state` | string | - | Filter by state code (e.g., `AZ`) |
| `category` | string | - | Filter by category (e.g., `Flower`) |
## Response Payload Schema
```typescript
interface BrandIntelligenceResult {
brand_name: string;
window: '7d' | '30d' | '90d';
generated_at: string; // ISO timestamp when data was computed
performance_snapshot: PerformanceSnapshot;
alerts: Alerts;
sku_performance: SkuPerformance[];
retail_footprint: RetailFootprint;
competitive_landscape: CompetitiveLandscape;
inventory_health: InventoryHealth;
promo_performance: PromoPerformance;
}
```
---
## Section 1: Performance Snapshot
Summary cards with key brand metrics.
```typescript
interface PerformanceSnapshot {
active_skus: number; // Total products in catalog
total_revenue_30d: number | null; // Estimated from qty × price
total_stores: number; // Active retail partners
new_stores_30d: number; // New distribution in window
market_share: number | null; // % of category SKUs
avg_wholesale_price: number | null;
price_position: 'premium' | 'value' | 'competitive';
}
```
**UI Label Mapping:**
| Field | User-Facing Label | Helper Text |
|-------|-------------------|-------------|
| `active_skus` | Active Products | X total in catalog |
| `total_revenue_30d` | Monthly Revenue | Estimated from sales |
| `total_stores` | Retail Distribution | Active retail partners |
| `new_stores_30d` | New Opportunities | X new in last 30 days |
| `market_share` | Category Position | % of category |
| `avg_wholesale_price` | Avg Wholesale | Per unit |
| `price_position` | Pricing Tier | Premium/Value/Market Rate |
---
## Section 2: Alerts
Issues requiring attention.
```typescript
interface Alerts {
lost_stores_30d_count: number;
lost_skus_30d_count: number;
competitor_takeover_count: number;
avg_oos_duration_days: number | null;
avg_reorder_lag_days: number | null;
items: AlertItem[];
}
interface AlertItem {
type: 'lost_store' | 'delisted_sku' | 'shelf_loss' | 'extended_oos';
severity: 'critical' | 'warning';
store_name?: string;
product_name?: string;
competitor_brand?: string;
days_since?: number;
state_code?: string;
}
```
**UI Label Mapping:**
| Field | User-Facing Label |
|-------|-------------------|
| `lost_stores_30d_count` | Accounts at Risk |
| `lost_skus_30d_count` | Delisted SKUs |
| `competitor_takeover_count` | Shelf Losses |
| `avg_oos_duration_days` | Avg Stockout Length |
| `avg_reorder_lag_days` | Avg Restock Time |
| `severity: critical` | Urgent |
| `severity: warning` | Watch |
---
## Section 3: SKU Performance (Product Velocity)
How fast each SKU sells.
```typescript
interface SkuPerformance {
store_product_id: number;
product_name: string;
category: string | null;
daily_velocity: number; // Units/day estimate
velocity_status: 'hot' | 'steady' | 'slow' | 'stale';
retail_price: number | null;
on_sale: boolean;
stores_carrying: number;
stock_status: 'in_stock' | 'low_stock' | 'out_of_stock';
}
```
**UI Label Mapping:**
| Field | User-Facing Label |
|-------|-------------------|
| `daily_velocity` | Daily Rate |
| `velocity_status` | Momentum |
| `velocity_status: hot` | Hot |
| `velocity_status: steady` | Steady |
| `velocity_status: slow` | Slow |
| `velocity_status: stale` | Stale |
| `retail_price` | Retail Price |
| `on_sale` | Promo (badge) |
**Velocity Thresholds:**
- `hot`: >= 5 units/day
- `steady`: >= 1 unit/day
- `slow`: >= 0.1 units/day
- `stale`: < 0.1 units/day
---
## Section 4: Retail Footprint
Store placement and coverage.
```typescript
interface RetailFootprint {
total_stores: number;
in_stock_count: number;
out_of_stock_count: number;
penetration_by_region: RegionPenetration[];
whitespace_stores: WhitespaceStore[];
}
interface RegionPenetration {
state_code: string;
store_count: number;
percent_reached: number; // % of state's dispensaries
in_stock: number;
out_of_stock: number;
}
interface WhitespaceStore {
store_id: number;
store_name: string;
state_code: string;
city: string | null;
category_fit: number; // How many competing brands they carry
competitor_brands: string[];
}
```
**UI Label Mapping:**
| Field | User-Facing Label |
|-------|-------------------|
| `penetration_by_region` | Market Coverage by Region |
| `percent_reached` | X% reached |
| `in_stock` | X stocked |
| `out_of_stock` | X out |
| `whitespace_stores` | Expansion Opportunities |
| `category_fit` | X fit |
---
## Section 5: Competitive Landscape
Market positioning vs competitors.
```typescript
interface CompetitiveLandscape {
brand_price_position: 'premium' | 'value' | 'competitive';
market_share_trend: MarketSharePoint[];
competitors: Competitor[];
head_to_head_skus: HeadToHead[];
}
interface MarketSharePoint {
date: string;
share_percent: number;
}
interface Competitor {
brand_name: string;
store_overlap_percent: number;
price_position: 'premium' | 'value' | 'competitive';
avg_price: number | null;
sku_count: number;
}
interface HeadToHead {
product_name: string;
brand_price: number;
competitor_brand: string;
competitor_price: number;
price_diff_percent: number;
}
```
**UI Label Mapping:**
| Field | User-Facing Label |
|-------|-------------------|
| `price_position: premium` | Premium Tier |
| `price_position: value` | Value Leader |
| `price_position: competitive` | Market Rate |
| `market_share_trend` | Share of Shelf Trend |
| `head_to_head_skus` | Price Comparison |
| `store_overlap_percent` | X% store overlap |
---
## Section 6: Inventory Health
Stock projections and risk levels.
```typescript
interface InventoryHealth {
critical_count: number; // <7 days stock
warning_count: number; // 7-14 days stock
healthy_count: number; // 14-90 days stock
overstocked_count: number; // >90 days stock
skus: InventorySku[];
overstock_alert: OverstockItem[];
}
interface InventorySku {
store_product_id: number;
product_name: string;
store_name: string;
days_of_stock: number | null;
risk_level: 'critical' | 'elevated' | 'moderate' | 'healthy';
current_quantity: number | null;
daily_sell_rate: number | null;
}
interface OverstockItem {
product_name: string;
store_name: string;
excess_units: number;
days_of_stock: number;
}
```
**UI Label Mapping:**
| Field | User-Facing Label |
|-------|-------------------|
| `risk_level: critical` | Reorder Now |
| `risk_level: elevated` | Low Stock |
| `risk_level: moderate` | Monitor |
| `risk_level: healthy` | Healthy |
| `critical_count` | Urgent (<7 days) |
| `warning_count` | Low (7-14 days) |
| `overstocked_count` | Excess (>90 days) |
| `days_of_stock` | X days remaining |
| `overstock_alert` | Overstock Alert |
| `excess_units` | X excess units |
---
## Section 7: Promotion Effectiveness
How promotions impact sales.
```typescript
interface PromoPerformance {
avg_baseline_velocity: number | null;
avg_promo_velocity: number | null;
avg_velocity_lift: number | null; // % increase during promo
avg_efficiency_score: number | null; // ROI proxy
promotions: Promotion[];
}
interface Promotion {
product_name: string;
store_name: string;
status: 'active' | 'scheduled' | 'ended';
start_date: string;
end_date: string | null;
regular_price: number;
promo_price: number;
discount_percent: number;
baseline_velocity: number | null;
promo_velocity: number | null;
velocity_lift: number | null;
efficiency_score: number | null;
}
```
**UI Label Mapping:**
| Field | User-Facing Label |
|-------|-------------------|
| `avg_baseline_velocity` | Normal Rate |
| `avg_promo_velocity` | During Promos |
| `avg_velocity_lift` | Avg Sales Lift |
| `avg_efficiency_score` | ROI Score |
| `velocity_lift` | Sales Lift |
| `efficiency_score` | ROI Score |
| `status: active` | Live |
| `status: scheduled` | Scheduled |
| `status: ended` | Ended |
---
## Example Queries
### Get full payload
```javascript
const response = await fetch('/api/analytics/v2/brand/Wyld/intelligence?window=30d');
const data = await response.json();
```
### Extract summary cards (flattened)
```javascript
const { performance_snapshot: ps, alerts } = data;
const summaryCards = {
activeProducts: ps.active_skus,
monthlyRevenue: ps.total_revenue_30d,
retailDistribution: ps.total_stores,
newOpportunities: ps.new_stores_30d,
categoryPosition: ps.market_share,
avgWholesale: ps.avg_wholesale_price,
pricingTier: ps.price_position,
accountsAtRisk: alerts.lost_stores_30d_count,
delistedSkus: alerts.lost_skus_30d_count,
shelfLosses: alerts.competitor_takeover_count,
};
```
### Get top 10 fastest selling SKUs
```javascript
const topSkus = data.sku_performance
.filter(sku => sku.velocity_status === 'hot' || sku.velocity_status === 'steady')
.sort((a, b) => b.daily_velocity - a.daily_velocity)
.slice(0, 10);
```
### Get critical inventory alerts only
```javascript
const criticalInventory = data.inventory_health.skus
.filter(sku => sku.risk_level === 'critical');
```
### Get states with <50% penetration
```javascript
const underPenetrated = data.retail_footprint.penetration_by_region
.filter(region => region.percent_reached < 50)
.sort((a, b) => a.percent_reached - b.percent_reached);
```
### Get active promotions with positive lift
```javascript
const effectivePromos = data.promo_performance.promotions
.filter(p => p.status === 'active' && p.velocity_lift > 0)
.sort((a, b) => b.velocity_lift - a.velocity_lift);
```
### Build chart data for market share trend
```javascript
const chartData = data.competitive_landscape.market_share_trend.map(point => ({
x: new Date(point.date),
y: point.share_percent,
}));
```
---
## Notes for Frontend Implementation
1. **All fields are snake_case** - transform to camelCase if needed
2. **Null values are possible** - handle gracefully in UI
3. **Arrays may be empty** - show appropriate empty states
4. **Timestamps are ISO format** - parse with `new Date()`
5. **Percentages are already computed** - no need to multiply by 100
6. **The `window` parameter affects trend calculations** - 7d/30d/90d

View File

@@ -500,17 +500,18 @@ CREATE TABLE proxies (
Proxies are mandatory. There is no environment variable to disable them. Workers will refuse to start without active proxies in the database. Proxies are mandatory. There is no environment variable to disable them. Workers will refuse to start without active proxies in the database.
### Fingerprints Available ### User-Agent Generation
The client includes 6 browser fingerprints: See `workflow-12102025.md` for full specification.
- Chrome 131 on Windows
- Chrome 131 on macOS
- Chrome 120 on Windows
- Firefox 133 on Windows
- Safari 17.2 on macOS
- Edge 131 on Windows
Each includes proper `sec-ch-ua`, `sec-ch-ua-platform`, and `sec-ch-ua-mobile` headers. **Summary:**
- Uses `intoli/user-agents` library (daily-updated market share data)
- Device distribution: Mobile 62%, Desktop 36%, Tablet 2%
- Browser whitelist: Chrome, Safari, Edge, Firefox only
- UA sticks until IP rotates (403 or manual rotation)
- Failure = alert admin + stop crawl (no fallback)
Each fingerprint includes proper `sec-ch-ua`, `sec-ch-ua-platform`, and `sec-ch-ua-mobile` headers.
--- ---

View File

@@ -0,0 +1,584 @@
# Task Workflow Documentation
**Date: 2024-12-10**
This document describes the complete task/job processing architecture after the 2024-12-10 rewrite.
---
## Complete Architecture
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ KUBERNETES CLUSTER │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ API SERVER POD (scraper) │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌────────────────────────────────────────┐ │ │
│ │ │ Express API │ │ TaskScheduler │ │ │
│ │ │ │ │ (src/services/task-scheduler.ts) │ │ │
│ │ │ /api/job-queue │ │ │ │ │
│ │ │ /api/tasks │ │ • Polls every 60s │ │ │
│ │ │ /api/schedules │ │ • Checks task_schedules table │ │ │
│ │ └────────┬─────────┘ │ • SELECT FOR UPDATE SKIP LOCKED │ │ │
│ │ │ │ • Generates tasks when due │ │ │
│ │ │ └──────────────────┬─────────────────────┘ │ │
│ │ │ │ │ │
│ └────────────┼──────────────────────────────────┼──────────────────────────┘ │
│ │ │ │
│ │ ┌────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ POSTGRESQL DATABASE │ │
│ │ │ │
│ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │
│ │ │ task_schedules │ │ worker_tasks │ │ │
│ │ │ │ │ │ │ │
│ │ │ • product_refresh │───────►│ • pending tasks │ │ │
│ │ │ • store_discovery │ create │ • claimed tasks │ │ │
│ │ │ • analytics_refresh │ tasks │ • running tasks │ │ │
│ │ │ │ │ • completed tasks │ │ │
│ │ │ next_run_at │ │ │ │ │
│ │ │ last_run_at │ │ role, dispensary_id │ │ │
│ │ │ interval_hours │ │ priority, status │ │ │
│ │ └─────────────────────┘ └──────────┬──────────┘ │ │
│ │ │ │ │
│ └─────────────────────────────────────────────┼────────────────────────────┘ │
│ │ │
│ ┌──────────────────────┘ │
│ │ Workers poll for tasks │
│ │ (SELECT FOR UPDATE SKIP LOCKED) │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ WORKER PODS (StatefulSet: scraper-worker) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Worker 0 │ │ Worker 1 │ │ Worker 2 │ │ Worker N │ │ │
│ │ │ │ │ │ │ │ │ │ │ │
│ │ │ task-worker │ │ task-worker │ │ task-worker │ │ task-worker │ │ │
│ │ │ .ts │ │ .ts │ │ .ts │ │ .ts │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────────┘
```
---
## Startup Sequence
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ API SERVER STARTUP │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. Express app initializes │
│ │ │
│ ▼ │
│ 2. runAutoMigrations() │
│ • Runs pending migrations (including 079_task_schedules.sql) │
│ │ │
│ ▼ │
│ 3. initializeMinio() / initializeImageStorage() │
│ │ │
│ ▼ │
│ 4. cleanupOrphanedJobs() │
│ │ │
│ ▼ │
│ 5. taskScheduler.start() ◄─── NEW (per TASK_WORKFLOW_2024-12-10.md) │
│ │ │
│ ├── Recover stale tasks (workers that died) │
│ ├── Ensure default schedules exist in task_schedules │
│ ├── Check and run any due schedules immediately │
│ └── Start 60-second poll interval │
│ │ │
│ ▼ │
│ 6. app.listen(PORT) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ WORKER POD STARTUP │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. K8s starts pod from StatefulSet │
│ │ │
│ ▼ │
│ 2. TaskWorker.constructor() │
│ • Create DB pool │
│ • Create CrawlRotator │
│ │ │
│ ▼ │
│ 3. initializeStealth() │
│ • Load proxies from DB (REQUIRED - fails if none) │
│ • Wire rotator to Dutchie client │
│ │ │
│ ▼ │
│ 4. register() with API │
│ • Optional - continues if fails │
│ │ │
│ ▼ │
│ 5. startRegistryHeartbeat() every 30s │
│ │ │
│ ▼ │
│ 6. processNextTask() loop │
│ │ │
│ ├── Poll for pending task (FOR UPDATE SKIP LOCKED) │
│ ├── Claim task atomically │
│ ├── Execute handler (product_refresh, store_discovery, etc.) │
│ ├── Mark complete/failed │
│ ├── Chain next task if applicable │
│ └── Loop │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Schedule Flow
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ SCHEDULER POLL (every 60 seconds) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ BEGIN TRANSACTION │
│ │ │
│ ▼ │
│ SELECT * FROM task_schedules │
│ WHERE enabled = true AND next_run_at <= NOW() │
│ FOR UPDATE SKIP LOCKED ◄─── Prevents duplicate execution across replicas │
│ │ │
│ ▼ │
│ For each due schedule: │
│ │ │
│ ├── product_refresh_all │
│ │ └─► Query dispensaries needing crawl │
│ │ └─► Create product_refresh tasks in worker_tasks │
│ │ │
│ ├── store_discovery_dutchie │
│ │ └─► Create single store_discovery task │
│ │ │
│ └── analytics_refresh │
│ └─► Create single analytics_refresh task │
│ │ │
│ ▼ │
│ UPDATE task_schedules SET │
│ last_run_at = NOW(), │
│ next_run_at = NOW() + interval_hours │
│ │ │
│ ▼ │
│ COMMIT │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Task Lifecycle
```
┌──────────┐
│ SCHEDULE │
│ DUE │
└────┬─────┘
┌──────────────┐ claim ┌──────────────┐ start ┌──────────────┐
│ PENDING │────────────►│ CLAIMED │────────────►│ RUNNING │
└──────────────┘ └──────────────┘ └──────┬───────┘
▲ │
│ ┌──────────────┼──────────────┐
│ retry │ │ │
│ (if retries < max) ▼ ▼ ▼
│ ┌──────────┐ ┌──────────┐ ┌──────────┐
└──────────────────────────────────│ FAILED │ │ COMPLETED│ │ STALE │
└──────────┘ └──────────┘ └────┬─────┘
recover_stale_tasks()
┌──────────┐
│ PENDING │
└──────────┘
```
---
## Database Tables
### task_schedules (NEW - migration 079)
Stores schedule definitions. Survives restarts.
```sql
CREATE TABLE task_schedules (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL UNIQUE,
role VARCHAR(50) NOT NULL, -- product_refresh, store_discovery, etc.
enabled BOOLEAN DEFAULT TRUE,
interval_hours INTEGER NOT NULL, -- How often to run
priority INTEGER DEFAULT 0, -- Task priority when created
state_code VARCHAR(2), -- Optional filter
last_run_at TIMESTAMPTZ, -- When it last ran
next_run_at TIMESTAMPTZ, -- When it's due next
last_task_count INTEGER, -- Tasks created last run
last_error TEXT -- Error message if failed
);
```
### worker_tasks (migration 074)
The task queue. Workers pull from here.
```sql
CREATE TABLE worker_tasks (
id SERIAL PRIMARY KEY,
role task_role NOT NULL, -- What type of work
dispensary_id INTEGER, -- Which store (if applicable)
platform VARCHAR(50), -- Which platform
status task_status DEFAULT 'pending',
priority INTEGER DEFAULT 0, -- Higher = process first
scheduled_for TIMESTAMP, -- Don't process before this time
worker_id VARCHAR(100), -- Which worker claimed it
claimed_at TIMESTAMP,
started_at TIMESTAMP,
completed_at TIMESTAMP,
last_heartbeat_at TIMESTAMP, -- For stale detection
result JSONB,
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3
);
```
---
## Default Schedules
| Name | Role | Interval | Priority | Description |
|------|------|----------|----------|-------------|
| `payload_fetch_all` | payload_fetch | 4 hours | 0 | Fetch payloads from Dutchie API (chains to product_refresh) |
| `store_discovery_dutchie` | store_discovery | 24 hours | 5 | Find new Dutchie stores |
| `analytics_refresh` | analytics_refresh | 6 hours | 0 | Refresh MVs |
---
## Task Roles
| Role | Description | Creates Tasks For |
|------|-------------|-------------------|
| `payload_fetch` | **NEW** - Fetch from Dutchie API, save to disk | Each dispensary needing crawl |
| `product_refresh` | **CHANGED** - Read local payload, normalize, upsert to DB | Chained from payload_fetch |
| `store_discovery` | Find new dispensaries, returns newStoreIds[] | Single task per platform |
| `entry_point_discovery` | **DEPRECATED** - Resolve platform IDs | No longer used |
| `product_discovery` | Initial product fetch for new stores | Chained from store_discovery |
| `analytics_refresh` | Refresh MVs | Single global task |
### Payload/Refresh Separation (2024-12-10)
The crawl workflow is now split into two phases:
```
payload_fetch (scheduled every 4h)
└─► Hit Dutchie GraphQL API
└─► Save raw JSON to /storage/payloads/{year}/{month}/{day}/store_{id}_{ts}.json.gz
└─► Record metadata in raw_crawl_payloads table
└─► Queue product_refresh task with payload_id
product_refresh (chained from payload_fetch)
└─► Load payload from filesystem (NOT from API)
└─► Normalize via DutchieNormalizer
└─► Upsert to store_products
└─► Create snapshots
└─► Track missing products
└─► Download images
```
**Benefits:**
- **Retry-friendly**: If normalize fails, re-run product_refresh without re-crawling
- **Replay-able**: Run product_refresh against any historical payload
- **Faster refreshes**: Local file read vs network call
- **Historical diffs**: Compare payloads to see what changed between crawls
- **Less API pressure**: Only payload_fetch hits Dutchie
---
## Task Chaining
Tasks automatically queue follow-up tasks upon successful completion. This creates two main flows:
### Discovery Flow (New Stores)
When `store_discovery` finds new dispensaries, they automatically get their initial product data:
```
store_discovery
└─► Discovers new locations via Dutchie GraphQL
└─► Auto-promotes valid locations to dispensaries table
└─► Collects newDispensaryIds[] from promotions
└─► Returns { newStoreIds: [...] } in result
chainNextTask() detects newStoreIds
└─► Creates product_discovery task for each new store
product_discovery
└─► Calls handlePayloadFetch() internally
└─► payload_fetch hits Dutchie API
└─► Saves raw JSON to /storage/payloads/
└─► Queues product_refresh task with payload_id
product_refresh
└─► Loads payload from filesystem
└─► Normalizes and upserts to store_products
└─► Creates snapshots, downloads images
```
**Complete Discovery Chain:**
```
store_discovery → product_discovery → payload_fetch → product_refresh
(internal call) (queues next)
```
### Scheduled Flow (Existing Stores)
For existing stores, `payload_fetch_all` schedule runs every 4 hours:
```
TaskScheduler (every 60s)
└─► Checks task_schedules for due schedules
└─► payload_fetch_all is due
└─► Generates payload_fetch task for each dispensary
payload_fetch
└─► Hits Dutchie GraphQL API
└─► Saves raw JSON to /storage/payloads/
└─► Queues product_refresh task with payload_id
product_refresh
└─► Loads payload from filesystem (NOT API)
└─► Normalizes via DutchieNormalizer
└─► Upserts to store_products
└─► Creates snapshots
```
**Complete Scheduled Chain:**
```
payload_fetch → product_refresh
(queues) (reads local)
```
### Chaining Implementation
Task chaining is handled in two places:
1. **Internal chaining (handler calls handler):**
- `product_discovery` calls `handlePayloadFetch()` directly
2. **External chaining (chainNextTask() in task-service.ts):**
- Called after task completion
- `store_discovery` → queues `product_discovery` for each newStoreId
3. **Queue-based chaining (taskService.createTask):**
- `payload_fetch` queues `product_refresh` with `payload: { payload_id }`
---
## Payload API Endpoints
Raw crawl payloads can be accessed via the Payloads API:
| Endpoint | Method | Description |
|----------|--------|-------------|
| `GET /api/payloads` | GET | List payload metadata (paginated) |
| `GET /api/payloads/:id` | GET | Get payload metadata by ID |
| `GET /api/payloads/:id/data` | GET | Get full payload JSON (decompressed) |
| `GET /api/payloads/store/:dispensaryId` | GET | List payloads for a store |
| `GET /api/payloads/store/:dispensaryId/latest` | GET | Get latest payload for a store |
| `GET /api/payloads/store/:dispensaryId/diff` | GET | Diff two payloads for changes |
### Payload Diff Response
The diff endpoint returns:
```json
{
"success": true,
"from": { "id": 123, "fetchedAt": "...", "productCount": 100 },
"to": { "id": 456, "fetchedAt": "...", "productCount": 105 },
"diff": {
"added": 10,
"removed": 5,
"priceChanges": 8,
"stockChanges": 12
},
"details": {
"added": [...],
"removed": [...],
"priceChanges": [...],
"stockChanges": [...]
}
}
```
---
## API Endpoints
### Schedules (NEW)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `GET /api/schedules` | GET | List all schedules |
| `PUT /api/schedules/:id` | PUT | Update schedule |
| `POST /api/schedules/:id/trigger` | POST | Run schedule immediately |
### Task Creation (rewired 2024-12-10)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `POST /api/job-queue/enqueue` | POST | Create single task |
| `POST /api/job-queue/enqueue-batch` | POST | Create batch tasks |
| `POST /api/job-queue/enqueue-state` | POST | Create tasks for state |
| `POST /api/tasks` | POST | Direct task creation |
### Task Management
| Endpoint | Method | Description |
|----------|--------|-------------|
| `GET /api/tasks` | GET | List tasks |
| `GET /api/tasks/:id` | GET | Get single task |
| `GET /api/tasks/counts` | GET | Task counts by status |
| `POST /api/tasks/recover-stale` | POST | Recover stale tasks |
---
## Key Files
| File | Purpose |
|------|---------|
| `src/services/task-scheduler.ts` | **NEW** - DB-driven scheduler |
| `src/tasks/task-worker.ts` | Worker that processes tasks |
| `src/tasks/task-service.ts` | Task CRUD operations |
| `src/tasks/handlers/payload-fetch.ts` | **NEW** - Fetches from API, saves to disk |
| `src/tasks/handlers/product-refresh.ts` | **CHANGED** - Reads from disk, processes to DB |
| `src/utils/payload-storage.ts` | **NEW** - Payload save/load utilities |
| `src/routes/tasks.ts` | Task API endpoints |
| `src/routes/job-queue.ts` | Job Queue UI endpoints (rewired) |
| `migrations/079_task_schedules.sql` | Schedule table |
| `migrations/080_raw_crawl_payloads.sql` | Payload metadata table |
| `migrations/081_payload_fetch_columns.sql` | payload, last_fetch_at columns |
| `migrations/074_worker_task_queue.sql` | Task queue table |
---
## Legacy Code (DEPRECATED)
| File | Status | Replacement |
|------|--------|-------------|
| `src/services/scheduler.ts` | DEPRECATED | `task-scheduler.ts` |
| `dispensary_crawl_jobs` table | ORPHANED | `worker_tasks` |
| `job_schedules` table | LEGACY | `task_schedules` |
---
## Dashboard Integration
Both pages remain wired to the dashboard:
| Page | Data Source | Actions |
|------|-------------|---------|
| **Job Queue** | `worker_tasks`, `task_schedules` | Create tasks, view schedules |
| **Task Queue** | `worker_tasks` | View tasks, recover stale |
---
## Multi-Replica Safety
The scheduler uses `SELECT FOR UPDATE SKIP LOCKED` to ensure:
1. **Only one replica** executes a schedule at a time
2. **No duplicate tasks** created
3. **Survives pod restarts** - state in DB, not memory
4. **Self-healing** - recovers stale tasks on startup
```sql
-- This query is atomic across all API server replicas
SELECT * FROM task_schedules
WHERE enabled = true AND next_run_at <= NOW()
FOR UPDATE SKIP LOCKED
```
---
## Worker Scaling (K8s)
Workers run as a StatefulSet in Kubernetes. You can scale from the admin UI or CLI.
### From Admin UI
The Workers page (`/admin/workers`) provides:
- Current replica count display
- Scale up/down buttons
- Target replica input
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `GET /api/workers/k8s/replicas` | GET | Get current/desired replica counts |
| `POST /api/workers/k8s/scale` | POST | Scale to N replicas (body: `{ replicas: N }`) |
### From CLI
```bash
# View current replicas
kubectl get statefulset scraper-worker -n dispensary-scraper
# Scale to 10 workers
kubectl scale statefulset scraper-worker -n dispensary-scraper --replicas=10
# Scale down to 3 workers
kubectl scale statefulset scraper-worker -n dispensary-scraper --replicas=3
```
### Configuration
Environment variables for the API server:
| Variable | Default | Description |
|----------|---------|-------------|
| `K8S_NAMESPACE` | `dispensary-scraper` | Kubernetes namespace |
| `K8S_WORKER_STATEFULSET` | `scraper-worker` | StatefulSet name |
### RBAC Requirements
The API server pod needs these K8s permissions:
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: worker-scaler
namespace: dispensary-scraper
rules:
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["get", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: scraper-worker-scaler
namespace: dispensary-scraper
subjects:
- kind: ServiceAccount
name: default
namespace: dispensary-scraper
roleRef:
kind: Role
name: worker-scaler
apiGroup: rbac.authorization.k8s.io
```

View File

@@ -362,6 +362,148 @@ SET status = 'pending', retry_count = retry_count + 1
WHERE status = 'failed' AND retry_count < max_retries; WHERE status = 'failed' AND retry_count < max_retries;
``` ```
## Concurrent Task Processing (Added 2024-12)
Workers can now process multiple tasks concurrently within a single worker instance. This improves throughput by utilizing async I/O efficiently.
### Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Pod (K8s) │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ TaskWorker │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Task 1 │ │ Task 2 │ │ Task 3 │ (concurrent)│ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ │ │
│ │ Resource Monitor │ │
│ │ ├── Memory: 65% (threshold: 85%) │ │
│ │ ├── CPU: 45% (threshold: 90%) │ │
│ │ └── Status: Normal │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MAX_CONCURRENT_TASKS` | 3 | Maximum tasks a worker will run concurrently |
| `MEMORY_BACKOFF_THRESHOLD` | 0.85 | Back off when heap memory exceeds 85% |
| `CPU_BACKOFF_THRESHOLD` | 0.90 | Back off when CPU exceeds 90% |
| `BACKOFF_DURATION_MS` | 10000 | How long to wait when backing off (10s) |
### How It Works
1. **Main Loop**: Worker continuously tries to fill up to `MAX_CONCURRENT_TASKS`
2. **Resource Monitoring**: Before claiming a new task, worker checks memory and CPU
3. **Backoff**: If resources exceed thresholds, worker pauses and stops claiming new tasks
4. **Concurrent Execution**: Tasks run in parallel using `Promise` - they don't block each other
5. **Graceful Shutdown**: On SIGTERM/decommission, worker stops claiming but waits for active tasks
### Resource Monitoring
```typescript
// ResourceStats interface
interface ResourceStats {
memoryPercent: number; // Current heap usage as decimal (0.0-1.0)
memoryMb: number; // Current heap used in MB
memoryTotalMb: number; // Total heap available in MB
cpuPercent: number; // CPU usage as percentage (0-100)
isBackingOff: boolean; // True if worker is in backoff state
backoffReason: string; // Why the worker is backing off
}
```
### Heartbeat Data
Workers report the following in their heartbeat:
```json
{
"worker_id": "worker-abc123",
"current_task_id": 456,
"current_task_ids": [456, 457, 458],
"active_task_count": 3,
"max_concurrent_tasks": 3,
"status": "active",
"resources": {
"memory_mb": 256,
"memory_total_mb": 512,
"memory_rss_mb": 320,
"memory_percent": 50,
"cpu_user_ms": 12500,
"cpu_system_ms": 3200,
"cpu_percent": 45,
"is_backing_off": false,
"backoff_reason": null
}
}
```
### Backoff Behavior
When resources exceed thresholds:
1. Worker logs the backoff reason:
```
[TaskWorker] MyWorker backing off: Memory at 87.3% (threshold: 85%)
```
2. Worker stops claiming new tasks but continues existing tasks
3. After `BACKOFF_DURATION_MS`, worker rechecks resources
4. When resources return to normal:
```
[TaskWorker] MyWorker resuming normal operation
```
### UI Display
The Workers Dashboard shows:
- **Tasks Column**: `2/3 tasks` (active/max concurrent)
- **Resources Column**: Memory % and CPU % with color coding
- Green: < 50%
- Yellow: 50-74%
- Amber: 75-89%
- Red: 90%+
- **Backing Off**: Orange warning badge when worker is in backoff state
### Task Count Badge Details
```
┌─────────────────────────────────────────────┐
│ Worker: "MyWorker" │
│ Tasks: 2/3 tasks #456, #457 │
│ Resources: 🧠 65% 💻 45% │
│ Status: ● Active │
└─────────────────────────────────────────────┘
```
### Best Practices
1. **Start Conservative**: Use `MAX_CONCURRENT_TASKS=3` initially
2. **Monitor Resources**: Watch for frequent backoffs in logs
3. **Tune Per Workload**: I/O-bound tasks benefit from higher concurrency
4. **Scale Horizontally**: Add more pods rather than cranking concurrency too high
### Code References
| File | Purpose |
|------|---------|
| `src/tasks/task-worker.ts:68-71` | Concurrency environment variables |
| `src/tasks/task-worker.ts:104-111` | ResourceStats interface |
| `src/tasks/task-worker.ts:149-179` | getResourceStats() method |
| `src/tasks/task-worker.ts:184-196` | shouldBackOff() method |
| `src/tasks/task-worker.ts:462-516` | mainLoop() with concurrent claiming |
| `src/routes/worker-registry.ts:148-195` | Heartbeat endpoint handling |
| `cannaiq/src/pages/WorkersDashboard.tsx:233-305` | UI components for resources |
## Monitoring ## Monitoring
### Logs ### Logs

View File

@@ -0,0 +1,27 @@
-- Migration: Worker Commands Table
-- Purpose: Store commands for workers (decommission, etc.)
-- Workers poll this table after each task to check for commands
CREATE TABLE IF NOT EXISTS worker_commands (
id SERIAL PRIMARY KEY,
worker_id TEXT NOT NULL,
command TEXT NOT NULL, -- 'decommission', 'pause', 'resume'
reason TEXT,
issued_by TEXT,
issued_at TIMESTAMPTZ DEFAULT NOW(),
acknowledged_at TIMESTAMPTZ,
executed_at TIMESTAMPTZ,
status TEXT DEFAULT 'pending' -- 'pending', 'acknowledged', 'executed', 'cancelled'
);
-- Index for worker lookups
CREATE INDEX IF NOT EXISTS idx_worker_commands_worker_id ON worker_commands(worker_id);
CREATE INDEX IF NOT EXISTS idx_worker_commands_pending ON worker_commands(worker_id, status) WHERE status = 'pending';
-- Add decommission_requested column to worker_registry for quick checks
ALTER TABLE worker_registry ADD COLUMN IF NOT EXISTS decommission_requested BOOLEAN DEFAULT FALSE;
ALTER TABLE worker_registry ADD COLUMN IF NOT EXISTS decommission_reason TEXT;
ALTER TABLE worker_registry ADD COLUMN IF NOT EXISTS decommission_requested_at TIMESTAMPTZ;
-- Comment
COMMENT ON TABLE worker_commands IS 'Commands issued to workers (decommission after task, pause, etc.)';

View File

@@ -0,0 +1,8 @@
-- Migration 078: Add consecutive_403_count to proxies table
-- Per workflow-12102025.md: Track consecutive 403s per proxy
-- After 3 consecutive 403s with different fingerprints → disable proxy
ALTER TABLE proxies ADD COLUMN IF NOT EXISTS consecutive_403_count INTEGER DEFAULT 0;
-- Add comment explaining the column
COMMENT ON COLUMN proxies.consecutive_403_count IS 'Tracks consecutive 403 blocks. Reset to 0 on success. Proxy disabled at 3.';

View File

@@ -0,0 +1,49 @@
-- Migration 079: Task Schedules for Database-Driven Scheduler
-- Per TASK_WORKFLOW_2024-12-10.md: Replaces node-cron with DB-driven scheduling
--
-- 2024-12-10: Created for reliable, multi-replica-safe task scheduling
-- task_schedules: Stores schedule definitions and state
CREATE TABLE IF NOT EXISTS task_schedules (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL UNIQUE,
role VARCHAR(50) NOT NULL, -- TaskRole: product_refresh, store_discovery, etc.
description TEXT,
-- Schedule configuration
enabled BOOLEAN DEFAULT TRUE,
interval_hours INTEGER NOT NULL DEFAULT 4,
priority INTEGER DEFAULT 0,
-- Optional scope filters
state_code VARCHAR(2), -- NULL = all states
platform VARCHAR(50), -- NULL = all platforms
-- Execution state (updated by scheduler)
last_run_at TIMESTAMPTZ,
next_run_at TIMESTAMPTZ,
last_task_count INTEGER DEFAULT 0,
last_error TEXT,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Indexes for scheduler queries
CREATE INDEX IF NOT EXISTS idx_task_schedules_enabled ON task_schedules(enabled) WHERE enabled = TRUE;
CREATE INDEX IF NOT EXISTS idx_task_schedules_next_run ON task_schedules(next_run_at) WHERE enabled = TRUE;
-- Insert default schedules
INSERT INTO task_schedules (name, role, interval_hours, priority, description, next_run_at)
VALUES
('product_refresh_all', 'product_refresh', 4, 0, 'Generate product refresh tasks for all crawl-enabled stores every 4 hours', NOW()),
('store_discovery_dutchie', 'store_discovery', 24, 5, 'Discover new Dutchie stores daily', NOW()),
('analytics_refresh', 'analytics_refresh', 6, 0, 'Refresh analytics materialized views every 6 hours', NOW())
ON CONFLICT (name) DO NOTHING;
-- Comment for documentation
COMMENT ON TABLE task_schedules IS 'Database-driven task scheduler configuration. Per TASK_WORKFLOW_2024-12-10.md:
- Schedules persist in DB (survive restarts)
- Uses SELECT FOR UPDATE SKIP LOCKED for multi-replica safety
- Scheduler polls every 60s and executes due schedules
- Creates tasks in worker_tasks for task-worker.ts to process';

View File

@@ -0,0 +1,58 @@
-- Migration 080: Raw Crawl Payloads Metadata Table
-- Per TASK_WORKFLOW_2024-12-10.md: Store full GraphQL payloads for historical analysis
--
-- Design Pattern: Metadata/Payload Separation
-- - Metadata (this table): Small, indexed, queryable
-- - Payload (filesystem): Gzipped JSON at storage_path
--
-- Benefits:
-- - Compare any two crawls to see what changed
-- - Replay/re-normalize historical data if logic changes
-- - Debug issues by seeing exactly what the API returned
-- - DB stays small, backups stay fast
--
-- Storage location: /storage/payloads/{year}/{month}/{day}/store_{id}_{timestamp}.json.gz
-- Compression: ~90% reduction (1.5MB -> 150KB per crawl)
CREATE TABLE IF NOT EXISTS raw_crawl_payloads (
id SERIAL PRIMARY KEY,
-- Links to crawl tracking
crawl_run_id INTEGER REFERENCES crawl_runs(id) ON DELETE SET NULL,
dispensary_id INTEGER NOT NULL REFERENCES dispensaries(id) ON DELETE CASCADE,
-- File location (gzipped JSON)
storage_path TEXT NOT NULL,
-- Metadata for quick queries without loading file
product_count INTEGER NOT NULL DEFAULT 0,
size_bytes INTEGER, -- Compressed size
size_bytes_raw INTEGER, -- Uncompressed size
-- Timestamps
fetched_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- Optional: checksum for integrity verification
checksum_sha256 VARCHAR(64)
);
-- Indexes for common queries
CREATE INDEX IF NOT EXISTS idx_raw_crawl_payloads_dispensary
ON raw_crawl_payloads(dispensary_id);
CREATE INDEX IF NOT EXISTS idx_raw_crawl_payloads_dispensary_fetched
ON raw_crawl_payloads(dispensary_id, fetched_at DESC);
CREATE INDEX IF NOT EXISTS idx_raw_crawl_payloads_fetched
ON raw_crawl_payloads(fetched_at DESC);
CREATE INDEX IF NOT EXISTS idx_raw_crawl_payloads_crawl_run
ON raw_crawl_payloads(crawl_run_id)
WHERE crawl_run_id IS NOT NULL;
-- Comments
COMMENT ON TABLE raw_crawl_payloads IS 'Metadata for raw GraphQL payloads stored on filesystem. Per TASK_WORKFLOW_2024-12-10.md: Full payloads enable historical diffs and replay.';
COMMENT ON COLUMN raw_crawl_payloads.storage_path IS 'Path to gzipped JSON file, e.g. /storage/payloads/2024/12/10/store_123_1702234567.json.gz';
COMMENT ON COLUMN raw_crawl_payloads.size_bytes IS 'Compressed file size in bytes';
COMMENT ON COLUMN raw_crawl_payloads.size_bytes_raw IS 'Uncompressed payload size in bytes';

View File

@@ -0,0 +1,37 @@
-- Migration 081: Payload Fetch Columns
-- Per TASK_WORKFLOW_2024-12-10.md: Separates API fetch from data processing
--
-- New architecture:
-- - payload_fetch: Hits Dutchie API, saves raw payload to disk
-- - product_refresh: Reads local payload, normalizes, upserts to DB
--
-- This migration adds:
-- 1. payload column to worker_tasks (for task chaining data)
-- 2. processed_at column to raw_crawl_payloads (track when payload was processed)
-- 3. last_fetch_at column to dispensaries (track when last payload was fetched)
-- Add payload column to worker_tasks for task chaining
-- Used by payload_fetch to pass payload_id to product_refresh
ALTER TABLE worker_tasks
ADD COLUMN IF NOT EXISTS payload JSONB DEFAULT NULL;
COMMENT ON COLUMN worker_tasks.payload IS 'Per TASK_WORKFLOW_2024-12-10.md: Task chaining data (e.g., payload_id from payload_fetch to product_refresh)';
-- Add processed_at to raw_crawl_payloads
-- Tracks when the payload was processed by product_refresh
ALTER TABLE raw_crawl_payloads
ADD COLUMN IF NOT EXISTS processed_at TIMESTAMPTZ DEFAULT NULL;
COMMENT ON COLUMN raw_crawl_payloads.processed_at IS 'When this payload was processed by product_refresh handler';
-- Index for finding unprocessed payloads
CREATE INDEX IF NOT EXISTS idx_raw_crawl_payloads_unprocessed
ON raw_crawl_payloads(dispensary_id, fetched_at DESC)
WHERE processed_at IS NULL;
-- Add last_fetch_at to dispensaries
-- Tracks when the last payload was fetched (separate from last_crawl_at which is when processing completed)
ALTER TABLE dispensaries
ADD COLUMN IF NOT EXISTS last_fetch_at TIMESTAMPTZ DEFAULT NULL;
COMMENT ON COLUMN dispensaries.last_fetch_at IS 'Per TASK_WORKFLOW_2024-12-10.md: When last payload was fetched from API (separate from last_crawl_at which is when processing completed)';

View File

@@ -0,0 +1,27 @@
-- Migration: 082_proxy_notification_trigger
-- Date: 2024-12-11
-- Description: Add PostgreSQL NOTIFY trigger to alert workers when proxies are added
-- Create function to notify workers when active proxy is added/activated
CREATE OR REPLACE FUNCTION notify_proxy_added()
RETURNS TRIGGER AS $$
BEGIN
-- Only notify if proxy is active
IF NEW.active = true THEN
PERFORM pg_notify('proxy_added', NEW.id::text);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Drop existing trigger if any
DROP TRIGGER IF EXISTS proxy_added_trigger ON proxies;
-- Create trigger on insert and update of active column
CREATE TRIGGER proxy_added_trigger
AFTER INSERT OR UPDATE OF active ON proxies
FOR EACH ROW
EXECUTE FUNCTION notify_proxy_added();
COMMENT ON FUNCTION notify_proxy_added() IS
'Sends PostgreSQL NOTIFY to proxy_added channel when an active proxy is added or activated. Workers LISTEN on this channel to wake up immediately.';

View File

@@ -1,6 +1,6 @@
{ {
"name": "dutchie-menus-backend", "name": "dutchie-menus-backend",
"version": "1.5.1", "version": "1.6.0",
"lockfileVersion": 3, "lockfileVersion": 3,
"requires": true, "requires": true,
"packages": { "packages": {
@@ -46,6 +46,97 @@
"resolved": "https://registry.npmjs.org/@ioredis/commands/-/commands-1.4.0.tgz", "resolved": "https://registry.npmjs.org/@ioredis/commands/-/commands-1.4.0.tgz",
"integrity": "sha512-aFT2yemJJo+TZCmieA7qnYGQooOS7QfNmYrzGtsYd3g9j5iDP8AimYYAesf79ohjbLG12XxC4nG5DyEnC88AsQ==" "integrity": "sha512-aFT2yemJJo+TZCmieA7qnYGQooOS7QfNmYrzGtsYd3g9j5iDP8AimYYAesf79ohjbLG12XxC4nG5DyEnC88AsQ=="
}, },
"node_modules/@jsep-plugin/assignment": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/@jsep-plugin/assignment/-/assignment-1.3.0.tgz",
"integrity": "sha512-VVgV+CXrhbMI3aSusQyclHkenWSAm95WaiKrMxRFam3JSUiIaQjoMIw2sEs/OX4XifnqeQUN4DYbJjlA8EfktQ==",
"engines": {
"node": ">= 10.16.0"
},
"peerDependencies": {
"jsep": "^0.4.0||^1.0.0"
}
},
"node_modules/@jsep-plugin/regex": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@jsep-plugin/regex/-/regex-1.0.4.tgz",
"integrity": "sha512-q7qL4Mgjs1vByCaTnDFcBnV9HS7GVPJX5vyVoCgZHNSC9rjwIlmbXG5sUuorR5ndfHAIlJ8pVStxvjXHbNvtUg==",
"engines": {
"node": ">= 10.16.0"
},
"peerDependencies": {
"jsep": "^0.4.0||^1.0.0"
}
},
"node_modules/@kubernetes/client-node": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/@kubernetes/client-node/-/client-node-1.4.0.tgz",
"integrity": "sha512-Zge3YvF7DJi264dU1b3wb/GmzR99JhUpqTvp+VGHfwZT+g7EOOYNScDJNZwXy9cszyIGPIs0VHr+kk8e95qqrA==",
"dependencies": {
"@types/js-yaml": "^4.0.1",
"@types/node": "^24.0.0",
"@types/node-fetch": "^2.6.13",
"@types/stream-buffers": "^3.0.3",
"form-data": "^4.0.0",
"hpagent": "^1.2.0",
"isomorphic-ws": "^5.0.0",
"js-yaml": "^4.1.0",
"jsonpath-plus": "^10.3.0",
"node-fetch": "^2.7.0",
"openid-client": "^6.1.3",
"rfc4648": "^1.3.0",
"socks-proxy-agent": "^8.0.4",
"stream-buffers": "^3.0.2",
"tar-fs": "^3.0.9",
"ws": "^8.18.2"
}
},
"node_modules/@kubernetes/client-node/node_modules/@types/node": {
"version": "24.10.3",
"resolved": "https://registry.npmjs.org/@types/node/-/node-24.10.3.tgz",
"integrity": "sha512-gqkrWUsS8hcm0r44yn7/xZeV1ERva/nLgrLxFRUGb7aoNMIJfZJ3AC261zDQuOAKC7MiXai1WCpYc48jAHoShQ==",
"dependencies": {
"undici-types": "~7.16.0"
}
},
"node_modules/@kubernetes/client-node/node_modules/tar-fs": {
"version": "3.1.1",
"resolved": "https://registry.npmjs.org/tar-fs/-/tar-fs-3.1.1.tgz",
"integrity": "sha512-LZA0oaPOc2fVo82Txf3gw+AkEd38szODlptMYejQUhndHMLQ9M059uXR+AfS7DNo0NpINvSqDsvyaCrBVkptWg==",
"dependencies": {
"pump": "^3.0.0",
"tar-stream": "^3.1.5"
},
"optionalDependencies": {
"bare-fs": "^4.0.1",
"bare-path": "^3.0.0"
}
},
"node_modules/@kubernetes/client-node/node_modules/undici-types": {
"version": "7.16.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.16.0.tgz",
"integrity": "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw=="
},
"node_modules/@kubernetes/client-node/node_modules/ws": {
"version": "8.18.3",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.18.3.tgz",
"integrity": "sha512-PEIGCY5tSlUt50cqyMXfCzX+oOPqN0vuGqWzbcJ2xvnkzkq46oOpz7dQaTDBdfICb4N14+GARUDw2XV2N4tvzg==",
"engines": {
"node": ">=10.0.0"
},
"peerDependencies": {
"bufferutil": "^4.0.1",
"utf-8-validate": ">=5.0.2"
},
"peerDependenciesMeta": {
"bufferutil": {
"optional": true
},
"utf-8-validate": {
"optional": true
}
}
},
"node_modules/@mapbox/node-pre-gyp": { "node_modules/@mapbox/node-pre-gyp": {
"version": "1.0.11", "version": "1.0.11",
"resolved": "https://registry.npmjs.org/@mapbox/node-pre-gyp/-/node-pre-gyp-1.0.11.tgz", "resolved": "https://registry.npmjs.org/@mapbox/node-pre-gyp/-/node-pre-gyp-1.0.11.tgz",
@@ -251,6 +342,11 @@
"integrity": "sha512-r8Tayk8HJnX0FztbZN7oVqGccWgw98T/0neJphO91KkmOzug1KkofZURD4UaD5uH8AqcFLfdPErnBod0u71/qg==", "integrity": "sha512-r8Tayk8HJnX0FztbZN7oVqGccWgw98T/0neJphO91KkmOzug1KkofZURD4UaD5uH8AqcFLfdPErnBod0u71/qg==",
"dev": true "dev": true
}, },
"node_modules/@types/js-yaml": {
"version": "4.0.9",
"resolved": "https://registry.npmjs.org/@types/js-yaml/-/js-yaml-4.0.9.tgz",
"integrity": "sha512-k4MGaQl5TGo/iipqb2UDG2UwjXziSWkh0uysQelTlJpX1qGlpUZYm8PnO4DxG1qBomtJUdYJ6qR6xdIah10JLg=="
},
"node_modules/@types/jsonwebtoken": { "node_modules/@types/jsonwebtoken": {
"version": "9.0.10", "version": "9.0.10",
"resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.10.tgz", "resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.10.tgz",
@@ -276,7 +372,6 @@
"version": "20.19.25", "version": "20.19.25",
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz", "resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz",
"integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==", "integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==",
"devOptional": true,
"dependencies": { "dependencies": {
"undici-types": "~6.21.0" "undici-types": "~6.21.0"
} }
@@ -287,6 +382,15 @@
"integrity": "sha512-0ikrnug3/IyneSHqCBeslAhlK2aBfYek1fGo4bP4QnZPmiqSGRK+Oy7ZMisLWkesffJvQ1cqAcBnJC+8+nxIAg==", "integrity": "sha512-0ikrnug3/IyneSHqCBeslAhlK2aBfYek1fGo4bP4QnZPmiqSGRK+Oy7ZMisLWkesffJvQ1cqAcBnJC+8+nxIAg==",
"dev": true "dev": true
}, },
"node_modules/@types/node-fetch": {
"version": "2.6.13",
"resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.13.tgz",
"integrity": "sha512-QGpRVpzSaUs30JBSGPjOg4Uveu384erbHBoT1zeONvyCfwQxIkUshLAOqN/k9EjGviPRmWTTe6aH2qySWKTVSw==",
"dependencies": {
"@types/node": "*",
"form-data": "^4.0.4"
}
},
"node_modules/@types/pg": { "node_modules/@types/pg": {
"version": "8.15.6", "version": "8.15.6",
"resolved": "https://registry.npmjs.org/@types/pg/-/pg-8.15.6.tgz", "resolved": "https://registry.npmjs.org/@types/pg/-/pg-8.15.6.tgz",
@@ -340,6 +444,14 @@
"@types/node": "*" "@types/node": "*"
} }
}, },
"node_modules/@types/stream-buffers": {
"version": "3.0.8",
"resolved": "https://registry.npmjs.org/@types/stream-buffers/-/stream-buffers-3.0.8.tgz",
"integrity": "sha512-J+7VaHKNvlNPJPEJXX/fKa9DZtR/xPMwuIbe+yNOwp1YB+ApUOBv2aUpEoBJEi8nJgbgs1x8e73ttg0r1rSUdw==",
"dependencies": {
"@types/node": "*"
}
},
"node_modules/@types/uuid": { "node_modules/@types/uuid": {
"version": "9.0.8", "version": "9.0.8",
"resolved": "https://registry.npmjs.org/@types/uuid/-/uuid-9.0.8.tgz", "resolved": "https://registry.npmjs.org/@types/uuid/-/uuid-9.0.8.tgz",
@@ -520,6 +632,78 @@
} }
} }
}, },
"node_modules/bare-fs": {
"version": "4.5.2",
"resolved": "https://registry.npmjs.org/bare-fs/-/bare-fs-4.5.2.tgz",
"integrity": "sha512-veTnRzkb6aPHOvSKIOy60KzURfBdUflr5VReI+NSaPL6xf+XLdONQgZgpYvUuZLVQ8dCqxpBAudaOM1+KpAUxw==",
"optional": true,
"dependencies": {
"bare-events": "^2.5.4",
"bare-path": "^3.0.0",
"bare-stream": "^2.6.4",
"bare-url": "^2.2.2",
"fast-fifo": "^1.3.2"
},
"engines": {
"bare": ">=1.16.0"
},
"peerDependencies": {
"bare-buffer": "*"
},
"peerDependenciesMeta": {
"bare-buffer": {
"optional": true
}
}
},
"node_modules/bare-os": {
"version": "3.6.2",
"resolved": "https://registry.npmjs.org/bare-os/-/bare-os-3.6.2.tgz",
"integrity": "sha512-T+V1+1srU2qYNBmJCXZkUY5vQ0B4FSlL3QDROnKQYOqeiQR8UbjNHlPa+TIbM4cuidiN9GaTaOZgSEgsvPbh5A==",
"optional": true,
"engines": {
"bare": ">=1.14.0"
}
},
"node_modules/bare-path": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/bare-path/-/bare-path-3.0.0.tgz",
"integrity": "sha512-tyfW2cQcB5NN8Saijrhqn0Zh7AnFNsnczRcuWODH0eYAXBsJ5gVxAUuNr7tsHSC6IZ77cA0SitzT+s47kot8Mw==",
"optional": true,
"dependencies": {
"bare-os": "^3.0.1"
}
},
"node_modules/bare-stream": {
"version": "2.7.0",
"resolved": "https://registry.npmjs.org/bare-stream/-/bare-stream-2.7.0.tgz",
"integrity": "sha512-oyXQNicV1y8nc2aKffH+BUHFRXmx6VrPzlnaEvMhram0nPBrKcEdcyBg5r08D0i8VxngHFAiVyn1QKXpSG0B8A==",
"optional": true,
"dependencies": {
"streamx": "^2.21.0"
},
"peerDependencies": {
"bare-buffer": "*",
"bare-events": "*"
},
"peerDependenciesMeta": {
"bare-buffer": {
"optional": true
},
"bare-events": {
"optional": true
}
}
},
"node_modules/bare-url": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/bare-url/-/bare-url-2.3.2.tgz",
"integrity": "sha512-ZMq4gd9ngV5aTMa5p9+UfY0b3skwhHELaDkhEHetMdX0LRkW9kzaym4oo/Eh+Ghm0CCDuMTsRIGM/ytUc1ZYmw==",
"optional": true,
"dependencies": {
"bare-path": "^3.0.0"
}
},
"node_modules/base64-js": { "node_modules/base64-js": {
"version": "1.5.1", "version": "1.5.1",
"resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz", "resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz",
@@ -2019,6 +2203,14 @@
"node": ">=16.0.0" "node": ">=16.0.0"
} }
}, },
"node_modules/hpagent": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/hpagent/-/hpagent-1.2.0.tgz",
"integrity": "sha512-A91dYTeIB6NoXG+PxTQpCCDDnfHsW9kc06Lvpu1TEe9gnd6ZFeiBoRO9JvzEv6xK7EX97/dUE8g/vBMTqTS3CA==",
"engines": {
"node": ">=14"
}
},
"node_modules/htmlparser2": { "node_modules/htmlparser2": {
"version": "10.0.0", "version": "10.0.0",
"resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz", "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz",
@@ -2382,6 +2574,22 @@
"node": ">=0.10.0" "node": ">=0.10.0"
} }
}, },
"node_modules/isomorphic-ws": {
"version": "5.0.0",
"resolved": "https://registry.npmjs.org/isomorphic-ws/-/isomorphic-ws-5.0.0.tgz",
"integrity": "sha512-muId7Zzn9ywDsyXgTIafTry2sV3nySZeUDe6YedVd1Hvuuep5AsIlqK+XefWpYTyJG5e503F2xIuT2lcU6rCSw==",
"peerDependencies": {
"ws": "*"
}
},
"node_modules/jose": {
"version": "6.1.3",
"resolved": "https://registry.npmjs.org/jose/-/jose-6.1.3.tgz",
"integrity": "sha512-0TpaTfihd4QMNwrz/ob2Bp7X04yuxJkjRGi4aKmOqwhov54i6u79oCv7T+C7lo70MKH6BesI3vscD1yb/yzKXQ==",
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/js-tokens": { "node_modules/js-tokens": {
"version": "4.0.0", "version": "4.0.0",
"resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz", "resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz",
@@ -2398,6 +2606,14 @@
"js-yaml": "bin/js-yaml.js" "js-yaml": "bin/js-yaml.js"
} }
}, },
"node_modules/jsep": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/jsep/-/jsep-1.4.0.tgz",
"integrity": "sha512-B7qPcEVE3NVkmSJbaYxvv4cHkVW7DQsZz13pUMrfS8z8Q/BuShN+gcTXrUlPiGqM2/t/EEaI030bpxMqY8gMlw==",
"engines": {
"node": ">= 10.16.0"
}
},
"node_modules/json-parse-even-better-errors": { "node_modules/json-parse-even-better-errors": {
"version": "2.3.1", "version": "2.3.1",
"resolved": "https://registry.npmjs.org/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz", "resolved": "https://registry.npmjs.org/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz",
@@ -2419,6 +2635,23 @@
"graceful-fs": "^4.1.6" "graceful-fs": "^4.1.6"
} }
}, },
"node_modules/jsonpath-plus": {
"version": "10.3.0",
"resolved": "https://registry.npmjs.org/jsonpath-plus/-/jsonpath-plus-10.3.0.tgz",
"integrity": "sha512-8TNmfeTCk2Le33A3vRRwtuworG/L5RrgMvdjhKZxvyShO+mBu2fP50OWUjRLNtvw344DdDarFh9buFAZs5ujeA==",
"dependencies": {
"@jsep-plugin/assignment": "^1.3.0",
"@jsep-plugin/regex": "^1.0.4",
"jsep": "^1.4.0"
},
"bin": {
"jsonpath": "bin/jsonpath-cli.js",
"jsonpath-plus": "bin/jsonpath-cli.js"
},
"engines": {
"node": ">=18.0.0"
}
},
"node_modules/jsonwebtoken": { "node_modules/jsonwebtoken": {
"version": "9.0.2", "version": "9.0.2",
"resolved": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.2.tgz", "resolved": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.2.tgz",
@@ -2493,6 +2726,11 @@
"resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz", "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
"integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==" "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg=="
}, },
"node_modules/lodash.clonedeep": {
"version": "4.5.0",
"resolved": "https://registry.npmjs.org/lodash.clonedeep/-/lodash.clonedeep-4.5.0.tgz",
"integrity": "sha512-H5ZhCF25riFd9uB5UCkVKo61m3S/xZk1x4wA6yp/L3RFP6Z/eHH1ymQcGLo7J3GMPfm0V/7m1tryHuGVxpqEBQ=="
},
"node_modules/lodash.defaults": { "node_modules/lodash.defaults": {
"version": "4.2.0", "version": "4.2.0",
"resolved": "https://registry.npmjs.org/lodash.defaults/-/lodash.defaults-4.2.0.tgz", "resolved": "https://registry.npmjs.org/lodash.defaults/-/lodash.defaults-4.2.0.tgz",
@@ -2942,6 +3180,14 @@
"url": "https://github.com/fb55/nth-check?sponsor=1" "url": "https://github.com/fb55/nth-check?sponsor=1"
} }
}, },
"node_modules/oauth4webapi": {
"version": "3.8.3",
"resolved": "https://registry.npmjs.org/oauth4webapi/-/oauth4webapi-3.8.3.tgz",
"integrity": "sha512-pQ5BsX3QRTgnt5HxgHwgunIRaDXBdkT23tf8dfzmtTIL2LTpdmxgbpbBm0VgFWAIDlezQvQCTgnVIUmHupXHxw==",
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/object-assign": { "node_modules/object-assign": {
"version": "4.1.1", "version": "4.1.1",
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz", "resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
@@ -2980,6 +3226,18 @@
"wrappy": "1" "wrappy": "1"
} }
}, },
"node_modules/openid-client": {
"version": "6.8.1",
"resolved": "https://registry.npmjs.org/openid-client/-/openid-client-6.8.1.tgz",
"integrity": "sha512-VoYT6enBo6Vj2j3Q5Ec0AezS+9YGzQo1f5Xc42lreMGlfP4ljiXPKVDvCADh+XHCV/bqPu/wWSiCVXbJKvrODw==",
"dependencies": {
"jose": "^6.1.0",
"oauth4webapi": "^3.8.2"
},
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/pac-proxy-agent": { "node_modules/pac-proxy-agent": {
"version": "7.2.0", "version": "7.2.0",
"resolved": "https://registry.npmjs.org/pac-proxy-agent/-/pac-proxy-agent-7.2.0.tgz", "resolved": "https://registry.npmjs.org/pac-proxy-agent/-/pac-proxy-agent-7.2.0.tgz",
@@ -3883,6 +4141,11 @@
"url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1" "url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1"
} }
}, },
"node_modules/rfc4648": {
"version": "1.5.4",
"resolved": "https://registry.npmjs.org/rfc4648/-/rfc4648-1.5.4.tgz",
"integrity": "sha512-rRg/6Lb+IGfJqO05HZkN50UtY7K/JhxJag1kP23+zyMfrvoB0B7RWv06MbOzoc79RgCdNTiUaNsTT1AJZ7Z+cg=="
},
"node_modules/rimraf": { "node_modules/rimraf": {
"version": "3.0.2", "version": "3.0.2",
"resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz", "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz",
@@ -4313,6 +4576,14 @@
"node": ">= 0.8" "node": ">= 0.8"
} }
}, },
"node_modules/stream-buffers": {
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/stream-buffers/-/stream-buffers-3.0.3.tgz",
"integrity": "sha512-pqMqwQCso0PBJt2PQmDO0cFj0lyqmiwOMiMSkVtRokl7e+ZTRYgDHKnuZNbqjiJXgsg4nuqtD/zxuo9KqTp0Yw==",
"engines": {
"node": ">= 0.10.0"
}
},
"node_modules/streamx": { "node_modules/streamx": {
"version": "2.23.0", "version": "2.23.0",
"resolved": "https://registry.npmjs.org/streamx/-/streamx-2.23.0.tgz", "resolved": "https://registry.npmjs.org/streamx/-/streamx-2.23.0.tgz",
@@ -4532,8 +4803,7 @@
"node_modules/undici-types": { "node_modules/undici-types": {
"version": "6.21.0", "version": "6.21.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz", "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==", "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ=="
"devOptional": true
}, },
"node_modules/universalify": { "node_modules/universalify": {
"version": "2.0.1", "version": "2.0.1",
@@ -4556,6 +4826,14 @@
"resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-10.0.0.tgz", "resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-10.0.0.tgz",
"integrity": "sha512-H/A06tKD7sS1O1X2SshBVeA5FLycRpjqiBeqGKmBwBDBy28EnRjORxTNe269KSSr5un5qyWi1iL61wLxpd+ZOg==" "integrity": "sha512-H/A06tKD7sS1O1X2SshBVeA5FLycRpjqiBeqGKmBwBDBy28EnRjORxTNe269KSSr5un5qyWi1iL61wLxpd+ZOg=="
}, },
"node_modules/user-agents": {
"version": "1.1.669",
"resolved": "https://registry.npmjs.org/user-agents/-/user-agents-1.1.669.tgz",
"integrity": "sha512-pbIzG+AOqCaIpySKJ4IAm1l0VyE4jMnK4y1thV8lm8PYxI+7X5uWcppOK7zY79TCKKTAnJH3/4gaVIZHsjrmJA==",
"dependencies": {
"lodash.clonedeep": "^4.5.0"
}
},
"node_modules/util": { "node_modules/util": {
"version": "0.12.5", "version": "0.12.5",
"resolved": "https://registry.npmjs.org/util/-/util-0.12.5.tgz", "resolved": "https://registry.npmjs.org/util/-/util-0.12.5.tgz",

View File

@@ -1,13 +1,14 @@
{ {
"name": "dutchie-menus-backend", "name": "dutchie-menus-backend",
"version": "1.5.1", "version": "1.6.0",
"lockfileVersion": 3, "lockfileVersion": 3,
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"name": "dutchie-menus-backend", "name": "dutchie-menus-backend",
"version": "1.5.1", "version": "1.6.0",
"dependencies": { "dependencies": {
"@kubernetes/client-node": "^1.4.0",
"@types/bcryptjs": "^3.0.0", "@types/bcryptjs": "^3.0.0",
"axios": "^1.6.2", "axios": "^1.6.2",
"bcrypt": "^5.1.1", "bcrypt": "^5.1.1",
@@ -34,6 +35,7 @@
"puppeteer-extra-plugin-stealth": "^2.11.2", "puppeteer-extra-plugin-stealth": "^2.11.2",
"sharp": "^0.32.0", "sharp": "^0.32.0",
"socks-proxy-agent": "^8.0.2", "socks-proxy-agent": "^8.0.2",
"user-agents": "^1.1.669",
"uuid": "^9.0.1", "uuid": "^9.0.1",
"zod": "^3.22.4" "zod": "^3.22.4"
}, },
@@ -492,6 +494,97 @@
"resolved": "https://registry.npmjs.org/@ioredis/commands/-/commands-1.4.0.tgz", "resolved": "https://registry.npmjs.org/@ioredis/commands/-/commands-1.4.0.tgz",
"integrity": "sha512-aFT2yemJJo+TZCmieA7qnYGQooOS7QfNmYrzGtsYd3g9j5iDP8AimYYAesf79ohjbLG12XxC4nG5DyEnC88AsQ==" "integrity": "sha512-aFT2yemJJo+TZCmieA7qnYGQooOS7QfNmYrzGtsYd3g9j5iDP8AimYYAesf79ohjbLG12XxC4nG5DyEnC88AsQ=="
}, },
"node_modules/@jsep-plugin/assignment": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/@jsep-plugin/assignment/-/assignment-1.3.0.tgz",
"integrity": "sha512-VVgV+CXrhbMI3aSusQyclHkenWSAm95WaiKrMxRFam3JSUiIaQjoMIw2sEs/OX4XifnqeQUN4DYbJjlA8EfktQ==",
"engines": {
"node": ">= 10.16.0"
},
"peerDependencies": {
"jsep": "^0.4.0||^1.0.0"
}
},
"node_modules/@jsep-plugin/regex": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@jsep-plugin/regex/-/regex-1.0.4.tgz",
"integrity": "sha512-q7qL4Mgjs1vByCaTnDFcBnV9HS7GVPJX5vyVoCgZHNSC9rjwIlmbXG5sUuorR5ndfHAIlJ8pVStxvjXHbNvtUg==",
"engines": {
"node": ">= 10.16.0"
},
"peerDependencies": {
"jsep": "^0.4.0||^1.0.0"
}
},
"node_modules/@kubernetes/client-node": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/@kubernetes/client-node/-/client-node-1.4.0.tgz",
"integrity": "sha512-Zge3YvF7DJi264dU1b3wb/GmzR99JhUpqTvp+VGHfwZT+g7EOOYNScDJNZwXy9cszyIGPIs0VHr+kk8e95qqrA==",
"dependencies": {
"@types/js-yaml": "^4.0.1",
"@types/node": "^24.0.0",
"@types/node-fetch": "^2.6.13",
"@types/stream-buffers": "^3.0.3",
"form-data": "^4.0.0",
"hpagent": "^1.2.0",
"isomorphic-ws": "^5.0.0",
"js-yaml": "^4.1.0",
"jsonpath-plus": "^10.3.0",
"node-fetch": "^2.7.0",
"openid-client": "^6.1.3",
"rfc4648": "^1.3.0",
"socks-proxy-agent": "^8.0.4",
"stream-buffers": "^3.0.2",
"tar-fs": "^3.0.9",
"ws": "^8.18.2"
}
},
"node_modules/@kubernetes/client-node/node_modules/@types/node": {
"version": "24.10.3",
"resolved": "https://registry.npmjs.org/@types/node/-/node-24.10.3.tgz",
"integrity": "sha512-gqkrWUsS8hcm0r44yn7/xZeV1ERva/nLgrLxFRUGb7aoNMIJfZJ3AC261zDQuOAKC7MiXai1WCpYc48jAHoShQ==",
"dependencies": {
"undici-types": "~7.16.0"
}
},
"node_modules/@kubernetes/client-node/node_modules/tar-fs": {
"version": "3.1.1",
"resolved": "https://registry.npmjs.org/tar-fs/-/tar-fs-3.1.1.tgz",
"integrity": "sha512-LZA0oaPOc2fVo82Txf3gw+AkEd38szODlptMYejQUhndHMLQ9M059uXR+AfS7DNo0NpINvSqDsvyaCrBVkptWg==",
"dependencies": {
"pump": "^3.0.0",
"tar-stream": "^3.1.5"
},
"optionalDependencies": {
"bare-fs": "^4.0.1",
"bare-path": "^3.0.0"
}
},
"node_modules/@kubernetes/client-node/node_modules/undici-types": {
"version": "7.16.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.16.0.tgz",
"integrity": "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw=="
},
"node_modules/@kubernetes/client-node/node_modules/ws": {
"version": "8.18.3",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.18.3.tgz",
"integrity": "sha512-PEIGCY5tSlUt50cqyMXfCzX+oOPqN0vuGqWzbcJ2xvnkzkq46oOpz7dQaTDBdfICb4N14+GARUDw2XV2N4tvzg==",
"engines": {
"node": ">=10.0.0"
},
"peerDependencies": {
"bufferutil": "^4.0.1",
"utf-8-validate": ">=5.0.2"
},
"peerDependenciesMeta": {
"bufferutil": {
"optional": true
},
"utf-8-validate": {
"optional": true
}
}
},
"node_modules/@mapbox/node-pre-gyp": { "node_modules/@mapbox/node-pre-gyp": {
"version": "1.0.11", "version": "1.0.11",
"resolved": "https://registry.npmjs.org/@mapbox/node-pre-gyp/-/node-pre-gyp-1.0.11.tgz", "resolved": "https://registry.npmjs.org/@mapbox/node-pre-gyp/-/node-pre-gyp-1.0.11.tgz",
@@ -757,6 +850,11 @@
"integrity": "sha512-r8Tayk8HJnX0FztbZN7oVqGccWgw98T/0neJphO91KkmOzug1KkofZURD4UaD5uH8AqcFLfdPErnBod0u71/qg==", "integrity": "sha512-r8Tayk8HJnX0FztbZN7oVqGccWgw98T/0neJphO91KkmOzug1KkofZURD4UaD5uH8AqcFLfdPErnBod0u71/qg==",
"dev": true "dev": true
}, },
"node_modules/@types/js-yaml": {
"version": "4.0.9",
"resolved": "https://registry.npmjs.org/@types/js-yaml/-/js-yaml-4.0.9.tgz",
"integrity": "sha512-k4MGaQl5TGo/iipqb2UDG2UwjXziSWkh0uysQelTlJpX1qGlpUZYm8PnO4DxG1qBomtJUdYJ6qR6xdIah10JLg=="
},
"node_modules/@types/jsonwebtoken": { "node_modules/@types/jsonwebtoken": {
"version": "9.0.10", "version": "9.0.10",
"resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.10.tgz", "resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.10.tgz",
@@ -782,7 +880,6 @@
"version": "20.19.25", "version": "20.19.25",
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz", "resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz",
"integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==", "integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==",
"devOptional": true,
"dependencies": { "dependencies": {
"undici-types": "~6.21.0" "undici-types": "~6.21.0"
} }
@@ -793,6 +890,15 @@
"integrity": "sha512-0ikrnug3/IyneSHqCBeslAhlK2aBfYek1fGo4bP4QnZPmiqSGRK+Oy7ZMisLWkesffJvQ1cqAcBnJC+8+nxIAg==", "integrity": "sha512-0ikrnug3/IyneSHqCBeslAhlK2aBfYek1fGo4bP4QnZPmiqSGRK+Oy7ZMisLWkesffJvQ1cqAcBnJC+8+nxIAg==",
"dev": true "dev": true
}, },
"node_modules/@types/node-fetch": {
"version": "2.6.13",
"resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.13.tgz",
"integrity": "sha512-QGpRVpzSaUs30JBSGPjOg4Uveu384erbHBoT1zeONvyCfwQxIkUshLAOqN/k9EjGviPRmWTTe6aH2qySWKTVSw==",
"dependencies": {
"@types/node": "*",
"form-data": "^4.0.4"
}
},
"node_modules/@types/pg": { "node_modules/@types/pg": {
"version": "8.15.6", "version": "8.15.6",
"resolved": "https://registry.npmjs.org/@types/pg/-/pg-8.15.6.tgz", "resolved": "https://registry.npmjs.org/@types/pg/-/pg-8.15.6.tgz",
@@ -846,6 +952,14 @@
"@types/node": "*" "@types/node": "*"
} }
}, },
"node_modules/@types/stream-buffers": {
"version": "3.0.8",
"resolved": "https://registry.npmjs.org/@types/stream-buffers/-/stream-buffers-3.0.8.tgz",
"integrity": "sha512-J+7VaHKNvlNPJPEJXX/fKa9DZtR/xPMwuIbe+yNOwp1YB+ApUOBv2aUpEoBJEi8nJgbgs1x8e73ttg0r1rSUdw==",
"dependencies": {
"@types/node": "*"
}
},
"node_modules/@types/uuid": { "node_modules/@types/uuid": {
"version": "9.0.8", "version": "9.0.8",
"resolved": "https://registry.npmjs.org/@types/uuid/-/uuid-9.0.8.tgz", "resolved": "https://registry.npmjs.org/@types/uuid/-/uuid-9.0.8.tgz",
@@ -1026,6 +1140,78 @@
} }
} }
}, },
"node_modules/bare-fs": {
"version": "4.5.2",
"resolved": "https://registry.npmjs.org/bare-fs/-/bare-fs-4.5.2.tgz",
"integrity": "sha512-veTnRzkb6aPHOvSKIOy60KzURfBdUflr5VReI+NSaPL6xf+XLdONQgZgpYvUuZLVQ8dCqxpBAudaOM1+KpAUxw==",
"optional": true,
"dependencies": {
"bare-events": "^2.5.4",
"bare-path": "^3.0.0",
"bare-stream": "^2.6.4",
"bare-url": "^2.2.2",
"fast-fifo": "^1.3.2"
},
"engines": {
"bare": ">=1.16.0"
},
"peerDependencies": {
"bare-buffer": "*"
},
"peerDependenciesMeta": {
"bare-buffer": {
"optional": true
}
}
},
"node_modules/bare-os": {
"version": "3.6.2",
"resolved": "https://registry.npmjs.org/bare-os/-/bare-os-3.6.2.tgz",
"integrity": "sha512-T+V1+1srU2qYNBmJCXZkUY5vQ0B4FSlL3QDROnKQYOqeiQR8UbjNHlPa+TIbM4cuidiN9GaTaOZgSEgsvPbh5A==",
"optional": true,
"engines": {
"bare": ">=1.14.0"
}
},
"node_modules/bare-path": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/bare-path/-/bare-path-3.0.0.tgz",
"integrity": "sha512-tyfW2cQcB5NN8Saijrhqn0Zh7AnFNsnczRcuWODH0eYAXBsJ5gVxAUuNr7tsHSC6IZ77cA0SitzT+s47kot8Mw==",
"optional": true,
"dependencies": {
"bare-os": "^3.0.1"
}
},
"node_modules/bare-stream": {
"version": "2.7.0",
"resolved": "https://registry.npmjs.org/bare-stream/-/bare-stream-2.7.0.tgz",
"integrity": "sha512-oyXQNicV1y8nc2aKffH+BUHFRXmx6VrPzlnaEvMhram0nPBrKcEdcyBg5r08D0i8VxngHFAiVyn1QKXpSG0B8A==",
"optional": true,
"dependencies": {
"streamx": "^2.21.0"
},
"peerDependencies": {
"bare-buffer": "*",
"bare-events": "*"
},
"peerDependenciesMeta": {
"bare-buffer": {
"optional": true
},
"bare-events": {
"optional": true
}
}
},
"node_modules/bare-url": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/bare-url/-/bare-url-2.3.2.tgz",
"integrity": "sha512-ZMq4gd9ngV5aTMa5p9+UfY0b3skwhHELaDkhEHetMdX0LRkW9kzaym4oo/Eh+Ghm0CCDuMTsRIGM/ytUc1ZYmw==",
"optional": true,
"dependencies": {
"bare-path": "^3.0.0"
}
},
"node_modules/base64-js": { "node_modules/base64-js": {
"version": "1.5.1", "version": "1.5.1",
"resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz", "resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz",
@@ -2539,6 +2725,14 @@
"node": ">=16.0.0" "node": ">=16.0.0"
} }
}, },
"node_modules/hpagent": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/hpagent/-/hpagent-1.2.0.tgz",
"integrity": "sha512-A91dYTeIB6NoXG+PxTQpCCDDnfHsW9kc06Lvpu1TEe9gnd6ZFeiBoRO9JvzEv6xK7EX97/dUE8g/vBMTqTS3CA==",
"engines": {
"node": ">=14"
}
},
"node_modules/htmlparser2": { "node_modules/htmlparser2": {
"version": "10.0.0", "version": "10.0.0",
"resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz", "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz",
@@ -2902,6 +3096,22 @@
"node": ">=0.10.0" "node": ">=0.10.0"
} }
}, },
"node_modules/isomorphic-ws": {
"version": "5.0.0",
"resolved": "https://registry.npmjs.org/isomorphic-ws/-/isomorphic-ws-5.0.0.tgz",
"integrity": "sha512-muId7Zzn9ywDsyXgTIafTry2sV3nySZeUDe6YedVd1Hvuuep5AsIlqK+XefWpYTyJG5e503F2xIuT2lcU6rCSw==",
"peerDependencies": {
"ws": "*"
}
},
"node_modules/jose": {
"version": "6.1.3",
"resolved": "https://registry.npmjs.org/jose/-/jose-6.1.3.tgz",
"integrity": "sha512-0TpaTfihd4QMNwrz/ob2Bp7X04yuxJkjRGi4aKmOqwhov54i6u79oCv7T+C7lo70MKH6BesI3vscD1yb/yzKXQ==",
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/js-tokens": { "node_modules/js-tokens": {
"version": "4.0.0", "version": "4.0.0",
"resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz", "resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz",
@@ -2918,6 +3128,14 @@
"js-yaml": "bin/js-yaml.js" "js-yaml": "bin/js-yaml.js"
} }
}, },
"node_modules/jsep": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/jsep/-/jsep-1.4.0.tgz",
"integrity": "sha512-B7qPcEVE3NVkmSJbaYxvv4cHkVW7DQsZz13pUMrfS8z8Q/BuShN+gcTXrUlPiGqM2/t/EEaI030bpxMqY8gMlw==",
"engines": {
"node": ">= 10.16.0"
}
},
"node_modules/json-parse-even-better-errors": { "node_modules/json-parse-even-better-errors": {
"version": "2.3.1", "version": "2.3.1",
"resolved": "https://registry.npmjs.org/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz", "resolved": "https://registry.npmjs.org/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz",
@@ -2939,6 +3157,23 @@
"graceful-fs": "^4.1.6" "graceful-fs": "^4.1.6"
} }
}, },
"node_modules/jsonpath-plus": {
"version": "10.3.0",
"resolved": "https://registry.npmjs.org/jsonpath-plus/-/jsonpath-plus-10.3.0.tgz",
"integrity": "sha512-8TNmfeTCk2Le33A3vRRwtuworG/L5RrgMvdjhKZxvyShO+mBu2fP50OWUjRLNtvw344DdDarFh9buFAZs5ujeA==",
"dependencies": {
"@jsep-plugin/assignment": "^1.3.0",
"@jsep-plugin/regex": "^1.0.4",
"jsep": "^1.4.0"
},
"bin": {
"jsonpath": "bin/jsonpath-cli.js",
"jsonpath-plus": "bin/jsonpath-cli.js"
},
"engines": {
"node": ">=18.0.0"
}
},
"node_modules/jsonwebtoken": { "node_modules/jsonwebtoken": {
"version": "9.0.2", "version": "9.0.2",
"resolved": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.2.tgz", "resolved": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.2.tgz",
@@ -3013,6 +3248,11 @@
"resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz", "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
"integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==" "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg=="
}, },
"node_modules/lodash.clonedeep": {
"version": "4.5.0",
"resolved": "https://registry.npmjs.org/lodash.clonedeep/-/lodash.clonedeep-4.5.0.tgz",
"integrity": "sha512-H5ZhCF25riFd9uB5UCkVKo61m3S/xZk1x4wA6yp/L3RFP6Z/eHH1ymQcGLo7J3GMPfm0V/7m1tryHuGVxpqEBQ=="
},
"node_modules/lodash.defaults": { "node_modules/lodash.defaults": {
"version": "4.2.0", "version": "4.2.0",
"resolved": "https://registry.npmjs.org/lodash.defaults/-/lodash.defaults-4.2.0.tgz", "resolved": "https://registry.npmjs.org/lodash.defaults/-/lodash.defaults-4.2.0.tgz",
@@ -3462,6 +3702,14 @@
"url": "https://github.com/fb55/nth-check?sponsor=1" "url": "https://github.com/fb55/nth-check?sponsor=1"
} }
}, },
"node_modules/oauth4webapi": {
"version": "3.8.3",
"resolved": "https://registry.npmjs.org/oauth4webapi/-/oauth4webapi-3.8.3.tgz",
"integrity": "sha512-pQ5BsX3QRTgnt5HxgHwgunIRaDXBdkT23tf8dfzmtTIL2LTpdmxgbpbBm0VgFWAIDlezQvQCTgnVIUmHupXHxw==",
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/object-assign": { "node_modules/object-assign": {
"version": "4.1.1", "version": "4.1.1",
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz", "resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
@@ -3500,6 +3748,18 @@
"wrappy": "1" "wrappy": "1"
} }
}, },
"node_modules/openid-client": {
"version": "6.8.1",
"resolved": "https://registry.npmjs.org/openid-client/-/openid-client-6.8.1.tgz",
"integrity": "sha512-VoYT6enBo6Vj2j3Q5Ec0AezS+9YGzQo1f5Xc42lreMGlfP4ljiXPKVDvCADh+XHCV/bqPu/wWSiCVXbJKvrODw==",
"dependencies": {
"jose": "^6.1.0",
"oauth4webapi": "^3.8.2"
},
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/pac-proxy-agent": { "node_modules/pac-proxy-agent": {
"version": "7.2.0", "version": "7.2.0",
"resolved": "https://registry.npmjs.org/pac-proxy-agent/-/pac-proxy-agent-7.2.0.tgz", "resolved": "https://registry.npmjs.org/pac-proxy-agent/-/pac-proxy-agent-7.2.0.tgz",
@@ -4416,6 +4676,11 @@
"url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1" "url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1"
} }
}, },
"node_modules/rfc4648": {
"version": "1.5.4",
"resolved": "https://registry.npmjs.org/rfc4648/-/rfc4648-1.5.4.tgz",
"integrity": "sha512-rRg/6Lb+IGfJqO05HZkN50UtY7K/JhxJag1kP23+zyMfrvoB0B7RWv06MbOzoc79RgCdNTiUaNsTT1AJZ7Z+cg=="
},
"node_modules/rimraf": { "node_modules/rimraf": {
"version": "3.0.2", "version": "3.0.2",
"resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz", "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz",
@@ -4846,6 +5111,14 @@
"node": ">= 0.8" "node": ">= 0.8"
} }
}, },
"node_modules/stream-buffers": {
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/stream-buffers/-/stream-buffers-3.0.3.tgz",
"integrity": "sha512-pqMqwQCso0PBJt2PQmDO0cFj0lyqmiwOMiMSkVtRokl7e+ZTRYgDHKnuZNbqjiJXgsg4nuqtD/zxuo9KqTp0Yw==",
"engines": {
"node": ">= 0.10.0"
}
},
"node_modules/streamx": { "node_modules/streamx": {
"version": "2.23.0", "version": "2.23.0",
"resolved": "https://registry.npmjs.org/streamx/-/streamx-2.23.0.tgz", "resolved": "https://registry.npmjs.org/streamx/-/streamx-2.23.0.tgz",
@@ -5065,8 +5338,7 @@
"node_modules/undici-types": { "node_modules/undici-types": {
"version": "6.21.0", "version": "6.21.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz", "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==", "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ=="
"devOptional": true
}, },
"node_modules/universalify": { "node_modules/universalify": {
"version": "2.0.1", "version": "2.0.1",
@@ -5089,6 +5361,14 @@
"resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-10.0.0.tgz", "resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-10.0.0.tgz",
"integrity": "sha512-H/A06tKD7sS1O1X2SshBVeA5FLycRpjqiBeqGKmBwBDBy28EnRjORxTNe269KSSr5un5qyWi1iL61wLxpd+ZOg==" "integrity": "sha512-H/A06tKD7sS1O1X2SshBVeA5FLycRpjqiBeqGKmBwBDBy28EnRjORxTNe269KSSr5un5qyWi1iL61wLxpd+ZOg=="
}, },
"node_modules/user-agents": {
"version": "1.1.669",
"resolved": "https://registry.npmjs.org/user-agents/-/user-agents-1.1.669.tgz",
"integrity": "sha512-pbIzG+AOqCaIpySKJ4IAm1l0VyE4jMnK4y1thV8lm8PYxI+7X5uWcppOK7zY79TCKKTAnJH3/4gaVIZHsjrmJA==",
"dependencies": {
"lodash.clonedeep": "^4.5.0"
}
},
"node_modules/util": { "node_modules/util": {
"version": "0.12.5", "version": "0.12.5",
"resolved": "https://registry.npmjs.org/util/-/util-0.12.5.tgz", "resolved": "https://registry.npmjs.org/util/-/util-0.12.5.tgz",

View File

@@ -1,6 +1,6 @@
{ {
"name": "dutchie-menus-backend", "name": "dutchie-menus-backend",
"version": "1.5.1", "version": "1.6.0",
"description": "Backend API for Dutchie Menus scraper and management", "description": "Backend API for Dutchie Menus scraper and management",
"main": "dist/index.js", "main": "dist/index.js",
"scripts": { "scripts": {
@@ -22,6 +22,7 @@
"seed:dt:cities:bulk": "tsx src/scripts/seed-dt-cities-bulk.ts" "seed:dt:cities:bulk": "tsx src/scripts/seed-dt-cities-bulk.ts"
}, },
"dependencies": { "dependencies": {
"@kubernetes/client-node": "^1.4.0",
"@types/bcryptjs": "^3.0.0", "@types/bcryptjs": "^3.0.0",
"axios": "^1.6.2", "axios": "^1.6.2",
"bcrypt": "^5.1.1", "bcrypt": "^5.1.1",
@@ -48,6 +49,7 @@
"puppeteer-extra-plugin-stealth": "^2.11.2", "puppeteer-extra-plugin-stealth": "^2.11.2",
"sharp": "^0.32.0", "sharp": "^0.32.0",
"socks-proxy-agent": "^8.0.2", "socks-proxy-agent": "^8.0.2",
"user-agents": "^1.1.669",
"uuid": "^9.0.1", "uuid": "^9.0.1",
"zod": "^3.22.4" "zod": "^3.22.4"
}, },

Binary file not shown.

View File

@@ -0,0 +1 @@
cannaiq-menus-1.6.0.zip

View File

@@ -32,6 +32,7 @@ const TRUSTED_ORIGINS = [
// Pattern-based trusted origins (wildcards) // Pattern-based trusted origins (wildcards)
const TRUSTED_ORIGIN_PATTERNS = [ const TRUSTED_ORIGIN_PATTERNS = [
/^https:\/\/.*\.cannabrands\.app$/, // *.cannabrands.app /^https:\/\/.*\.cannabrands\.app$/, // *.cannabrands.app
/^https:\/\/.*\.cannaiq\.co$/, // *.cannaiq.co
]; ];
// Trusted IPs for internal pod-to-pod communication // Trusted IPs for internal pod-to-pod communication
@@ -152,22 +153,10 @@ export async function authenticateUser(email: string, password: string): Promise
} }
export async function authMiddleware(req: AuthRequest, res: Response, next: NextFunction) { export async function authMiddleware(req: AuthRequest, res: Response, next: NextFunction) {
// Allow trusted origins/IPs to bypass auth (internal services, same-origin)
if (isTrustedRequest(req)) {
req.user = {
id: 0,
email: 'internal@system',
role: 'internal'
};
return next();
}
const authHeader = req.headers.authorization; const authHeader = req.headers.authorization;
if (!authHeader || !authHeader.startsWith('Bearer ')) { // If a Bearer token is provided, always try to use it first (logged-in user)
return res.status(401).json({ error: 'No token provided' }); if (authHeader && authHeader.startsWith('Bearer ')) {
}
const token = authHeader.substring(7); const token = authHeader.substring(7);
// Try JWT first // Try JWT first
@@ -186,56 +175,44 @@ export async function authMiddleware(req: AuthRequest, res: Response, next: Next
WHERE token = $1 WHERE token = $1
`, [token]); `, [token]);
if (result.rows.length === 0) { if (result.rows.length > 0) {
const apiToken = result.rows[0];
if (!apiToken.active) {
return res.status(401).json({ error: 'API token is inactive' });
}
if (apiToken.expires_at && new Date(apiToken.expires_at) < new Date()) {
return res.status(401).json({ error: 'API token has expired' });
}
req.user = {
id: 0,
email: `api:${apiToken.name}`,
role: 'api_token'
};
req.apiToken = apiToken;
return next();
}
} catch (err) {
console.error('API token lookup error:', err);
}
// Token provided but invalid
return res.status(401).json({ error: 'Invalid token' }); return res.status(401).json({ error: 'Invalid token' });
} }
const apiToken = result.rows[0]; // No token provided - check trusted origins for API access (WordPress, etc.)
if (isTrustedRequest(req)) {
// Check if token is active
if (!apiToken.active) {
return res.status(401).json({ error: 'Token is disabled' });
}
// Check if token is expired
if (apiToken.expires_at && new Date(apiToken.expires_at) < new Date()) {
return res.status(401).json({ error: 'Token has expired' });
}
// Check allowed endpoints
if (apiToken.allowed_endpoints && apiToken.allowed_endpoints.length > 0) {
const isAllowed = apiToken.allowed_endpoints.some((pattern: string) => {
// Simple wildcard matching
const regex = new RegExp('^' + pattern.replace('*', '.*') + '$');
return regex.test(req.path);
});
if (!isAllowed) {
return res.status(403).json({ error: 'Endpoint not allowed for this token' });
}
}
// Set API token on request for tracking
req.apiToken = {
id: apiToken.id,
name: apiToken.name,
rate_limit: apiToken.rate_limit
};
// Set a generic user for compatibility with existing code
req.user = { req.user = {
id: apiToken.id, id: 0,
email: `api-token-${apiToken.id}@system`, email: 'internal@system',
role: 'api' role: 'internal'
}; };
return next();
next();
} catch (error) {
console.error('Error verifying API token:', error);
return res.status(500).json({ error: 'Authentication failed' });
} }
return res.status(401).json({ error: 'No token provided' });
} }
/** /**
* Require specific role(s) to access endpoint. * Require specific role(s) to access endpoint.
* *

View File

@@ -172,6 +172,9 @@ export async function runFullDiscovery(
console.log(`Errors: ${totalErrors}`); console.log(`Errors: ${totalErrors}`);
} }
// Per TASK_WORKFLOW_2024-12-10.md: Track new dispensary IDs for task chaining
let newDispensaryIds: number[] = [];
// Step 4: Auto-validate and promote discovered locations // Step 4: Auto-validate and promote discovered locations
if (!dryRun && totalLocationsUpserted > 0) { if (!dryRun && totalLocationsUpserted > 0) {
console.log('\n[Discovery] Step 4: Auto-promoting discovered locations...'); console.log('\n[Discovery] Step 4: Auto-promoting discovered locations...');
@@ -180,6 +183,13 @@ export async function runFullDiscovery(
console.log(` Created: ${promotionResult.created} new dispensaries`); console.log(` Created: ${promotionResult.created} new dispensaries`);
console.log(` Updated: ${promotionResult.updated} existing dispensaries`); console.log(` Updated: ${promotionResult.updated} existing dispensaries`);
console.log(` Rejected: ${promotionResult.rejected} (validation failed)`); console.log(` Rejected: ${promotionResult.rejected} (validation failed)`);
// Per TASK_WORKFLOW_2024-12-10.md: Capture new IDs for task chaining
newDispensaryIds = promotionResult.newDispensaryIds;
if (newDispensaryIds.length > 0) {
console.log(` New store IDs for crawl: [${newDispensaryIds.join(', ')}]`);
}
if (promotionResult.rejectedRecords.length > 0) { if (promotionResult.rejectedRecords.length > 0) {
console.log(` Rejection reasons:`); console.log(` Rejection reasons:`);
promotionResult.rejectedRecords.slice(0, 5).forEach(r => { promotionResult.rejectedRecords.slice(0, 5).forEach(r => {
@@ -214,6 +224,8 @@ export async function runFullDiscovery(
totalLocationsFound, totalLocationsFound,
totalLocationsUpserted, totalLocationsUpserted,
durationMs, durationMs,
// Per TASK_WORKFLOW_2024-12-10.md: Return new IDs for task chaining
newDispensaryIds,
}; };
} }

View File

@@ -127,6 +127,8 @@ export interface PromotionSummary {
errors: string[]; errors: string[];
}>; }>;
durationMs: number; durationMs: number;
// Per TASK_WORKFLOW_2024-12-10.md: Track new dispensary IDs for task chaining
newDispensaryIds: number[];
} }
/** /**
@@ -469,6 +471,8 @@ export async function promoteDiscoveredLocations(
const results: PromotionResult[] = []; const results: PromotionResult[] = [];
const rejectedRecords: PromotionSummary['rejectedRecords'] = []; const rejectedRecords: PromotionSummary['rejectedRecords'] = [];
// Per TASK_WORKFLOW_2024-12-10.md: Track new dispensary IDs for task chaining
const newDispensaryIds: number[] = [];
let created = 0; let created = 0;
let updated = 0; let updated = 0;
let skipped = 0; let skipped = 0;
@@ -525,6 +529,8 @@ export async function promoteDiscoveredLocations(
if (promotionResult.action === 'created') { if (promotionResult.action === 'created') {
created++; created++;
// Per TASK_WORKFLOW_2024-12-10.md: Track new IDs for task chaining
newDispensaryIds.push(promotionResult.dispensaryId);
} else { } else {
updated++; updated++;
} }
@@ -548,6 +554,8 @@ export async function promoteDiscoveredLocations(
results, results,
rejectedRecords, rejectedRecords,
durationMs: Date.now() - startTime, durationMs: Date.now() - startTime,
// Per TASK_WORKFLOW_2024-12-10.md: Return new IDs for task chaining
newDispensaryIds,
}; };
} }

View File

@@ -211,6 +211,8 @@ export interface FullDiscoveryResult {
totalLocationsFound: number; totalLocationsFound: number;
totalLocationsUpserted: number; totalLocationsUpserted: number;
durationMs: number; durationMs: number;
// Per TASK_WORKFLOW_2024-12-10.md: Track new dispensary IDs for task chaining
newDispensaryIds?: number[];
} }
// ============================================================ // ============================================================

View File

@@ -90,7 +90,7 @@ export async function upsertStoreProducts(
name_raw, brand_name_raw, category_raw, subcategory_raw, name_raw, brand_name_raw, category_raw, subcategory_raw,
price_rec, price_med, price_rec_special, price_med_special, price_rec, price_med, price_rec_special, price_med_special,
is_on_special, discount_percent, is_on_special, discount_percent,
is_in_stock, stock_status, is_in_stock, stock_status, stock_quantity, total_quantity_available,
thc_percent, cbd_percent, thc_percent, cbd_percent,
image_url, image_url,
first_seen_at, last_seen_at, updated_at first_seen_at, last_seen_at, updated_at
@@ -99,9 +99,9 @@ export async function upsertStoreProducts(
$5, $6, $7, $8, $5, $6, $7, $8,
$9, $10, $11, $12, $9, $10, $11, $12,
$13, $14, $13, $14,
$15, $16, $15, $16, $17, $17,
$17, $18, $18, $19,
$19, $20,
NOW(), NOW(), NOW() NOW(), NOW(), NOW()
) )
ON CONFLICT (dispensary_id, provider, provider_product_id) ON CONFLICT (dispensary_id, provider, provider_product_id)
@@ -118,6 +118,8 @@ export async function upsertStoreProducts(
discount_percent = EXCLUDED.discount_percent, discount_percent = EXCLUDED.discount_percent,
is_in_stock = EXCLUDED.is_in_stock, is_in_stock = EXCLUDED.is_in_stock,
stock_status = EXCLUDED.stock_status, stock_status = EXCLUDED.stock_status,
stock_quantity = EXCLUDED.stock_quantity,
total_quantity_available = EXCLUDED.total_quantity_available,
thc_percent = EXCLUDED.thc_percent, thc_percent = EXCLUDED.thc_percent,
cbd_percent = EXCLUDED.cbd_percent, cbd_percent = EXCLUDED.cbd_percent,
image_url = EXCLUDED.image_url, image_url = EXCLUDED.image_url,
@@ -141,6 +143,7 @@ export async function upsertStoreProducts(
productPricing?.discountPercent, productPricing?.discountPercent,
productAvailability?.inStock ?? true, productAvailability?.inStock ?? true,
productAvailability?.stockStatus || 'unknown', productAvailability?.stockStatus || 'unknown',
productAvailability?.quantity ?? null, // stock_quantity and total_quantity_available
// Clamp THC/CBD to valid percentage range (0-100) - some products report mg as % // Clamp THC/CBD to valid percentage range (0-100) - some products report mg as %
product.thcPercent !== null && product.thcPercent <= 100 ? product.thcPercent : null, product.thcPercent !== null && product.thcPercent <= 100 ? product.thcPercent : null,
product.cbdPercent !== null && product.cbdPercent <= 100 ? product.cbdPercent : null, product.cbdPercent !== null && product.cbdPercent <= 100 ? product.cbdPercent : null,

View File

@@ -6,6 +6,8 @@ import { initializeMinio, isMinioEnabled } from './utils/minio';
import { initializeImageStorage } from './utils/image-storage'; import { initializeImageStorage } from './utils/image-storage';
import { logger } from './services/logger'; import { logger } from './services/logger';
import { cleanupOrphanedJobs } from './services/proxyTestQueue'; import { cleanupOrphanedJobs } from './services/proxyTestQueue';
// Per TASK_WORKFLOW_2024-12-10.md: Database-driven task scheduler
import { taskScheduler } from './services/task-scheduler';
import { runAutoMigrations } from './db/auto-migrate'; import { runAutoMigrations } from './db/auto-migrate';
import { getPool } from './db/pool'; import { getPool } from './db/pool';
import healthRoutes from './routes/health'; import healthRoutes from './routes/health';
@@ -142,6 +144,9 @@ import seoRoutes from './routes/seo';
import priceAnalyticsRoutes from './routes/price-analytics'; import priceAnalyticsRoutes from './routes/price-analytics';
import tasksRoutes from './routes/tasks'; import tasksRoutes from './routes/tasks';
import workerRegistryRoutes from './routes/worker-registry'; import workerRegistryRoutes from './routes/worker-registry';
// Per TASK_WORKFLOW_2024-12-10.md: Raw payload access API
import payloadsRoutes from './routes/payloads';
import k8sRoutes from './routes/k8s';
// Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com) // Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com)
// These domains can access the API without authentication // These domains can access the API without authentication
@@ -222,6 +227,14 @@ console.log('[Tasks] Routes registered at /api/tasks');
app.use('/api/worker-registry', workerRegistryRoutes); app.use('/api/worker-registry', workerRegistryRoutes);
console.log('[WorkerRegistry] Routes registered at /api/worker-registry'); console.log('[WorkerRegistry] Routes registered at /api/worker-registry');
// Per TASK_WORKFLOW_2024-12-10.md: Raw payload access API
app.use('/api/payloads', payloadsRoutes);
console.log('[Payloads] Routes registered at /api/payloads');
// K8s control routes - worker scaling from admin UI
app.use('/api/k8s', k8sRoutes);
console.log('[K8s] Routes registered at /api/k8s');
// Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation // Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation
try { try {
const analyticsV2Router = createAnalyticsV2Router(getPool()); const analyticsV2Router = createAnalyticsV2Router(getPool());
@@ -326,6 +339,17 @@ async function startServer() {
// Clean up any orphaned proxy test jobs from previous server runs // Clean up any orphaned proxy test jobs from previous server runs
await cleanupOrphanedJobs(); await cleanupOrphanedJobs();
// Per TASK_WORKFLOW_2024-12-10.md: Start database-driven task scheduler
// This replaces node-cron - schedules are stored in DB and survive restarts
// Uses SELECT FOR UPDATE SKIP LOCKED for multi-replica safety
try {
await taskScheduler.start();
logger.info('system', 'Task scheduler started');
} catch (err: any) {
// Non-fatal - scheduler can recover on next poll
logger.warn('system', `Task scheduler startup warning: ${err.message}`);
}
app.listen(PORT, () => { app.listen(PORT, () => {
logger.info('system', `Server running on port ${PORT}`); logger.info('system', `Server running on port ${PORT}`);
console.log(`🚀 Server running on port ${PORT}`); console.log(`🚀 Server running on port ${PORT}`);

View File

@@ -5,8 +5,8 @@ import { Request, Response, NextFunction } from 'express';
* These are our own frontends that should have unrestricted access. * These are our own frontends that should have unrestricted access.
*/ */
const TRUSTED_DOMAINS = [ const TRUSTED_DOMAINS = [
'cannaiq.co', '*.cannaiq.co',
'www.cannaiq.co', '*.cannabrands.app',
'findagram.co', 'findagram.co',
'www.findagram.co', 'www.findagram.co',
'findadispo.com', 'findadispo.com',
@@ -32,6 +32,24 @@ function extractDomain(header: string): string | null {
} }
} }
/**
* Checks if a domain matches any trusted domain (supports *.domain.com wildcards)
*/
function isTrustedDomain(domain: string): boolean {
for (const trusted of TRUSTED_DOMAINS) {
if (trusted.startsWith('*.')) {
// Wildcard: *.example.com matches example.com and any subdomain
const baseDomain = trusted.slice(2);
if (domain === baseDomain || domain.endsWith('.' + baseDomain)) {
return true;
}
} else if (domain === trusted) {
return true;
}
}
return false;
}
/** /**
* Checks if the request comes from a trusted domain * Checks if the request comes from a trusted domain
*/ */
@@ -42,7 +60,7 @@ function isRequestFromTrustedDomain(req: Request): boolean {
// Check Origin header first (preferred for CORS requests) // Check Origin header first (preferred for CORS requests)
if (origin) { if (origin) {
const domain = extractDomain(origin); const domain = extractDomain(origin);
if (domain && TRUSTED_DOMAINS.includes(domain)) { if (domain && isTrustedDomain(domain)) {
return true; return true;
} }
} }
@@ -50,7 +68,7 @@ function isRequestFromTrustedDomain(req: Request): boolean {
// Fallback to Referer header // Fallback to Referer header
if (referer) { if (referer) {
const domain = extractDomain(referer); const domain = extractDomain(referer);
if (domain && TRUSTED_DOMAINS.includes(domain)) { if (domain && isTrustedDomain(domain)) {
return true; return true;
} }
} }

View File

@@ -702,12 +702,10 @@ export class StateQueryService {
async getNationalSummary(): Promise<NationalSummary> { async getNationalSummary(): Promise<NationalSummary> {
const stateMetrics = await this.getAllStateMetrics(); const stateMetrics = await this.getAllStateMetrics();
// Get all states count and aggregate metrics
const result = await this.pool.query(` const result = await this.pool.query(`
SELECT SELECT
COUNT(DISTINCT s.code) AS total_states, COUNT(DISTINCT s.code) AS total_states,
COUNT(DISTINCT CASE WHEN EXISTS (
SELECT 1 FROM dispensaries d WHERE d.state = s.code AND d.menu_type IS NOT NULL
) THEN s.code END) AS active_states,
(SELECT COUNT(*) FROM dispensaries WHERE state IS NOT NULL) AS total_stores, (SELECT COUNT(*) FROM dispensaries WHERE state IS NOT NULL) AS total_stores,
(SELECT COUNT(*) FROM store_products sp (SELECT COUNT(*) FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id JOIN dispensaries d ON sp.dispensary_id = d.id
@@ -725,7 +723,7 @@ export class StateQueryService {
return { return {
totalStates: parseInt(data.total_states), totalStates: parseInt(data.total_states),
activeStates: parseInt(data.active_states), activeStates: parseInt(data.total_states), // Same as totalStates - all states shown
totalStores: parseInt(data.total_stores), totalStores: parseInt(data.total_stores),
totalProducts: parseInt(data.total_products), totalProducts: parseInt(data.total_products),
totalBrands: parseInt(data.total_brands), totalBrands: parseInt(data.total_brands),

View File

@@ -5,22 +5,35 @@
* *
* DO NOT MODIFY THIS FILE WITHOUT EXPLICIT AUTHORIZATION. * DO NOT MODIFY THIS FILE WITHOUT EXPLICIT AUTHORIZATION.
* *
* This is the canonical HTTP client for all Dutchie communication. * Updated: 2025-12-10 per workflow-12102025.md
* All Dutchie workers (Alice, Bella, etc.) MUST use this client. *
* KEY BEHAVIORS (per workflow-12102025.md):
* 1. startSession() gets identity from PROXY LOCATION, not task params
* 2. On 403: immediately get new IP + new fingerprint, then retry
* 3. After 3 consecutive 403s on same proxy → disable it (burned)
* 4. Language is always English (en-US)
* *
* IMPLEMENTATION: * IMPLEMENTATION:
* - Uses curl via child_process.execSync (bypasses TLS fingerprinting) * - Uses curl via child_process.execSync (bypasses TLS fingerprinting)
* - NO Puppeteer, NO axios, NO fetch * - NO Puppeteer, NO axios, NO fetch
* - Fingerprint rotation on 403 * - Uses intoli/user-agents via CrawlRotator for realistic fingerprints
* - Residential IP compatible * - Residential IP compatible
* *
* USAGE: * USAGE:
* import { curlPost, curlGet, executeGraphQL } from '@dutchie/client'; * import { curlPost, curlGet, executeGraphQL, startSession } from '@dutchie/client';
* *
* ============================================================ * ============================================================
*/ */
import { execSync } from 'child_process'; import { execSync } from 'child_process';
import {
buildOrderedHeaders,
buildRefererFromMenuUrl,
getCurlBinary,
isCurlImpersonateAvailable,
HeaderContext,
BrowserType,
} from '../../services/http-fingerprint';
// ============================================================ // ============================================================
// TYPES // TYPES
@@ -32,6 +45,8 @@ export interface CurlResponse {
error?: string; error?: string;
} }
// Per workflow-12102025.md: fingerprint comes from CrawlRotator's BrowserFingerprint
// We keep a simplified interface here for header building
export interface Fingerprint { export interface Fingerprint {
userAgent: string; userAgent: string;
acceptLanguage: string; acceptLanguage: string;
@@ -57,15 +72,13 @@ export const DUTCHIE_CONFIG = {
// ============================================================ // ============================================================
// PROXY SUPPORT // PROXY SUPPORT
// ============================================================ // Per workflow-12102025.md:
// Integrates with the CrawlRotator system from proxy-rotator.ts // - On 403: recordBlock() → increment consecutive_403_count
// On 403 errors: // - After 3 consecutive 403s → proxy disabled
// 1. Record failure on current proxy // - Immediately rotate to new IP + new fingerprint on 403
// 2. Rotate to next proxy
// 3. Retry with new proxy
// ============================================================ // ============================================================
import type { CrawlRotator, Proxy } from '../../services/crawl-rotator'; import type { CrawlRotator, BrowserFingerprint } from '../../services/crawl-rotator';
let currentProxy: string | null = null; let currentProxy: string | null = null;
let crawlRotator: CrawlRotator | null = null; let crawlRotator: CrawlRotator | null = null;
@@ -92,13 +105,12 @@ export function getProxy(): string | null {
/** /**
* Set CrawlRotator for proxy rotation on 403s * Set CrawlRotator for proxy rotation on 403s
* This enables automatic proxy rotation when blocked * Per workflow-12102025.md: enables automatic rotation when blocked
*/ */
export function setCrawlRotator(rotator: CrawlRotator | null): void { export function setCrawlRotator(rotator: CrawlRotator | null): void {
crawlRotator = rotator; crawlRotator = rotator;
if (rotator) { if (rotator) {
console.log('[Dutchie Client] CrawlRotator attached - proxy rotation enabled'); console.log('[Dutchie Client] CrawlRotator attached - proxy rotation enabled');
// Set initial proxy from rotator
const proxy = rotator.proxy.getCurrent(); const proxy = rotator.proxy.getCurrent();
if (proxy) { if (proxy) {
currentProxy = rotator.proxy.getProxyUrl(proxy); currentProxy = rotator.proxy.getProxyUrl(proxy);
@@ -115,30 +127,41 @@ export function getCrawlRotator(): CrawlRotator | null {
} }
/** /**
* Rotate to next proxy (called on 403) * Handle 403 block - per workflow-12102025.md:
* 1. Record block on current proxy (increments consecutive_403_count)
* 2. Immediately rotate to new proxy (new IP)
* 3. Rotate fingerprint
* Returns false if no more proxies available
*/ */
async function rotateProxyOn403(error?: string): Promise<boolean> { async function handle403Block(): Promise<boolean> {
if (!crawlRotator) { if (!crawlRotator) {
console.warn('[Dutchie Client] No CrawlRotator - cannot handle 403');
return false; return false;
} }
// Record failure on current proxy // Per workflow-12102025.md: record block (tracks consecutive 403s)
await crawlRotator.recordFailure(error || '403 Forbidden'); const wasDisabled = await crawlRotator.recordBlock();
if (wasDisabled) {
console.log('[Dutchie Client] Current proxy was disabled (3 consecutive 403s)');
}
// Per workflow-12102025.md: immediately get new IP + new fingerprint
const { proxy: nextProxy, fingerprint } = crawlRotator.rotateBoth();
// Rotate to next proxy
const nextProxy = crawlRotator.rotateProxy();
if (nextProxy) { if (nextProxy) {
currentProxy = crawlRotator.proxy.getProxyUrl(nextProxy); currentProxy = crawlRotator.proxy.getProxyUrl(nextProxy);
console.log(`[Dutchie Client] Rotated proxy: ${currentProxy.replace(/:[^:@]+@/, ':***@')}`); console.log(`[Dutchie Client] Rotated to new proxy: ${currentProxy.replace(/:[^:@]+@/, ':***@')}`);
console.log(`[Dutchie Client] New fingerprint: ${fingerprint.userAgent.slice(0, 50)}...`);
return true; return true;
} }
console.warn('[Dutchie Client] No more proxies available'); console.error('[Dutchie Client] No more proxies available!');
return false; return false;
} }
/** /**
* Record success on current proxy * Record success on current proxy
* Per workflow-12102025.md: resets consecutive_403_count
*/ */
async function recordProxySuccess(responseTimeMs?: number): Promise<void> { async function recordProxySuccess(responseTimeMs?: number): Promise<void> {
if (crawlRotator) { if (crawlRotator) {
@@ -162,163 +185,69 @@ export const GRAPHQL_HASHES = {
GetAllCitiesByState: 'ae547a0466ace5a48f91e55bf6699eacd87e3a42841560f0c0eabed5a0a920e6', GetAllCitiesByState: 'ae547a0466ace5a48f91e55bf6699eacd87e3a42841560f0c0eabed5a0a920e6',
}; };
// ============================================================
// FINGERPRINTS - Browser profiles for anti-detect
// ============================================================
const FINGERPRINTS: Fingerprint[] = [
// Chrome Windows (latest) - typical residential user, use first
{
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
acceptLanguage: 'en-US,en;q=0.9',
secChUa: '"Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"',
secChUaPlatform: '"Windows"',
secChUaMobile: '?0',
},
// Chrome Mac (latest)
{
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
acceptLanguage: 'en-US,en;q=0.9',
secChUa: '"Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"',
secChUaPlatform: '"macOS"',
secChUaMobile: '?0',
},
// Chrome Windows (120)
{
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
acceptLanguage: 'en-US,en;q=0.9',
secChUa: '"Chromium";v="120", "Google Chrome";v="120", "Not-A.Brand";v="99"',
secChUaPlatform: '"Windows"',
secChUaMobile: '?0',
},
// Firefox Windows
{
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:133.0) Gecko/20100101 Firefox/133.0',
acceptLanguage: 'en-US,en;q=0.5',
},
// Safari Mac
{
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 14_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Safari/605.1.15',
acceptLanguage: 'en-US,en;q=0.9',
},
// Edge Windows
{
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.0.0',
acceptLanguage: 'en-US,en;q=0.9',
secChUa: '"Microsoft Edge";v="131", "Chromium";v="131", "Not_A Brand";v="24"',
secChUaPlatform: '"Windows"',
secChUaMobile: '?0',
},
];
let currentFingerprintIndex = 0;
// Forward declaration for session (actual CrawlSession interface defined later)
let currentSession: {
sessionId: string;
fingerprint: Fingerprint;
proxyUrl: string | null;
stateCode?: string;
timezone?: string;
startedAt: Date;
} | null = null;
/**
* Get current fingerprint - returns session fingerprint if active, otherwise default
*/
export function getFingerprint(): Fingerprint {
// Use session fingerprint if a session is active
if (currentSession) {
return currentSession.fingerprint;
}
return FINGERPRINTS[currentFingerprintIndex];
}
export function rotateFingerprint(): Fingerprint {
currentFingerprintIndex = (currentFingerprintIndex + 1) % FINGERPRINTS.length;
const fp = FINGERPRINTS[currentFingerprintIndex];
console.log(`[Dutchie Client] Rotated to fingerprint: ${fp.userAgent.slice(0, 50)}...`);
return fp;
}
export function resetFingerprint(): void {
currentFingerprintIndex = 0;
}
/**
* Get a random fingerprint from the pool
*/
export function getRandomFingerprint(): Fingerprint {
const index = Math.floor(Math.random() * FINGERPRINTS.length);
return FINGERPRINTS[index];
}
// ============================================================ // ============================================================
// SESSION MANAGEMENT // SESSION MANAGEMENT
// Per-session fingerprint rotation for stealth // Per workflow-12102025.md:
// - Session identity comes from PROXY LOCATION
// - NOT from task params (no stateCode/timezone params)
// - Language is always English
// ============================================================ // ============================================================
export interface CrawlSession { export interface CrawlSession {
sessionId: string; sessionId: string;
fingerprint: Fingerprint; fingerprint: BrowserFingerprint;
proxyUrl: string | null; proxyUrl: string | null;
stateCode?: string; proxyTimezone?: string;
timezone?: string; proxyState?: string;
startedAt: Date; startedAt: Date;
// Per workflow-12102025.md: Dynamic Referer per dispensary
menuUrl?: string;
referer: string;
} }
// Note: currentSession variable declared earlier in file for proper scoping let currentSession: CrawlSession | null = null;
/** /**
* Timezone to Accept-Language mapping * Start a new crawl session
* US timezones all use en-US but this can be extended for international *
* Per workflow-12102025.md:
* - NO state/timezone params - identity comes from proxy location
* - Gets fingerprint from CrawlRotator (uses intoli/user-agents)
* - Language is always English (en-US)
* - Dynamic Referer per dispensary (from menuUrl)
*
* @param menuUrl - The dispensary's menu URL for dynamic Referer header
*/ */
const TIMEZONE_TO_LOCALE: Record<string, string> = { export function startSession(menuUrl?: string): CrawlSession {
'America/Phoenix': 'en-US,en;q=0.9', if (!crawlRotator) {
'America/Los_Angeles': 'en-US,en;q=0.9', throw new Error('[Dutchie Client] Cannot start session without CrawlRotator');
'America/Denver': 'en-US,en;q=0.9', }
'America/Chicago': 'en-US,en;q=0.9',
'America/New_York': 'en-US,en;q=0.9',
'America/Detroit': 'en-US,en;q=0.9',
'America/Anchorage': 'en-US,en;q=0.9',
'Pacific/Honolulu': 'en-US,en;q=0.9',
};
/** // Per workflow-12102025.md: get identity from proxy location
* Get Accept-Language header for a given timezone const proxyLocation = crawlRotator.getProxyLocation();
*/ const fingerprint = crawlRotator.userAgent.getCurrent();
export function getLocaleForTimezone(timezone?: string): string {
if (!timezone) return 'en-US,en;q=0.9';
return TIMEZONE_TO_LOCALE[timezone] || 'en-US,en;q=0.9';
}
/** // Per workflow-12102025.md: Dynamic Referer per dispensary
* Start a new crawl session with a random fingerprint const referer = buildRefererFromMenuUrl(menuUrl);
* Call this before crawling a store to get a fresh identity
*/
export function startSession(stateCode?: string, timezone?: string): CrawlSession {
const baseFp = getRandomFingerprint();
// Override Accept-Language based on timezone for geographic consistency
const fingerprint: Fingerprint = {
...baseFp,
acceptLanguage: getLocaleForTimezone(timezone),
};
currentSession = { currentSession = {
sessionId: `session_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`, sessionId: `session_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`,
fingerprint, fingerprint,
proxyUrl: currentProxy, proxyUrl: currentProxy,
stateCode, proxyTimezone: proxyLocation?.timezone,
timezone, proxyState: proxyLocation?.state,
startedAt: new Date(), startedAt: new Date(),
menuUrl,
referer,
}; };
console.log(`[Dutchie Client] Started session ${currentSession.sessionId}`); console.log(`[Dutchie Client] Started session ${currentSession.sessionId}`);
console.log(`[Dutchie Client] Fingerprint: ${fingerprint.userAgent.slice(0, 50)}...`); console.log(`[Dutchie Client] Browser: ${fingerprint.browserName} (${fingerprint.deviceCategory})`);
console.log(`[Dutchie Client] Accept-Language: ${fingerprint.acceptLanguage}`); console.log(`[Dutchie Client] DNT: ${fingerprint.httpFingerprint.hasDNT ? 'enabled' : 'disabled'}`);
if (timezone) { console.log(`[Dutchie Client] TLS: ${fingerprint.httpFingerprint.curlImpersonateBinary}`);
console.log(`[Dutchie Client] Timezone: ${timezone}`); console.log(`[Dutchie Client] Referer: ${referer}`);
if (proxyLocation?.timezone) {
console.log(`[Dutchie Client] Proxy: ${proxyLocation.state || 'unknown'} (${proxyLocation.timezone})`);
} }
return currentSession; return currentSession;
@@ -347,48 +276,80 @@ export function getCurrentSession(): CrawlSession | null {
// ============================================================ // ============================================================
/** /**
* Build headers for Dutchie requests * Per workflow-12102025.md: Build headers using HTTP fingerprint system
* Returns headers in browser-specific order with all natural variations
*/ */
export function buildHeaders(refererPath: string, fingerprint?: Fingerprint): Record<string, string> { export function buildHeaders(isPost: boolean, contentLength?: number): { headers: Record<string, string>; orderedHeaders: string[] } {
const fp = fingerprint || getFingerprint(); if (!currentSession || !crawlRotator) {
const refererUrl = `https://dutchie.com${refererPath}`; throw new Error('[Dutchie Client] Cannot build headers without active session');
const headers: Record<string, string> = {
'accept': 'application/json, text/plain, */*',
'accept-language': fp.acceptLanguage,
'content-type': 'application/json',
'origin': 'https://dutchie.com',
'referer': refererUrl,
'user-agent': fp.userAgent,
'apollographql-client-name': 'Marketplace (production)',
};
if (fp.secChUa) {
headers['sec-ch-ua'] = fp.secChUa;
headers['sec-ch-ua-mobile'] = fp.secChUaMobile || '?0';
headers['sec-ch-ua-platform'] = fp.secChUaPlatform || '"Windows"';
headers['sec-fetch-dest'] = 'empty';
headers['sec-fetch-mode'] = 'cors';
headers['sec-fetch-site'] = 'same-site';
} }
return headers; const fp = currentSession.fingerprint;
const httpFp = fp.httpFingerprint;
// Per workflow-12102025.md: Build context for ordered headers
const context: HeaderContext = {
userAgent: fp.userAgent,
secChUa: fp.secChUa,
secChUaPlatform: fp.secChUaPlatform,
secChUaMobile: fp.secChUaMobile,
referer: currentSession.referer,
isPost,
contentLength,
};
// Per workflow-12102025.md: Get ordered headers from HTTP fingerprint service
return buildOrderedHeaders(httpFp, context);
} }
/** /**
* Execute HTTP POST using curl (bypasses TLS fingerprinting) * Per workflow-12102025.md: Get curl binary for current session's browser
* Uses curl-impersonate for TLS fingerprint matching
*/ */
export function curlPost(url: string, body: any, headers: Record<string, string>, timeout = 30000): CurlResponse { function getCurlBinaryForSession(): string {
const filteredHeaders = Object.entries(headers) if (!currentSession) {
.filter(([k]) => k.toLowerCase() !== 'accept-encoding') return 'curl'; // Fallback to standard curl
.map(([k, v]) => `-H '${k}: ${v}'`) }
const browserType = currentSession.fingerprint.browserName as BrowserType;
// Per workflow-12102025.md: Check if curl-impersonate is available
if (isCurlImpersonateAvailable(browserType)) {
return getCurlBinary(browserType);
}
// Fallback to standard curl with warning
console.warn(`[Dutchie Client] curl-impersonate not available for ${browserType}, using standard curl`);
return 'curl';
}
/**
* Per workflow-12102025.md: Execute HTTP POST using curl/curl-impersonate
* - Uses browser-specific TLS fingerprint via curl-impersonate
* - Headers sent in browser-specific order
* - Dynamic Referer per dispensary
*/
export function curlPost(url: string, body: any, timeout = 30000): CurlResponse {
const bodyJson = JSON.stringify(body);
// Per workflow-12102025.md: Build ordered headers for POST request
const { headers, orderedHeaders } = buildHeaders(true, bodyJson.length);
// Per workflow-12102025.md: Build header args in browser-specific order
const headerArgs = orderedHeaders
.filter(h => h !== 'Host' && h !== 'Content-Length') // curl handles these
.map(h => `-H '${h}: ${headers[h]}'`)
.join(' '); .join(' ');
const bodyJson = JSON.stringify(body).replace(/'/g, "'\\''"); const bodyEscaped = bodyJson.replace(/'/g, "'\\''");
const timeoutSec = Math.ceil(timeout / 1000); const timeoutSec = Math.ceil(timeout / 1000);
const separator = '___HTTP_STATUS___'; const separator = '___HTTP_STATUS___';
const proxyArg = getProxyArg(); const proxyArg = getProxyArg();
const cmd = `curl -s --compressed ${proxyArg} -w '${separator}%{http_code}' --max-time ${timeoutSec} ${filteredHeaders} -d '${bodyJson}' '${url}'`;
// Per workflow-12102025.md: Use curl-impersonate for TLS fingerprint matching
const curlBinary = getCurlBinaryForSession();
const cmd = `${curlBinary} -s --compressed ${proxyArg} -w '${separator}%{http_code}' --max-time ${timeoutSec} ${headerArgs} -d '${bodyEscaped}' '${url}'`;
try { try {
const output = execSync(cmd, { const output = execSync(cmd, {
@@ -427,19 +388,29 @@ export function curlPost(url: string, body: any, headers: Record<string, string>
} }
/** /**
* Execute HTTP GET using curl (bypasses TLS fingerprinting) * Per workflow-12102025.md: Execute HTTP GET using curl/curl-impersonate
* Returns HTML or JSON depending on response content-type * - Uses browser-specific TLS fingerprint via curl-impersonate
* - Headers sent in browser-specific order
* - Dynamic Referer per dispensary
*/ */
export function curlGet(url: string, headers: Record<string, string>, timeout = 30000): CurlResponse { export function curlGet(url: string, timeout = 30000): CurlResponse {
const filteredHeaders = Object.entries(headers) // Per workflow-12102025.md: Build ordered headers for GET request
.filter(([k]) => k.toLowerCase() !== 'accept-encoding') const { headers, orderedHeaders } = buildHeaders(false);
.map(([k, v]) => `-H '${k}: ${v}'`)
// Per workflow-12102025.md: Build header args in browser-specific order
const headerArgs = orderedHeaders
.filter(h => h !== 'Host' && h !== 'Content-Length') // curl handles these
.map(h => `-H '${h}: ${headers[h]}'`)
.join(' '); .join(' ');
const timeoutSec = Math.ceil(timeout / 1000); const timeoutSec = Math.ceil(timeout / 1000);
const separator = '___HTTP_STATUS___'; const separator = '___HTTP_STATUS___';
const proxyArg = getProxyArg(); const proxyArg = getProxyArg();
const cmd = `curl -s --compressed ${proxyArg} -w '${separator}%{http_code}' --max-time ${timeoutSec} ${filteredHeaders} '${url}'`;
// Per workflow-12102025.md: Use curl-impersonate for TLS fingerprint matching
const curlBinary = getCurlBinaryForSession();
const cmd = `${curlBinary} -s --compressed ${proxyArg} -w '${separator}%{http_code}' --max-time ${timeoutSec} ${headerArgs} '${url}'`;
try { try {
const output = execSync(cmd, { const output = execSync(cmd, {
@@ -459,7 +430,6 @@ export function curlGet(url: string, headers: Record<string, string>, timeout =
const responseBody = output.slice(0, separatorIndex); const responseBody = output.slice(0, separatorIndex);
const statusCode = parseInt(output.slice(separatorIndex + separator.length).trim(), 10); const statusCode = parseInt(output.slice(separatorIndex + separator.length).trim(), 10);
// Try to parse as JSON, otherwise return as string (HTML)
try { try {
return { status: statusCode, data: JSON.parse(responseBody) }; return { status: statusCode, data: JSON.parse(responseBody) };
} catch { } catch {
@@ -476,16 +446,22 @@ export function curlGet(url: string, headers: Record<string, string>, timeout =
// ============================================================ // ============================================================
// GRAPHQL EXECUTION // GRAPHQL EXECUTION
// Per workflow-12102025.md:
// - On 403: immediately rotate IP + fingerprint (no delay first)
// - Then retry
// ============================================================ // ============================================================
export interface ExecuteGraphQLOptions { export interface ExecuteGraphQLOptions {
maxRetries?: number; maxRetries?: number;
retryOn403?: boolean; retryOn403?: boolean;
cName?: string; // Optional - used for Referer header, defaults to 'cities' cName?: string;
} }
/** /**
* Execute GraphQL query with curl (bypasses TLS fingerprinting) * Per workflow-12102025.md: Execute GraphQL query with curl/curl-impersonate
* - Uses browser-specific TLS fingerprint
* - Headers in browser-specific order
* - On 403: immediately rotate IP + fingerprint, then retry
*/ */
export async function executeGraphQL( export async function executeGraphQL(
operationName: string, operationName: string,
@@ -493,7 +469,12 @@ export async function executeGraphQL(
hash: string, hash: string,
options: ExecuteGraphQLOptions options: ExecuteGraphQLOptions
): Promise<any> { ): Promise<any> {
const { maxRetries = 3, retryOn403 = true, cName = 'cities' } = options; const { maxRetries = 3, retryOn403 = true } = options;
// Per workflow-12102025.md: Session must be active for requests
if (!currentSession) {
throw new Error('[Dutchie Client] Cannot execute GraphQL without active session - call startSession() first');
}
const body = { const body = {
operationName, operationName,
@@ -507,14 +488,14 @@ export async function executeGraphQL(
let attempt = 0; let attempt = 0;
while (attempt <= maxRetries) { while (attempt <= maxRetries) {
const fingerprint = getFingerprint();
const headers = buildHeaders(`/embedded-menu/${cName}`, fingerprint);
console.log(`[Dutchie Client] curl POST ${operationName} (attempt ${attempt + 1}/${maxRetries + 1})`); console.log(`[Dutchie Client] curl POST ${operationName} (attempt ${attempt + 1}/${maxRetries + 1})`);
const response = curlPost(DUTCHIE_CONFIG.graphqlEndpoint, body, headers, DUTCHIE_CONFIG.timeout); const startTime = Date.now();
// Per workflow-12102025.md: curlPost now uses ordered headers and curl-impersonate
const response = curlPost(DUTCHIE_CONFIG.graphqlEndpoint, body, DUTCHIE_CONFIG.timeout);
const responseTime = Date.now() - startTime;
console.log(`[Dutchie Client] Response status: ${response.status}`); console.log(`[Dutchie Client] Response status: ${response.status} (${responseTime}ms)`);
if (response.error) { if (response.error) {
console.error(`[Dutchie Client] curl error: ${response.error}`); console.error(`[Dutchie Client] curl error: ${response.error}`);
@@ -527,6 +508,9 @@ export async function executeGraphQL(
} }
if (response.status === 200) { if (response.status === 200) {
// Per workflow-12102025.md: success resets consecutive 403 count
await recordProxySuccess(responseTime);
if (response.data?.errors?.length > 0) { if (response.data?.errors?.length > 0) {
console.warn(`[Dutchie Client] GraphQL errors: ${JSON.stringify(response.data.errors[0])}`); console.warn(`[Dutchie Client] GraphQL errors: ${JSON.stringify(response.data.errors[0])}`);
} }
@@ -534,11 +518,20 @@ export async function executeGraphQL(
} }
if (response.status === 403 && retryOn403) { if (response.status === 403 && retryOn403) {
console.warn(`[Dutchie Client] 403 blocked - rotating proxy and fingerprint...`); // Per workflow-12102025.md: immediately rotate IP + fingerprint
await rotateProxyOn403('403 Forbidden on GraphQL'); console.warn(`[Dutchie Client] 403 blocked - immediately rotating proxy + fingerprint...`);
rotateFingerprint(); const hasMoreProxies = await handle403Block();
if (!hasMoreProxies) {
throw new Error('All proxies exhausted - no more IPs available');
}
// Per workflow-12102025.md: Update session referer after rotation
currentSession.referer = buildRefererFromMenuUrl(currentSession.menuUrl);
attempt++; attempt++;
await sleep(1000 * attempt); // Per workflow-12102025.md: small backoff after rotation
await sleep(500);
continue; continue;
} }
@@ -567,8 +560,10 @@ export interface FetchPageOptions {
} }
/** /**
* Fetch HTML page from Dutchie (for city pages, dispensary pages, etc.) * Per workflow-12102025.md: Fetch HTML page from Dutchie
* Returns raw HTML string * - Uses browser-specific TLS fingerprint
* - Headers in browser-specific order
* - Same 403 handling as GraphQL
*/ */
export async function fetchPage( export async function fetchPage(
path: string, path: string,
@@ -577,32 +572,22 @@ export async function fetchPage(
const { maxRetries = 3, retryOn403 = true } = options; const { maxRetries = 3, retryOn403 = true } = options;
const url = `${DUTCHIE_CONFIG.baseUrl}${path}`; const url = `${DUTCHIE_CONFIG.baseUrl}${path}`;
// Per workflow-12102025.md: Session must be active for requests
if (!currentSession) {
throw new Error('[Dutchie Client] Cannot fetch page without active session - call startSession() first');
}
let attempt = 0; let attempt = 0;
while (attempt <= maxRetries) { while (attempt <= maxRetries) {
const fingerprint = getFingerprint(); // Per workflow-12102025.md: curlGet now uses ordered headers and curl-impersonate
const headers: Record<string, string> = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'accept-language': fingerprint.acceptLanguage,
'user-agent': fingerprint.userAgent,
};
if (fingerprint.secChUa) {
headers['sec-ch-ua'] = fingerprint.secChUa;
headers['sec-ch-ua-mobile'] = fingerprint.secChUaMobile || '?0';
headers['sec-ch-ua-platform'] = fingerprint.secChUaPlatform || '"Windows"';
headers['sec-fetch-dest'] = 'document';
headers['sec-fetch-mode'] = 'navigate';
headers['sec-fetch-site'] = 'none';
headers['sec-fetch-user'] = '?1';
headers['upgrade-insecure-requests'] = '1';
}
console.log(`[Dutchie Client] curl GET ${path} (attempt ${attempt + 1}/${maxRetries + 1})`); console.log(`[Dutchie Client] curl GET ${path} (attempt ${attempt + 1}/${maxRetries + 1})`);
const response = curlGet(url, headers, DUTCHIE_CONFIG.timeout); const startTime = Date.now();
const response = curlGet(url, DUTCHIE_CONFIG.timeout);
const responseTime = Date.now() - startTime;
console.log(`[Dutchie Client] Response status: ${response.status}`); console.log(`[Dutchie Client] Response status: ${response.status} (${responseTime}ms)`);
if (response.error) { if (response.error) {
console.error(`[Dutchie Client] curl error: ${response.error}`); console.error(`[Dutchie Client] curl error: ${response.error}`);
@@ -614,15 +599,26 @@ export async function fetchPage(
} }
if (response.status === 200) { if (response.status === 200) {
// Per workflow-12102025.md: success resets consecutive 403 count
await recordProxySuccess(responseTime);
return { html: response.data, status: response.status }; return { html: response.data, status: response.status };
} }
if (response.status === 403 && retryOn403) { if (response.status === 403 && retryOn403) {
console.warn(`[Dutchie Client] 403 blocked - rotating proxy and fingerprint...`); // Per workflow-12102025.md: immediately rotate IP + fingerprint
await rotateProxyOn403('403 Forbidden on page fetch'); console.warn(`[Dutchie Client] 403 blocked - immediately rotating proxy + fingerprint...`);
rotateFingerprint(); const hasMoreProxies = await handle403Block();
if (!hasMoreProxies) {
throw new Error('All proxies exhausted - no more IPs available');
}
// Per workflow-12102025.md: Update session after rotation
currentSession.referer = buildRefererFromMenuUrl(currentSession.menuUrl);
attempt++; attempt++;
await sleep(1000 * attempt); // Per workflow-12102025.md: small backoff after rotation
await sleep(500);
continue; continue;
} }

View File

@@ -6,22 +6,17 @@
*/ */
export { export {
// HTTP Client // HTTP Client (per workflow-12102025.md: uses curl-impersonate + ordered headers)
curlPost, curlPost,
curlGet, curlGet,
executeGraphQL, executeGraphQL,
fetchPage, fetchPage,
extractNextData, extractNextData,
// Headers & Fingerprints // Headers (per workflow-12102025.md: browser-specific ordering)
buildHeaders, buildHeaders,
getFingerprint,
rotateFingerprint,
resetFingerprint,
getRandomFingerprint,
getLocaleForTimezone,
// Session Management (per-store fingerprint rotation) // Session Management (per workflow-12102025.md: menuUrl for dynamic Referer)
startSession, startSession,
endSession, endSession,
getCurrentSession, getCurrentSession,

View File

@@ -7,15 +7,23 @@
* Routes are prefixed with /api/analytics/v2 * Routes are prefixed with /api/analytics/v2
* *
* Phase 3: Analytics Engine + Rec/Med by State * Phase 3: Analytics Engine + Rec/Med by State
*
* SECURITY: All routes require authentication via authMiddleware.
* Access is granted to:
* - Trusted origins (cannaiq.co, findadispo.com, etc.)
* - Trusted IPs (localhost, internal pods)
* - Valid JWT or API tokens
*/ */
import { Router, Request, Response } from 'express'; import { Router, Request, Response } from 'express';
import { Pool } from 'pg'; import { Pool } from 'pg';
import { authMiddleware } from '../auth/middleware';
import { PriceAnalyticsService } from '../services/analytics/PriceAnalyticsService'; import { PriceAnalyticsService } from '../services/analytics/PriceAnalyticsService';
import { BrandPenetrationService } from '../services/analytics/BrandPenetrationService'; import { BrandPenetrationService } from '../services/analytics/BrandPenetrationService';
import { CategoryAnalyticsService } from '../services/analytics/CategoryAnalyticsService'; import { CategoryAnalyticsService } from '../services/analytics/CategoryAnalyticsService';
import { StoreAnalyticsService } from '../services/analytics/StoreAnalyticsService'; import { StoreAnalyticsService } from '../services/analytics/StoreAnalyticsService';
import { StateAnalyticsService } from '../services/analytics/StateAnalyticsService'; import { StateAnalyticsService } from '../services/analytics/StateAnalyticsService';
import { BrandIntelligenceService } from '../services/analytics/BrandIntelligenceService';
import { TimeWindow, LegalType } from '../services/analytics/types'; import { TimeWindow, LegalType } from '../services/analytics/types';
function parseTimeWindow(window?: string): TimeWindow { function parseTimeWindow(window?: string): TimeWindow {
@@ -35,12 +43,17 @@ function parseLegalType(legalType?: string): LegalType {
export function createAnalyticsV2Router(pool: Pool): Router { export function createAnalyticsV2Router(pool: Pool): Router {
const router = Router(); const router = Router();
// SECURITY: Apply auth middleware to ALL routes
// This gate ensures only authenticated requests can access analytics data
router.use(authMiddleware);
// Initialize services // Initialize services
const priceService = new PriceAnalyticsService(pool); const priceService = new PriceAnalyticsService(pool);
const brandService = new BrandPenetrationService(pool); const brandService = new BrandPenetrationService(pool);
const categoryService = new CategoryAnalyticsService(pool); const categoryService = new CategoryAnalyticsService(pool);
const storeService = new StoreAnalyticsService(pool); const storeService = new StoreAnalyticsService(pool);
const stateService = new StateAnalyticsService(pool); const stateService = new StateAnalyticsService(pool);
const brandIntelligenceService = new BrandIntelligenceService(pool);
// ============================================================ // ============================================================
// PRICE ANALYTICS // PRICE ANALYTICS
@@ -231,6 +244,76 @@ export function createAnalyticsV2Router(pool: Pool): Router {
} }
}); });
/**
* GET /brand/:name/promotions
* Get brand promotional history - tracks specials, discounts, duration, and sales estimates
*
* Query params:
* - window: 7d|30d|90d (default: 90d)
* - state: state code filter (e.g., AZ)
* - category: category filter (e.g., Flower)
*/
router.get('/brand/:name/promotions', async (req: Request, res: Response) => {
try {
const brandName = decodeURIComponent(req.params.name);
const window = parseTimeWindow(req.query.window as string) || '90d';
const stateCode = req.query.state as string | undefined;
const category = req.query.category as string | undefined;
const result = await brandService.getBrandPromotionalHistory(brandName, {
window,
stateCode,
category,
});
res.json(result);
} catch (error) {
console.error('[AnalyticsV2] Brand promotions error:', error);
res.status(500).json({ error: 'Failed to fetch brand promotional history' });
}
});
/**
* GET /brand/:name/intelligence
* Get comprehensive B2B brand intelligence dashboard data
*
* Returns all brand metrics in a single unified response:
* - Performance Snapshot (active SKUs, revenue, stores, market share)
* - Alerts/Slippage (lost stores, delisted SKUs, competitor takeovers)
* - Product Velocity (daily rates, velocity status)
* - Retail Footprint (penetration, whitespace opportunities)
* - Competitive Landscape (price position, market share trend)
* - Inventory Health (days of stock, risk levels)
* - Promotion Effectiveness (baseline vs promo velocity, ROI)
*
* Query params:
* - window: 7d|30d|90d (default: 30d)
* - state: state code filter (e.g., AZ)
* - category: category filter (e.g., Flower)
*/
router.get('/brand/:name/intelligence', async (req: Request, res: Response) => {
try {
const brandName = decodeURIComponent(req.params.name);
const window = parseTimeWindow(req.query.window as string);
const stateCode = req.query.state as string | undefined;
const category = req.query.category as string | undefined;
const result = await brandIntelligenceService.getBrandIntelligence(brandName, {
window,
stateCode,
category,
});
if (!result) {
return res.status(404).json({ error: 'Brand not found' });
}
res.json(result);
} catch (error) {
console.error('[AnalyticsV2] Brand intelligence error:', error);
res.status(500).json({ error: 'Failed to fetch brand intelligence' });
}
});
// ============================================================ // ============================================================
// CATEGORY ANALYTICS // CATEGORY ANALYTICS
// ============================================================ // ============================================================
@@ -400,6 +483,31 @@ export function createAnalyticsV2Router(pool: Pool): Router {
} }
}); });
/**
* GET /store/:id/quantity-changes
* Get quantity changes for a store (increases/decreases)
* Useful for estimating sales (decreases) or restocks (increases)
*
* Query params:
* - window: 7d|30d|90d (default: 7d)
* - direction: increase|decrease|all (default: all)
* - limit: number (default: 100)
*/
router.get('/store/:id/quantity-changes', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.id);
const window = parseTimeWindow(req.query.window as string);
const direction = (req.query.direction as 'increase' | 'decrease' | 'all') || 'all';
const limit = req.query.limit ? parseInt(req.query.limit as string) : 100;
const result = await storeService.getQuantityChanges(dispensaryId, { window, direction, limit });
res.json(result);
} catch (error) {
console.error('[AnalyticsV2] Store quantity changes error:', error);
res.status(500).json({ error: 'Failed to fetch store quantity changes' });
}
});
/** /**
* GET /store/:id/inventory * GET /store/:id/inventory
* Get store inventory composition * Get store inventory composition

View File

@@ -47,4 +47,27 @@ router.post('/refresh', authMiddleware, async (req: AuthRequest, res) => {
res.json({ token }); res.json({ token });
}); });
// Verify password for sensitive actions (requires current user to be authenticated)
router.post('/verify-password', authMiddleware, async (req: AuthRequest, res) => {
try {
const { password } = req.body;
if (!password) {
return res.status(400).json({ error: 'Password required' });
}
// Re-authenticate the current user with the provided password
const user = await authenticateUser(req.user!.email, password);
if (!user) {
return res.status(401).json({ error: 'Invalid password', verified: false });
}
res.json({ verified: true });
} catch (error) {
console.error('Password verification error:', error);
res.status(500).json({ error: 'Internal server error' });
}
});
export default router; export default router;

View File

@@ -14,35 +14,56 @@ router.use(authMiddleware);
/** /**
* GET /api/admin/intelligence/brands * GET /api/admin/intelligence/brands
* List all brands with state presence, store counts, and pricing * List all brands with state presence, store counts, and pricing
* Query params:
* - state: Filter by state (e.g., "AZ")
* - limit: Max results (default 500)
* - offset: Pagination offset
*/ */
router.get('/brands', async (req: Request, res: Response) => { router.get('/brands', async (req: Request, res: Response) => {
try { try {
const { limit = '500', offset = '0' } = req.query; const { limit = '500', offset = '0', state } = req.query;
const limitNum = Math.min(parseInt(limit as string, 10), 1000); const limitNum = Math.min(parseInt(limit as string, 10), 1000);
const offsetNum = parseInt(offset as string, 10); const offsetNum = parseInt(offset as string, 10);
// Build WHERE clause based on state filter
let stateFilter = '';
const params: any[] = [limitNum, offsetNum];
if (state && state !== 'all') {
stateFilter = 'AND d.state = $3';
params.push(state);
}
const { rows } = await pool.query(` const { rows } = await pool.query(`
SELECT SELECT
sp.brand_name_raw as brand_name, sp.brand_name_raw as brand_name,
array_agg(DISTINCT d.state) FILTER (WHERE d.state IS NOT NULL) as states, array_agg(DISTINCT d.state) FILTER (WHERE d.state IS NOT NULL) as states,
COUNT(DISTINCT d.id) as store_count, COUNT(DISTINCT d.id) as store_count,
COUNT(DISTINCT sp.id) as sku_count, COUNT(DISTINCT sp.id) as sku_count,
ROUND(AVG(sp.price_rec)::numeric, 2) FILTER (WHERE sp.price_rec > 0) as avg_price_rec, ROUND(AVG(sp.price_rec) FILTER (WHERE sp.price_rec > 0)::numeric, 2) as avg_price_rec,
ROUND(AVG(sp.price_med)::numeric, 2) FILTER (WHERE sp.price_med > 0) as avg_price_med ROUND(AVG(sp.price_med) FILTER (WHERE sp.price_med > 0)::numeric, 2) as avg_price_med
FROM store_products sp FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.brand_name_raw IS NOT NULL AND sp.brand_name_raw != '' WHERE sp.brand_name_raw IS NOT NULL AND sp.brand_name_raw != ''
${stateFilter}
GROUP BY sp.brand_name_raw GROUP BY sp.brand_name_raw
ORDER BY store_count DESC, sku_count DESC ORDER BY store_count DESC, sku_count DESC
LIMIT $1 OFFSET $2 LIMIT $1 OFFSET $2
`, [limitNum, offsetNum]); `, params);
// Get total count // Get total count with same state filter
const countParams: any[] = [];
let countStateFilter = '';
if (state && state !== 'all') {
countStateFilter = 'AND d.state = $1';
countParams.push(state);
}
const { rows: countRows } = await pool.query(` const { rows: countRows } = await pool.query(`
SELECT COUNT(DISTINCT brand_name_raw) as total SELECT COUNT(DISTINCT sp.brand_name_raw) as total
FROM store_products FROM store_products sp
WHERE brand_name_raw IS NOT NULL AND brand_name_raw != '' JOIN dispensaries d ON sp.dispensary_id = d.id
`); WHERE sp.brand_name_raw IS NOT NULL AND sp.brand_name_raw != ''
${countStateFilter}
`, countParams);
res.json({ res.json({
brands: rows.map((r: any) => ({ brands: rows.map((r: any) => ({
@@ -147,29 +168,63 @@ router.get('/brands/:brandName/penetration', async (req: Request, res: Response)
/** /**
* GET /api/admin/intelligence/pricing * GET /api/admin/intelligence/pricing
* Get pricing analytics by category * Get pricing analytics by category
* Query params:
* - state: Filter by state (e.g., "AZ")
*/ */
router.get('/pricing', async (req: Request, res: Response) => { router.get('/pricing', async (req: Request, res: Response) => {
try { try {
const { rows: categoryRows } = await pool.query(` const { state } = req.query;
// Build WHERE clause based on state filter
let stateFilter = '';
const categoryParams: any[] = [];
const stateQueryParams: any[] = [];
const overallParams: any[] = [];
if (state && state !== 'all') {
stateFilter = 'AND d.state = $1';
categoryParams.push(state);
overallParams.push(state);
}
// Category pricing with optional state filter
const categoryQuery = state && state !== 'all'
? `
SELECT SELECT
sp.category_raw as category, sp.category_raw as category,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price, ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) FILTER (WHERE sp.price_rec > 0) as min_price, MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price, MAX(sp.price_rec) as max_price,
ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)::numeric, 2) ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)::numeric, 2) as median_price,
FILTER (WHERE sp.price_rec > 0) as median_price, COUNT(*) as product_count
FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.category_raw IS NOT NULL AND sp.price_rec > 0 ${stateFilter}
GROUP BY sp.category_raw
ORDER BY product_count DESC
`
: `
SELECT
sp.category_raw as category,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)::numeric, 2) as median_price,
COUNT(*) as product_count COUNT(*) as product_count
FROM store_products sp FROM store_products sp
WHERE sp.category_raw IS NOT NULL AND sp.price_rec > 0 WHERE sp.category_raw IS NOT NULL AND sp.price_rec > 0
GROUP BY sp.category_raw GROUP BY sp.category_raw
ORDER BY product_count DESC ORDER BY product_count DESC
`); `;
const { rows: categoryRows } = await pool.query(categoryQuery, categoryParams);
// State pricing
const { rows: stateRows } = await pool.query(` const { rows: stateRows } = await pool.query(`
SELECT SELECT
d.state, d.state,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price, ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) FILTER (WHERE sp.price_rec > 0) as min_price, MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price, MAX(sp.price_rec) as max_price,
COUNT(DISTINCT sp.id) as product_count COUNT(DISTINCT sp.id) as product_count
FROM store_products sp FROM store_products sp
@@ -179,6 +234,31 @@ router.get('/pricing', async (req: Request, res: Response) => {
ORDER BY avg_price DESC ORDER BY avg_price DESC
`); `);
// Overall stats with optional state filter
const overallQuery = state && state !== 'all'
? `
SELECT
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
COUNT(*) as total_products
FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.price_rec > 0 ${stateFilter}
`
: `
SELECT
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
COUNT(*) as total_products
FROM store_products sp
WHERE sp.price_rec > 0
`;
const { rows: overallRows } = await pool.query(overallQuery, overallParams);
const overall = overallRows[0];
res.json({ res.json({
byCategory: categoryRows.map((r: any) => ({ byCategory: categoryRows.map((r: any) => ({
category: r.category, category: r.category,
@@ -195,6 +275,12 @@ router.get('/pricing', async (req: Request, res: Response) => {
maxPrice: r.max_price ? parseFloat(r.max_price) : null, maxPrice: r.max_price ? parseFloat(r.max_price) : null,
productCount: parseInt(r.product_count, 10), productCount: parseInt(r.product_count, 10),
})), })),
overall: {
avgPrice: overall?.avg_price ? parseFloat(overall.avg_price) : null,
minPrice: overall?.min_price ? parseFloat(overall.min_price) : null,
maxPrice: overall?.max_price ? parseFloat(overall.max_price) : null,
totalProducts: parseInt(overall?.total_products || '0', 10),
},
}); });
} catch (error: any) { } catch (error: any) {
console.error('[Intelligence] Error fetching pricing:', error.message); console.error('[Intelligence] Error fetching pricing:', error.message);
@@ -205,9 +291,23 @@ router.get('/pricing', async (req: Request, res: Response) => {
/** /**
* GET /api/admin/intelligence/stores * GET /api/admin/intelligence/stores
* Get store intelligence summary * Get store intelligence summary
* Query params:
* - state: Filter by state (e.g., "AZ")
* - limit: Max results (default 200)
*/ */
router.get('/stores', async (req: Request, res: Response) => { router.get('/stores', async (req: Request, res: Response) => {
try { try {
const { state, limit = '200' } = req.query;
const limitNum = Math.min(parseInt(limit as string, 10), 500);
// Build WHERE clause based on state filter
let stateFilter = '';
const params: any[] = [limitNum];
if (state && state !== 'all') {
stateFilter = 'AND d.state = $2';
params.push(state);
}
const { rows: storeRows } = await pool.query(` const { rows: storeRows } = await pool.query(`
SELECT SELECT
d.id, d.id,
@@ -217,17 +317,22 @@ router.get('/stores', async (req: Request, res: Response) => {
d.state, d.state,
d.menu_type, d.menu_type,
d.crawl_enabled, d.crawl_enabled,
COUNT(DISTINCT sp.id) as product_count, c.name as chain_name,
COUNT(DISTINCT sp.id) as sku_count,
COUNT(DISTINCT sp.brand_name_raw) as brand_count, COUNT(DISTINCT sp.brand_name_raw) as brand_count,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price, ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MAX(sp.updated_at) as last_product_update MAX(sp.updated_at) as last_crawl,
(SELECT COUNT(*) FROM store_product_snapshots sps
WHERE sps.store_product_id IN (SELECT id FROM store_products WHERE dispensary_id = d.id)) as snapshot_count
FROM dispensaries d FROM dispensaries d
LEFT JOIN store_products sp ON sp.dispensary_id = d.id LEFT JOIN store_products sp ON sp.dispensary_id = d.id
WHERE d.state IS NOT NULL LEFT JOIN chains c ON d.chain_id = c.id
GROUP BY d.id, d.name, d.dba_name, d.city, d.state, d.menu_type, d.crawl_enabled WHERE d.state IS NOT NULL AND d.crawl_enabled = true
ORDER BY product_count DESC ${stateFilter}
LIMIT 200 GROUP BY d.id, d.name, d.dba_name, d.city, d.state, d.menu_type, d.crawl_enabled, c.name
`); ORDER BY sku_count DESC
LIMIT $1
`, params);
res.json({ res.json({
stores: storeRows.map((r: any) => ({ stores: storeRows.map((r: any) => ({
@@ -238,10 +343,13 @@ router.get('/stores', async (req: Request, res: Response) => {
state: r.state, state: r.state,
menuType: r.menu_type, menuType: r.menu_type,
crawlEnabled: r.crawl_enabled, crawlEnabled: r.crawl_enabled,
productCount: parseInt(r.product_count || '0', 10), chainName: r.chain_name || null,
skuCount: parseInt(r.sku_count || '0', 10),
snapshotCount: parseInt(r.snapshot_count || '0', 10),
brandCount: parseInt(r.brand_count || '0', 10), brandCount: parseInt(r.brand_count || '0', 10),
avgPrice: r.avg_price ? parseFloat(r.avg_price) : null, avgPrice: r.avg_price ? parseFloat(r.avg_price) : null,
lastProductUpdate: r.last_product_update, lastCrawl: r.last_crawl,
crawlFrequencyHours: 4, // Default crawl frequency
})), })),
total: storeRows.length, total: storeRows.length,
}); });

View File

@@ -543,6 +543,9 @@ router.post('/bulk-priority', async (req: Request, res: Response) => {
/** /**
* POST /api/job-queue/enqueue - Add a new job to the queue * POST /api/job-queue/enqueue - Add a new job to the queue
*
* 2024-12-10: Rewired to use worker_tasks via taskService.
* Legacy dispensary_crawl_jobs code commented out below.
*/ */
router.post('/enqueue', async (req: Request, res: Response) => { router.post('/enqueue', async (req: Request, res: Response) => {
try { try {
@@ -552,6 +555,59 @@ router.post('/enqueue', async (req: Request, res: Response) => {
return res.status(400).json({ success: false, error: 'dispensary_id is required' }); return res.status(400).json({ success: false, error: 'dispensary_id is required' });
} }
// 2024-12-10: Map legacy job_type to new task role
const roleMap: Record<string, string> = {
'dutchie_product_crawl': 'product_refresh',
'menu_detection': 'entry_point_discovery',
'menu_detection_single': 'entry_point_discovery',
'product_discovery': 'product_discovery',
'store_discovery': 'store_discovery',
};
const role = roleMap[job_type] || 'product_refresh';
// 2024-12-10: Use taskService to create task in worker_tasks table
const { taskService } = await import('../tasks/task-service');
// Check if task already pending for this dispensary
const existingTasks = await taskService.listTasks({
dispensary_id,
role: role as any,
status: ['pending', 'claimed', 'running'],
limit: 1,
});
if (existingTasks.length > 0) {
return res.json({
success: true,
task_id: existingTasks[0].id,
message: 'Task already queued'
});
}
const task = await taskService.createTask({
role: role as any,
dispensary_id,
priority,
});
res.json({ success: true, task_id: task.id, message: 'Task enqueued' });
} catch (error: any) {
console.error('[JobQueue] Error enqueuing task:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/*
* LEGACY CODE - 2024-12-10: Commented out, was using orphaned dispensary_crawl_jobs table
*
router.post('/enqueue', async (req: Request, res: Response) => {
try {
const { dispensary_id, job_type = 'dutchie_product_crawl', priority = 0 } = req.body;
if (!dispensary_id) {
return res.status(400).json({ success: false, error: 'dispensary_id is required' });
}
// Check if job already pending for this dispensary // Check if job already pending for this dispensary
const existing = await pool.query(` const existing = await pool.query(`
SELECT id FROM dispensary_crawl_jobs SELECT id FROM dispensary_crawl_jobs
@@ -585,6 +641,7 @@ router.post('/enqueue', async (req: Request, res: Response) => {
res.status(500).json({ success: false, error: error.message }); res.status(500).json({ success: false, error: error.message });
} }
}); });
*/
/** /**
* POST /api/job-queue/pause - Pause queue processing * POST /api/job-queue/pause - Pause queue processing
@@ -612,6 +669,8 @@ router.get('/paused', async (_req: Request, res: Response) => {
/** /**
* POST /api/job-queue/enqueue-batch - Queue multiple dispensaries at once * POST /api/job-queue/enqueue-batch - Queue multiple dispensaries at once
* Body: { dispensary_ids: number[], job_type?: string, priority?: number } * Body: { dispensary_ids: number[], job_type?: string, priority?: number }
*
* 2024-12-10: Rewired to use worker_tasks via taskService.
*/ */
router.post('/enqueue-batch', async (req: Request, res: Response) => { router.post('/enqueue-batch', async (req: Request, res: Response) => {
try { try {
@@ -625,35 +684,30 @@ router.post('/enqueue-batch', async (req: Request, res: Response) => {
return res.status(400).json({ success: false, error: 'Maximum 500 dispensaries per batch' }); return res.status(400).json({ success: false, error: 'Maximum 500 dispensaries per batch' });
} }
// Insert jobs, skipping duplicates // 2024-12-10: Map legacy job_type to new task role
const { rows } = await pool.query(` const roleMap: Record<string, string> = {
INSERT INTO dispensary_crawl_jobs (dispensary_id, job_type, priority, trigger_type, status, created_at) 'dutchie_product_crawl': 'product_refresh',
SELECT 'menu_detection': 'entry_point_discovery',
d.id, 'product_discovery': 'product_discovery',
$2::text, };
$3::integer, const role = roleMap[job_type] || 'product_refresh';
'api_batch',
'pending', // 2024-12-10: Use taskService to create tasks in worker_tasks table
NOW() const { taskService } = await import('../tasks/task-service');
FROM dispensaries d
WHERE d.id = ANY($1::int[]) const tasks = dispensary_ids.map(dispensary_id => ({
AND d.crawl_enabled = true role: role as any,
AND d.platform_dispensary_id IS NOT NULL dispensary_id,
AND NOT EXISTS ( priority,
SELECT 1 FROM dispensary_crawl_jobs cj }));
WHERE cj.dispensary_id = d.id
AND cj.job_type = $2::text const createdCount = await taskService.createTasks(tasks);
AND cj.status IN ('pending', 'running')
)
RETURNING id, dispensary_id
`, [dispensary_ids, job_type, priority]);
res.json({ res.json({
success: true, success: true,
queued: rows.length, queued: createdCount,
requested: dispensary_ids.length, requested: dispensary_ids.length,
job_ids: rows.map(r => r.id), message: `Queued ${createdCount} of ${dispensary_ids.length} dispensaries`
message: `Queued ${rows.length} of ${dispensary_ids.length} dispensaries`
}); });
} catch (error: any) { } catch (error: any) {
console.error('[JobQueue] Error batch enqueuing:', error); console.error('[JobQueue] Error batch enqueuing:', error);
@@ -664,6 +718,8 @@ router.post('/enqueue-batch', async (req: Request, res: Response) => {
/** /**
* POST /api/job-queue/enqueue-state - Queue all crawl-enabled dispensaries for a state * POST /api/job-queue/enqueue-state - Queue all crawl-enabled dispensaries for a state
* Body: { state_code: string, job_type?: string, priority?: number, limit?: number } * Body: { state_code: string, job_type?: string, priority?: number, limit?: number }
*
* 2024-12-10: Rewired to use worker_tasks via taskService.
*/ */
router.post('/enqueue-state', async (req: Request, res: Response) => { router.post('/enqueue-state', async (req: Request, res: Response) => {
try { try {
@@ -673,52 +729,55 @@ router.post('/enqueue-state', async (req: Request, res: Response) => {
return res.status(400).json({ success: false, error: 'state_code is required (e.g., "AZ")' }); return res.status(400).json({ success: false, error: 'state_code is required (e.g., "AZ")' });
} }
// Get state_id and queue jobs // 2024-12-10: Map legacy job_type to new task role
const { rows } = await pool.query(` const roleMap: Record<string, string> = {
WITH target_state AS ( 'dutchie_product_crawl': 'product_refresh',
SELECT id FROM states WHERE code = $1 'menu_detection': 'entry_point_discovery',
) 'product_discovery': 'product_discovery',
INSERT INTO dispensary_crawl_jobs (dispensary_id, job_type, priority, trigger_type, status, created_at) };
SELECT const role = roleMap[job_type] || 'product_refresh';
d.id,
$2::text, // Get dispensary IDs for the state
$3::integer, const dispensaryResult = await pool.query(`
'api_state', SELECT d.id
'pending', FROM dispensaries d
NOW() JOIN states s ON s.id = d.state_id
FROM dispensaries d, target_state WHERE s.code = $1
WHERE d.state_id = target_state.id
AND d.crawl_enabled = true AND d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL AND d.platform_dispensary_id IS NOT NULL
AND NOT EXISTS ( LIMIT $2
SELECT 1 FROM dispensary_crawl_jobs cj `, [state_code.toUpperCase(), limit]);
WHERE cj.dispensary_id = d.id
AND cj.job_type = $2::text const dispensary_ids = dispensaryResult.rows.map((r: any) => r.id);
AND cj.status IN ('pending', 'running')
) // 2024-12-10: Use taskService to create tasks in worker_tasks table
LIMIT $4::integer const { taskService } = await import('../tasks/task-service');
RETURNING id, dispensary_id
`, [state_code.toUpperCase(), job_type, priority, limit]); const tasks = dispensary_ids.map((dispensary_id: number) => ({
role: role as any,
dispensary_id,
priority,
}));
const createdCount = await taskService.createTasks(tasks);
// Get total available count // Get total available count
const countResult = await pool.query(` const countResult = await pool.query(`
WITH target_state AS (
SELECT id FROM states WHERE code = $1
)
SELECT COUNT(*) as total SELECT COUNT(*) as total
FROM dispensaries d, target_state FROM dispensaries d
WHERE d.state_id = target_state.id JOIN states s ON s.id = d.state_id
WHERE s.code = $1
AND d.crawl_enabled = true AND d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL AND d.platform_dispensary_id IS NOT NULL
`, [state_code.toUpperCase()]); `, [state_code.toUpperCase()]);
res.json({ res.json({
success: true, success: true,
queued: rows.length, queued: createdCount,
total_available: parseInt(countResult.rows[0].total), total_available: parseInt(countResult.rows[0].total),
state: state_code.toUpperCase(), state: state_code.toUpperCase(),
job_type, role,
message: `Queued ${rows.length} dispensaries for ${state_code.toUpperCase()}` message: `Queued ${createdCount} dispensaries for ${state_code.toUpperCase()}`
}); });
} catch (error: any) { } catch (error: any) {
console.error('[JobQueue] Error enqueuing state:', error); console.error('[JobQueue] Error enqueuing state:', error);

140
backend/src/routes/k8s.ts Normal file
View File

@@ -0,0 +1,140 @@
/**
* Kubernetes Control Routes
*
* Provides admin UI control over k8s resources like worker scaling.
* Uses in-cluster config when running in k8s, or kubeconfig locally.
*/
import { Router, Request, Response } from 'express';
import * as k8s from '@kubernetes/client-node';
const router = Router();
// K8s client setup - lazy initialization
let appsApi: k8s.AppsV1Api | null = null;
let k8sError: string | null = null;
function getK8sClient(): k8s.AppsV1Api | null {
if (appsApi) return appsApi;
if (k8sError) return null;
try {
const kc = new k8s.KubeConfig();
// Try in-cluster config first (when running in k8s)
try {
kc.loadFromCluster();
console.log('[K8s] Loaded in-cluster config');
} catch {
// Fall back to default kubeconfig (local dev)
try {
kc.loadFromDefault();
console.log('[K8s] Loaded default kubeconfig');
} catch (e) {
k8sError = 'No k8s config available';
console.log('[K8s] No config available - k8s routes disabled');
return null;
}
}
appsApi = kc.makeApiClient(k8s.AppsV1Api);
return appsApi;
} catch (e: any) {
k8sError = e.message;
console.error('[K8s] Failed to initialize client:', e.message);
return null;
}
}
const NAMESPACE = process.env.K8S_NAMESPACE || 'dispensary-scraper';
const WORKER_DEPLOYMENT = 'scraper-worker';
/**
* GET /api/k8s/workers
* Get current worker deployment status
*/
router.get('/workers', async (_req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.json({
success: true,
available: false,
error: k8sError || 'K8s not available',
replicas: 0,
readyReplicas: 0,
});
}
try {
const deployment = await client.readNamespacedDeployment({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
});
res.json({
success: true,
available: true,
replicas: deployment.spec?.replicas || 0,
readyReplicas: deployment.status?.readyReplicas || 0,
availableReplicas: deployment.status?.availableReplicas || 0,
updatedReplicas: deployment.status?.updatedReplicas || 0,
});
} catch (e: any) {
console.error('[K8s] Error getting deployment:', e.message);
res.status(500).json({
success: false,
error: e.message,
});
}
});
/**
* POST /api/k8s/workers/scale
* Scale worker deployment
* Body: { replicas: number }
*/
router.post('/workers/scale', async (req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: k8sError || 'K8s not available',
});
}
const { replicas } = req.body;
if (typeof replicas !== 'number' || replicas < 0 || replicas > 50) {
return res.status(400).json({
success: false,
error: 'replicas must be a number between 0 and 50',
});
}
try {
// Patch the deployment to set replicas
await client.patchNamespacedDeploymentScale({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
body: { spec: { replicas } },
});
console.log(`[K8s] Scaled ${WORKER_DEPLOYMENT} to ${replicas} replicas`);
res.json({
success: true,
replicas,
message: `Scaled to ${replicas} workers`,
});
} catch (e: any) {
console.error('[K8s] Error scaling deployment:', e.message);
res.status(500).json({
success: false,
error: e.message,
});
}
});
export default router;

View File

@@ -291,6 +291,107 @@ router.get('/stores/:id/summary', async (req: Request, res: Response) => {
} }
}); });
/**
* GET /api/markets/stores/:id/crawl-history
* Get crawl history for a specific store
*/
router.get('/stores/:id/crawl-history', async (req: Request, res: Response) => {
try {
const { id } = req.params;
const { limit = '50' } = req.query;
const dispensaryId = parseInt(id, 10);
const limitNum = Math.min(parseInt(limit as string, 10), 100);
// Get crawl history from crawl_orchestration_traces
const { rows: historyRows } = await pool.query(`
SELECT
id,
run_id,
profile_key,
crawler_module,
state_at_start,
state_at_end,
total_steps,
duration_ms,
success,
error_message,
products_found,
started_at,
completed_at
FROM crawl_orchestration_traces
WHERE dispensary_id = $1
ORDER BY started_at DESC
LIMIT $2
`, [dispensaryId, limitNum]);
// Get next scheduled crawl if available
const { rows: scheduleRows } = await pool.query(`
SELECT
js.id as schedule_id,
js.job_name,
js.enabled,
js.base_interval_minutes,
js.jitter_minutes,
js.next_run_at,
js.last_run_at,
js.last_status
FROM job_schedules js
WHERE js.enabled = true
AND js.job_config->>'dispensaryId' = $1::text
ORDER BY js.next_run_at
LIMIT 1
`, [dispensaryId.toString()]);
// Get dispensary info for slug
const { rows: dispRows } = await pool.query(`
SELECT
id,
name,
dba_name,
slug,
state,
city,
menu_type,
platform_dispensary_id,
last_menu_scrape
FROM dispensaries
WHERE id = $1
`, [dispensaryId]);
res.json({
dispensary: dispRows[0] || null,
history: historyRows.map(row => ({
id: row.id,
runId: row.run_id,
profileKey: row.profile_key,
crawlerModule: row.crawler_module,
stateAtStart: row.state_at_start,
stateAtEnd: row.state_at_end,
totalSteps: row.total_steps,
durationMs: row.duration_ms,
success: row.success,
errorMessage: row.error_message,
productsFound: row.products_found,
startedAt: row.started_at?.toISOString() || null,
completedAt: row.completed_at?.toISOString() || null,
})),
nextSchedule: scheduleRows[0] ? {
scheduleId: scheduleRows[0].schedule_id,
jobName: scheduleRows[0].job_name,
enabled: scheduleRows[0].enabled,
baseIntervalMinutes: scheduleRows[0].base_interval_minutes,
jitterMinutes: scheduleRows[0].jitter_minutes,
nextRunAt: scheduleRows[0].next_run_at?.toISOString() || null,
lastRunAt: scheduleRows[0].last_run_at?.toISOString() || null,
lastStatus: scheduleRows[0].last_status,
} : null,
});
} catch (error: any) {
console.error('[Markets] Error fetching crawl history:', error.message);
res.status(500).json({ error: error.message });
}
});
/** /**
* GET /api/markets/stores/:id/products * GET /api/markets/stores/:id/products
* Get products for a store with filtering and pagination * Get products for a store with filtering and pagination

View File

@@ -78,14 +78,14 @@ router.get('/metrics', async (_req: Request, res: Response) => {
/** /**
* GET /api/admin/orchestrator/states * GET /api/admin/orchestrator/states
* Returns array of states with at least one known dispensary * Returns array of states with at least one crawl-enabled dispensary
*/ */
router.get('/states', async (_req: Request, res: Response) => { router.get('/states', async (_req: Request, res: Response) => {
try { try {
const { rows } = await pool.query(` const { rows } = await pool.query(`
SELECT DISTINCT state, COUNT(*) as store_count SELECT DISTINCT state, COUNT(*) as store_count
FROM dispensaries FROM dispensaries
WHERE state IS NOT NULL WHERE state IS NOT NULL AND crawl_enabled = true
GROUP BY state GROUP BY state
ORDER BY state ORDER BY state
`); `);

View File

@@ -0,0 +1,334 @@
/**
* Payload Routes
*
* Per TASK_WORKFLOW_2024-12-10.md: API access to raw crawl payloads.
*
* Endpoints:
* - GET /api/payloads - List payload metadata (paginated)
* - GET /api/payloads/:id - Get payload metadata by ID
* - GET /api/payloads/:id/data - Get full payload JSON
* - GET /api/payloads/store/:dispensaryId - List payloads for a store
* - GET /api/payloads/store/:dispensaryId/latest - Get latest payload for a store
* - GET /api/payloads/store/:dispensaryId/diff - Diff two payloads
*/
import { Router, Request, Response } from 'express';
import { getPool } from '../db/pool';
import {
loadRawPayloadById,
getLatestPayload,
getRecentPayloads,
listPayloadMetadata,
} from '../utils/payload-storage';
import { Pool } from 'pg';
const router = Router();
// Get pool instance for queries
const getDbPool = (): Pool => getPool() as unknown as Pool;
/**
* GET /api/payloads
* List payload metadata (paginated)
*/
router.get('/', async (req: Request, res: Response) => {
try {
const pool = getDbPool();
const limit = Math.min(parseInt(req.query.limit as string) || 50, 100);
const offset = parseInt(req.query.offset as string) || 0;
const dispensaryId = req.query.dispensary_id ? parseInt(req.query.dispensary_id as string) : undefined;
const payloads = await listPayloadMetadata(pool, {
dispensaryId,
limit,
offset,
});
res.json({
success: true,
payloads,
pagination: { limit, offset },
});
} catch (error: any) {
console.error('[Payloads] List error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/payloads/:id
* Get payload metadata by ID
*/
router.get('/:id', async (req: Request, res: Response) => {
try {
const pool = getDbPool();
const id = parseInt(req.params.id);
const result = await pool.query(`
SELECT
p.id,
p.dispensary_id,
p.crawl_run_id,
p.storage_path,
p.product_count,
p.size_bytes,
p.size_bytes_raw,
p.fetched_at,
p.processed_at,
p.checksum_sha256,
d.name as dispensary_name
FROM raw_crawl_payloads p
LEFT JOIN dispensaries d ON d.id = p.dispensary_id
WHERE p.id = $1
`, [id]);
if (result.rows.length === 0) {
return res.status(404).json({ success: false, error: 'Payload not found' });
}
res.json({
success: true,
payload: result.rows[0],
});
} catch (error: any) {
console.error('[Payloads] Get error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/payloads/:id/data
* Get full payload JSON (decompressed from disk)
*/
router.get('/:id/data', async (req: Request, res: Response) => {
try {
const pool = getDbPool();
const id = parseInt(req.params.id);
const result = await loadRawPayloadById(pool, id);
if (!result) {
return res.status(404).json({ success: false, error: 'Payload not found' });
}
res.json({
success: true,
metadata: result.metadata,
data: result.payload,
});
} catch (error: any) {
console.error('[Payloads] Get data error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/payloads/store/:dispensaryId
* List payloads for a specific store
*/
router.get('/store/:dispensaryId', async (req: Request, res: Response) => {
try {
const pool = getDbPool();
const dispensaryId = parseInt(req.params.dispensaryId);
const limit = Math.min(parseInt(req.query.limit as string) || 20, 100);
const offset = parseInt(req.query.offset as string) || 0;
const payloads = await listPayloadMetadata(pool, {
dispensaryId,
limit,
offset,
});
res.json({
success: true,
dispensaryId,
payloads,
pagination: { limit, offset },
});
} catch (error: any) {
console.error('[Payloads] Store list error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/payloads/store/:dispensaryId/latest
* Get the latest payload for a store (with full data)
*/
router.get('/store/:dispensaryId/latest', async (req: Request, res: Response) => {
try {
const pool = getDbPool();
const dispensaryId = parseInt(req.params.dispensaryId);
const result = await getLatestPayload(pool, dispensaryId);
if (!result) {
return res.status(404).json({
success: false,
error: `No payloads found for dispensary ${dispensaryId}`,
});
}
res.json({
success: true,
metadata: result.metadata,
data: result.payload,
});
} catch (error: any) {
console.error('[Payloads] Latest error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/payloads/store/:dispensaryId/diff
* Compare two payloads for a store
*
* Query params:
* - from: payload ID (older)
* - to: payload ID (newer) - optional, defaults to latest
*/
router.get('/store/:dispensaryId/diff', async (req: Request, res: Response) => {
try {
const pool = getDbPool();
const dispensaryId = parseInt(req.params.dispensaryId);
const fromId = req.query.from ? parseInt(req.query.from as string) : undefined;
const toId = req.query.to ? parseInt(req.query.to as string) : undefined;
let fromPayload: any;
let toPayload: any;
if (fromId && toId) {
// Load specific payloads
const [from, to] = await Promise.all([
loadRawPayloadById(pool, fromId),
loadRawPayloadById(pool, toId),
]);
fromPayload = from;
toPayload = to;
} else {
// Load two most recent
const recent = await getRecentPayloads(pool, dispensaryId, 2);
if (recent.length < 2) {
return res.status(400).json({
success: false,
error: 'Need at least 2 payloads to diff. Only found ' + recent.length,
});
}
toPayload = recent[0]; // Most recent
fromPayload = recent[1]; // Previous
}
if (!fromPayload || !toPayload) {
return res.status(404).json({ success: false, error: 'One or both payloads not found' });
}
// Build product maps by ID
const fromProducts = new Map<string, any>();
const toProducts = new Map<string, any>();
for (const p of fromPayload.payload.products || []) {
const id = p._id || p.id;
if (id) fromProducts.set(id, p);
}
for (const p of toPayload.payload.products || []) {
const id = p._id || p.id;
if (id) toProducts.set(id, p);
}
// Find differences
const added: any[] = [];
const removed: any[] = [];
const priceChanges: any[] = [];
const stockChanges: any[] = [];
// Products in "to" but not in "from" = added
for (const [id, product] of toProducts) {
if (!fromProducts.has(id)) {
added.push({
id,
name: product.name,
brand: product.brand?.name,
price: product.Prices?.[0]?.price,
});
}
}
// Products in "from" but not in "to" = removed
for (const [id, product] of fromProducts) {
if (!toProducts.has(id)) {
removed.push({
id,
name: product.name,
brand: product.brand?.name,
price: product.Prices?.[0]?.price,
});
}
}
// Products in both - check for changes
for (const [id, toProduct] of toProducts) {
const fromProduct = fromProducts.get(id);
if (!fromProduct) continue;
const fromPrice = fromProduct.Prices?.[0]?.price;
const toPrice = toProduct.Prices?.[0]?.price;
if (fromPrice !== toPrice) {
priceChanges.push({
id,
name: toProduct.name,
brand: toProduct.brand?.name,
oldPrice: fromPrice,
newPrice: toPrice,
change: toPrice && fromPrice ? toPrice - fromPrice : null,
});
}
const fromStock = fromProduct.Status || fromProduct.status;
const toStock = toProduct.Status || toProduct.status;
if (fromStock !== toStock) {
stockChanges.push({
id,
name: toProduct.name,
brand: toProduct.brand?.name,
oldStatus: fromStock,
newStatus: toStock,
});
}
}
res.json({
success: true,
from: {
id: fromPayload.metadata.id,
fetchedAt: fromPayload.metadata.fetchedAt,
productCount: fromPayload.metadata.productCount,
},
to: {
id: toPayload.metadata.id,
fetchedAt: toPayload.metadata.fetchedAt,
productCount: toPayload.metadata.productCount,
},
diff: {
added: added.length,
removed: removed.length,
priceChanges: priceChanges.length,
stockChanges: stockChanges.length,
},
details: {
added,
removed,
priceChanges,
stockChanges,
},
});
} catch (error: any) {
console.error('[Payloads] Diff error:', error.message);
res.status(500).json({ success: false, error: error.message });
}
});
export default router;

View File

@@ -183,8 +183,8 @@ router.post('/test-all', requireRole('superadmin', 'admin'), async (req, res) =>
return res.status(400).json({ error: 'Concurrency must be between 1 and 50' }); return res.status(400).json({ error: 'Concurrency must be between 1 and 50' });
} }
const jobId = await createProxyTestJob(mode, concurrency); const { jobId, totalProxies } = await createProxyTestJob(mode, concurrency);
res.json({ jobId, mode, concurrency, message: `Proxy test job started (mode: ${mode}, concurrency: ${concurrency})` }); res.json({ jobId, total: totalProxies, mode, concurrency, message: `Proxy test job started (mode: ${mode}, concurrency: ${concurrency})` });
} catch (error: any) { } catch (error: any) {
console.error('Error starting proxy test job:', error); console.error('Error starting proxy test job:', error);
res.status(500).json({ error: error.message || 'Failed to start proxy test job' }); res.status(500).json({ error: error.message || 'Failed to start proxy test job' });
@@ -195,8 +195,8 @@ router.post('/test-all', requireRole('superadmin', 'admin'), async (req, res) =>
router.post('/test-failed', requireRole('superadmin', 'admin'), async (req, res) => { router.post('/test-failed', requireRole('superadmin', 'admin'), async (req, res) => {
try { try {
const concurrency = parseInt(req.query.concurrency as string) || 10; const concurrency = parseInt(req.query.concurrency as string) || 10;
const jobId = await createProxyTestJob('failed', concurrency); const { jobId, totalProxies } = await createProxyTestJob('failed', concurrency);
res.json({ jobId, mode: 'failed', concurrency, message: 'Retesting failed proxies...' }); res.json({ jobId, total: totalProxies, mode: 'failed', concurrency, message: 'Retesting failed proxies...' });
} catch (error: any) { } catch (error: any) {
console.error('Error starting failed proxy test:', error); console.error('Error starting failed proxy test:', error);
res.status(500).json({ error: error.message || 'Failed to start proxy test job' }); res.status(500).json({ error: error.message || 'Failed to start proxy test job' });

View File

@@ -130,6 +130,12 @@ const CONSUMER_TRUSTED_ORIGINS = [
'http://localhost:3002', 'http://localhost:3002',
]; ];
// Wildcard trusted origin patterns (*.domain.com)
const CONSUMER_TRUSTED_PATTERNS = [
/^https:\/\/([a-z0-9-]+\.)?cannaiq\.co$/,
/^https:\/\/([a-z0-9-]+\.)?cannabrands\.app$/,
];
// Trusted IPs for local development (bypass API key auth) // Trusted IPs for local development (bypass API key auth)
const TRUSTED_IPS = ['127.0.0.1', '::1', '::ffff:127.0.0.1']; const TRUSTED_IPS = ['127.0.0.1', '::1', '::ffff:127.0.0.1'];
@@ -150,9 +156,18 @@ function isConsumerTrustedRequest(req: Request): boolean {
return true; return true;
} }
const origin = req.headers.origin; const origin = req.headers.origin;
if (origin && CONSUMER_TRUSTED_ORIGINS.includes(origin)) { if (origin) {
// Check exact matches
if (CONSUMER_TRUSTED_ORIGINS.includes(origin)) {
return true; return true;
} }
// Check wildcard patterns
for (const pattern of CONSUMER_TRUSTED_PATTERNS) {
if (pattern.test(origin)) {
return true;
}
}
}
const referer = req.headers.referer; const referer = req.headers.referer;
if (referer) { if (referer) {
for (const trusted of CONSUMER_TRUSTED_ORIGINS) { for (const trusted of CONSUMER_TRUSTED_ORIGINS) {
@@ -160,6 +175,18 @@ function isConsumerTrustedRequest(req: Request): boolean {
return true; return true;
} }
} }
// Check wildcard patterns against referer origin
try {
const refererUrl = new URL(referer);
const refererOrigin = refererUrl.origin;
for (const pattern of CONSUMER_TRUSTED_PATTERNS) {
if (pattern.test(refererOrigin)) {
return true;
}
}
} catch {
// Invalid referer URL, ignore
}
} }
return false; return false;
} }

View File

@@ -13,6 +13,12 @@ import {
TaskFilter, TaskFilter,
} from '../tasks/task-service'; } from '../tasks/task-service';
import { pool } from '../db/pool'; import { pool } from '../db/pool';
import {
isTaskPoolPaused,
pauseTaskPool,
resumeTaskPool,
getTaskPoolStatus,
} from '../tasks/task-pool-state';
const router = Router(); const router = Router();
@@ -592,4 +598,42 @@ router.post('/migration/full-migrate', async (req: Request, res: Response) => {
} }
}); });
/**
* GET /api/tasks/pool/status
* Check if task pool is paused
*/
router.get('/pool/status', async (_req: Request, res: Response) => {
const status = getTaskPoolStatus();
res.json({
success: true,
...status,
});
});
/**
* POST /api/tasks/pool/pause
* Pause the task pool - workers won't pick up new tasks
*/
router.post('/pool/pause', async (_req: Request, res: Response) => {
pauseTaskPool();
res.json({
success: true,
paused: true,
message: 'Task pool paused - workers will not pick up new tasks',
});
});
/**
* POST /api/tasks/pool/resume
* Resume the task pool - workers will pick up tasks again
*/
router.post('/pool/resume', async (_req: Request, res: Response) => {
resumeTaskPool();
res.json({
success: true,
paused: false,
message: 'Task pool resumed - workers will pick up new tasks',
});
});
export default router; export default router;

View File

@@ -14,23 +14,36 @@ router.get('/', async (req: AuthRequest, res) => {
try { try {
const { search, domain } = req.query; const { search, domain } = req.query;
let query = ` // Check which columns exist (schema-tolerant)
SELECT id, email, role, first_name, last_name, phone, domain, created_at, updated_at const columnsResult = await pool.query(`
FROM users SELECT column_name FROM information_schema.columns
WHERE 1=1 WHERE table_name = 'users' AND column_name IN ('first_name', 'last_name', 'phone', 'domain')
`; `);
const existingColumns = new Set(columnsResult.rows.map((r: any) => r.column_name));
// Build column list based on what exists
const selectCols = ['id', 'email', 'role', 'created_at', 'updated_at'];
if (existingColumns.has('first_name')) selectCols.push('first_name');
if (existingColumns.has('last_name')) selectCols.push('last_name');
if (existingColumns.has('phone')) selectCols.push('phone');
if (existingColumns.has('domain')) selectCols.push('domain');
let query = `SELECT ${selectCols.join(', ')} FROM users WHERE 1=1`;
const params: any[] = []; const params: any[] = [];
let paramIndex = 1; let paramIndex = 1;
// Search by email, first_name, or last_name // Search by email (and optionally first_name, last_name if they exist)
if (search && typeof search === 'string') { if (search && typeof search === 'string') {
query += ` AND (email ILIKE $${paramIndex} OR first_name ILIKE $${paramIndex} OR last_name ILIKE $${paramIndex})`; const searchClauses = ['email ILIKE $' + paramIndex];
if (existingColumns.has('first_name')) searchClauses.push('first_name ILIKE $' + paramIndex);
if (existingColumns.has('last_name')) searchClauses.push('last_name ILIKE $' + paramIndex);
query += ` AND (${searchClauses.join(' OR ')})`;
params.push(`%${search}%`); params.push(`%${search}%`);
paramIndex++; paramIndex++;
} }
// Filter by domain // Filter by domain (if column exists)
if (domain && typeof domain === 'string') { if (domain && typeof domain === 'string' && existingColumns.has('domain')) {
query += ` AND domain = $${paramIndex}`; query += ` AND domain = $${paramIndex}`;
params.push(domain); params.push(domain);
paramIndex++; paramIndex++;
@@ -50,8 +63,22 @@ router.get('/', async (req: AuthRequest, res) => {
router.get('/:id', async (req: AuthRequest, res) => { router.get('/:id', async (req: AuthRequest, res) => {
try { try {
const { id } = req.params; const { id } = req.params;
// Check which columns exist (schema-tolerant)
const columnsResult = await pool.query(`
SELECT column_name FROM information_schema.columns
WHERE table_name = 'users' AND column_name IN ('first_name', 'last_name', 'phone', 'domain')
`);
const existingColumns = new Set(columnsResult.rows.map((r: any) => r.column_name));
const selectCols = ['id', 'email', 'role', 'created_at', 'updated_at'];
if (existingColumns.has('first_name')) selectCols.push('first_name');
if (existingColumns.has('last_name')) selectCols.push('last_name');
if (existingColumns.has('phone')) selectCols.push('phone');
if (existingColumns.has('domain')) selectCols.push('domain');
const result = await pool.query(` const result = await pool.query(`
SELECT id, email, role, first_name, last_name, phone, domain, created_at, updated_at SELECT ${selectCols.join(', ')}
FROM users FROM users
WHERE id = $1 WHERE id = $1
`, [id]); `, [id]);

View File

@@ -138,17 +138,36 @@ router.post('/register', async (req: Request, res: Response) => {
* *
* Body: * Body:
* - worker_id: string (required) * - worker_id: string (required)
* - current_task_id: number (optional) - task currently being processed * - current_task_id: number (optional) - task currently being processed (primary task)
* - current_task_ids: number[] (optional) - all tasks currently being processed (concurrent)
* - active_task_count: number (optional) - number of tasks currently running
* - max_concurrent_tasks: number (optional) - max concurrent tasks this worker can handle
* - status: string (optional) - 'active', 'idle' * - status: string (optional) - 'active', 'idle'
* - resources: object (optional) - memory_mb, cpu_user_ms, cpu_system_ms, etc.
*/ */
router.post('/heartbeat', async (req: Request, res: Response) => { router.post('/heartbeat', async (req: Request, res: Response) => {
try { try {
const { worker_id, current_task_id, status = 'active', resources } = req.body; const {
worker_id,
current_task_id,
current_task_ids,
active_task_count,
max_concurrent_tasks,
status = 'active',
resources
} = req.body;
if (!worker_id) { if (!worker_id) {
return res.status(400).json({ success: false, error: 'worker_id is required' }); return res.status(400).json({ success: false, error: 'worker_id is required' });
} }
// Build metadata object with all the new fields
const metadata: Record<string, unknown> = {};
if (resources) Object.assign(metadata, resources);
if (current_task_ids) metadata.current_task_ids = current_task_ids;
if (active_task_count !== undefined) metadata.active_task_count = active_task_count;
if (max_concurrent_tasks !== undefined) metadata.max_concurrent_tasks = max_concurrent_tasks;
// Store resources in metadata jsonb column // Store resources in metadata jsonb column
const { rows } = await pool.query(` const { rows } = await pool.query(`
UPDATE worker_registry UPDATE worker_registry
@@ -159,7 +178,7 @@ router.post('/heartbeat', async (req: Request, res: Response) => {
updated_at = NOW() updated_at = NOW()
WHERE worker_id = $3 WHERE worker_id = $3
RETURNING id, friendly_name, status RETURNING id, friendly_name, status
`, [current_task_id || null, status, worker_id, resources ? JSON.stringify(resources) : null]); `, [current_task_id || null, status, worker_id, Object.keys(metadata).length > 0 ? JSON.stringify(metadata) : null]);
if (rows.length === 0) { if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found - please register first' }); return res.status(404).json({ success: false, error: 'Worker not found - please register first' });
@@ -273,6 +292,29 @@ router.post('/deregister', async (req: Request, res: Response) => {
*/ */
router.get('/workers', async (req: Request, res: Response) => { router.get('/workers', async (req: Request, res: Response) => {
try { try {
// Check if worker_registry table exists
const tableCheck = await pool.query(`
SELECT EXISTS (
SELECT FROM information_schema.tables
WHERE table_name = 'worker_registry'
) as exists
`);
if (!tableCheck.rows[0].exists) {
// Return empty result if table doesn't exist yet
return res.json({
success: true,
workers: [],
summary: {
active_count: 0,
idle_count: 0,
offline_count: 0,
total_count: 0,
active_roles: 0
}
});
}
const { status, role, include_terminated = 'false' } = req.query; const { status, role, include_terminated = 'false' } = req.query;
let whereClause = include_terminated === 'true' ? 'WHERE 1=1' : "WHERE status != 'terminated'"; let whereClause = include_terminated === 'true' ? 'WHERE 1=1' : "WHERE status != 'terminated'";
@@ -307,12 +349,21 @@ router.get('/workers', async (req: Request, res: Response) => {
tasks_completed, tasks_completed,
tasks_failed, tasks_failed,
current_task_id, current_task_id,
-- Concurrent task fields from metadata
(metadata->>'current_task_ids')::jsonb as current_task_ids,
(metadata->>'active_task_count')::int as active_task_count,
(metadata->>'max_concurrent_tasks')::int as max_concurrent_tasks,
-- Decommission fields
COALESCE(decommission_requested, false) as decommission_requested,
decommission_reason,
-- Full metadata for resources
metadata, metadata,
EXTRACT(EPOCH FROM (NOW() - last_heartbeat_at)) as seconds_since_heartbeat, EXTRACT(EPOCH FROM (NOW() - last_heartbeat_at)) as seconds_since_heartbeat,
CASE CASE
WHEN status = 'offline' OR status = 'terminated' THEN status WHEN status = 'offline' OR status = 'terminated' THEN status
WHEN last_heartbeat_at < NOW() - INTERVAL '2 minutes' THEN 'stale' WHEN last_heartbeat_at < NOW() - INTERVAL '2 minutes' THEN 'stale'
WHEN current_task_id IS NOT NULL THEN 'busy' WHEN current_task_id IS NOT NULL THEN 'busy'
WHEN (metadata->>'active_task_count')::int > 0 THEN 'busy'
ELSE 'ready' ELSE 'ready'
END as health_status, END as health_status,
created_at created_at
@@ -649,4 +700,163 @@ router.get('/capacity', async (_req: Request, res: Response) => {
} }
}); });
// ============================================================
// WORKER LIFECYCLE MANAGEMENT
// ============================================================
/**
* POST /api/worker-registry/workers/:workerId/decommission
* Request graceful decommission of a worker (will stop after current task)
*/
router.post('/workers/:workerId/decommission', async (req: Request, res: Response) => {
try {
const { workerId } = req.params;
const { reason, issued_by } = req.body;
// Update worker_registry to flag for decommission
const result = await pool.query(
`UPDATE worker_registry
SET decommission_requested = true,
decommission_reason = $2,
decommission_requested_at = NOW()
WHERE worker_id = $1
RETURNING friendly_name, status, current_task_id`,
[workerId, reason || 'Manual decommission from admin']
);
if (result.rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
const worker = result.rows[0];
// Also log to worker_commands for audit trail
await pool.query(
`INSERT INTO worker_commands (worker_id, command, reason, issued_by)
VALUES ($1, 'decommission', $2, $3)
ON CONFLICT DO NOTHING`,
[workerId, reason || 'Manual decommission', issued_by || 'admin']
).catch(() => {
// Table might not exist yet - ignore
});
res.json({
success: true,
message: worker.current_task_id
? `Worker ${worker.friendly_name} will stop after completing task #${worker.current_task_id}`
: `Worker ${worker.friendly_name} will stop on next poll`,
worker: {
friendly_name: worker.friendly_name,
status: worker.status,
current_task_id: worker.current_task_id,
decommission_requested: true
}
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/workers/:workerId/cancel-decommission
* Cancel a pending decommission request
*/
router.post('/workers/:workerId/cancel-decommission', async (req: Request, res: Response) => {
try {
const { workerId } = req.params;
const result = await pool.query(
`UPDATE worker_registry
SET decommission_requested = false,
decommission_reason = NULL,
decommission_requested_at = NULL
WHERE worker_id = $1
RETURNING friendly_name`,
[workerId]
);
if (result.rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
res.json({
success: true,
message: `Decommission cancelled for ${result.rows[0].friendly_name}`
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/spawn
* Spawn a new worker in the current pod (only works in multi-worker-per-pod mode)
* For now, this is a placeholder - actual spawning requires the pod supervisor
*/
router.post('/spawn', async (req: Request, res: Response) => {
try {
const { pod_name, role } = req.body;
// For now, we can't actually spawn workers from the API
// This would require a supervisor process in each pod that listens for spawn commands
// Instead, return instructions for how to scale
res.json({
success: false,
error: 'Direct worker spawning not yet implemented',
instructions: 'To add workers, scale the K8s deployment: kubectl scale deployment/scraper-worker --replicas=N'
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/worker-registry/pods
* Get workers grouped by pod
*/
router.get('/pods', async (_req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
COALESCE(pod_name, 'Unknown') as pod_name,
COUNT(*) as worker_count,
COUNT(*) FILTER (WHERE current_task_id IS NOT NULL) as busy_count,
COUNT(*) FILTER (WHERE current_task_id IS NULL) as idle_count,
SUM(tasks_completed) as total_completed,
SUM(tasks_failed) as total_failed,
SUM((metadata->>'memory_rss_mb')::int) as total_memory_mb,
array_agg(json_build_object(
'worker_id', worker_id,
'friendly_name', friendly_name,
'status', status,
'current_task_id', current_task_id,
'tasks_completed', tasks_completed,
'tasks_failed', tasks_failed,
'decommission_requested', COALESCE(decommission_requested, false),
'last_heartbeat_at', last_heartbeat_at
)) as workers
FROM worker_registry
WHERE status NOT IN ('offline', 'terminated')
GROUP BY pod_name
ORDER BY pod_name
`);
res.json({
success: true,
pods: rows.map(row => ({
pod_name: row.pod_name,
worker_count: parseInt(row.worker_count),
busy_count: parseInt(row.busy_count),
idle_count: parseInt(row.idle_count),
total_completed: parseInt(row.total_completed) || 0,
total_failed: parseInt(row.total_failed) || 0,
total_memory_mb: parseInt(row.total_memory_mb) || 0,
workers: row.workers
}))
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
export default router; export default router;

View File

@@ -17,13 +17,234 @@
* GET /api/monitor/jobs - Get recent job history * GET /api/monitor/jobs - Get recent job history
* GET /api/monitor/active-jobs - Get currently running jobs * GET /api/monitor/active-jobs - Get currently running jobs
* GET /api/monitor/summary - Get monitoring summary * GET /api/monitor/summary - Get monitoring summary
*
* K8s Scaling (added 2024-12-10):
* GET /api/workers/k8s/replicas - Get current replica count
* POST /api/workers/k8s/scale - Scale worker replicas up/down
*/ */
import { Router, Request, Response } from 'express'; import { Router, Request, Response } from 'express';
import { pool } from '../db/pool'; import { pool } from '../db/pool';
import * as k8s from '@kubernetes/client-node';
const router = Router(); const router = Router();
// ============================================================
// K8S SCALING CONFIGURATION (added 2024-12-10)
// Per TASK_WORKFLOW_2024-12-10.md: Admin can scale workers from UI
// ============================================================
const K8S_NAMESPACE = process.env.K8S_NAMESPACE || 'dispensary-scraper';
const K8S_DEPLOYMENT_NAME = process.env.K8S_WORKER_DEPLOYMENT || 'scraper-worker';
// Initialize K8s client - uses in-cluster config when running in K8s,
// or kubeconfig when running locally
let k8sAppsApi: k8s.AppsV1Api | null = null;
function getK8sClient(): k8s.AppsV1Api | null {
if (k8sAppsApi) return k8sAppsApi;
try {
const kc = new k8s.KubeConfig();
// Try in-cluster config first (when running as a pod)
// Falls back to default kubeconfig (~/.kube/config) for local dev
try {
kc.loadFromCluster();
} catch {
kc.loadFromDefault();
}
k8sAppsApi = kc.makeApiClient(k8s.AppsV1Api);
return k8sAppsApi;
} catch (err: any) {
console.warn('[Workers] K8s client not available:', err.message);
return null;
}
}
// ============================================================
// K8S SCALING ROUTES (added 2024-12-10)
// Per TASK_WORKFLOW_2024-12-10.md: Admin can scale workers from UI
// ============================================================
/**
* GET /api/workers/k8s/replicas - Get current worker replica count
* Returns current and desired replica counts from the Deployment
*/
router.get('/k8s/replicas', async (_req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: 'K8s client not available (not running in cluster or no kubeconfig)',
replicas: null,
});
}
try {
const response = await client.readNamespacedDeployment({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
const deployment = response;
res.json({
success: true,
replicas: {
current: deployment.status?.readyReplicas || 0,
desired: deployment.spec?.replicas || 0,
available: deployment.status?.availableReplicas || 0,
updated: deployment.status?.updatedReplicas || 0,
},
deployment: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
} catch (err: any) {
console.error('[Workers] K8s replicas error:', err.body?.message || err.message);
res.status(500).json({
success: false,
error: err.body?.message || err.message,
});
}
});
/**
* POST /api/workers/k8s/scale - Scale worker replicas
* Body: { replicas: number } - desired replica count (0-20)
*/
router.post('/k8s/scale', async (req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: 'K8s client not available (not running in cluster or no kubeconfig)',
});
}
const { replicas } = req.body;
// Validate replica count
if (typeof replicas !== 'number' || replicas < 0 || replicas > 20) {
return res.status(400).json({
success: false,
error: 'replicas must be a number between 0 and 20',
});
}
try {
// Get current state first
const currentResponse = await client.readNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
const currentReplicas = currentResponse.spec?.replicas || 0;
// Update scale using replaceNamespacedDeploymentScale
await client.replaceNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
body: {
apiVersion: 'autoscaling/v1',
kind: 'Scale',
metadata: {
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
},
spec: {
replicas: replicas,
},
},
});
console.log(`[Workers] Scaled ${K8S_DEPLOYMENT_NAME} from ${currentReplicas} to ${replicas} replicas`);
res.json({
success: true,
message: `Scaled from ${currentReplicas} to ${replicas} replicas`,
previous: currentReplicas,
desired: replicas,
deployment: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
} catch (err: any) {
console.error('[Workers] K8s scale error:', err.body?.message || err.message);
res.status(500).json({
success: false,
error: err.body?.message || err.message,
});
}
});
/**
* POST /api/workers/k8s/scale-up - Scale up worker replicas by 1
* Convenience endpoint for adding a single worker
*/
router.post('/k8s/scale-up', async (_req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: 'K8s client not available (not running in cluster or no kubeconfig)',
});
}
try {
// Get current replica count
const currentResponse = await client.readNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
const currentReplicas = currentResponse.spec?.replicas || 0;
const newReplicas = currentReplicas + 1;
// Cap at 20 replicas
if (newReplicas > 20) {
return res.status(400).json({
success: false,
error: 'Maximum replica count (20) reached',
});
}
// Scale up by 1
await client.replaceNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
body: {
apiVersion: 'autoscaling/v1',
kind: 'Scale',
metadata: {
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
},
spec: {
replicas: newReplicas,
},
},
});
console.log(`[Workers] Scaled up ${K8S_DEPLOYMENT_NAME} from ${currentReplicas} to ${newReplicas} replicas`);
res.json({
success: true,
message: `Added worker (${currentReplicas}${newReplicas} replicas)`,
previous: currentReplicas,
desired: newReplicas,
deployment: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
} catch (err: any) {
console.error('[Workers] K8s scale-up error:', err.body?.message || err.message);
res.status(500).json({
success: false,
error: err.body?.message || err.message,
});
}
});
// ============================================================ // ============================================================
// STATIC ROUTES (must come before parameterized routes) // STATIC ROUTES (must come before parameterized routes)
// ============================================================ // ============================================================

View File

@@ -16,10 +16,11 @@ import {
executeGraphQL, executeGraphQL,
startSession, startSession,
endSession, endSession,
getFingerprint, setCrawlRotator,
GRAPHQL_HASHES, GRAPHQL_HASHES,
DUTCHIE_CONFIG, DUTCHIE_CONFIG,
} from '../platforms/dutchie'; } from '../platforms/dutchie';
import { CrawlRotator } from '../services/crawl-rotator';
dotenv.config(); dotenv.config();
@@ -108,19 +109,27 @@ async function main() {
// ============================================================ // ============================================================
// STEP 2: Start stealth session // STEP 2: Start stealth session
// Per workflow-12102025.md: Initialize CrawlRotator and start session with menuUrl
// ============================================================ // ============================================================
console.log('┌─────────────────────────────────────────────────────────────┐'); console.log('┌─────────────────────────────────────────────────────────────┐');
console.log('│ STEP 2: Start Stealth Session │'); console.log('│ STEP 2: Start Stealth Session │');
console.log('└─────────────────────────────────────────────────────────────┘'); console.log('└─────────────────────────────────────────────────────────────┘');
// Use Arizona timezone for this store // Per workflow-12102025.md: Initialize CrawlRotator (required for sessions)
const session = startSession(disp.state || 'AZ', 'America/Phoenix'); const rotator = new CrawlRotator();
setCrawlRotator(rotator);
const fp = getFingerprint(); // Per workflow-12102025.md: startSession takes menuUrl for dynamic Referer
const session = startSession(disp.menu_url);
const fp = session.fingerprint;
console.log(` Session ID: ${session.sessionId}`); console.log(` Session ID: ${session.sessionId}`);
console.log(` Browser: ${fp.browserName} (${fp.deviceCategory})`);
console.log(` User-Agent: ${fp.userAgent.slice(0, 60)}...`); console.log(` User-Agent: ${fp.userAgent.slice(0, 60)}...`);
console.log(` Accept-Language: ${fp.acceptLanguage}`); console.log(` Accept-Language: ${fp.acceptLanguage}`);
console.log(` Sec-CH-UA: ${fp.secChUa || '(not set)'}`); console.log(` Referer: ${session.referer}`);
console.log(` DNT: ${fp.httpFingerprint.hasDNT ? 'enabled' : 'disabled'}`);
console.log(` TLS: ${fp.httpFingerprint.curlImpersonateBinary}`);
console.log(''); console.log('');
// ============================================================ // ============================================================

View File

@@ -1,10 +1,10 @@
/** /**
* Test script for stealth session management * Test script for stealth session management
* *
* Tests: * Per workflow-12102025.md:
* 1. Per-session fingerprint rotation * - Tests HTTP fingerprinting (browser-specific headers + ordering)
* 2. Geographic consistency (timezone → Accept-Language) * - Tests UA generation (device distribution, browser filtering)
* 3. Proxy location loading from database * - Tests dynamic Referer per dispensary
* *
* Usage: * Usage:
* npx tsx src/scripts/test-stealth-session.ts * npx tsx src/scripts/test-stealth-session.ts
@@ -14,104 +14,142 @@ import {
startSession, startSession,
endSession, endSession,
getCurrentSession, getCurrentSession,
getFingerprint,
getRandomFingerprint,
getLocaleForTimezone,
buildHeaders, buildHeaders,
setCrawlRotator,
} from '../platforms/dutchie'; } from '../platforms/dutchie';
import { CrawlRotator } from '../services/crawl-rotator';
import {
generateHTTPFingerprint,
buildRefererFromMenuUrl,
BrowserType,
} from '../services/http-fingerprint';
console.log('='.repeat(60)); console.log('='.repeat(60));
console.log('STEALTH SESSION TEST'); console.log('STEALTH SESSION TEST (per workflow-12102025.md)');
console.log('='.repeat(60)); console.log('='.repeat(60));
// Test 1: Timezone to Locale mapping // Initialize CrawlRotator (required for sessions)
console.log('\n[Test 1] Timezone to Locale Mapping:'); console.log('\n[Setup] Initializing CrawlRotator...');
const testTimezones = [ const rotator = new CrawlRotator();
'America/Phoenix', setCrawlRotator(rotator);
'America/Los_Angeles', console.log(' CrawlRotator initialized');
'America/New_York',
'America/Chicago', // Test 1: HTTP Fingerprint Generation
console.log('\n[Test 1] HTTP Fingerprint Generation:');
const browsers: BrowserType[] = ['Chrome', 'Firefox', 'Safari', 'Edge'];
for (const browser of browsers) {
const httpFp = generateHTTPFingerprint(browser);
console.log(` ${browser}:`);
console.log(` TLS binary: ${httpFp.curlImpersonateBinary}`);
console.log(` DNT: ${httpFp.hasDNT ? 'enabled' : 'disabled'}`);
console.log(` Header order: ${httpFp.headerOrder.slice(0, 5).join(', ')}...`);
}
// Test 2: Dynamic Referer from menu URLs
console.log('\n[Test 2] Dynamic Referer from Menu URLs:');
const testUrls = [
'https://dutchie.com/embedded-menu/harvest-of-tempe',
'https://dutchie.com/dispensary/zen-leaf-mesa',
'/embedded-menu/deeply-rooted',
'/dispensary/curaleaf-phoenix',
null,
undefined, undefined,
'Invalid/Timezone',
]; ];
for (const tz of testTimezones) { for (const url of testUrls) {
const locale = getLocaleForTimezone(tz); const referer = buildRefererFromMenuUrl(url);
console.log(` ${tz || '(undefined)'}${locale}`); console.log(` ${url || '(null/undefined)'}`);
console.log(`${referer}`);
} }
// Test 2: Random fingerprint selection // Test 3: Session with Dynamic Referer
console.log('\n[Test 2] Random Fingerprint Selection (5 samples):'); console.log('\n[Test 3] Session with Dynamic Referer:');
for (let i = 0; i < 5; i++) { const testMenuUrl = 'https://dutchie.com/dispensary/harvest-of-tempe';
const fp = getRandomFingerprint(); console.log(` Starting session with menuUrl: ${testMenuUrl}`);
console.log(` ${i + 1}. ${fp.userAgent.slice(0, 60)}...`);
}
// Test 3: Session Management const session1 = startSession(testMenuUrl);
console.log('\n[Test 3] Session Management:');
// Before session - should use default fingerprint
console.log(' Before session:');
const beforeFp = getFingerprint();
console.log(` getFingerprint(): ${beforeFp.userAgent.slice(0, 50)}...`);
console.log(` getCurrentSession(): ${getCurrentSession()}`);
// Start session with Arizona timezone
console.log('\n Starting session (AZ, America/Phoenix):');
const session1 = startSession('AZ', 'America/Phoenix');
console.log(` Session ID: ${session1.sessionId}`); console.log(` Session ID: ${session1.sessionId}`);
console.log(` Fingerprint UA: ${session1.fingerprint.userAgent.slice(0, 50)}...`); console.log(` Browser: ${session1.fingerprint.browserName}`);
console.log(` Accept-Language: ${session1.fingerprint.acceptLanguage}`); console.log(` Device: ${session1.fingerprint.deviceCategory}`);
console.log(` Timezone: ${session1.timezone}`); console.log(` Referer: ${session1.referer}`);
console.log(` DNT: ${session1.fingerprint.httpFingerprint.hasDNT ? 'enabled' : 'disabled'}`);
console.log(` TLS: ${session1.fingerprint.httpFingerprint.curlImpersonateBinary}`);
// During session - should use session fingerprint // Test 4: Build Headers (browser-specific order)
console.log('\n During session:'); console.log('\n[Test 4] Build Headers (browser-specific order):');
const duringFp = getFingerprint(); const { headers, orderedHeaders } = buildHeaders(true, 1000);
console.log(` getFingerprint(): ${duringFp.userAgent.slice(0, 50)}...`); console.log(` Headers built for ${session1.fingerprint.browserName}:`);
console.log(` Same as session? ${duringFp.userAgent === session1.fingerprint.userAgent}`); console.log(` Order: ${orderedHeaders.join(' → ')}`);
console.log(` Sample headers:`);
console.log(` User-Agent: ${headers['User-Agent']?.slice(0, 50)}...`);
console.log(` Accept: ${headers['Accept']}`);
console.log(` Accept-Language: ${headers['Accept-Language']}`);
console.log(` Referer: ${headers['Referer']}`);
if (headers['sec-ch-ua']) {
console.log(` sec-ch-ua: ${headers['sec-ch-ua']}`);
}
if (headers['DNT']) {
console.log(` DNT: ${headers['DNT']}`);
}
// Test buildHeaders with session
console.log('\n buildHeaders() during session:');
const headers = buildHeaders('/embedded-menu/test-store');
console.log(` User-Agent: ${headers['user-agent'].slice(0, 50)}...`);
console.log(` Accept-Language: ${headers['accept-language']}`);
console.log(` Origin: ${headers['origin']}`);
console.log(` Referer: ${headers['referer']}`);
// End session
console.log('\n Ending session:');
endSession(); endSession();
console.log(` getCurrentSession(): ${getCurrentSession()}`);
// Test 4: Multiple sessions should have different fingerprints // Test 5: Multiple Sessions (UA variety)
console.log('\n[Test 4] Multiple Sessions (fingerprint variety):'); console.log('\n[Test 5] Multiple Sessions (UA & fingerprint variety):');
const fingerprints: string[] = []; const sessions: {
browser: string;
device: string;
hasDNT: boolean;
}[] = [];
for (let i = 0; i < 10; i++) { for (let i = 0; i < 10; i++) {
const session = startSession('CA', 'America/Los_Angeles'); const session = startSession(`/dispensary/store-${i}`);
fingerprints.push(session.fingerprint.userAgent); sessions.push({
browser: session.fingerprint.browserName,
device: session.fingerprint.deviceCategory,
hasDNT: session.fingerprint.httpFingerprint.hasDNT,
});
endSession(); endSession();
} }
const uniqueCount = new Set(fingerprints).size; // Count distribution
console.log(` 10 sessions created, ${uniqueCount} unique fingerprints`); const browserCounts: Record<string, number> = {};
console.log(` Variety: ${uniqueCount >= 3 ? '✅ Good' : '⚠️ Low - may need more fingerprint options'}`); const deviceCounts: Record<string, number> = {};
let dntCount = 0;
// Test 5: Geographic consistency check for (const s of sessions) {
console.log('\n[Test 5] Geographic Consistency:'); browserCounts[s.browser] = (browserCounts[s.browser] || 0) + 1;
const geoTests = [ deviceCounts[s.device] = (deviceCounts[s.device] || 0) + 1;
{ state: 'AZ', tz: 'America/Phoenix' }, if (s.hasDNT) dntCount++;
{ state: 'CA', tz: 'America/Los_Angeles' }, }
{ state: 'NY', tz: 'America/New_York' },
{ state: 'IL', tz: 'America/Chicago' },
];
for (const { state, tz } of geoTests) { console.log(` 10 sessions created:`);
const session = startSession(state, tz); console.log(` Browsers: ${JSON.stringify(browserCounts)}`);
const consistent = session.fingerprint.acceptLanguage.includes('en-US'); console.log(` Devices: ${JSON.stringify(deviceCounts)}`);
console.log(` ${state} (${tz}): Accept-Language=${session.fingerprint.acceptLanguage} ${consistent ? '✅' : '❌'}`); console.log(` DNT enabled: ${dntCount}/10 (expected ~30%)`);
// Test 6: Device distribution check (per workflow-12102025.md: 62/36/2)
console.log('\n[Test 6] Device Distribution (larger sample):');
const deviceSamples: string[] = [];
for (let i = 0; i < 100; i++) {
const session = startSession();
deviceSamples.push(session.fingerprint.deviceCategory);
endSession(); endSession();
} }
const mobileCount = deviceSamples.filter(d => d === 'mobile').length;
const desktopCount = deviceSamples.filter(d => d === 'desktop').length;
const tabletCount = deviceSamples.filter(d => d === 'tablet').length;
console.log(` 100 sessions (expected: 62% mobile, 36% desktop, 2% tablet):`);
console.log(` Mobile: ${mobileCount}%`);
console.log(` Desktop: ${desktopCount}%`);
console.log(` Tablet: ${tabletCount}%`);
console.log(` Distribution: ${Math.abs(mobileCount - 62) < 15 && Math.abs(desktopCount - 36) < 15 ? '✅ Reasonable' : '⚠️ Off target'}`);
console.log('\n' + '='.repeat(60)); console.log('\n' + '='.repeat(60));
console.log('TEST COMPLETE'); console.log('TEST COMPLETE');
console.log('='.repeat(60)); console.log('='.repeat(60));

File diff suppressed because it is too large Load Diff

View File

@@ -26,6 +26,8 @@ import {
PenetrationDataPoint, PenetrationDataPoint,
BrandMarketPosition, BrandMarketPosition,
BrandRecVsMedFootprint, BrandRecVsMedFootprint,
BrandPromotionalSummary,
BrandPromotionalEvent,
} from './types'; } from './types';
export class BrandPenetrationService { export class BrandPenetrationService {
@@ -44,16 +46,17 @@ export class BrandPenetrationService {
// Get current brand presence // Get current brand presence
const currentResult = await this.pool.query(` const currentResult = await this.pool.query(`
SELECT SELECT
sp.brand_name, sp.brand_name_raw AS brand_name,
COUNT(DISTINCT sp.dispensary_id) AS total_dispensaries, COUNT(DISTINCT sp.dispensary_id) AS total_dispensaries,
COUNT(*) AS total_skus, COUNT(*) AS total_skus,
ROUND(COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT sp.dispensary_id), 0), 2) AS avg_skus_per_dispensary, ROUND(COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT sp.dispensary_id), 0), 2) AS avg_skus_per_dispensary,
ARRAY_AGG(DISTINCT s.code) FILTER (WHERE s.code IS NOT NULL) AS states_present ARRAY_AGG(DISTINCT s.code) FILTER (WHERE s.code IS NOT NULL) AS states_present
FROM store_products sp FROM store_products sp
LEFT JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name = $1 LEFT JOIN states s ON s.id = d.state_id
WHERE sp.brand_name_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
GROUP BY sp.brand_name GROUP BY sp.brand_name_raw
`, [brandName]); `, [brandName]);
if (currentResult.rows.length === 0) { if (currentResult.rows.length === 0) {
@@ -72,7 +75,7 @@ export class BrandPenetrationService {
DATE(sps.captured_at) AS date, DATE(sps.captured_at) AS date,
COUNT(DISTINCT sps.dispensary_id) AS dispensary_count COUNT(DISTINCT sps.dispensary_id) AS dispensary_count
FROM store_product_snapshots sps FROM store_product_snapshots sps
WHERE sps.brand_name = $1 WHERE sps.brand_name_raw = $1
AND sps.captured_at >= $2 AND sps.captured_at >= $2
AND sps.captured_at <= $3 AND sps.captured_at <= $3
AND sps.is_in_stock = TRUE AND sps.is_in_stock = TRUE
@@ -123,8 +126,9 @@ export class BrandPenetrationService {
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
COUNT(*) AS sku_count COUNT(*) AS sku_count
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name = $1 JOIN states s ON s.id = d.state_id
WHERE sp.brand_name_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
GROUP BY s.code, s.name, s.recreational_legal, s.medical_legal GROUP BY s.code, s.name, s.recreational_legal, s.medical_legal
), ),
@@ -133,7 +137,8 @@ export class BrandPenetrationService {
s.code AS state_code, s.code AS state_code,
COUNT(DISTINCT sp.dispensary_id) AS total_dispensaries COUNT(DISTINCT sp.dispensary_id) AS total_dispensaries
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE sp.is_in_stock = TRUE WHERE sp.is_in_stock = TRUE
GROUP BY s.code GROUP BY s.code
) )
@@ -169,7 +174,7 @@ export class BrandPenetrationService {
let filters = ''; let filters = '';
if (options.category) { if (options.category) {
filters += ` AND sp.category = $${paramIdx}`; filters += ` AND sp.category_raw = $${paramIdx}`;
params.push(options.category); params.push(options.category);
paramIdx++; paramIdx++;
} }
@@ -183,31 +188,33 @@ export class BrandPenetrationService {
const result = await this.pool.query(` const result = await this.pool.query(`
WITH brand_metrics AS ( WITH brand_metrics AS (
SELECT SELECT
sp.brand_name, sp.brand_name_raw AS brand_name,
sp.category, sp.category_raw AS category,
s.code AS state_code, s.code AS state_code,
COUNT(*) AS sku_count, COUNT(*) AS sku_count,
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
AVG(sp.price_rec) AS avg_price AVG(sp.price_rec) AS avg_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name = $1 JOIN states s ON s.id = d.state_id
WHERE sp.brand_name_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
${filters} ${filters}
GROUP BY sp.brand_name, sp.category, s.code GROUP BY sp.brand_name_raw, sp.category_raw, s.code
), ),
category_totals AS ( category_totals AS (
SELECT SELECT
sp.category, sp.category_raw AS category,
s.code AS state_code, s.code AS state_code,
COUNT(*) AS total_skus, COUNT(*) AS total_skus,
AVG(sp.price_rec) AS category_avg_price AVG(sp.price_rec) AS category_avg_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE sp.is_in_stock = TRUE WHERE sp.is_in_stock = TRUE
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
GROUP BY sp.category, s.code GROUP BY sp.category_raw, s.code
) )
SELECT SELECT
bm.*, bm.*,
@@ -243,8 +250,9 @@ export class BrandPenetrationService {
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
ROUND(COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT sp.dispensary_id), 0), 2) AS avg_skus ROUND(COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT sp.dispensary_id), 0), 2) AS avg_skus
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name = $1 JOIN states s ON s.id = d.state_id
WHERE sp.brand_name_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND s.recreational_legal = TRUE AND s.recreational_legal = TRUE
), ),
@@ -255,8 +263,9 @@ export class BrandPenetrationService {
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
ROUND(COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT sp.dispensary_id), 0), 2) AS avg_skus ROUND(COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT sp.dispensary_id), 0), 2) AS avg_skus
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name = $1 JOIN states s ON s.id = d.state_id
WHERE sp.brand_name_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND s.medical_legal = TRUE AND s.medical_legal = TRUE
AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL) AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL)
@@ -311,23 +320,24 @@ export class BrandPenetrationService {
} }
if (category) { if (category) {
filters += ` AND sp.category = $${paramIdx}`; filters += ` AND sp.category_raw = $${paramIdx}`;
params.push(category); params.push(category);
paramIdx++; paramIdx++;
} }
const result = await this.pool.query(` const result = await this.pool.query(`
SELECT SELECT
sp.brand_name, sp.brand_name_raw AS brand_name,
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
COUNT(*) AS sku_count, COUNT(*) AS sku_count,
COUNT(DISTINCT s.code) AS state_count COUNT(DISTINCT s.code) AS state_count
FROM store_products sp FROM store_products sp
LEFT JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.brand_name IS NOT NULL LEFT JOIN states s ON s.id = d.state_id
WHERE sp.brand_name_raw IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
${filters} ${filters}
GROUP BY sp.brand_name GROUP BY sp.brand_name_raw
ORDER BY dispensary_count DESC, sku_count DESC ORDER BY dispensary_count DESC, sku_count DESC
LIMIT $1 LIMIT $1
`, params); `, params);
@@ -358,23 +368,23 @@ export class BrandPenetrationService {
const result = await this.pool.query(` const result = await this.pool.query(`
WITH start_counts AS ( WITH start_counts AS (
SELECT SELECT
brand_name, brand_name_raw AS brand_name,
COUNT(DISTINCT dispensary_id) AS dispensary_count COUNT(DISTINCT dispensary_id) AS dispensary_count
FROM store_product_snapshots FROM store_product_snapshots
WHERE captured_at >= $1 AND captured_at < $1 + INTERVAL '1 day' WHERE captured_at >= $1 AND captured_at < $1 + INTERVAL '1 day'
AND brand_name IS NOT NULL AND brand_name_raw IS NOT NULL
AND is_in_stock = TRUE AND is_in_stock = TRUE
GROUP BY brand_name GROUP BY brand_name_raw
), ),
end_counts AS ( end_counts AS (
SELECT SELECT
brand_name, brand_name_raw AS brand_name,
COUNT(DISTINCT dispensary_id) AS dispensary_count COUNT(DISTINCT dispensary_id) AS dispensary_count
FROM store_product_snapshots FROM store_product_snapshots
WHERE captured_at >= $2 - INTERVAL '1 day' AND captured_at <= $2 WHERE captured_at >= $2 - INTERVAL '1 day' AND captured_at <= $2
AND brand_name IS NOT NULL AND brand_name_raw IS NOT NULL
AND is_in_stock = TRUE AND is_in_stock = TRUE
GROUP BY brand_name GROUP BY brand_name_raw
) )
SELECT SELECT
COALESCE(sc.brand_name, ec.brand_name) AS brand_name, COALESCE(sc.brand_name, ec.brand_name) AS brand_name,
@@ -401,6 +411,225 @@ export class BrandPenetrationService {
change_percent: row.change_percent ? parseFloat(row.change_percent) : 0, change_percent: row.change_percent ? parseFloat(row.change_percent) : 0,
})); }));
} }
/**
* Get brand promotional history
*
* Tracks when products went on special, how long, what discount,
* and estimated quantity sold during the promotion.
*/
async getBrandPromotionalHistory(
brandName: string,
options: { window?: TimeWindow; customRange?: DateRange; stateCode?: string; category?: string } = {}
): Promise<BrandPromotionalSummary> {
const { window = '90d', customRange, stateCode, category } = options;
const { start, end } = getDateRangeFromWindow(window, customRange);
// Build filters
const params: any[] = [brandName, start, end];
let paramIdx = 4;
let filters = '';
if (stateCode) {
filters += ` AND s.code = $${paramIdx}`;
params.push(stateCode);
paramIdx++;
}
if (category) {
filters += ` AND sp.category_raw = $${paramIdx}`;
params.push(category);
paramIdx++;
}
// Find promotional events by detecting when is_on_special transitions to TRUE
// and tracking until it transitions back to FALSE
const eventsResult = await this.pool.query(`
WITH snapshot_with_lag AS (
SELECT
sps.id,
sps.store_product_id,
sps.dispensary_id,
sps.brand_name_raw,
sps.name_raw,
sps.category_raw,
sps.is_on_special,
sps.price_rec,
sps.price_rec_special,
sps.stock_quantity,
sps.captured_at,
LAG(sps.is_on_special) OVER (
PARTITION BY sps.store_product_id
ORDER BY sps.captured_at
) AS prev_is_on_special,
LAG(sps.stock_quantity) OVER (
PARTITION BY sps.store_product_id
ORDER BY sps.captured_at
) AS prev_stock_quantity
FROM store_product_snapshots sps
JOIN store_products sp ON sp.id = sps.store_product_id
JOIN dispensaries dd ON dd.id = sp.dispensary_id
LEFT JOIN states s ON s.id = dd.state_id
WHERE sps.brand_name_raw = $1
AND sps.captured_at >= $2
AND sps.captured_at <= $3
${filters}
),
special_starts AS (
-- Find when specials START (transition from not-on-special to on-special)
SELECT
store_product_id,
dispensary_id,
name_raw,
category_raw,
captured_at AS special_start,
price_rec AS regular_price,
price_rec_special AS special_price,
stock_quantity AS quantity_at_start
FROM snapshot_with_lag
WHERE is_on_special = TRUE
AND (prev_is_on_special = FALSE OR prev_is_on_special IS NULL)
AND price_rec_special IS NOT NULL
AND price_rec IS NOT NULL
),
special_ends AS (
-- Find when specials END (transition from on-special to not-on-special)
SELECT
store_product_id,
captured_at AS special_end,
prev_stock_quantity AS quantity_at_end
FROM snapshot_with_lag
WHERE is_on_special = FALSE
AND prev_is_on_special = TRUE
),
matched_events AS (
SELECT
ss.store_product_id,
ss.dispensary_id,
ss.name_raw AS product_name,
ss.category_raw AS category,
ss.special_start,
se.special_end,
ss.regular_price,
ss.special_price,
ss.quantity_at_start,
COALESCE(se.quantity_at_end, ss.quantity_at_start) AS quantity_at_end
FROM special_starts ss
LEFT JOIN special_ends se ON se.store_product_id = ss.store_product_id
AND se.special_end > ss.special_start
AND se.special_end = (
SELECT MIN(se2.special_end)
FROM special_ends se2
WHERE se2.store_product_id = ss.store_product_id
AND se2.special_end > ss.special_start
)
)
SELECT
me.store_product_id,
me.dispensary_id,
d.name AS dispensary_name,
s.code AS state_code,
me.product_name,
me.category,
me.special_start,
me.special_end,
EXTRACT(DAY FROM COALESCE(me.special_end, NOW()) - me.special_start)::INT AS duration_days,
me.regular_price,
me.special_price,
ROUND(((me.regular_price - me.special_price) / NULLIF(me.regular_price, 0)) * 100, 1) AS discount_percent,
me.quantity_at_start,
me.quantity_at_end,
GREATEST(0, COALESCE(me.quantity_at_start, 0) - COALESCE(me.quantity_at_end, 0)) AS quantity_sold_estimate
FROM matched_events me
JOIN dispensaries d ON d.id = me.dispensary_id
LEFT JOIN states s ON s.id = d.state_id
ORDER BY me.special_start DESC
`, params);
const events: BrandPromotionalEvent[] = eventsResult.rows.map((row: any) => ({
product_name: row.product_name,
store_product_id: parseInt(row.store_product_id),
dispensary_id: parseInt(row.dispensary_id),
dispensary_name: row.dispensary_name,
state_code: row.state_code || 'Unknown',
category: row.category,
special_start: row.special_start.toISOString().split('T')[0],
special_end: row.special_end ? row.special_end.toISOString().split('T')[0] : null,
duration_days: row.duration_days ? parseInt(row.duration_days) : null,
regular_price: parseFloat(row.regular_price) || 0,
special_price: parseFloat(row.special_price) || 0,
discount_percent: parseFloat(row.discount_percent) || 0,
quantity_at_start: row.quantity_at_start ? parseInt(row.quantity_at_start) : null,
quantity_at_end: row.quantity_at_end ? parseInt(row.quantity_at_end) : null,
quantity_sold_estimate: row.quantity_sold_estimate ? parseInt(row.quantity_sold_estimate) : null,
}));
// Calculate summary stats
const totalEvents = events.length;
const uniqueProducts = new Set(events.map(e => e.store_product_id)).size;
const uniqueDispensaries = new Set(events.map(e => e.dispensary_id)).size;
const uniqueStates = [...new Set(events.map(e => e.state_code))];
const avgDiscount = totalEvents > 0
? events.reduce((sum, e) => sum + e.discount_percent, 0) / totalEvents
: 0;
const durations = events.filter(e => e.duration_days !== null).map(e => e.duration_days!);
const avgDuration = durations.length > 0
? durations.reduce((sum, d) => sum + d, 0) / durations.length
: null;
const totalQuantitySold = events
.filter(e => e.quantity_sold_estimate !== null)
.reduce((sum, e) => sum + (e.quantity_sold_estimate || 0), 0);
// Calculate frequency
const windowDays = Math.ceil((end.getTime() - start.getTime()) / (1000 * 60 * 60 * 24));
const weeklyAvg = windowDays > 0 ? (totalEvents / windowDays) * 7 : 0;
const monthlyAvg = windowDays > 0 ? (totalEvents / windowDays) * 30 : 0;
// Group by category
const categoryMap = new Map<string, { count: number; discounts: number[]; quantity: number }>();
for (const event of events) {
const cat = event.category || 'Uncategorized';
if (!categoryMap.has(cat)) {
categoryMap.set(cat, { count: 0, discounts: [], quantity: 0 });
}
const entry = categoryMap.get(cat)!;
entry.count++;
entry.discounts.push(event.discount_percent);
if (event.quantity_sold_estimate !== null) {
entry.quantity += event.quantity_sold_estimate;
}
}
const byCategory = Array.from(categoryMap.entries()).map(([category, data]) => ({
category,
event_count: data.count,
avg_discount_percent: data.discounts.length > 0
? Math.round((data.discounts.reduce((a, b) => a + b, 0) / data.discounts.length) * 10) / 10
: 0,
quantity_sold_estimate: data.quantity > 0 ? data.quantity : null,
})).sort((a, b) => b.event_count - a.event_count);
return {
brand_name: brandName,
window,
total_promotional_events: totalEvents,
total_products_on_special: uniqueProducts,
total_dispensaries_with_specials: uniqueDispensaries,
states_with_specials: uniqueStates,
avg_discount_percent: Math.round(avgDiscount * 10) / 10,
avg_duration_days: avgDuration !== null ? Math.round(avgDuration * 10) / 10 : null,
total_quantity_sold_estimate: totalQuantitySold > 0 ? totalQuantitySold : null,
promotional_frequency: {
weekly_avg: Math.round(weeklyAvg * 10) / 10,
monthly_avg: Math.round(monthlyAvg * 10) / 10,
},
by_category: byCategory,
events,
};
}
} }
export default BrandPenetrationService; export default BrandPenetrationService;

View File

@@ -43,14 +43,14 @@ export class CategoryAnalyticsService {
// Get current category metrics // Get current category metrics
const currentResult = await this.pool.query(` const currentResult = await this.pool.query(`
SELECT SELECT
sp.category, sp.category_raw,
COUNT(*) AS sku_count, COUNT(*) AS sku_count,
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
AVG(sp.price_rec) AS avg_price AVG(sp.price_rec) AS avg_price
FROM store_products sp FROM store_products sp
WHERE sp.category = $1 WHERE sp.category_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
GROUP BY sp.category GROUP BY sp.category_raw
`, [category]); `, [category]);
if (currentResult.rows.length === 0) { if (currentResult.rows.length === 0) {
@@ -70,7 +70,7 @@ export class CategoryAnalyticsService {
COUNT(DISTINCT sps.dispensary_id) AS dispensary_count, COUNT(DISTINCT sps.dispensary_id) AS dispensary_count,
AVG(sps.price_rec) AS avg_price AVG(sps.price_rec) AS avg_price
FROM store_product_snapshots sps FROM store_product_snapshots sps
WHERE sps.category = $1 WHERE sps.category_raw = $1
AND sps.captured_at >= $2 AND sps.captured_at >= $2
AND sps.captured_at <= $3 AND sps.captured_at <= $3
AND sps.is_in_stock = TRUE AND sps.is_in_stock = TRUE
@@ -111,8 +111,9 @@ export class CategoryAnalyticsService {
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
AVG(sp.price_rec) AS avg_price AVG(sp.price_rec) AS avg_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.category = $1 JOIN states s ON s.id = d.state_id
WHERE sp.category_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
GROUP BY s.code, s.name, s.recreational_legal GROUP BY s.code, s.name, s.recreational_legal
ORDER BY sku_count DESC ORDER BY sku_count DESC
@@ -154,24 +155,25 @@ export class CategoryAnalyticsService {
const result = await this.pool.query(` const result = await this.pool.query(`
SELECT SELECT
sp.category, sp.category_raw,
COUNT(*) AS sku_count, COUNT(*) AS sku_count,
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
COUNT(DISTINCT sp.brand_name) AS brand_count, COUNT(DISTINCT sp.brand_name_raw) AS brand_count,
AVG(sp.price_rec) AS avg_price, AVG(sp.price_rec) AS avg_price,
COUNT(DISTINCT s.code) AS state_count COUNT(DISTINCT s.code) AS state_count
FROM store_products sp FROM store_products sp
LEFT JOIN states s ON s.id = sp.state_id LEFT JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.category IS NOT NULL JOIN states s ON s.id = d.state_id
WHERE sp.category_raw IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
${stateFilter} ${stateFilter}
GROUP BY sp.category GROUP BY sp.category_raw
ORDER BY sku_count DESC ORDER BY sku_count DESC
LIMIT $1 LIMIT $1
`, params); `, params);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
sku_count: parseInt(row.sku_count), sku_count: parseInt(row.sku_count),
dispensary_count: parseInt(row.dispensary_count), dispensary_count: parseInt(row.dispensary_count),
brand_count: parseInt(row.brand_count), brand_count: parseInt(row.brand_count),
@@ -188,14 +190,14 @@ export class CategoryAnalyticsService {
let categoryFilter = ''; let categoryFilter = '';
if (category) { if (category) {
categoryFilter = 'WHERE sp.category = $1'; categoryFilter = 'WHERE sp.category_raw = $1';
params.push(category); params.push(category);
} }
const result = await this.pool.query(` const result = await this.pool.query(`
WITH category_stats AS ( WITH category_stats AS (
SELECT SELECT
sp.category, sp.category_raw,
CASE WHEN s.recreational_legal = TRUE THEN 'recreational' ELSE 'medical_only' END AS legal_type, CASE WHEN s.recreational_legal = TRUE THEN 'recreational' ELSE 'medical_only' END AS legal_type,
COUNT(DISTINCT s.code) AS state_count, COUNT(DISTINCT s.code) AS state_count,
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
@@ -203,13 +205,14 @@ export class CategoryAnalyticsService {
AVG(sp.price_rec) AS avg_price, AVG(sp.price_rec) AS avg_price,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
${categoryFilter} ${categoryFilter}
${category ? 'AND' : 'WHERE'} sp.category IS NOT NULL ${category ? 'AND' : 'WHERE'} sp.category_raw IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE) AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE)
GROUP BY sp.category, CASE WHEN s.recreational_legal = TRUE THEN 'recreational' ELSE 'medical_only' END GROUP BY sp.category_raw, CASE WHEN s.recreational_legal = TRUE THEN 'recreational' ELSE 'medical_only' END
), ),
rec_stats AS ( rec_stats AS (
SELECT * FROM category_stats WHERE legal_type = 'recreational' SELECT * FROM category_stats WHERE legal_type = 'recreational'
@@ -218,7 +221,7 @@ export class CategoryAnalyticsService {
SELECT * FROM category_stats WHERE legal_type = 'medical_only' SELECT * FROM category_stats WHERE legal_type = 'medical_only'
) )
SELECT SELECT
COALESCE(r.category, m.category) AS category, COALESCE(r.category_raw, m.category_raw) AS category,
r.state_count AS rec_state_count, r.state_count AS rec_state_count,
r.dispensary_count AS rec_dispensary_count, r.dispensary_count AS rec_dispensary_count,
r.sku_count AS rec_sku_count, r.sku_count AS rec_sku_count,
@@ -235,7 +238,7 @@ export class CategoryAnalyticsService {
ELSE NULL ELSE NULL
END AS price_diff_percent END AS price_diff_percent
FROM rec_stats r FROM rec_stats r
FULL OUTER JOIN med_stats m ON r.category = m.category FULL OUTER JOIN med_stats m ON r.category_raw = m.category_raw
ORDER BY COALESCE(r.sku_count, 0) + COALESCE(m.sku_count, 0) DESC ORDER BY COALESCE(r.sku_count, 0) + COALESCE(m.sku_count, 0) DESC
`, params); `, params);
@@ -282,7 +285,7 @@ export class CategoryAnalyticsService {
COUNT(*) AS sku_count, COUNT(*) AS sku_count,
COUNT(DISTINCT sps.dispensary_id) AS dispensary_count COUNT(DISTINCT sps.dispensary_id) AS dispensary_count
FROM store_product_snapshots sps FROM store_product_snapshots sps
WHERE sps.category = $1 WHERE sps.category_raw = $1
AND sps.captured_at >= $2 AND sps.captured_at >= $2
AND sps.captured_at <= $3 AND sps.captured_at <= $3
AND sps.is_in_stock = TRUE AND sps.is_in_stock = TRUE
@@ -335,31 +338,33 @@ export class CategoryAnalyticsService {
WITH category_total AS ( WITH category_total AS (
SELECT COUNT(*) AS total SELECT COUNT(*) AS total
FROM store_products sp FROM store_products sp
LEFT JOIN states s ON s.id = sp.state_id LEFT JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.category = $1 JOIN states s ON s.id = d.state_id
WHERE sp.category_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.brand_name IS NOT NULL AND sp.brand_name_raw IS NOT NULL
${stateFilter} ${stateFilter}
) )
SELECT SELECT
sp.brand_name, sp.brand_name_raw,
COUNT(*) AS sku_count, COUNT(*) AS sku_count,
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count, COUNT(DISTINCT sp.dispensary_id) AS dispensary_count,
AVG(sp.price_rec) AS avg_price, AVG(sp.price_rec) AS avg_price,
ROUND(COUNT(*)::NUMERIC * 100 / NULLIF((SELECT total FROM category_total), 0), 2) AS category_share_percent ROUND(COUNT(*)::NUMERIC * 100 / NULLIF((SELECT total FROM category_total), 0), 2) AS category_share_percent
FROM store_products sp FROM store_products sp
LEFT JOIN states s ON s.id = sp.state_id LEFT JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE sp.category = $1 JOIN states s ON s.id = d.state_id
WHERE sp.category_raw = $1
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.brand_name IS NOT NULL AND sp.brand_name_raw IS NOT NULL
${stateFilter} ${stateFilter}
GROUP BY sp.brand_name GROUP BY sp.brand_name_raw
ORDER BY sku_count DESC ORDER BY sku_count DESC
LIMIT $2 LIMIT $2
`, params); `, params);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
brand_name: row.brand_name, brand_name: row.brand_name_raw,
sku_count: parseInt(row.sku_count), sku_count: parseInt(row.sku_count),
dispensary_count: parseInt(row.dispensary_count), dispensary_count: parseInt(row.dispensary_count),
avg_price: row.avg_price ? parseFloat(row.avg_price) : null, avg_price: row.avg_price ? parseFloat(row.avg_price) : null,
@@ -421,7 +426,7 @@ export class CategoryAnalyticsService {
`, [start, end, limit]); `, [start, end, limit]);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
start_sku_count: parseInt(row.start_sku_count), start_sku_count: parseInt(row.start_sku_count),
end_sku_count: parseInt(row.end_sku_count), end_sku_count: parseInt(row.end_sku_count),
growth: parseInt(row.growth), growth: parseInt(row.growth),

View File

@@ -43,9 +43,9 @@ export class PriceAnalyticsService {
const productResult = await this.pool.query(` const productResult = await this.pool.query(`
SELECT SELECT
sp.id, sp.id,
sp.name, sp.name_raw,
sp.brand_name, sp.brand_name_raw,
sp.category, sp.category_raw,
sp.dispensary_id, sp.dispensary_id,
sp.price_rec, sp.price_rec,
sp.price_med, sp.price_med,
@@ -53,7 +53,7 @@ export class PriceAnalyticsService {
s.code AS state_code s.code AS state_code
FROM store_products sp FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id JOIN dispensaries d ON d.id = sp.dispensary_id
LEFT JOIN states s ON s.id = sp.state_id JOIN states s ON s.id = d.state_id
WHERE sp.id = $1 WHERE sp.id = $1
`, [storeProductId]); `, [storeProductId]);
@@ -133,7 +133,7 @@ export class PriceAnalyticsService {
const result = await this.pool.query(` const result = await this.pool.query(`
SELECT SELECT
sp.category, sp.category_raw,
s.code AS state_code, s.code AS state_code,
s.name AS state_name, s.name AS state_name,
CASE CASE
@@ -148,18 +148,18 @@ export class PriceAnalyticsService {
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count COUNT(DISTINCT sp.dispensary_id) AS dispensary_count
FROM store_products sp FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = sp.state_id JOIN states s ON s.id = d.state_id
WHERE sp.category = $1 WHERE sp.category_raw = $1
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE) AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE)
${stateFilter} ${stateFilter}
GROUP BY sp.category, s.code, s.name, s.recreational_legal GROUP BY sp.category_raw, s.code, s.name, s.recreational_legal
ORDER BY state_code ORDER BY state_code
`, params); `, params);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
state_code: row.state_code, state_code: row.state_code,
state_name: row.state_name, state_name: row.state_name,
legal_type: row.legal_type, legal_type: row.legal_type,
@@ -189,7 +189,7 @@ export class PriceAnalyticsService {
const result = await this.pool.query(` const result = await this.pool.query(`
SELECT SELECT
sp.brand_name AS category, sp.brand_name_raw AS category,
s.code AS state_code, s.code AS state_code,
s.name AS state_name, s.name AS state_name,
CASE CASE
@@ -204,18 +204,18 @@ export class PriceAnalyticsService {
COUNT(DISTINCT sp.dispensary_id) AS dispensary_count COUNT(DISTINCT sp.dispensary_id) AS dispensary_count
FROM store_products sp FROM store_products sp
JOIN dispensaries d ON d.id = sp.dispensary_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = sp.state_id JOIN states s ON s.id = d.state_id
WHERE sp.brand_name = $1 WHERE sp.brand_name_raw = $1
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE) AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE)
${stateFilter} ${stateFilter}
GROUP BY sp.brand_name, s.code, s.name, s.recreational_legal GROUP BY sp.brand_name_raw, s.code, s.name, s.recreational_legal
ORDER BY state_code ORDER BY state_code
`, params); `, params);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
state_code: row.state_code, state_code: row.state_code,
state_name: row.state_name, state_name: row.state_name,
legal_type: row.legal_type, legal_type: row.legal_type,
@@ -254,7 +254,7 @@ export class PriceAnalyticsService {
} }
if (category) { if (category) {
filters += ` AND sp.category = $${paramIdx}`; filters += ` AND sp.category_raw = $${paramIdx}`;
params.push(category); params.push(category);
paramIdx++; paramIdx++;
} }
@@ -288,15 +288,16 @@ export class PriceAnalyticsService {
) )
SELECT SELECT
v.store_product_id, v.store_product_id,
sp.name AS product_name, sp.name_raw AS product_name,
sp.brand_name, sp.brand_name_raw,
v.change_count, v.change_count,
v.avg_change_pct, v.avg_change_pct,
v.max_change_pct, v.max_change_pct,
v.last_change_at v.last_change_at
FROM volatility v FROM volatility v
JOIN store_products sp ON sp.id = v.store_product_id JOIN store_products sp ON sp.id = v.store_product_id
LEFT JOIN states s ON s.id = sp.state_id LEFT JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE 1=1 ${filters} WHERE 1=1 ${filters}
ORDER BY v.change_count DESC, v.avg_change_pct DESC ORDER BY v.change_count DESC, v.avg_change_pct DESC
LIMIT $3 LIMIT $3
@@ -305,7 +306,7 @@ export class PriceAnalyticsService {
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
store_product_id: row.store_product_id, store_product_id: row.store_product_id,
product_name: row.product_name, product_name: row.product_name,
brand_name: row.brand_name, brand_name: row.brand_name_raw,
change_count: parseInt(row.change_count), change_count: parseInt(row.change_count),
avg_change_percent: row.avg_change_pct ? parseFloat(row.avg_change_pct) : 0, avg_change_percent: row.avg_change_pct ? parseFloat(row.avg_change_pct) : 0,
max_change_percent: row.max_change_pct ? parseFloat(row.max_change_pct) : 0, max_change_percent: row.max_change_pct ? parseFloat(row.max_change_pct) : 0,
@@ -327,13 +328,13 @@ export class PriceAnalyticsService {
let categoryFilter = ''; let categoryFilter = '';
if (category) { if (category) {
categoryFilter = 'WHERE sp.category = $1'; categoryFilter = 'WHERE sp.category_raw = $1';
params.push(category); params.push(category);
} }
const result = await this.pool.query(` const result = await this.pool.query(`
SELECT SELECT
sp.category, sp.category_raw,
AVG(sp.price_rec) FILTER (WHERE s.recreational_legal = TRUE) AS rec_avg, AVG(sp.price_rec) FILTER (WHERE s.recreational_legal = TRUE) AS rec_avg,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)
FILTER (WHERE s.recreational_legal = TRUE) AS rec_median, FILTER (WHERE s.recreational_legal = TRUE) AS rec_median,
@@ -343,17 +344,18 @@ export class PriceAnalyticsService {
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)
FILTER (WHERE s.medical_legal = TRUE AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL)) AS med_median FILTER (WHERE s.medical_legal = TRUE AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL)) AS med_median
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
${categoryFilter} ${categoryFilter}
${category ? 'AND' : 'WHERE'} sp.price_rec IS NOT NULL ${category ? 'AND' : 'WHERE'} sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
GROUP BY sp.category GROUP BY sp.category_raw
ORDER BY sp.category ORDER BY sp.category_raw
`, params); `, params);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
rec_avg: row.rec_avg ? parseFloat(row.rec_avg) : null, rec_avg: row.rec_avg ? parseFloat(row.rec_avg) : null,
rec_median: row.rec_median ? parseFloat(row.rec_median) : null, rec_median: row.rec_median ? parseFloat(row.rec_median) : null,
med_avg: row.med_avg ? parseFloat(row.med_avg) : null, med_avg: row.med_avg ? parseFloat(row.med_avg) : null,

View File

@@ -108,14 +108,14 @@ export class StateAnalyticsService {
SELECT SELECT
COUNT(DISTINCT d.id) AS dispensary_count, COUNT(DISTINCT d.id) AS dispensary_count,
COUNT(DISTINCT sp.id) AS product_count, COUNT(DISTINCT sp.id) AS product_count,
COUNT(DISTINCT sp.brand_name) FILTER (WHERE sp.brand_name IS NOT NULL) AS brand_count, COUNT(DISTINCT sp.brand_name_raw) FILTER (WHERE sp.brand_name_raw IS NOT NULL) AS brand_count,
COUNT(DISTINCT sp.category) FILTER (WHERE sp.category IS NOT NULL) AS category_count, COUNT(DISTINCT sp.category_raw) FILTER (WHERE sp.category_raw IS NOT NULL) AS category_count,
COUNT(sps.id) AS snapshot_count, COUNT(sps.id) AS snapshot_count,
MAX(sps.captured_at) AS last_crawl_at MAX(sps.captured_at) AS last_crawl_at
FROM states s FROM states s
LEFT JOIN dispensaries d ON d.state_id = s.id LEFT JOIN dispensaries d ON d.state_id = s.id
LEFT JOIN store_products sp ON sp.state_id = s.id AND sp.is_in_stock = TRUE LEFT JOIN store_products sp ON sp.dispensary_id = d.id AND sp.is_in_stock = TRUE
LEFT JOIN store_product_snapshots sps ON sps.state_id = s.id LEFT JOIN store_product_snapshots sps ON sps.dispensary_id = d.id
WHERE s.code = $1 WHERE s.code = $1
`, [stateCode]); `, [stateCode]);
@@ -129,7 +129,8 @@ export class StateAnalyticsService {
MIN(price_rec) AS min_price, MIN(price_rec) AS min_price,
MAX(price_rec) AS max_price MAX(price_rec) AS max_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE s.code = $1 WHERE s.code = $1
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
@@ -140,14 +141,15 @@ export class StateAnalyticsService {
// Get top categories // Get top categories
const topCategoriesResult = await this.pool.query(` const topCategoriesResult = await this.pool.query(`
SELECT SELECT
sp.category, sp.category_raw,
COUNT(*) AS count COUNT(*) AS count
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE s.code = $1 WHERE s.code = $1
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
GROUP BY sp.category GROUP BY sp.category_raw
ORDER BY count DESC ORDER BY count DESC
LIMIT 10 LIMIT 10
`, [stateCode]); `, [stateCode]);
@@ -155,14 +157,15 @@ export class StateAnalyticsService {
// Get top brands // Get top brands
const topBrandsResult = await this.pool.query(` const topBrandsResult = await this.pool.query(`
SELECT SELECT
sp.brand_name AS brand, sp.brand_name_raw AS brand,
COUNT(*) AS count COUNT(*) AS count
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE s.code = $1 WHERE s.code = $1
AND sp.brand_name IS NOT NULL AND sp.brand_name_raw IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
GROUP BY sp.brand_name GROUP BY sp.brand_name_raw
ORDER BY count DESC ORDER BY count DESC
LIMIT 10 LIMIT 10
`, [stateCode]); `, [stateCode]);
@@ -191,7 +194,7 @@ export class StateAnalyticsService {
max_price: pricing.max_price ? parseFloat(pricing.max_price) : null, max_price: pricing.max_price ? parseFloat(pricing.max_price) : null,
}, },
top_categories: topCategoriesResult.rows.map((row: any) => ({ top_categories: topCategoriesResult.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
count: parseInt(row.count), count: parseInt(row.count),
})), })),
top_brands: topBrandsResult.rows.map((row: any) => ({ top_brands: topBrandsResult.rows.map((row: any) => ({
@@ -215,8 +218,8 @@ export class StateAnalyticsService {
COUNT(sps.id) AS snapshot_count COUNT(sps.id) AS snapshot_count
FROM states s FROM states s
LEFT JOIN dispensaries d ON d.state_id = s.id LEFT JOIN dispensaries d ON d.state_id = s.id
LEFT JOIN store_products sp ON sp.state_id = s.id AND sp.is_in_stock = TRUE LEFT JOIN store_products sp ON sp.dispensary_id = d.id AND sp.is_in_stock = TRUE
LEFT JOIN store_product_snapshots sps ON sps.state_id = s.id LEFT JOIN store_product_snapshots sps ON sps.dispensary_id = d.id
WHERE s.recreational_legal = TRUE WHERE s.recreational_legal = TRUE
GROUP BY s.code, s.name GROUP BY s.code, s.name
ORDER BY dispensary_count DESC ORDER BY dispensary_count DESC
@@ -232,8 +235,8 @@ export class StateAnalyticsService {
COUNT(sps.id) AS snapshot_count COUNT(sps.id) AS snapshot_count
FROM states s FROM states s
LEFT JOIN dispensaries d ON d.state_id = s.id LEFT JOIN dispensaries d ON d.state_id = s.id
LEFT JOIN store_products sp ON sp.state_id = s.id AND sp.is_in_stock = TRUE LEFT JOIN store_products sp ON sp.dispensary_id = d.id AND sp.is_in_stock = TRUE
LEFT JOIN store_product_snapshots sps ON sps.state_id = s.id LEFT JOIN store_product_snapshots sps ON sps.dispensary_id = d.id
WHERE s.medical_legal = TRUE WHERE s.medical_legal = TRUE
AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL) AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL)
GROUP BY s.code, s.name GROUP BY s.code, s.name
@@ -295,46 +298,48 @@ export class StateAnalyticsService {
let groupBy = 'NULL'; let groupBy = 'NULL';
if (category) { if (category) {
categoryFilter = 'AND sp.category = $1'; categoryFilter = 'AND sp.category_raw = $1';
params.push(category); params.push(category);
groupBy = 'sp.category'; groupBy = 'sp.category_raw';
} else { } else {
groupBy = 'sp.category'; groupBy = 'sp.category_raw';
} }
const result = await this.pool.query(` const result = await this.pool.query(`
WITH rec_prices AS ( WITH rec_prices AS (
SELECT SELECT
${category ? 'sp.category' : 'sp.category'}, ${category ? 'sp.category_raw' : 'sp.category_raw'},
COUNT(DISTINCT s.code) AS state_count, COUNT(DISTINCT s.code) AS state_count,
COUNT(*) AS product_count, COUNT(*) AS product_count,
AVG(sp.price_rec) AS avg_price, AVG(sp.price_rec) AS avg_price,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE s.recreational_legal = TRUE WHERE s.recreational_legal = TRUE
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
${categoryFilter} ${categoryFilter}
GROUP BY sp.category GROUP BY sp.category_raw
), ),
med_prices AS ( med_prices AS (
SELECT SELECT
${category ? 'sp.category' : 'sp.category'}, ${category ? 'sp.category_raw' : 'sp.category_raw'},
COUNT(DISTINCT s.code) AS state_count, COUNT(DISTINCT s.code) AS state_count,
COUNT(*) AS product_count, COUNT(*) AS product_count,
AVG(sp.price_rec) AS avg_price, AVG(sp.price_rec) AS avg_price,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price
FROM store_products sp FROM store_products sp
JOIN states s ON s.id = sp.state_id JOIN dispensaries d ON d.id = sp.dispensary_id
JOIN states s ON s.id = d.state_id
WHERE s.medical_legal = TRUE WHERE s.medical_legal = TRUE
AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL) AND (s.recreational_legal = FALSE OR s.recreational_legal IS NULL)
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
${categoryFilter} ${categoryFilter}
GROUP BY sp.category GROUP BY sp.category_raw
) )
SELECT SELECT
COALESCE(r.category, m.category) AS category, COALESCE(r.category, m.category) AS category,
@@ -357,7 +362,7 @@ export class StateAnalyticsService {
`, params); `, params);
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
recreational: { recreational: {
state_count: parseInt(row.rec_state_count) || 0, state_count: parseInt(row.rec_state_count) || 0,
product_count: parseInt(row.rec_product_count) || 0, product_count: parseInt(row.rec_product_count) || 0,
@@ -395,12 +400,12 @@ export class StateAnalyticsService {
COALESCE(s.medical_legal, FALSE) AS medical_legal, COALESCE(s.medical_legal, FALSE) AS medical_legal,
COUNT(DISTINCT d.id) AS dispensary_count, COUNT(DISTINCT d.id) AS dispensary_count,
COUNT(DISTINCT sp.id) AS product_count, COUNT(DISTINCT sp.id) AS product_count,
COUNT(DISTINCT sp.brand_name) FILTER (WHERE sp.brand_name IS NOT NULL) AS brand_count, COUNT(DISTINCT sp.brand_name_raw) FILTER (WHERE sp.brand_name_raw IS NOT NULL) AS brand_count,
MAX(sps.captured_at) AS last_crawl_at MAX(sps.captured_at) AS last_crawl_at
FROM states s FROM states s
LEFT JOIN dispensaries d ON d.state_id = s.id LEFT JOIN dispensaries d ON d.state_id = s.id
LEFT JOIN store_products sp ON sp.state_id = s.id AND sp.is_in_stock = TRUE LEFT JOIN store_products sp ON sp.dispensary_id = d.id AND sp.is_in_stock = TRUE
LEFT JOIN store_product_snapshots sps ON sps.state_id = s.id LEFT JOIN store_product_snapshots sps ON sps.dispensary_id = d.id
GROUP BY s.code, s.name, s.recreational_legal, s.medical_legal GROUP BY s.code, s.name, s.recreational_legal, s.medical_legal
ORDER BY dispensary_count DESC, s.name ORDER BY dispensary_count DESC, s.name
`); `);
@@ -451,8 +456,8 @@ export class StateAnalyticsService {
END AS gap_reason END AS gap_reason
FROM states s FROM states s
LEFT JOIN dispensaries d ON d.state_id = s.id LEFT JOIN dispensaries d ON d.state_id = s.id
LEFT JOIN store_products sp ON sp.state_id = s.id AND sp.is_in_stock = TRUE LEFT JOIN store_products sp ON sp.dispensary_id = d.id AND sp.is_in_stock = TRUE
LEFT JOIN store_product_snapshots sps ON sps.state_id = s.id LEFT JOIN store_product_snapshots sps ON sps.dispensary_id = d.id
WHERE s.recreational_legal = TRUE OR s.medical_legal = TRUE WHERE s.recreational_legal = TRUE OR s.medical_legal = TRUE
GROUP BY s.code, s.name, s.recreational_legal, s.medical_legal GROUP BY s.code, s.name, s.recreational_legal, s.medical_legal
HAVING COUNT(DISTINCT d.id) = 0 HAVING COUNT(DISTINCT d.id) = 0
@@ -499,7 +504,8 @@ export class StateAnalyticsService {
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec) AS median_price,
COUNT(*) AS product_count COUNT(*) AS product_count
FROM states s FROM states s
JOIN store_products sp ON sp.state_id = s.id JOIN dispensaries d ON d.state_id = s.id
JOIN store_products sp ON sp.dispensary_id = d.id
WHERE sp.price_rec IS NOT NULL WHERE sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE) AND (s.recreational_legal = TRUE OR s.medical_legal = TRUE)

View File

@@ -89,22 +89,22 @@ export class StoreAnalyticsService {
// Get brands added/dropped // Get brands added/dropped
const brandsResult = await this.pool.query(` const brandsResult = await this.pool.query(`
WITH start_brands AS ( WITH start_brands AS (
SELECT DISTINCT brand_name SELECT DISTINCT brand_name_raw
FROM store_product_snapshots FROM store_product_snapshots
WHERE dispensary_id = $1 WHERE dispensary_id = $1
AND captured_at >= $2 AND captured_at < $2 + INTERVAL '1 day' AND captured_at >= $2::timestamp AND captured_at < $2::timestamp + INTERVAL '1 day'
AND brand_name IS NOT NULL AND brand_name_raw IS NOT NULL
), ),
end_brands AS ( end_brands AS (
SELECT DISTINCT brand_name SELECT DISTINCT brand_name_raw
FROM store_product_snapshots FROM store_product_snapshots
WHERE dispensary_id = $1 WHERE dispensary_id = $1
AND captured_at >= $3 - INTERVAL '1 day' AND captured_at <= $3 AND captured_at >= $3::timestamp - INTERVAL '1 day' AND captured_at <= $3::timestamp
AND brand_name IS NOT NULL AND brand_name_raw IS NOT NULL
) )
SELECT SELECT
ARRAY(SELECT brand_name FROM end_brands EXCEPT SELECT brand_name FROM start_brands) AS added, ARRAY(SELECT brand_name_raw FROM end_brands EXCEPT SELECT brand_name_raw FROM start_brands) AS added,
ARRAY(SELECT brand_name FROM start_brands EXCEPT SELECT brand_name FROM end_brands) AS dropped ARRAY(SELECT brand_name_raw FROM start_brands EXCEPT SELECT brand_name_raw FROM end_brands) AS dropped
`, [dispensaryId, start, end]); `, [dispensaryId, start, end]);
const brands = brandsResult.rows[0] || { added: [], dropped: [] }; const brands = brandsResult.rows[0] || { added: [], dropped: [] };
@@ -184,9 +184,9 @@ export class StoreAnalyticsService {
-- Products added -- Products added
SELECT SELECT
sp.id AS store_product_id, sp.id AS store_product_id,
sp.name AS product_name, sp.name_raw AS product_name,
sp.brand_name, sp.brand_name_raw,
sp.category, sp.category_raw,
'added' AS event_type, 'added' AS event_type,
sp.first_seen_at AS event_date, sp.first_seen_at AS event_date,
NULL::TEXT AS old_value, NULL::TEXT AS old_value,
@@ -201,9 +201,9 @@ export class StoreAnalyticsService {
-- Stock in/out from snapshots -- Stock in/out from snapshots
SELECT SELECT
sps.store_product_id, sps.store_product_id,
sp.name AS product_name, sp.name_raw AS product_name,
sp.brand_name, sp.brand_name_raw,
sp.category, sp.category_raw,
CASE CASE
WHEN sps.is_in_stock = TRUE AND LAG(sps.is_in_stock) OVER w = FALSE THEN 'stock_in' WHEN sps.is_in_stock = TRUE AND LAG(sps.is_in_stock) OVER w = FALSE THEN 'stock_in'
WHEN sps.is_in_stock = FALSE AND LAG(sps.is_in_stock) OVER w = TRUE THEN 'stock_out' WHEN sps.is_in_stock = FALSE AND LAG(sps.is_in_stock) OVER w = TRUE THEN 'stock_out'
@@ -224,9 +224,9 @@ export class StoreAnalyticsService {
-- Price changes from snapshots -- Price changes from snapshots
SELECT SELECT
sps.store_product_id, sps.store_product_id,
sp.name AS product_name, sp.name_raw AS product_name,
sp.brand_name, sp.brand_name_raw,
sp.category, sp.category_raw,
'price_change' AS event_type, 'price_change' AS event_type,
sps.captured_at AS event_date, sps.captured_at AS event_date,
LAG(sps.price_rec::TEXT) OVER w AS old_value, LAG(sps.price_rec::TEXT) OVER w AS old_value,
@@ -250,8 +250,8 @@ export class StoreAnalyticsService {
return result.rows.map((row: any) => ({ return result.rows.map((row: any) => ({
store_product_id: row.store_product_id, store_product_id: row.store_product_id,
product_name: row.product_name, product_name: row.product_name,
brand_name: row.brand_name, brand_name: row.brand_name_raw,
category: row.category, category: row.category_raw,
event_type: row.event_type, event_type: row.event_type,
event_date: row.event_date ? row.event_date.toISOString() : null, event_date: row.event_date ? row.event_date.toISOString() : null,
old_value: row.old_value, old_value: row.old_value,
@@ -259,6 +259,122 @@ export class StoreAnalyticsService {
})); }));
} }
/**
* Get quantity changes for a store (increases/decreases)
* Useful for estimating sales (decreases) or restocks (increases)
*
* @param direction - 'decrease' for likely sales, 'increase' for restocks, 'all' for both
*/
async getQuantityChanges(
dispensaryId: number,
options: {
window?: TimeWindow;
customRange?: DateRange;
direction?: 'increase' | 'decrease' | 'all';
limit?: number;
} = {}
): Promise<{
dispensary_id: number;
window: TimeWindow;
direction: string;
total_changes: number;
total_units_decreased: number;
total_units_increased: number;
changes: Array<{
store_product_id: number;
product_name: string;
brand_name: string | null;
category: string | null;
old_quantity: number;
new_quantity: number;
quantity_delta: number;
direction: 'increase' | 'decrease';
captured_at: string;
}>;
}> {
const { window = '7d', customRange, direction = 'all', limit = 100 } = options;
const { start, end } = getDateRangeFromWindow(window, customRange);
// Build direction filter
let directionFilter = '';
if (direction === 'decrease') {
directionFilter = 'AND qty_delta < 0';
} else if (direction === 'increase') {
directionFilter = 'AND qty_delta > 0';
}
const result = await this.pool.query(`
WITH qty_changes AS (
SELECT
sps.store_product_id,
sp.name_raw AS product_name,
sp.brand_name_raw AS brand_name,
sp.category_raw AS category,
LAG(sps.stock_quantity) OVER w AS old_quantity,
sps.stock_quantity AS new_quantity,
sps.stock_quantity - LAG(sps.stock_quantity) OVER w AS qty_delta,
sps.captured_at
FROM store_product_snapshots sps
JOIN store_products sp ON sp.id = sps.store_product_id
WHERE sps.dispensary_id = $1
AND sps.captured_at >= $2
AND sps.captured_at <= $3
AND sps.stock_quantity IS NOT NULL
WINDOW w AS (PARTITION BY sps.store_product_id ORDER BY sps.captured_at)
)
SELECT *
FROM qty_changes
WHERE old_quantity IS NOT NULL
AND qty_delta != 0
${directionFilter}
ORDER BY captured_at DESC
LIMIT $4
`, [dispensaryId, start, end, limit]);
// Calculate totals
const totalsResult = await this.pool.query(`
WITH qty_changes AS (
SELECT
sps.stock_quantity - LAG(sps.stock_quantity) OVER w AS qty_delta
FROM store_product_snapshots sps
WHERE sps.dispensary_id = $1
AND sps.captured_at >= $2
AND sps.captured_at <= $3
AND sps.stock_quantity IS NOT NULL
AND sps.store_product_id IS NOT NULL
WINDOW w AS (PARTITION BY sps.store_product_id ORDER BY sps.captured_at)
)
SELECT
COUNT(*) FILTER (WHERE qty_delta != 0) AS total_changes,
COALESCE(SUM(ABS(qty_delta)) FILTER (WHERE qty_delta < 0), 0) AS units_decreased,
COALESCE(SUM(qty_delta) FILTER (WHERE qty_delta > 0), 0) AS units_increased
FROM qty_changes
WHERE qty_delta IS NOT NULL
`, [dispensaryId, start, end]);
const totals = totalsResult.rows[0] || {};
return {
dispensary_id: dispensaryId,
window,
direction,
total_changes: parseInt(totals.total_changes) || 0,
total_units_decreased: parseInt(totals.units_decreased) || 0,
total_units_increased: parseInt(totals.units_increased) || 0,
changes: result.rows.map((row: any) => ({
store_product_id: row.store_product_id,
product_name: row.product_name,
brand_name: row.brand_name_raw,
category: row.category_raw,
old_quantity: row.old_quantity,
new_quantity: row.new_quantity,
quantity_delta: row.qty_delta,
direction: row.qty_delta > 0 ? 'increase' : 'decrease',
captured_at: row.captured_at?.toISOString() || null,
})),
};
}
/** /**
* Get store inventory composition (categories and brands breakdown) * Get store inventory composition (categories and brands breakdown)
*/ */
@@ -299,14 +415,14 @@ export class StoreAnalyticsService {
// Get top brands // Get top brands
const brandsResult = await this.pool.query(` const brandsResult = await this.pool.query(`
SELECT SELECT
brand_name AS brand, brand_name_raw AS brand,
COUNT(*) AS count, COUNT(*) AS count,
ROUND(COUNT(*)::NUMERIC * 100 / NULLIF($2, 0), 2) AS percent ROUND(COUNT(*)::NUMERIC * 100 / NULLIF($2, 0), 2) AS percent
FROM store_products FROM store_products
WHERE dispensary_id = $1 WHERE dispensary_id = $1
AND brand_name IS NOT NULL AND brand_name_raw IS NOT NULL
AND is_in_stock = TRUE AND is_in_stock = TRUE
GROUP BY brand_name GROUP BY brand_name_raw
ORDER BY count DESC ORDER BY count DESC
LIMIT 20 LIMIT 20
`, [dispensaryId, totalProducts]); `, [dispensaryId, totalProducts]);
@@ -316,7 +432,7 @@ export class StoreAnalyticsService {
in_stock_count: parseInt(totals.in_stock) || 0, in_stock_count: parseInt(totals.in_stock) || 0,
out_of_stock_count: parseInt(totals.out_of_stock) || 0, out_of_stock_count: parseInt(totals.out_of_stock) || 0,
categories: categoriesResult.rows.map((row: any) => ({ categories: categoriesResult.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
count: parseInt(row.count), count: parseInt(row.count),
percent: parseFloat(row.percent) || 0, percent: parseFloat(row.percent) || 0,
})), })),
@@ -458,23 +574,24 @@ export class StoreAnalyticsService {
), ),
market_prices AS ( market_prices AS (
SELECT SELECT
sp.category, sp.category_raw,
AVG(sp.price_rec) AS market_avg AVG(sp.price_rec) AS market_avg
FROM store_products sp FROM store_products sp
WHERE sp.state_id = $2 JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE d.state_id = $2
AND sp.price_rec IS NOT NULL AND sp.price_rec IS NOT NULL
AND sp.is_in_stock = TRUE AND sp.is_in_stock = TRUE
AND sp.category IS NOT NULL AND sp.category_raw IS NOT NULL
GROUP BY sp.category GROUP BY sp.category_raw
) )
SELECT SELECT
sp.category, sp.category_raw,
sp.store_avg AS store_avg_price, sp.store_avg AS store_avg_price,
mp.market_avg AS market_avg_price, mp.market_avg AS market_avg_price,
ROUND(((sp.store_avg - mp.market_avg) / NULLIF(mp.market_avg, 0) * 100)::NUMERIC, 2) AS price_vs_market_percent, ROUND(((sp.store_avg - mp.market_avg) / NULLIF(mp.market_avg, 0) * 100)::NUMERIC, 2) AS price_vs_market_percent,
sp.product_count sp.product_count
FROM store_prices sp FROM store_prices sp
LEFT JOIN market_prices mp ON mp.category = sp.category LEFT JOIN market_prices mp ON mp.category = sp.category_raw
ORDER BY sp.product_count DESC ORDER BY sp.product_count DESC
`, [dispensaryId, dispensary.state_id]); `, [dispensaryId, dispensary.state_id]);
@@ -486,9 +603,10 @@ export class StoreAnalyticsService {
WHERE dispensary_id = $1 AND price_rec IS NOT NULL AND is_in_stock = TRUE WHERE dispensary_id = $1 AND price_rec IS NOT NULL AND is_in_stock = TRUE
), ),
market_avg AS ( market_avg AS (
SELECT AVG(price_rec) AS avg SELECT AVG(sp.price_rec) AS avg
FROM store_products FROM store_products sp
WHERE state_id = $2 AND price_rec IS NOT NULL AND is_in_stock = TRUE JOIN dispensaries d ON d.id = sp.dispensary_id
WHERE d.state_id = $2 AND sp.price_rec IS NOT NULL AND sp.is_in_stock = TRUE
) )
SELECT SELECT
ROUND(((sa.avg - ma.avg) / NULLIF(ma.avg, 0) * 100)::NUMERIC, 2) AS price_vs_market ROUND(((sa.avg - ma.avg) / NULLIF(ma.avg, 0) * 100)::NUMERIC, 2) AS price_vs_market
@@ -499,7 +617,7 @@ export class StoreAnalyticsService {
dispensary_id: dispensaryId, dispensary_id: dispensaryId,
dispensary_name: dispensary.name, dispensary_name: dispensary.name,
categories: result.rows.map((row: any) => ({ categories: result.rows.map((row: any) => ({
category: row.category, category: row.category_raw,
store_avg_price: parseFloat(row.store_avg_price), store_avg_price: parseFloat(row.store_avg_price),
market_avg_price: row.market_avg_price ? parseFloat(row.market_avg_price) : 0, market_avg_price: row.market_avg_price ? parseFloat(row.market_avg_price) : 0,
price_vs_market_percent: row.price_vs_market_percent ? parseFloat(row.price_vs_market_percent) : 0, price_vs_market_percent: row.price_vs_market_percent ? parseFloat(row.price_vs_market_percent) : 0,

View File

@@ -11,3 +11,4 @@ export { BrandPenetrationService } from './BrandPenetrationService';
export { CategoryAnalyticsService } from './CategoryAnalyticsService'; export { CategoryAnalyticsService } from './CategoryAnalyticsService';
export { StoreAnalyticsService } from './StoreAnalyticsService'; export { StoreAnalyticsService } from './StoreAnalyticsService';
export { StateAnalyticsService } from './StateAnalyticsService'; export { StateAnalyticsService } from './StateAnalyticsService';
export { BrandIntelligenceService } from './BrandIntelligenceService';

View File

@@ -322,3 +322,48 @@ export interface RecVsMedPriceComparison {
}; };
price_diff_percent: number | null; price_diff_percent: number | null;
} }
// ============================================================
// BRAND PROMOTIONAL ANALYTICS TYPES
// ============================================================
export interface BrandPromotionalEvent {
product_name: string;
store_product_id: number;
dispensary_id: number;
dispensary_name: string;
state_code: string;
category: string | null;
special_start: string; // ISO date when special started
special_end: string | null; // ISO date when special ended (null if ongoing)
duration_days: number | null;
regular_price: number;
special_price: number;
discount_percent: number;
quantity_at_start: number | null;
quantity_at_end: number | null;
quantity_sold_estimate: number | null; // quantity_at_start - quantity_at_end
}
export interface BrandPromotionalSummary {
brand_name: string;
window: TimeWindow;
total_promotional_events: number;
total_products_on_special: number;
total_dispensaries_with_specials: number;
states_with_specials: string[];
avg_discount_percent: number;
avg_duration_days: number | null;
total_quantity_sold_estimate: number | null;
promotional_frequency: {
weekly_avg: number;
monthly_avg: number;
};
by_category: Array<{
category: string;
event_count: number;
avg_discount_percent: number;
quantity_sold_estimate: number | null;
}>;
events: BrandPromotionalEvent[];
}

View File

@@ -1,49 +1,53 @@
/** /**
* Crawl Rotator - Proxy & User Agent Rotation for Crawlers * Crawl Rotator - Proxy & User Agent Rotation for Crawlers
* *
* Manages rotation of proxies and user agents to avoid blocks. * Updated: 2025-12-10 per workflow-12102025.md
* Used by platform-specific crawlers (Dutchie, Jane, etc.) *
* KEY BEHAVIORS (per workflow-12102025.md):
* 1. Task determines WHAT work to do, proxy determines SESSION IDENTITY
* 2. Proxy location (timezone) sets Accept-Language headers (always English)
* 3. On 403: immediately get new IP, new fingerprint, retry
* 4. After 3 consecutive 403s on same proxy with different fingerprints → disable proxy
*
* USER-AGENT GENERATION (per workflow-12102025.md):
* - Device distribution: Mobile 62%, Desktop 36%, Tablet 2%
* - Browser whitelist: Chrome, Safari, Edge, Firefox only
* - UA sticks until IP rotates
* - Failure = alert admin + stop crawl (no fallback)
*
* Uses intoli/user-agents for realistic UA generation with daily-updated data.
* *
* Canonical location: src/services/crawl-rotator.ts * Canonical location: src/services/crawl-rotator.ts
*/ */
import { Pool } from 'pg'; import { Pool } from 'pg';
import UserAgent from 'user-agents';
import {
HTTPFingerprint,
generateHTTPFingerprint,
BrowserType,
} from './http-fingerprint';
// ============================================================ // ============================================================
// USER AGENT CONFIGURATION // UA CONSTANTS (per workflow-12102025.md)
// ============================================================ // ============================================================
/** /**
* Modern browser user agents (Chrome, Firefox, Safari, Edge on various platforms) * Per workflow-12102025.md: Device category distribution (hardcoded)
* Updated: 2024 * Mobile: 62%, Desktop: 36%, Tablet: 2%
*/ */
export const USER_AGENTS = [ const DEVICE_WEIGHTS = {
// Chrome on Windows mobile: 62,
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36', desktop: 36,
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36', tablet: 2,
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36', } as const;
// Chrome on macOS /**
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36', * Per workflow-12102025.md: Browser whitelist
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36', * Only Chrome (67%), Safari (20%), Edge (6%), Firefox (3%)
* Samsung Internet, Opera, and other niche browsers are filtered out
// Firefox on Windows */
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0', const ALLOWED_BROWSERS = ['Chrome', 'Safari', 'Edge', 'Firefox'] as const;
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0',
// Firefox on macOS
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:121.0) Gecko/20100101 Firefox/121.0',
// Safari on macOS
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Safari/605.1.15',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15',
// Edge on Windows
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 Edg/120.0.0.0',
// Chrome on Linux
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
];
// ============================================================ // ============================================================
// PROXY TYPES // PROXY TYPES
@@ -61,8 +65,13 @@ export interface Proxy {
failureCount: number; failureCount: number;
successCount: number; successCount: number;
avgResponseTimeMs: number | null; avgResponseTimeMs: number | null;
maxConnections: number; // Number of concurrent connections allowed (for rotating proxies) maxConnections: number;
// Location info (if known) /**
* Per workflow-12102025.md: Track consecutive 403s with different fingerprints.
* After 3 consecutive 403s → disable proxy (it's burned).
*/
consecutive403Count: number;
// Location info - determines session headers per workflow-12102025.md
city?: string; city?: string;
state?: string; state?: string;
country?: string; country?: string;
@@ -77,6 +86,40 @@ export interface ProxyStats {
avgSuccessRate: number; avgSuccessRate: number;
} }
// ============================================================
// FINGERPRINT TYPE
// Per workflow-12102025.md: Full browser fingerprint from user-agents
// ============================================================
export interface BrowserFingerprint {
userAgent: string;
platform: string;
screenWidth: number;
screenHeight: number;
viewportWidth: number;
viewportHeight: number;
deviceCategory: string;
browserName: string; // Per workflow-12102025.md: for session logging
// Derived headers for anti-detect
acceptLanguage: string;
secChUa?: string;
secChUaPlatform?: string;
secChUaMobile?: string;
// Per workflow-12102025.md: HTTP Fingerprinting section
httpFingerprint: HTTPFingerprint;
}
/**
* Per workflow-12102025.md: Session log entry for debugging blocked sessions
*/
export interface UASessionLog {
deviceCategory: string;
browserName: string;
userAgent: string;
proxyIp: string | null;
sessionStartedAt: Date;
}
// ============================================================ // ============================================================
// PROXY ROTATOR CLASS // PROXY ROTATOR CLASS
// ============================================================ // ============================================================
@@ -91,9 +134,6 @@ export class ProxyRotator {
this.pool = pool || null; this.pool = pool || null;
} }
/**
* Initialize with database pool
*/
setPool(pool: Pool): void { setPool(pool: Pool): void {
this.pool = pool; this.pool = pool;
} }
@@ -122,6 +162,7 @@ export class ProxyRotator {
0 as "successCount", 0 as "successCount",
response_time_ms as "avgResponseTimeMs", response_time_ms as "avgResponseTimeMs",
COALESCE(max_connections, 1) as "maxConnections", COALESCE(max_connections, 1) as "maxConnections",
COALESCE(consecutive_403_count, 0) as "consecutive403Count",
city, city,
state, state,
country, country,
@@ -134,11 +175,9 @@ export class ProxyRotator {
this.proxies = result.rows; this.proxies = result.rows;
// Calculate total concurrent capacity
const totalCapacity = this.proxies.reduce((sum, p) => sum + p.maxConnections, 0); const totalCapacity = this.proxies.reduce((sum, p) => sum + p.maxConnections, 0);
console.log(`[ProxyRotator] Loaded ${this.proxies.length} active proxies (${totalCapacity} max concurrent connections)`); console.log(`[ProxyRotator] Loaded ${this.proxies.length} active proxies (${totalCapacity} max concurrent connections)`);
} catch (error) { } catch (error) {
// Table might not exist - that's okay
console.warn(`[ProxyRotator] Could not load proxies: ${error}`); console.warn(`[ProxyRotator] Could not load proxies: ${error}`);
this.proxies = []; this.proxies = [];
} }
@@ -150,7 +189,6 @@ export class ProxyRotator {
getNext(): Proxy | null { getNext(): Proxy | null {
if (this.proxies.length === 0) return null; if (this.proxies.length === 0) return null;
// Round-robin rotation
this.currentIndex = (this.currentIndex + 1) % this.proxies.length; this.currentIndex = (this.currentIndex + 1) % this.proxies.length;
this.lastRotation = new Date(); this.lastRotation = new Date();
@@ -185,23 +223,68 @@ export class ProxyRotator {
} }
/** /**
* Mark proxy as failed (temporarily remove from rotation) * Mark proxy as blocked (403 received)
* Per workflow-12102025.md:
* - Increment consecutive_403_count
* - After 3 consecutive 403s with different fingerprints → disable proxy
* - This is separate from general failures (timeouts, etc.)
*/ */
async markFailed(proxyId: number, error?: string): Promise<void> { async markBlocked(proxyId: number): Promise<boolean> {
// Update in-memory
const proxy = this.proxies.find(p => p.id === proxyId); const proxy = this.proxies.find(p => p.id === proxyId);
if (proxy) { let shouldDisable = false;
proxy.failureCount++;
// Deactivate if too many failures if (proxy) {
if (proxy.failureCount >= 5) { proxy.consecutive403Count++;
// Per workflow-12102025.md: 3 consecutive 403s → proxy is burned
if (proxy.consecutive403Count >= 3) {
proxy.isActive = false; proxy.isActive = false;
this.proxies = this.proxies.filter(p => p.id !== proxyId); this.proxies = this.proxies.filter(p => p.id !== proxyId);
console.log(`[ProxyRotator] Proxy ${proxyId} deactivated after ${proxy.failureCount} failures`); console.log(`[ProxyRotator] Proxy ${proxyId} DISABLED after ${proxy.consecutive403Count} consecutive 403s (burned)`);
shouldDisable = true;
} else {
console.log(`[ProxyRotator] Proxy ${proxyId} blocked (403 #${proxy.consecutive403Count}/3)`);
} }
} }
// Update database // Update database
if (this.pool) {
try {
await this.pool.query(`
UPDATE proxies
SET
consecutive_403_count = COALESCE(consecutive_403_count, 0) + 1,
last_failure_at = NOW(),
test_result = '403 Forbidden',
active = CASE WHEN COALESCE(consecutive_403_count, 0) >= 2 THEN false ELSE active END,
updated_at = NOW()
WHERE id = $1
`, [proxyId]);
} catch (err) {
console.error(`[ProxyRotator] Failed to update proxy ${proxyId}:`, err);
}
}
return shouldDisable;
}
/**
* Mark proxy as failed (general error - timeout, connection error, etc.)
* Separate from 403 blocking per workflow-12102025.md
*/
async markFailed(proxyId: number, error?: string): Promise<void> {
const proxy = this.proxies.find(p => p.id === proxyId);
if (proxy) {
proxy.failureCount++;
// Deactivate if too many general failures
if (proxy.failureCount >= 5) {
proxy.isActive = false;
this.proxies = this.proxies.filter(p => p.id !== proxyId);
console.log(`[ProxyRotator] Proxy ${proxyId} deactivated after ${proxy.failureCount} general failures`);
}
}
if (this.pool) { if (this.pool) {
try { try {
await this.pool.query(` await this.pool.query(`
@@ -220,23 +303,22 @@ export class ProxyRotator {
} }
/** /**
* Mark proxy as successful * Mark proxy as successful - resets consecutive 403 count
* Per workflow-12102025.md: successful request clears the 403 counter
*/ */
async markSuccess(proxyId: number, responseTimeMs?: number): Promise<void> { async markSuccess(proxyId: number, responseTimeMs?: number): Promise<void> {
// Update in-memory
const proxy = this.proxies.find(p => p.id === proxyId); const proxy = this.proxies.find(p => p.id === proxyId);
if (proxy) { if (proxy) {
proxy.successCount++; proxy.successCount++;
proxy.consecutive403Count = 0; // Reset on success per workflow-12102025.md
proxy.lastUsedAt = new Date(); proxy.lastUsedAt = new Date();
if (responseTimeMs !== undefined) { if (responseTimeMs !== undefined) {
// Rolling average
proxy.avgResponseTimeMs = proxy.avgResponseTimeMs proxy.avgResponseTimeMs = proxy.avgResponseTimeMs
? (proxy.avgResponseTimeMs * 0.8) + (responseTimeMs * 0.2) ? (proxy.avgResponseTimeMs * 0.8) + (responseTimeMs * 0.2)
: responseTimeMs; : responseTimeMs;
} }
} }
// Update database
if (this.pool) { if (this.pool) {
try { try {
await this.pool.query(` await this.pool.query(`
@@ -244,6 +326,7 @@ export class ProxyRotator {
SET SET
last_tested_at = NOW(), last_tested_at = NOW(),
test_result = 'success', test_result = 'success',
consecutive_403_count = 0,
response_time_ms = CASE response_time_ms = CASE
WHEN response_time_ms IS NULL THEN $2 WHEN response_time_ms IS NULL THEN $2
ELSE (response_time_ms * 0.8 + $2 * 0.2)::integer ELSE (response_time_ms * 0.8 + $2 * 0.2)::integer
@@ -272,8 +355,8 @@ export class ProxyRotator {
*/ */
getStats(): ProxyStats { getStats(): ProxyStats {
const totalProxies = this.proxies.length; const totalProxies = this.proxies.length;
const activeProxies = this.proxies.reduce((sum, p) => sum + p.maxConnections, 0); // Total concurrent capacity const activeProxies = this.proxies.reduce((sum, p) => sum + p.maxConnections, 0);
const blockedProxies = this.proxies.filter(p => p.failureCount >= 5).length; const blockedProxies = this.proxies.filter(p => p.failureCount >= 5 || p.consecutive403Count >= 3).length;
const successRates = this.proxies const successRates = this.proxies
.filter(p => p.successCount + p.failureCount > 0) .filter(p => p.successCount + p.failureCount > 0)
@@ -285,15 +368,12 @@ export class ProxyRotator {
return { return {
totalProxies, totalProxies,
activeProxies, // Total concurrent capacity across all proxies activeProxies,
blockedProxies, blockedProxies,
avgSuccessRate, avgSuccessRate,
}; };
} }
/**
* Check if proxy pool has available proxies
*/
hasAvailableProxies(): boolean { hasAvailableProxies(): boolean {
return this.proxies.length > 0; return this.proxies.length > 0;
} }
@@ -301,53 +381,194 @@ export class ProxyRotator {
// ============================================================ // ============================================================
// USER AGENT ROTATOR CLASS // USER AGENT ROTATOR CLASS
// Per workflow-12102025.md: Uses intoli/user-agents for realistic fingerprints
// ============================================================ // ============================================================
export class UserAgentRotator { export class UserAgentRotator {
private userAgents: string[]; private currentFingerprint: BrowserFingerprint | null = null;
private currentIndex: number = 0; private sessionLog: UASessionLog | null = null;
private lastRotation: Date = new Date();
constructor(userAgents: string[] = USER_AGENTS) { constructor() {
this.userAgents = userAgents; // Per workflow-12102025.md: Initialize with first fingerprint
// Start at random index to avoid patterns this.rotate();
this.currentIndex = Math.floor(Math.random() * userAgents.length);
} }
/** /**
* Get next user agent in rotation * Per workflow-12102025.md: Roll device category based on distribution
* Mobile: 62%, Desktop: 36%, Tablet: 2%
*/ */
getNext(): string { private rollDeviceCategory(): 'mobile' | 'desktop' | 'tablet' {
this.currentIndex = (this.currentIndex + 1) % this.userAgents.length; const roll = Math.random() * 100;
this.lastRotation = new Date(); if (roll < DEVICE_WEIGHTS.mobile) {
return this.userAgents[this.currentIndex]; return 'mobile';
} else if (roll < DEVICE_WEIGHTS.mobile + DEVICE_WEIGHTS.desktop) {
return 'desktop';
} else {
return 'tablet';
}
} }
/** /**
* Get current user agent without rotating * Per workflow-12102025.md: Extract browser name from UA string
*/ */
getCurrent(): string { private extractBrowserName(userAgent: string): string {
return this.userAgents[this.currentIndex]; if (userAgent.includes('Edg/')) return 'Edge';
if (userAgent.includes('Firefox/')) return 'Firefox';
if (userAgent.includes('Safari/') && !userAgent.includes('Chrome/')) return 'Safari';
if (userAgent.includes('Chrome/')) return 'Chrome';
return 'Unknown';
} }
/** /**
* Get a random user agent * Per workflow-12102025.md: Check if browser is in whitelist
*/ */
getRandom(): string { private isAllowedBrowser(userAgent: string): boolean {
const index = Math.floor(Math.random() * this.userAgents.length); const browserName = this.extractBrowserName(userAgent);
return this.userAgents[index]; return ALLOWED_BROWSERS.includes(browserName as typeof ALLOWED_BROWSERS[number]);
} }
/** /**
* Get total available user agents * Generate a new random fingerprint
* Per workflow-12102025.md:
* - Roll device category (62/36/2)
* - Filter to top 4 browsers only
* - Failure = alert admin + stop (no fallback)
*/ */
rotate(proxyIp?: string): BrowserFingerprint {
// Per workflow-12102025.md: Roll device category
const deviceCategory = this.rollDeviceCategory();
// Per workflow-12102025.md: Generate UA filtered to device category
const generator = new UserAgent({ deviceCategory });
// Per workflow-12102025.md: Try to get an allowed browser (max 50 attempts)
let ua: ReturnType<typeof generator>;
let attempts = 0;
const maxAttempts = 50;
do {
ua = generator();
attempts++;
} while (!this.isAllowedBrowser(ua.data.userAgent) && attempts < maxAttempts);
// Per workflow-12102025.md: If we can't get allowed browser, this is a failure
if (!this.isAllowedBrowser(ua.data.userAgent)) {
const errorMsg = `[UserAgentRotator] CRITICAL: Failed to generate allowed browser after ${maxAttempts} attempts. Device: ${deviceCategory}. Last UA: ${ua.data.userAgent}`;
console.error(errorMsg);
// Per workflow-12102025.md: Alert admin + stop crawl
// TODO: Post alert to admin dashboard
throw new Error(errorMsg);
}
const data = ua.data;
const browserName = this.extractBrowserName(data.userAgent);
// Build sec-ch-ua headers from user agent string
const secChUa = this.buildSecChUa(data.userAgent, deviceCategory);
// Per workflow-12102025.md: HTTP Fingerprinting - generate full HTTP fingerprint
const httpFingerprint = generateHTTPFingerprint(browserName as BrowserType);
this.currentFingerprint = {
userAgent: data.userAgent,
platform: data.platform,
screenWidth: data.screenWidth,
screenHeight: data.screenHeight,
viewportWidth: data.viewportWidth,
viewportHeight: data.viewportHeight,
deviceCategory: data.deviceCategory,
browserName, // Per workflow-12102025.md: for session logging
// Per workflow-12102025.md: always English
acceptLanguage: 'en-US,en;q=0.9',
...secChUa,
// Per workflow-12102025.md: HTTP Fingerprinting section
httpFingerprint,
};
// Per workflow-12102025.md: Log session data
this.sessionLog = {
deviceCategory,
browserName,
userAgent: data.userAgent,
proxyIp: proxyIp || null,
sessionStartedAt: new Date(),
};
console.log(`[UserAgentRotator] New fingerprint: device=${deviceCategory}, browser=${browserName}, UA=${data.userAgent.slice(0, 50)}...`);
return this.currentFingerprint;
}
/**
* Get current fingerprint without rotating
*/
getCurrent(): BrowserFingerprint {
if (!this.currentFingerprint) {
return this.rotate();
}
return this.currentFingerprint;
}
/**
* Get a random fingerprint (rotates and returns)
*/
getRandom(proxyIp?: string): BrowserFingerprint {
return this.rotate(proxyIp);
}
/**
* Per workflow-12102025.md: Get session log for debugging
*/
getSessionLog(): UASessionLog | null {
return this.sessionLog;
}
/**
* Build sec-ch-ua headers from user agent string
* Per workflow-12102025.md: Include mobile indicator based on device category
*/
private buildSecChUa(userAgent: string, deviceCategory: string): { secChUa?: string; secChUaPlatform?: string; secChUaMobile?: string } {
const isMobile = deviceCategory === 'mobile' || deviceCategory === 'tablet';
// Extract Chrome version if present
const chromeMatch = userAgent.match(/Chrome\/(\d+)/);
const edgeMatch = userAgent.match(/Edg\/(\d+)/);
if (edgeMatch) {
const version = edgeMatch[1];
return {
secChUa: `"Microsoft Edge";v="${version}", "Chromium";v="${version}", "Not_A Brand";v="24"`,
secChUaPlatform: userAgent.includes('Windows') ? '"Windows"' : userAgent.includes('Android') ? '"Android"' : '"macOS"',
secChUaMobile: isMobile ? '?1' : '?0',
};
}
if (chromeMatch) {
const version = chromeMatch[1];
let platform = '"Linux"';
if (userAgent.includes('Windows')) platform = '"Windows"';
else if (userAgent.includes('Mac')) platform = '"macOS"';
else if (userAgent.includes('Android')) platform = '"Android"';
else if (userAgent.includes('iPhone') || userAgent.includes('iPad')) platform = '"iOS"';
return {
secChUa: `"Google Chrome";v="${version}", "Chromium";v="${version}", "Not_A Brand";v="24"`,
secChUaPlatform: platform,
secChUaMobile: isMobile ? '?1' : '?0',
};
}
// Firefox/Safari don't send sec-ch-ua
return {};
}
getCount(): number { getCount(): number {
return this.userAgents.length; return 1; // user-agents generates dynamically
} }
} }
// ============================================================ // ============================================================
// COMBINED ROTATOR (for convenience) // COMBINED ROTATOR
// Per workflow-12102025.md: Coordinates proxy + fingerprint rotation
// ============================================================ // ============================================================
export class CrawlRotator { export class CrawlRotator {
@@ -359,49 +580,51 @@ export class CrawlRotator {
this.userAgent = new UserAgentRotator(); this.userAgent = new UserAgentRotator();
} }
/**
* Initialize rotator (load proxies from DB)
*/
async initialize(): Promise<void> { async initialize(): Promise<void> {
await this.proxy.loadProxies(); await this.proxy.loadProxies();
} }
/** /**
* Rotate proxy only * Rotate proxy only (get new IP)
*/ */
rotateProxy(): Proxy | null { rotateProxy(): Proxy | null {
return this.proxy.getNext(); return this.proxy.getNext();
} }
/** /**
* Rotate user agent only * Rotate fingerprint only (new UA, screen size, etc.)
*/ */
rotateUserAgent(): string { rotateFingerprint(): BrowserFingerprint {
return this.userAgent.getNext(); return this.userAgent.rotate();
} }
/** /**
* Rotate both proxy and user agent * Rotate both proxy and fingerprint
* Per workflow-12102025.md: called on 403 for fresh identity
* Passes proxy IP to UA rotation for session logging
*/ */
rotateBoth(): { proxy: Proxy | null; userAgent: string } { rotateBoth(): { proxy: Proxy | null; fingerprint: BrowserFingerprint } {
const proxy = this.proxy.getNext();
const proxyIp = proxy ? proxy.host : undefined;
return { return {
proxy: this.proxy.getNext(), proxy,
userAgent: this.userAgent.getNext(), fingerprint: this.userAgent.rotate(proxyIp),
}; };
} }
/** /**
* Get current proxy and user agent without rotating * Get current proxy and fingerprint without rotating
*/ */
getCurrent(): { proxy: Proxy | null; userAgent: string } { getCurrent(): { proxy: Proxy | null; fingerprint: BrowserFingerprint } {
return { return {
proxy: this.proxy.getCurrent(), proxy: this.proxy.getCurrent(),
userAgent: this.userAgent.getCurrent(), fingerprint: this.userAgent.getCurrent(),
}; };
} }
/** /**
* Record success for current proxy * Record success for current proxy
* Per workflow-12102025.md: resets consecutive 403 count
*/ */
async recordSuccess(responseTimeMs?: number): Promise<void> { async recordSuccess(responseTimeMs?: number): Promise<void> {
const current = this.proxy.getCurrent(); const current = this.proxy.getCurrent();
@@ -411,7 +634,20 @@ export class CrawlRotator {
} }
/** /**
* Record failure for current proxy * Record 403 block for current proxy
* Per workflow-12102025.md: increments consecutive_403_count, disables after 3
* Returns true if proxy was disabled
*/
async recordBlock(): Promise<boolean> {
const current = this.proxy.getCurrent();
if (current) {
return await this.proxy.markBlocked(current.id);
}
return false;
}
/**
* Record general failure (not 403)
*/ */
async recordFailure(error?: string): Promise<void> { async recordFailure(error?: string): Promise<void> {
const current = this.proxy.getCurrent(); const current = this.proxy.getCurrent();
@@ -421,14 +657,13 @@ export class CrawlRotator {
} }
/** /**
* Get current proxy location info (for reporting) * Get current proxy location info
* Note: For rotating proxies (like IPRoyal), the actual exit location varies per request * Per workflow-12102025.md: proxy location determines session headers
*/ */
getProxyLocation(): { city?: string; state?: string; country?: string; timezone?: string; isRotating: boolean } | null { getProxyLocation(): { city?: string; state?: string; country?: string; timezone?: string; isRotating: boolean } | null {
const current = this.proxy.getCurrent(); const current = this.proxy.getCurrent();
if (!current) return null; if (!current) return null;
// Check if this is a rotating proxy (max_connections > 1 usually indicates rotating)
const isRotating = current.maxConnections > 1; const isRotating = current.maxConnections > 1;
return { return {
@@ -439,6 +674,15 @@ export class CrawlRotator {
isRotating isRotating
}; };
} }
/**
* Get timezone from current proxy
* Per workflow-12102025.md: used for Accept-Language header
*/
getProxyTimezone(): string | undefined {
const current = this.proxy.getCurrent();
return current?.timezone;
}
} }
// ============================================================ // ============================================================

View File

@@ -0,0 +1,315 @@
/**
* HTTP Fingerprinting Service
*
* Per workflow-12102025.md - HTTP Fingerprinting section:
* - Full header set per browser type
* - Browser-specific header ordering
* - Natural randomization (DNT, Accept quality)
* - Dynamic Referer per dispensary
*
* Canonical location: src/services/http-fingerprint.ts
*/
// ============================================================
// TYPES
// ============================================================
export type BrowserType = 'Chrome' | 'Firefox' | 'Safari' | 'Edge';
/**
* Per workflow-12102025.md: Full HTTP fingerprint for a session
*/
export interface HTTPFingerprint {
browserType: BrowserType;
headers: Record<string, string>;
headerOrder: string[];
curlImpersonateBinary: string;
hasDNT: boolean;
}
/**
* Per workflow-12102025.md: Context for building headers
*/
export interface HeaderContext {
userAgent: string;
secChUa?: string;
secChUaPlatform?: string;
secChUaMobile?: string;
referer: string;
isPost: boolean;
contentLength?: number;
}
// ============================================================
// CONSTANTS (per workflow-12102025.md)
// ============================================================
/**
* Per workflow-12102025.md: DNT header distribution (~30% of users)
*/
const DNT_PROBABILITY = 0.30;
/**
* Per workflow-12102025.md: Accept header variations for natural traffic
*/
const ACCEPT_VARIATIONS = [
'application/json, text/plain, */*',
'application/json,text/plain,*/*',
'*/*',
];
/**
* Per workflow-12102025.md: Accept-Language variations
*/
const ACCEPT_LANGUAGE_VARIATIONS = [
'en-US,en;q=0.9',
'en-US,en;q=0.8',
'en-US;q=0.9,en;q=0.8',
];
/**
* Per workflow-12102025.md: curl-impersonate binaries per browser
*/
const CURL_IMPERSONATE_BINARIES: Record<BrowserType, string> = {
Chrome: 'curl_chrome131',
Edge: 'curl_chrome131', // Edge uses Chromium
Firefox: 'curl_ff133',
Safari: 'curl_safari17',
};
// ============================================================
// HEADER ORDERING (per workflow-12102025.md)
// ============================================================
/**
* Per workflow-12102025.md: Chrome header order for GraphQL requests
*/
const CHROME_HEADER_ORDER = [
'Host',
'Connection',
'Content-Length',
'sec-ch-ua',
'DNT',
'sec-ch-ua-mobile',
'User-Agent',
'sec-ch-ua-platform',
'Content-Type',
'Accept',
'Origin',
'sec-fetch-site',
'sec-fetch-mode',
'sec-fetch-dest',
'Referer',
'Accept-Encoding',
'Accept-Language',
];
/**
* Per workflow-12102025.md: Firefox header order for GraphQL requests
*/
const FIREFOX_HEADER_ORDER = [
'Host',
'User-Agent',
'Accept',
'Accept-Language',
'Accept-Encoding',
'Content-Type',
'Content-Length',
'Origin',
'DNT',
'Connection',
'Referer',
'sec-fetch-dest',
'sec-fetch-mode',
'sec-fetch-site',
];
/**
* Per workflow-12102025.md: Safari header order for GraphQL requests
*/
const SAFARI_HEADER_ORDER = [
'Host',
'Connection',
'Content-Length',
'Accept',
'User-Agent',
'Content-Type',
'Origin',
'Referer',
'Accept-Encoding',
'Accept-Language',
];
/**
* Per workflow-12102025.md: Edge uses Chrome order (Chromium-based)
*/
const HEADER_ORDERS: Record<BrowserType, string[]> = {
Chrome: CHROME_HEADER_ORDER,
Edge: CHROME_HEADER_ORDER,
Firefox: FIREFOX_HEADER_ORDER,
Safari: SAFARI_HEADER_ORDER,
};
// ============================================================
// FINGERPRINT GENERATION
// ============================================================
/**
* Per workflow-12102025.md: Generate HTTP fingerprint for a session
* Randomization is done once per session for consistency
*/
export function generateHTTPFingerprint(browserType: BrowserType): HTTPFingerprint {
// Per workflow-12102025.md: DNT randomized per session (~30%)
const hasDNT = Math.random() < DNT_PROBABILITY;
return {
browserType,
headers: {}, // Built dynamically per request
headerOrder: HEADER_ORDERS[browserType],
curlImpersonateBinary: CURL_IMPERSONATE_BINARIES[browserType],
hasDNT,
};
}
/**
* Per workflow-12102025.md: Build complete headers for a request
* Returns headers in browser-specific order
*/
export function buildOrderedHeaders(
fingerprint: HTTPFingerprint,
context: HeaderContext
): { headers: Record<string, string>; orderedHeaders: string[] } {
const { browserType, hasDNT, headerOrder } = fingerprint;
const { userAgent, secChUa, secChUaPlatform, secChUaMobile, referer, isPost, contentLength } = context;
// Per workflow-12102025.md: Natural randomization for Accept
const accept = ACCEPT_VARIATIONS[Math.floor(Math.random() * ACCEPT_VARIATIONS.length)];
const acceptLanguage = ACCEPT_LANGUAGE_VARIATIONS[Math.floor(Math.random() * ACCEPT_LANGUAGE_VARIATIONS.length)];
// Build all possible headers
const allHeaders: Record<string, string> = {
'Connection': 'keep-alive',
'User-Agent': userAgent,
'Accept': accept,
'Accept-Language': acceptLanguage,
'Accept-Encoding': 'gzip, deflate, br',
};
// Per workflow-12102025.md: POST-only headers
if (isPost) {
allHeaders['Content-Type'] = 'application/json';
allHeaders['Origin'] = 'https://dutchie.com';
if (contentLength !== undefined) {
allHeaders['Content-Length'] = String(contentLength);
}
}
// Per workflow-12102025.md: Dynamic Referer per dispensary
allHeaders['Referer'] = referer;
// Per workflow-12102025.md: DNT randomized per session
if (hasDNT) {
allHeaders['DNT'] = '1';
}
// Per workflow-12102025.md: Chromium-only headers (Chrome, Edge)
if (browserType === 'Chrome' || browserType === 'Edge') {
if (secChUa) allHeaders['sec-ch-ua'] = secChUa;
if (secChUaMobile) allHeaders['sec-ch-ua-mobile'] = secChUaMobile;
if (secChUaPlatform) allHeaders['sec-ch-ua-platform'] = secChUaPlatform;
allHeaders['sec-fetch-site'] = 'same-origin';
allHeaders['sec-fetch-mode'] = 'cors';
allHeaders['sec-fetch-dest'] = 'empty';
}
// Per workflow-12102025.md: Firefox has sec-fetch but no sec-ch
if (browserType === 'Firefox') {
allHeaders['sec-fetch-site'] = 'same-origin';
allHeaders['sec-fetch-mode'] = 'cors';
allHeaders['sec-fetch-dest'] = 'empty';
}
// Per workflow-12102025.md: Safari has no sec-* headers
// Filter to only headers that exist and order them
const orderedHeaders: string[] = [];
const headers: Record<string, string> = {};
for (const headerName of headerOrder) {
if (allHeaders[headerName]) {
orderedHeaders.push(headerName);
headers[headerName] = allHeaders[headerName];
}
}
return { headers, orderedHeaders };
}
/**
* Per workflow-12102025.md: Build curl command arguments for headers
* Headers are added in browser-specific order
*/
export function buildCurlHeaderArgs(
fingerprint: HTTPFingerprint,
context: HeaderContext
): string[] {
const { headers, orderedHeaders } = buildOrderedHeaders(fingerprint, context);
const args: string[] = [];
for (const headerName of orderedHeaders) {
// Skip Host and Content-Length - curl handles these
if (headerName === 'Host' || headerName === 'Content-Length') continue;
args.push('-H', `${headerName}: ${headers[headerName]}`);
}
return args;
}
/**
* Per workflow-12102025.md: Extract Referer from dispensary menu_url
*/
export function buildRefererFromMenuUrl(menuUrl: string | null | undefined): string {
if (!menuUrl) {
return 'https://dutchie.com/';
}
// Extract slug from menu_url
// Formats: /embedded-menu/<slug> or /dispensary/<slug> or full URL
let slug: string | null = null;
const embeddedMatch = menuUrl.match(/\/embedded-menu\/([^/?]+)/);
const dispensaryMatch = menuUrl.match(/\/dispensary\/([^/?]+)/);
if (embeddedMatch) {
slug = embeddedMatch[1];
} else if (dispensaryMatch) {
slug = dispensaryMatch[1];
}
if (slug) {
return `https://dutchie.com/dispensary/${slug}`;
}
return 'https://dutchie.com/';
}
/**
* Per workflow-12102025.md: Get curl-impersonate binary for browser
*/
export function getCurlBinary(browserType: BrowserType): string {
return CURL_IMPERSONATE_BINARIES[browserType];
}
/**
* Per workflow-12102025.md: Check if curl-impersonate is available
*/
export function isCurlImpersonateAvailable(browserType: BrowserType): boolean {
const binary = CURL_IMPERSONATE_BINARIES[browserType];
try {
const { execSync } = require('child_process');
execSync(`which ${binary}`, { stdio: 'ignore' });
return true;
} catch {
return false;
}
}

View File

@@ -39,7 +39,12 @@ export async function cleanupOrphanedJobs(): Promise<void> {
export type ProxyTestMode = 'all' | 'failed' | 'inactive'; export type ProxyTestMode = 'all' | 'failed' | 'inactive';
export async function createProxyTestJob(mode: ProxyTestMode = 'all', concurrency: number = DEFAULT_CONCURRENCY): Promise<number> { export interface CreateJobResult {
jobId: number;
totalProxies: number;
}
export async function createProxyTestJob(mode: ProxyTestMode = 'all', concurrency: number = DEFAULT_CONCURRENCY): Promise<CreateJobResult> {
// Check for existing running jobs first // Check for existing running jobs first
const existingJob = await getActiveProxyTestJob(); const existingJob = await getActiveProxyTestJob();
if (existingJob) { if (existingJob) {
@@ -79,7 +84,7 @@ export async function createProxyTestJob(mode: ProxyTestMode = 'all', concurrenc
console.error(`❌ Proxy test job ${jobId} failed:`, err); console.error(`❌ Proxy test job ${jobId} failed:`, err);
}); });
return jobId; return { jobId, totalProxies };
} }
export async function getProxyTestJob(jobId: number): Promise<ProxyTestJob | null> { export async function getProxyTestJob(jobId: number): Promise<ProxyTestJob | null> {

View File

@@ -1,116 +1,38 @@
import cron from 'node-cron'; /**
import { pool } from '../db/pool'; * LEGACY SCHEDULER - DEPRECATED 2024-12-10
import { scrapeStore, scrapeCategory } from '../scraper-v2'; *
* DO NOT USE THIS FILE.
let scheduledJobs: cron.ScheduledTask[] = []; *
* Per TASK_WORKFLOW_2024-12-10.md:
async function getSettings(): Promise<{ * This node-cron scheduler has been replaced by the database-driven
scrapeIntervalHours: number; * task scheduler in src/services/task-scheduler.ts
scrapeSpecialsTime: string; *
}> { * The new scheduler:
const result = await pool.query(` * - Stores schedules in PostgreSQL (survives restarts)
SELECT key, value FROM settings * - Uses SELECT FOR UPDATE SKIP LOCKED (multi-replica safe)
WHERE key IN ('scrape_interval_hours', 'scrape_specials_time') * - Creates tasks in worker_tasks table (processed by task-worker.ts)
`); *
* This file is kept for reference only. All exports are no-ops.
const settings: Record<string, string> = {}; * Legacy code has been removed - see git history for original implementation.
result.rows.forEach((row: { key: string; value: string }) => { */
settings[row.key] = row.value;
});
return {
scrapeIntervalHours: parseInt(settings.scrape_interval_hours || '4'),
scrapeSpecialsTime: settings.scrape_specials_time || '00:01'
};
}
async function scrapeAllStores(): Promise<void> {
console.log('🔄 Starting scheduled scrape for all stores...');
const result = await pool.query(`
SELECT id, name FROM stores WHERE active = true AND scrape_enabled = true
`);
for (const store of result.rows) {
try {
console.log(`Scraping store: ${store.name}`);
await scrapeStore(store.id);
} catch (error) {
console.error(`Failed to scrape store ${store.name}:`, error);
}
}
console.log('✅ Scheduled scrape completed');
}
async function scrapeSpecials(): Promise<void> {
console.log('🌟 Starting scheduled specials scrape...');
const result = await pool.query(`
SELECT s.id, s.name, c.id as category_id
FROM stores s
JOIN categories c ON c.store_id = s.id
WHERE s.active = true AND s.scrape_enabled = true
AND c.slug = 'specials' AND c.scrape_enabled = true
`);
for (const row of result.rows) {
try {
console.log(`Scraping specials for: ${row.name}`);
await scrapeCategory(row.id, row.category_id);
} catch (error) {
console.error(`Failed to scrape specials for ${row.name}:`, error);
}
}
console.log('✅ Specials scrape completed');
}
// 2024-12-10: All functions are now no-ops
export async function startScheduler(): Promise<void> { export async function startScheduler(): Promise<void> {
// Stop any existing jobs console.warn('[DEPRECATED] startScheduler() called - use taskScheduler from task-scheduler.ts instead');
stopScheduler();
const settings = await getSettings();
// Schedule regular store scrapes (every N hours)
const scrapeIntervalCron = `0 */${settings.scrapeIntervalHours} * * *`;
const storeJob = cron.schedule(scrapeIntervalCron, scrapeAllStores);
scheduledJobs.push(storeJob);
console.log(`📅 Scheduled store scraping: every ${settings.scrapeIntervalHours} hours`);
// Schedule specials scraping (daily at specified time)
const [hours, minutes] = settings.scrapeSpecialsTime.split(':');
const specialsCron = `${minutes} ${hours} * * *`;
const specialsJob = cron.schedule(specialsCron, scrapeSpecials);
scheduledJobs.push(specialsJob);
console.log(`📅 Scheduled specials scraping: daily at ${settings.scrapeSpecialsTime}`);
// Initial scrape on startup (after 10 seconds)
setTimeout(() => {
console.log('🚀 Running initial scrape...');
scrapeAllStores().catch(console.error);
}, 10000);
} }
export function stopScheduler(): void { export function stopScheduler(): void {
scheduledJobs.forEach(job => job.stop()); console.warn('[DEPRECATED] stopScheduler() called - use taskScheduler from task-scheduler.ts instead');
scheduledJobs = [];
console.log('🛑 Scheduler stopped');
} }
export async function restartScheduler(): Promise<void> { export async function restartScheduler(): Promise<void> {
console.log('🔄 Restarting scheduler...'); console.warn('[DEPRECATED] restartScheduler() called - use taskScheduler from task-scheduler.ts instead');
stopScheduler();
await startScheduler();
} }
// Manual trigger functions for admin export async function triggerStoreScrape(_storeId: number): Promise<void> {
export async function triggerStoreScrape(storeId: number): Promise<void> { console.warn('[DEPRECATED] triggerStoreScrape() called - use taskService.createTask() instead');
console.log(`🔧 Manual scrape triggered for store ID: ${storeId}`);
await scrapeStore(storeId);
} }
export async function triggerAllStoresScrape(): Promise<void> { export async function triggerAllStoresScrape(): Promise<void> {
console.log('🔧 Manual scrape triggered for all stores'); console.warn('[DEPRECATED] triggerAllStoresScrape() called - use taskScheduler.triggerSchedule() instead');
await scrapeAllStores();
} }

View File

@@ -0,0 +1,375 @@
/**
* Database-Driven Task Scheduler
*
* Per TASK_WORKFLOW_2024-12-10.md:
* - Schedules stored in DB (survives restarts)
* - Uses SELECT FOR UPDATE to prevent duplicate execution across replicas
* - Polls every 60s to check if schedules are due
* - Generates tasks into worker_tasks table for task-worker.ts to process
*
* 2024-12-10: Created to replace legacy node-cron scheduler
*/
import { pool } from '../db/pool';
import { taskService, TaskRole } from '../tasks/task-service';
// Per TASK_WORKFLOW_2024-12-10.md: Poll interval for checking schedules
const POLL_INTERVAL_MS = 60_000; // 60 seconds
interface TaskSchedule {
id: number;
name: string;
role: TaskRole;
enabled: boolean;
interval_hours: number;
last_run_at: Date | null;
next_run_at: Date | null;
state_code: string | null;
priority: number;
}
class TaskScheduler {
private pollTimer: NodeJS.Timeout | null = null;
private isRunning = false;
/**
* Start the scheduler
* Per TASK_WORKFLOW_2024-12-10.md: Called on API server startup
*/
async start(): Promise<void> {
if (this.isRunning) {
console.log('[TaskScheduler] Already running');
return;
}
console.log('[TaskScheduler] Starting database-driven scheduler...');
this.isRunning = true;
// Per TASK_WORKFLOW_2024-12-10.md: On startup, recover stale tasks
try {
const recovered = await taskService.recoverStaleTasks(10);
if (recovered > 0) {
console.log(`[TaskScheduler] Recovered ${recovered} stale tasks from dead workers`);
}
} catch (err: any) {
console.error('[TaskScheduler] Failed to recover stale tasks:', err.message);
}
// Per TASK_WORKFLOW_2024-12-10.md: Ensure default schedules exist
await this.ensureDefaultSchedules();
// Per TASK_WORKFLOW_2024-12-10.md: Check immediately on startup
await this.checkAndRunDueSchedules();
// Per TASK_WORKFLOW_2024-12-10.md: Then poll every 60 seconds
this.pollTimer = setInterval(async () => {
await this.checkAndRunDueSchedules();
}, POLL_INTERVAL_MS);
console.log('[TaskScheduler] Started - polling every 60s');
}
/**
* Stop the scheduler
*/
stop(): void {
if (this.pollTimer) {
clearInterval(this.pollTimer);
this.pollTimer = null;
}
this.isRunning = false;
console.log('[TaskScheduler] Stopped');
}
/**
* Ensure default schedules exist in the database
* Per TASK_WORKFLOW_2024-12-10.md: Creates schedules if they don't exist
*/
private async ensureDefaultSchedules(): Promise<void> {
// Per TASK_WORKFLOW_2024-12-10.md: Default schedules for task generation
// NOTE: payload_fetch replaces direct product_refresh - it chains to product_refresh
const defaults = [
{
name: 'payload_fetch_all',
role: 'payload_fetch' as TaskRole,
interval_hours: 4,
priority: 0,
description: 'Fetch payloads from Dutchie API for all crawl-enabled stores every 4 hours. Chains to product_refresh.',
},
{
name: 'store_discovery_dutchie',
role: 'store_discovery' as TaskRole,
interval_hours: 24,
priority: 5,
description: 'Discover new Dutchie stores daily',
},
{
name: 'analytics_refresh',
role: 'analytics_refresh' as TaskRole,
interval_hours: 6,
priority: 0,
description: 'Refresh analytics materialized views every 6 hours',
},
];
for (const sched of defaults) {
try {
await pool.query(`
INSERT INTO task_schedules (name, role, interval_hours, priority, description, enabled, next_run_at)
VALUES ($1, $2, $3, $4, $5, true, NOW())
ON CONFLICT (name) DO NOTHING
`, [sched.name, sched.role, sched.interval_hours, sched.priority, sched.description]);
} catch (err: any) {
// Table may not exist yet - will be created by migration
if (!err.message.includes('does not exist')) {
console.error(`[TaskScheduler] Failed to create default schedule ${sched.name}:`, err.message);
}
}
}
}
/**
* Check for and run any due schedules
* Per TASK_WORKFLOW_2024-12-10.md: Uses SELECT FOR UPDATE SKIP LOCKED to prevent duplicates
*/
private async checkAndRunDueSchedules(): Promise<void> {
const client = await pool.connect();
try {
await client.query('BEGIN');
// Per TASK_WORKFLOW_2024-12-10.md: Atomic claim of due schedules
const result = await client.query<TaskSchedule>(`
SELECT *
FROM task_schedules
WHERE enabled = true
AND (next_run_at IS NULL OR next_run_at <= NOW())
FOR UPDATE SKIP LOCKED
`);
for (const schedule of result.rows) {
console.log(`[TaskScheduler] Running schedule: ${schedule.name} (${schedule.role})`);
try {
const tasksCreated = await this.executeSchedule(schedule);
console.log(`[TaskScheduler] Schedule ${schedule.name} created ${tasksCreated} tasks`);
// Per TASK_WORKFLOW_2024-12-10.md: Update last_run_at and calculate next_run_at
await client.query(`
UPDATE task_schedules
SET
last_run_at = NOW(),
next_run_at = NOW() + ($1 || ' hours')::interval,
last_task_count = $2,
updated_at = NOW()
WHERE id = $3
`, [schedule.interval_hours, tasksCreated, schedule.id]);
} catch (err: any) {
console.error(`[TaskScheduler] Schedule ${schedule.name} failed:`, err.message);
// Still update next_run_at to prevent infinite retry loop
await client.query(`
UPDATE task_schedules
SET
next_run_at = NOW() + ($1 || ' hours')::interval,
last_error = $2,
updated_at = NOW()
WHERE id = $3
`, [schedule.interval_hours, err.message, schedule.id]);
}
}
await client.query('COMMIT');
} catch (err: any) {
await client.query('ROLLBACK');
console.error('[TaskScheduler] Failed to check schedules:', err.message);
} finally {
client.release();
}
}
/**
* Execute a schedule and create tasks
* Per TASK_WORKFLOW_2024-12-10.md: Different logic per role
*/
private async executeSchedule(schedule: TaskSchedule): Promise<number> {
switch (schedule.role) {
case 'payload_fetch':
// Per TASK_WORKFLOW_2024-12-10.md: payload_fetch replaces direct product_refresh
return this.generatePayloadFetchTasks(schedule);
case 'product_refresh':
// Legacy - kept for manual triggers, but scheduled crawls use payload_fetch
return this.generatePayloadFetchTasks(schedule);
case 'store_discovery':
return this.generateStoreDiscoveryTasks(schedule);
case 'analytics_refresh':
return this.generateAnalyticsRefreshTasks(schedule);
default:
console.warn(`[TaskScheduler] Unknown role: ${schedule.role}`);
return 0;
}
}
/**
* Generate payload_fetch tasks for stores that need crawling
* Per TASK_WORKFLOW_2024-12-10.md: payload_fetch hits API, saves to disk, chains to product_refresh
*/
private async generatePayloadFetchTasks(schedule: TaskSchedule): Promise<number> {
// Per TASK_WORKFLOW_2024-12-10.md: Find stores needing refresh
const result = await pool.query(`
SELECT d.id
FROM dispensaries d
WHERE d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL
-- No pending/running payload_fetch or product_refresh task already
AND NOT EXISTS (
SELECT 1 FROM worker_tasks t
WHERE t.dispensary_id = d.id
AND t.role IN ('payload_fetch', 'product_refresh')
AND t.status IN ('pending', 'claimed', 'running')
)
-- Never fetched OR last fetch > interval ago
AND (
d.last_fetch_at IS NULL
OR d.last_fetch_at < NOW() - ($1 || ' hours')::interval
)
${schedule.state_code ? 'AND d.state_id = (SELECT id FROM states WHERE code = $2)' : ''}
`, schedule.state_code ? [schedule.interval_hours, schedule.state_code] : [schedule.interval_hours]);
const dispensaryIds = result.rows.map((r: { id: number }) => r.id);
if (dispensaryIds.length === 0) {
return 0;
}
// Per TASK_WORKFLOW_2024-12-10.md: Create payload_fetch tasks (they chain to product_refresh)
const tasks = dispensaryIds.map((id: number) => ({
role: 'payload_fetch' as TaskRole,
dispensary_id: id,
priority: schedule.priority,
}));
return taskService.createTasks(tasks);
}
/**
* Generate store_discovery tasks
* Per TASK_WORKFLOW_2024-12-10.md: One task per platform
*/
private async generateStoreDiscoveryTasks(schedule: TaskSchedule): Promise<number> {
// Check if discovery task already pending
const existing = await taskService.listTasks({
role: 'store_discovery',
status: ['pending', 'claimed', 'running'],
limit: 1,
});
if (existing.length > 0) {
console.log('[TaskScheduler] Store discovery task already pending, skipping');
return 0;
}
await taskService.createTask({
role: 'store_discovery',
platform: 'dutchie',
priority: schedule.priority,
});
return 1;
}
/**
* Generate analytics_refresh tasks
* Per TASK_WORKFLOW_2024-12-10.md: Single task to refresh all MVs
*/
private async generateAnalyticsRefreshTasks(schedule: TaskSchedule): Promise<number> {
// Check if analytics task already pending
const existing = await taskService.listTasks({
role: 'analytics_refresh',
status: ['pending', 'claimed', 'running'],
limit: 1,
});
if (existing.length > 0) {
console.log('[TaskScheduler] Analytics refresh task already pending, skipping');
return 0;
}
await taskService.createTask({
role: 'analytics_refresh',
priority: schedule.priority,
});
return 1;
}
/**
* Get all schedules for dashboard display
*/
async getSchedules(): Promise<TaskSchedule[]> {
try {
const result = await pool.query(`
SELECT * FROM task_schedules ORDER BY name
`);
return result.rows as TaskSchedule[];
} catch {
return [];
}
}
/**
* Update a schedule
*/
async updateSchedule(id: number, updates: Partial<TaskSchedule>): Promise<void> {
const setClauses: string[] = [];
const values: any[] = [];
let paramIndex = 1;
if (updates.enabled !== undefined) {
setClauses.push(`enabled = $${paramIndex++}`);
values.push(updates.enabled);
}
if (updates.interval_hours !== undefined) {
setClauses.push(`interval_hours = $${paramIndex++}`);
values.push(updates.interval_hours);
}
if (updates.priority !== undefined) {
setClauses.push(`priority = $${paramIndex++}`);
values.push(updates.priority);
}
if (setClauses.length === 0) return;
setClauses.push('updated_at = NOW()');
values.push(id);
await pool.query(`
UPDATE task_schedules
SET ${setClauses.join(', ')}
WHERE id = $${paramIndex}
`, values);
}
/**
* Trigger a schedule to run immediately
*/
async triggerSchedule(id: number): Promise<number> {
const result = await pool.query(`
SELECT * FROM task_schedules WHERE id = $1
`, [id]);
if (result.rows.length === 0) {
throw new Error(`Schedule ${id} not found`);
}
return this.executeSchedule(result.rows[0] as TaskSchedule);
}
}
// Per TASK_WORKFLOW_2024-12-10.md: Singleton instance
export const taskScheduler = new TaskScheduler();

View File

@@ -94,7 +94,8 @@ export async function handleEntryPointDiscovery(ctx: TaskContext): Promise<TaskR
// ============================================================ // ============================================================
// STEP 3: Start stealth session // STEP 3: Start stealth session
// ============================================================ // ============================================================
const session = startSession(dispensary.state || 'AZ', 'America/Phoenix'); // Per workflow-12102025.md: session identity comes from proxy location, not task params
const session = startSession();
console.log(`[EntryPointDiscovery] Session started: ${session.sessionId}`); console.log(`[EntryPointDiscovery] Session started: ${session.sessionId}`);
try { try {

View File

@@ -0,0 +1,221 @@
/**
* Payload Fetch Handler
*
* Per TASK_WORKFLOW_2024-12-10.md: Separates API fetch from data processing.
*
* This handler ONLY:
* 1. Hits Dutchie GraphQL API
* 2. Saves raw payload to filesystem (gzipped)
* 3. Records metadata in raw_crawl_payloads table
* 4. Queues a product_refresh task to process the payload
*
* Benefits of separation:
* - Retry-friendly: If normalize fails, re-run refresh without re-crawling
* - Faster refreshes: Local file read vs network call
* - Replay-able: Run refresh against any historical payload
* - Less API pressure: Only this role hits Dutchie
*/
import { TaskContext, TaskResult } from '../task-worker';
import {
executeGraphQL,
startSession,
endSession,
GRAPHQL_HASHES,
DUTCHIE_CONFIG,
} from '../../platforms/dutchie';
import { saveRawPayload } from '../../utils/payload-storage';
import { taskService } from '../task-service';
export async function handlePayloadFetch(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const dispensaryId = task.dispensary_id;
if (!dispensaryId) {
return { success: false, error: 'No dispensary_id specified for payload_fetch task' };
}
try {
// ============================================================
// STEP 1: Load dispensary info
// ============================================================
const dispResult = await pool.query(`
SELECT
id, name, platform_dispensary_id, menu_url, menu_type, city, state
FROM dispensaries
WHERE id = $1 AND crawl_enabled = true
`, [dispensaryId]);
if (dispResult.rows.length === 0) {
return { success: false, error: `Dispensary ${dispensaryId} not found or not crawl_enabled` };
}
const dispensary = dispResult.rows[0];
const platformId = dispensary.platform_dispensary_id;
if (!platformId) {
return { success: false, error: `Dispensary ${dispensaryId} has no platform_dispensary_id` };
}
// Extract cName from menu_url
const cNameMatch = dispensary.menu_url?.match(/\/(?:embedded-menu|dispensary)\/([^/?]+)/);
const cName = cNameMatch ? cNameMatch[1] : 'dispensary';
console.log(`[PayloadFetch] Starting fetch for ${dispensary.name} (ID: ${dispensaryId})`);
console.log(`[PayloadFetch] Platform ID: ${platformId}, cName: ${cName}`);
// ============================================================
// STEP 2: Start stealth session
// ============================================================
const session = startSession();
console.log(`[PayloadFetch] Session started: ${session.sessionId}`);
await ctx.heartbeat();
// ============================================================
// STEP 3: Fetch products via GraphQL (Status: 'All')
// ============================================================
const allProducts: any[] = [];
let page = 0;
let totalCount = 0;
const perPage = DUTCHIE_CONFIG.perPage;
const maxPages = DUTCHIE_CONFIG.maxPages;
try {
while (page < maxPages) {
const variables = {
includeEnterpriseSpecials: false,
productsFilter: {
dispensaryId: platformId,
pricingType: 'rec',
Status: 'All',
types: [],
useCache: false,
isDefaultSort: true,
sortBy: 'popularSortIdx',
sortDirection: 1,
bypassOnlineThresholds: true,
isKioskMenu: false,
removeProductsBelowOptionThresholds: false,
},
page,
perPage,
};
console.log(`[PayloadFetch] Fetching page ${page + 1}...`);
const result = await executeGraphQL(
'FilteredProducts',
variables,
GRAPHQL_HASHES.FilteredProducts,
{ cName, maxRetries: 3 }
);
const data = result?.data?.filteredProducts;
if (!data || !data.products) {
if (page === 0) {
throw new Error('No product data returned from GraphQL');
}
break;
}
const products = data.products;
allProducts.push(...products);
if (page === 0) {
totalCount = data.queryInfo?.totalCount || products.length;
console.log(`[PayloadFetch] Total products reported: ${totalCount}`);
}
if (allProducts.length >= totalCount || products.length < perPage) {
break;
}
page++;
if (page < maxPages) {
await new Promise(r => setTimeout(r, DUTCHIE_CONFIG.pageDelayMs));
}
if (page % 5 === 0) {
await ctx.heartbeat();
}
}
console.log(`[PayloadFetch] Fetched ${allProducts.length} products in ${page + 1} pages`);
} finally {
endSession();
}
if (allProducts.length === 0) {
return {
success: false,
error: 'No products returned from GraphQL',
productsProcessed: 0,
};
}
await ctx.heartbeat();
// ============================================================
// STEP 4: Save raw payload to filesystem
// Per TASK_WORKFLOW_2024-12-10.md: Metadata/Payload separation
// ============================================================
const rawPayload = {
dispensaryId,
platformId,
cName,
fetchedAt: new Date().toISOString(),
productCount: allProducts.length,
products: allProducts,
};
const payloadResult = await saveRawPayload(
pool,
dispensaryId,
rawPayload,
null, // crawl_run_id - not using crawl_runs in new system
allProducts.length
);
console.log(`[PayloadFetch] Saved payload #${payloadResult.id} (${(payloadResult.sizeBytes / 1024).toFixed(1)}KB)`);
// ============================================================
// STEP 5: Update dispensary last_fetch_at
// ============================================================
await pool.query(`
UPDATE dispensaries
SET last_fetch_at = NOW()
WHERE id = $1
`, [dispensaryId]);
// ============================================================
// STEP 6: Queue product_refresh task to process the payload
// Per TASK_WORKFLOW_2024-12-10.md: Task chaining
// ============================================================
await taskService.createTask({
role: 'product_refresh',
dispensary_id: dispensaryId,
priority: task.priority || 0,
payload: { payload_id: payloadResult.id },
});
console.log(`[PayloadFetch] Queued product_refresh task for payload #${payloadResult.id}`);
return {
success: true,
payloadId: payloadResult.id,
productCount: allProducts.length,
sizeBytes: payloadResult.sizeBytes,
};
} catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error(`[PayloadFetch] Error for dispensary ${dispensaryId}:`, errorMessage);
return {
success: false,
error: errorMessage,
};
}
}

View File

@@ -1,16 +1,31 @@
/** /**
* Product Discovery Handler * Product Discovery Handler
* *
* Initial product fetch for stores that have 0 products. * Per TASK_WORKFLOW_2024-12-10.md: Initial product fetch for newly discovered stores.
* Same logic as product_resync, but for initial discovery. *
* Flow:
* 1. Triggered after store_discovery promotes a new dispensary
* 2. Chains to payload_fetch to get initial product data
* 3. payload_fetch chains to product_refresh for DB upsert
*
* Chaining:
* store_discovery → (newStoreIds) → product_discovery → payload_fetch → product_refresh
*/ */
import { TaskContext, TaskResult } from '../task-worker'; import { TaskContext, TaskResult } from '../task-worker';
import { handleProductRefresh } from './product-refresh'; import { handlePayloadFetch } from './payload-fetch';
export async function handleProductDiscovery(ctx: TaskContext): Promise<TaskResult> { export async function handleProductDiscovery(ctx: TaskContext): Promise<TaskResult> {
// Product discovery is essentially the same as refresh for the first time const { task } = ctx;
// The main difference is in when this task is triggered (new store vs scheduled) const dispensaryId = task.dispensary_id;
console.log(`[ProductDiscovery] Starting initial product fetch for dispensary ${ctx.task.dispensary_id}`);
return handleProductRefresh(ctx); if (!dispensaryId) {
return { success: false, error: 'No dispensary_id provided' };
}
console.log(`[ProductDiscovery] Starting initial product discovery for dispensary ${dispensaryId}`);
// Per TASK_WORKFLOW_2024-12-10.md: Chain to payload_fetch for API → disk
// payload_fetch will then chain to product_refresh for disk → DB
return handlePayloadFetch(ctx);
} }

View File

@@ -1,33 +1,32 @@
/** /**
* Product Refresh Handler * Product Refresh Handler
* *
* Re-crawls a store to capture price/stock changes using the GraphQL pipeline. * Per TASK_WORKFLOW_2024-12-10.md: Processes a locally-stored payload.
*
* This handler reads from the filesystem (NOT the Dutchie API).
* The payload_fetch handler is responsible for API calls.
* *
* Flow: * Flow:
* 1. Load dispensary info from database * 1. Load payload from filesystem (by payload_id or latest for dispensary)
* 2. Start stealth session (fingerprint + optional proxy) * 2. Normalize data via DutchieNormalizer
* 3. Fetch products via GraphQL (Status: 'All') * 3. Upsert to store_products and store_product_snapshots
* 4. Normalize data via DutchieNormalizer * 4. Track missing products (increment consecutive_misses, mark OOS at 3)
* 5. Upsert to store_products and store_product_snapshots * 5. Download new product images
* 6. Track missing products (increment consecutive_misses, mark OOS at 3) *
* 7. Download new product images * Benefits of separation:
* 8. End session * - Retry-friendly: If this fails, re-run without re-crawling
* - Replay-able: Run against any historical payload
* - Faster: Local file read vs network call
*/ */
import { TaskContext, TaskResult } from '../task-worker'; import { TaskContext, TaskResult } from '../task-worker';
import {
executeGraphQL,
startSession,
endSession,
GRAPHQL_HASHES,
DUTCHIE_CONFIG,
} from '../../platforms/dutchie';
import { DutchieNormalizer } from '../../hydration/normalizers/dutchie'; import { DutchieNormalizer } from '../../hydration/normalizers/dutchie';
import { import {
upsertStoreProducts, upsertStoreProducts,
createStoreProductSnapshots, createStoreProductSnapshots,
downloadProductImages, downloadProductImages,
} from '../../hydration/canonical-upsert'; } from '../../hydration/canonical-upsert';
import { loadRawPayloadById, getLatestPayload } from '../../utils/payload-storage';
const normalizer = new DutchieNormalizer(); const normalizer = new DutchieNormalizer();
@@ -47,129 +46,76 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
SELECT SELECT
id, name, platform_dispensary_id, menu_url, menu_type, city, state id, name, platform_dispensary_id, menu_url, menu_type, city, state
FROM dispensaries FROM dispensaries
WHERE id = $1 AND crawl_enabled = true WHERE id = $1
`, [dispensaryId]); `, [dispensaryId]);
if (dispResult.rows.length === 0) { if (dispResult.rows.length === 0) {
return { success: false, error: `Dispensary ${dispensaryId} not found or not crawl_enabled` }; return { success: false, error: `Dispensary ${dispensaryId} not found` };
} }
const dispensary = dispResult.rows[0]; const dispensary = dispResult.rows[0];
const platformId = dispensary.platform_dispensary_id;
if (!platformId) { // Extract cName from menu_url for image storage context
return { success: false, error: `Dispensary ${dispensaryId} has no platform_dispensary_id` };
}
// Extract cName from menu_url
const cNameMatch = dispensary.menu_url?.match(/\/(?:embedded-menu|dispensary)\/([^/?]+)/); const cNameMatch = dispensary.menu_url?.match(/\/(?:embedded-menu|dispensary)\/([^/?]+)/);
const cName = cNameMatch ? cNameMatch[1] : 'dispensary'; const cName = cNameMatch ? cNameMatch[1] : 'dispensary';
console.log(`[ProductResync] Starting crawl for ${dispensary.name} (ID: ${dispensaryId})`); console.log(`[ProductRefresh] Starting refresh for ${dispensary.name} (ID: ${dispensaryId})`);
console.log(`[ProductResync] Platform ID: ${platformId}, cName: ${cName}`);
// ============================================================
// STEP 2: Start stealth session
// ============================================================
const session = startSession(dispensary.state || 'AZ', 'America/Phoenix');
console.log(`[ProductResync] Session started: ${session.sessionId}`);
await ctx.heartbeat(); await ctx.heartbeat();
// ============================================================ // ============================================================
// STEP 3: Fetch products via GraphQL (Status: 'All') // STEP 2: Load payload from filesystem
// Per TASK_WORKFLOW_2024-12-10.md: Read local payload, not API
// ============================================================ // ============================================================
const allProducts: any[] = []; let payloadData: any;
let page = 0; let payloadId: number;
let totalCount = 0;
const perPage = DUTCHIE_CONFIG.perPage;
const maxPages = DUTCHIE_CONFIG.maxPages;
try { // Check if specific payload_id was provided (from task chaining)
while (page < maxPages) { const taskPayload = task.payload as { payload_id?: number } | null;
const variables = {
includeEnterpriseSpecials: false,
productsFilter: {
dispensaryId: platformId,
pricingType: 'rec',
Status: 'All',
types: [],
useCache: false,
isDefaultSort: true,
sortBy: 'popularSortIdx',
sortDirection: 1,
bypassOnlineThresholds: true,
isKioskMenu: false,
removeProductsBelowOptionThresholds: false,
},
page,
perPage,
};
console.log(`[ProductResync] Fetching page ${page + 1}...`); if (taskPayload?.payload_id) {
// Load specific payload (from payload_fetch chaining)
const result = await executeGraphQL( const result = await loadRawPayloadById(pool, taskPayload.payload_id);
'FilteredProducts', if (!result) {
variables, return { success: false, error: `Payload ${taskPayload.payload_id} not found` };
GRAPHQL_HASHES.FilteredProducts,
{ cName, maxRetries: 3 }
);
const data = result?.data?.filteredProducts;
if (!data || !data.products) {
if (page === 0) {
throw new Error('No product data returned from GraphQL');
} }
break; payloadData = result.payload;
payloadId = result.metadata.id;
console.log(`[ProductRefresh] Loaded specific payload #${payloadId}`);
} else {
// Load latest payload for this dispensary
const result = await getLatestPayload(pool, dispensaryId);
if (!result) {
return { success: false, error: `No payload found for dispensary ${dispensaryId}` };
}
payloadData = result.payload;
payloadId = result.metadata.id;
console.log(`[ProductRefresh] Loaded latest payload #${payloadId} (${result.metadata.fetchedAt})`);
} }
const products = data.products; const allProducts = payloadData.products || [];
allProducts.push(...products);
if (page === 0) {
totalCount = data.queryInfo?.totalCount || products.length;
console.log(`[ProductResync] Total products reported: ${totalCount}`);
}
if (allProducts.length >= totalCount || products.length < perPage) {
break;
}
page++;
if (page < maxPages) {
await new Promise(r => setTimeout(r, DUTCHIE_CONFIG.pageDelayMs));
}
if (page % 5 === 0) {
await ctx.heartbeat();
}
}
console.log(`[ProductResync] Fetched ${allProducts.length} products in ${page + 1} pages`);
} finally {
endSession();
}
if (allProducts.length === 0) { if (allProducts.length === 0) {
return { return {
success: false, success: false,
error: 'No products returned from GraphQL', error: 'Payload contains no products',
payloadId,
productsProcessed: 0, productsProcessed: 0,
}; };
} }
console.log(`[ProductRefresh] Processing ${allProducts.length} products from payload #${payloadId}`);
await ctx.heartbeat(); await ctx.heartbeat();
// ============================================================ // ============================================================
// STEP 4: Normalize data // STEP 3: Normalize data
// ============================================================ // ============================================================
console.log(`[ProductResync] Normalizing ${allProducts.length} products...`); console.log(`[ProductRefresh] Normalizing ${allProducts.length} products...`);
// Build RawPayload for the normalizer // Build RawPayload for the normalizer
const rawPayload = { const rawPayload = {
id: `resync-${dispensaryId}-${Date.now()}`, id: `refresh-${dispensaryId}-${Date.now()}`,
dispensary_id: dispensaryId, dispensary_id: dispensaryId,
crawl_run_id: null, crawl_run_id: null,
platform: 'dutchie', platform: 'dutchie',
@@ -189,25 +135,26 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
const normalizationResult = normalizer.normalize(rawPayload); const normalizationResult = normalizer.normalize(rawPayload);
if (normalizationResult.errors.length > 0) { if (normalizationResult.errors.length > 0) {
console.warn(`[ProductResync] Normalization warnings: ${normalizationResult.errors.map(e => e.message).join(', ')}`); console.warn(`[ProductRefresh] Normalization warnings: ${normalizationResult.errors.map(e => e.message).join(', ')}`);
} }
if (normalizationResult.products.length === 0) { if (normalizationResult.products.length === 0) {
return { return {
success: false, success: false,
error: 'Normalization produced no products', error: 'Normalization produced no products',
payloadId,
productsProcessed: 0, productsProcessed: 0,
}; };
} }
console.log(`[ProductResync] Normalized ${normalizationResult.products.length} products`); console.log(`[ProductRefresh] Normalized ${normalizationResult.products.length} products`);
await ctx.heartbeat(); await ctx.heartbeat();
// ============================================================ // ============================================================
// STEP 5: Upsert to canonical tables // STEP 4: Upsert to canonical tables
// ============================================================ // ============================================================
console.log(`[ProductResync] Upserting to store_products...`); console.log(`[ProductRefresh] Upserting to store_products...`);
const upsertResult = await upsertStoreProducts( const upsertResult = await upsertStoreProducts(
pool, pool,
@@ -216,12 +163,12 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
normalizationResult.availability normalizationResult.availability
); );
console.log(`[ProductResync] Upserted: ${upsertResult.upserted} (${upsertResult.new} new, ${upsertResult.updated} updated)`); console.log(`[ProductRefresh] Upserted: ${upsertResult.upserted} (${upsertResult.new} new, ${upsertResult.updated} updated)`);
await ctx.heartbeat(); await ctx.heartbeat();
// Create snapshots // Create snapshots
console.log(`[ProductResync] Creating snapshots...`); console.log(`[ProductRefresh] Creating snapshots...`);
const snapshotsResult = await createStoreProductSnapshots( const snapshotsResult = await createStoreProductSnapshots(
pool, pool,
@@ -232,12 +179,12 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
null // No crawl_run_id in new system null // No crawl_run_id in new system
); );
console.log(`[ProductResync] Created ${snapshotsResult.created} snapshots`); console.log(`[ProductRefresh] Created ${snapshotsResult.created} snapshots`);
await ctx.heartbeat(); await ctx.heartbeat();
// ============================================================ // ============================================================
// STEP 6: Track missing products (consecutive_misses logic) // STEP 5: Track missing products (consecutive_misses logic)
// - Products in feed: reset consecutive_misses to 0 // - Products in feed: reset consecutive_misses to 0
// - Products not in feed: increment consecutive_misses // - Products not in feed: increment consecutive_misses
// - At 3 consecutive misses: mark as OOS // - At 3 consecutive misses: mark as OOS
@@ -270,7 +217,7 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
const incrementedCount = incrementResult.rowCount || 0; const incrementedCount = incrementResult.rowCount || 0;
if (incrementedCount > 0) { if (incrementedCount > 0) {
console.log(`[ProductResync] Incremented consecutive_misses for ${incrementedCount} products`); console.log(`[ProductRefresh] Incremented consecutive_misses for ${incrementedCount} products`);
} }
// Mark as OOS any products that hit 3 consecutive misses // Mark as OOS any products that hit 3 consecutive misses
@@ -286,16 +233,16 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
const markedOosCount = oosResult.rowCount || 0; const markedOosCount = oosResult.rowCount || 0;
if (markedOosCount > 0) { if (markedOosCount > 0) {
console.log(`[ProductResync] Marked ${markedOosCount} products as OOS (3+ consecutive misses)`); console.log(`[ProductRefresh] Marked ${markedOosCount} products as OOS (3+ consecutive misses)`);
} }
await ctx.heartbeat(); await ctx.heartbeat();
// ============================================================ // ============================================================
// STEP 7: Download images for new products // STEP 6: Download images for new products
// ============================================================ // ============================================================
if (upsertResult.productsNeedingImages.length > 0) { if (upsertResult.productsNeedingImages.length > 0) {
console.log(`[ProductResync] Downloading images for ${upsertResult.productsNeedingImages.length} products...`); console.log(`[ProductRefresh] Downloading images for ${upsertResult.productsNeedingImages.length} products...`);
try { try {
const dispensaryContext = { const dispensaryContext = {
@@ -309,12 +256,12 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
); );
} catch (imgError: any) { } catch (imgError: any) {
// Image download errors shouldn't fail the whole task // Image download errors shouldn't fail the whole task
console.warn(`[ProductResync] Image download error (non-fatal): ${imgError.message}`); console.warn(`[ProductRefresh] Image download error (non-fatal): ${imgError.message}`);
} }
} }
// ============================================================ // ============================================================
// STEP 8: Update dispensary last_crawl_at // STEP 7: Update dispensary last_crawl_at
// ============================================================ // ============================================================
await pool.query(` await pool.query(`
UPDATE dispensaries UPDATE dispensaries
@@ -322,10 +269,20 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
WHERE id = $1 WHERE id = $1
`, [dispensaryId]); `, [dispensaryId]);
console.log(`[ProductResync] Completed ${dispensary.name}`); // ============================================================
// STEP 8: Mark payload as processed
// ============================================================
await pool.query(`
UPDATE raw_crawl_payloads
SET processed_at = NOW()
WHERE id = $1
`, [payloadId]);
console.log(`[ProductRefresh] Completed ${dispensary.name}`);
return { return {
success: true, success: true,
payloadId,
productsProcessed: normalizationResult.products.length, productsProcessed: normalizationResult.products.length,
snapshotsCreated: snapshotsResult.created, snapshotsCreated: snapshotsResult.created,
newProducts: upsertResult.new, newProducts: upsertResult.new,
@@ -335,7 +292,7 @@ export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult
} catch (error: unknown) { } catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error'; const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error(`[ProductResync] Error for dispensary ${dispensaryId}:`, errorMessage); console.error(`[ProductRefresh] Error for dispensary ${dispensaryId}:`, errorMessage);
return { return {
success: false, success: false,
error: errorMessage, error: errorMessage,

View File

@@ -1,8 +1,16 @@
/** /**
* Store Discovery Handler * Store Discovery Handler
* *
* Discovers new stores by crawling location APIs and adding them * Per TASK_WORKFLOW_2024-12-10.md: Discovers new stores and returns their IDs for task chaining.
* to discovery_locations table. *
* Flow:
* 1. For each active state, run Dutchie discovery
* 2. Discover locations via GraphQL
* 3. Auto-promote valid locations to dispensaries table
* 4. Return newStoreIds[] for chaining to payload_fetch
*
* Chaining:
* store_discovery → (returns newStoreIds) → payload_fetch → product_refresh
*/ */
import { TaskContext, TaskResult } from '../task-worker'; import { TaskContext, TaskResult } from '../task-worker';
@@ -10,23 +18,25 @@ import { discoverState } from '../../discovery';
export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult> { export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx; const { pool, task } = ctx;
const platform = task.platform || 'default'; const platform = task.platform || 'dutchie';
console.log(`[StoreDiscovery] Starting discovery for platform: ${platform}`); console.log(`[StoreDiscovery] Starting discovery for platform: ${platform}`);
try { try {
// Get states to discover // Get states to discover
const statesResult = await pool.query(` const statesResult = await pool.query(`
SELECT code FROM states WHERE active = true ORDER BY code SELECT code FROM states WHERE is_active = true ORDER BY code
`); `);
const stateCodes = statesResult.rows.map(r => r.code); const stateCodes = statesResult.rows.map(r => r.code);
if (stateCodes.length === 0) { if (stateCodes.length === 0) {
return { success: true, storesDiscovered: 0, message: 'No active states to discover' }; return { success: true, storesDiscovered: 0, newStoreIds: [], message: 'No active states to discover' };
} }
let totalDiscovered = 0; let totalDiscovered = 0;
let totalPromoted = 0; let totalPromoted = 0;
// Per TASK_WORKFLOW_2024-12-10.md: Collect all new store IDs for task chaining
const allNewStoreIds: number[] = [];
// Run discovery for each state // Run discovery for each state
for (const stateCode of stateCodes) { for (const stateCode of stateCodes) {
@@ -39,6 +49,13 @@ export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult
const result = await discoverState(pool, stateCode); const result = await discoverState(pool, stateCode);
totalDiscovered += result.totalLocationsFound || 0; totalDiscovered += result.totalLocationsFound || 0;
totalPromoted += result.totalLocationsUpserted || 0; totalPromoted += result.totalLocationsUpserted || 0;
// Per TASK_WORKFLOW_2024-12-10.md: Collect new IDs for chaining
if (result.newDispensaryIds && result.newDispensaryIds.length > 0) {
allNewStoreIds.push(...result.newDispensaryIds);
console.log(`[StoreDiscovery] ${stateCode}: ${result.newDispensaryIds.length} new stores`);
}
console.log(`[StoreDiscovery] ${stateCode}: found ${result.totalLocationsFound}, upserted ${result.totalLocationsUpserted}`); console.log(`[StoreDiscovery] ${stateCode}: found ${result.totalLocationsFound}, upserted ${result.totalLocationsUpserted}`);
} catch (error: unknown) { } catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error'; const errorMessage = error instanceof Error ? error.message : 'Unknown error';
@@ -47,13 +64,15 @@ export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult
} }
} }
console.log(`[StoreDiscovery] Complete: ${totalDiscovered} discovered, ${totalPromoted} promoted`); console.log(`[StoreDiscovery] Complete: ${totalDiscovered} discovered, ${totalPromoted} promoted, ${allNewStoreIds.length} new stores`);
return { return {
success: true, success: true,
storesDiscovered: totalDiscovered, storesDiscovered: totalDiscovered,
storesPromoted: totalPromoted, storesPromoted: totalPromoted,
statesProcessed: stateCodes.length, statesProcessed: stateCodes.length,
// Per TASK_WORKFLOW_2024-12-10.md: Return new IDs for task chaining
newStoreIds: allNewStoreIds,
}; };
} catch (error: unknown) { } catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error'; const errorMessage = error instanceof Error ? error.message : 'Unknown error';
@@ -61,6 +80,7 @@ export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult
return { return {
success: false, success: false,
error: errorMessage, error: errorMessage,
newStoreIds: [],
}; };
} }
} }

View File

@@ -0,0 +1,37 @@
/**
* Task Pool State
*
* Shared state for task pool pause/resume functionality.
* This is kept separate to avoid circular dependencies between
* task-service.ts and routes/tasks.ts.
*
* State is in-memory and resets on server restart.
* By default, the pool is PAUSED (closed) - admin must explicitly start it.
* This prevents workers from immediately grabbing tasks on deploy before
* the system is ready.
*/
let taskPoolPaused = true;
export function isTaskPoolPaused(): boolean {
return taskPoolPaused;
}
export function pauseTaskPool(): void {
taskPoolPaused = true;
console.log('[TaskPool] Task pool PAUSED - workers will not pick up new tasks');
}
export function resumeTaskPool(): void {
taskPoolPaused = false;
console.log('[TaskPool] Task pool RESUMED - workers can pick up tasks');
}
export function getTaskPoolStatus(): { paused: boolean; message: string } {
return {
paused: taskPoolPaused,
message: taskPoolPaused
? 'Task pool is paused - workers will not pick up new tasks'
: 'Task pool is open - workers are picking up tasks',
};
}

View File

@@ -9,12 +9,28 @@
*/ */
import { pool } from '../db/pool'; import { pool } from '../db/pool';
import { isTaskPoolPaused } from './task-pool-state';
// Helper to check if a table exists
async function tableExists(tableName: string): Promise<boolean> {
const result = await pool.query(`
SELECT EXISTS (
SELECT FROM information_schema.tables
WHERE table_name = $1
) as exists
`, [tableName]);
return result.rows[0].exists;
}
// Per TASK_WORKFLOW_2024-12-10.md: Task roles
// payload_fetch: Hits Dutchie API, saves raw payload to filesystem
// product_refresh: Reads local payload, normalizes, upserts to DB
export type TaskRole = export type TaskRole =
| 'store_discovery' | 'store_discovery'
| 'entry_point_discovery' | 'entry_point_discovery'
| 'product_discovery' | 'product_discovery'
| 'product_refresh' | 'payload_fetch' // NEW: Fetches from API, saves to disk
| 'product_refresh' // CHANGED: Now reads from local payload
| 'analytics_refresh'; | 'analytics_refresh';
export type TaskStatus = export type TaskStatus =
@@ -44,6 +60,7 @@ export interface WorkerTask {
error_message: string | null; error_message: string | null;
retry_count: number; retry_count: number;
max_retries: number; max_retries: number;
payload: Record<string, unknown> | null; // Per TASK_WORKFLOW_2024-12-10.md: Task chaining data
created_at: Date; created_at: Date;
updated_at: Date; updated_at: Date;
} }
@@ -54,6 +71,7 @@ export interface CreateTaskParams {
platform?: string; platform?: string;
priority?: number; priority?: number;
scheduled_for?: Date; scheduled_for?: Date;
payload?: Record<string, unknown>; // Per TASK_WORKFLOW_2024-12-10.md: For task chaining data
} }
export interface CapacityMetrics { export interface CapacityMetrics {
@@ -85,8 +103,8 @@ class TaskService {
*/ */
async createTask(params: CreateTaskParams): Promise<WorkerTask> { async createTask(params: CreateTaskParams): Promise<WorkerTask> {
const result = await pool.query( const result = await pool.query(
`INSERT INTO worker_tasks (role, dispensary_id, platform, priority, scheduled_for) `INSERT INTO worker_tasks (role, dispensary_id, platform, priority, scheduled_for, payload)
VALUES ($1, $2, $3, $4, $5) VALUES ($1, $2, $3, $4, $5, $6)
RETURNING *`, RETURNING *`,
[ [
params.role, params.role,
@@ -94,6 +112,7 @@ class TaskService {
params.platform ?? null, params.platform ?? null,
params.priority ?? 0, params.priority ?? 0,
params.scheduled_for ?? null, params.scheduled_for ?? null,
params.payload ? JSON.stringify(params.payload) : null,
] ]
); );
return result.rows[0] as WorkerTask; return result.rows[0] as WorkerTask;
@@ -131,8 +150,14 @@ class TaskService {
/** /**
* Claim a task atomically for a worker * Claim a task atomically for a worker
* If role is null, claims ANY available task (role-agnostic worker) * If role is null, claims ANY available task (role-agnostic worker)
* Returns null if task pool is paused.
*/ */
async claimTask(role: TaskRole | null, workerId: string): Promise<WorkerTask | null> { async claimTask(role: TaskRole | null, workerId: string): Promise<WorkerTask | null> {
// Check if task pool is paused - don't claim any tasks
if (isTaskPoolPaused()) {
return null;
}
if (role) { if (role) {
// Role-specific claiming - use the SQL function // Role-specific claiming - use the SQL function
const result = await pool.query( const result = await pool.query(
@@ -270,6 +295,11 @@ class TaskService {
* List tasks with filters * List tasks with filters
*/ */
async listTasks(filter: TaskFilter = {}): Promise<WorkerTask[]> { async listTasks(filter: TaskFilter = {}): Promise<WorkerTask[]> {
// Return empty list if table doesn't exist
if (!await tableExists('worker_tasks')) {
return [];
}
const conditions: string[] = []; const conditions: string[] = [];
const params: (string | number | string[])[] = []; const params: (string | number | string[])[] = [];
let paramIndex = 1; let paramIndex = 1;
@@ -323,21 +353,41 @@ class TaskService {
* Get capacity metrics for all roles * Get capacity metrics for all roles
*/ */
async getCapacityMetrics(): Promise<CapacityMetrics[]> { async getCapacityMetrics(): Promise<CapacityMetrics[]> {
// Return empty metrics if worker_tasks table doesn't exist
if (!await tableExists('worker_tasks')) {
return [];
}
try {
const result = await pool.query( const result = await pool.query(
`SELECT * FROM v_worker_capacity` `SELECT * FROM v_worker_capacity`
); );
return result.rows as CapacityMetrics[]; return result.rows as CapacityMetrics[];
} catch {
// View may not exist
return [];
}
} }
/** /**
* Get capacity metrics for a specific role * Get capacity metrics for a specific role
*/ */
async getRoleCapacity(role: TaskRole): Promise<CapacityMetrics | null> { async getRoleCapacity(role: TaskRole): Promise<CapacityMetrics | null> {
// Return null if worker_tasks table doesn't exist
if (!await tableExists('worker_tasks')) {
return null;
}
try {
const result = await pool.query( const result = await pool.query(
`SELECT * FROM v_worker_capacity WHERE role = $1`, `SELECT * FROM v_worker_capacity WHERE role = $1`,
[role] [role]
); );
return (result.rows[0] as CapacityMetrics) || null; return (result.rows[0] as CapacityMetrics) || null;
} catch {
// View may not exist
return null;
}
} }
/** /**
@@ -365,6 +415,17 @@ class TaskService {
/** /**
* Chain next task after completion * Chain next task after completion
* Called automatically when a task completes successfully * Called automatically when a task completes successfully
*
* Per TASK_WORKFLOW_2024-12-10.md: Task chaining flow:
*
* Discovery flow (new stores):
* store_discovery → product_discovery → payload_fetch → product_refresh
*
* Scheduled flow (existing stores):
* payload_fetch → product_refresh
*
* Note: entry_point_discovery is deprecated since platform_dispensary_id
* is now resolved during store promotion.
*/ */
async chainNextTask(completedTask: WorkerTask): Promise<WorkerTask | null> { async chainNextTask(completedTask: WorkerTask): Promise<WorkerTask | null> {
if (completedTask.status !== 'completed') { if (completedTask.status !== 'completed') {
@@ -373,12 +434,14 @@ class TaskService {
switch (completedTask.role) { switch (completedTask.role) {
case 'store_discovery': { case 'store_discovery': {
// New stores discovered -> create entry_point_discovery tasks // Per TASK_WORKFLOW_2024-12-10.md: New stores discovered -> create product_discovery tasks
// Skip entry_point_discovery since platform_dispensary_id is set during promotion
const newStoreIds = (completedTask.result as { newStoreIds?: number[] })?.newStoreIds; const newStoreIds = (completedTask.result as { newStoreIds?: number[] })?.newStoreIds;
if (newStoreIds && newStoreIds.length > 0) { if (newStoreIds && newStoreIds.length > 0) {
console.log(`[TaskService] Chaining ${newStoreIds.length} product_discovery tasks for new stores`);
for (const storeId of newStoreIds) { for (const storeId of newStoreIds) {
await this.createTask({ await this.createTask({
role: 'entry_point_discovery', role: 'product_discovery',
dispensary_id: storeId, dispensary_id: storeId,
platform: completedTask.platform ?? undefined, platform: completedTask.platform ?? undefined,
priority: 10, // High priority for new stores priority: 10, // High priority for new stores
@@ -389,7 +452,8 @@ class TaskService {
} }
case 'entry_point_discovery': { case 'entry_point_discovery': {
// Entry point resolved -> create product_discovery task // DEPRECATED: Entry point resolution now happens during store promotion
// Kept for backward compatibility with any in-flight tasks
const success = (completedTask.result as { success?: boolean })?.success; const success = (completedTask.result as { success?: boolean })?.success;
if (success && completedTask.dispensary_id) { if (success && completedTask.dispensary_id) {
return this.createTask({ return this.createTask({
@@ -403,8 +467,15 @@ class TaskService {
} }
case 'product_discovery': { case 'product_discovery': {
// Product discovery done -> store is now ready for regular resync // Per TASK_WORKFLOW_2024-12-10.md: Product discovery chains internally to payload_fetch
// No immediate chaining needed; will be picked up by daily batch generation // No external chaining needed - handleProductDiscovery calls handlePayloadFetch directly
break;
}
case 'payload_fetch': {
// Per TASK_WORKFLOW_2024-12-10.md: payload_fetch chains to product_refresh
// This is handled internally by the payload_fetch handler via taskService.createTask
// No external chaining needed here
break; break;
} }
} }
@@ -463,12 +534,6 @@ class TaskService {
* Get task counts by status for dashboard * Get task counts by status for dashboard
*/ */
async getTaskCounts(): Promise<Record<TaskStatus, number>> { async getTaskCounts(): Promise<Record<TaskStatus, number>> {
const result = await pool.query(
`SELECT status, COUNT(*) as count
FROM worker_tasks
GROUP BY status`
);
const counts: Record<TaskStatus, number> = { const counts: Record<TaskStatus, number> = {
pending: 0, pending: 0,
claimed: 0, claimed: 0,
@@ -478,6 +543,17 @@ class TaskService {
stale: 0, stale: 0,
}; };
// Return empty counts if table doesn't exist
if (!await tableExists('worker_tasks')) {
return counts;
}
const result = await pool.query(
`SELECT status, COUNT(*) as count
FROM worker_tasks
GROUP BY status`
);
for (const row of result.rows) { for (const row of result.rows) {
const typedRow = row as { status: TaskStatus; count: string }; const typedRow = row as { status: TaskStatus; count: string };
counts[typedRow.status] = parseInt(typedRow.count, 10); counts[typedRow.status] = parseInt(typedRow.count, 10);

View File

@@ -52,6 +52,8 @@ import { CrawlRotator } from '../services/crawl-rotator';
import { setCrawlRotator } from '../platforms/dutchie'; import { setCrawlRotator } from '../platforms/dutchie';
// Task handlers by role // Task handlers by role
// Per TASK_WORKFLOW_2024-12-10.md: payload_fetch and product_refresh are now separate
import { handlePayloadFetch } from './handlers/payload-fetch';
import { handleProductRefresh } from './handlers/product-refresh'; import { handleProductRefresh } from './handlers/product-refresh';
import { handleProductDiscovery } from './handlers/product-discovery'; import { handleProductDiscovery } from './handlers/product-discovery';
import { handleStoreDiscovery } from './handlers/store-discovery'; import { handleStoreDiscovery } from './handlers/store-discovery';
@@ -62,6 +64,33 @@ const POLL_INTERVAL_MS = parseInt(process.env.POLL_INTERVAL_MS || '5000');
const HEARTBEAT_INTERVAL_MS = parseInt(process.env.HEARTBEAT_INTERVAL_MS || '30000'); const HEARTBEAT_INTERVAL_MS = parseInt(process.env.HEARTBEAT_INTERVAL_MS || '30000');
const API_BASE_URL = process.env.API_BASE_URL || 'http://localhost:3010'; const API_BASE_URL = process.env.API_BASE_URL || 'http://localhost:3010';
// =============================================================================
// CONCURRENT TASK PROCESSING SETTINGS
// =============================================================================
// Workers can process multiple tasks simultaneously using async I/O.
// This improves throughput for I/O-bound tasks (network calls, DB queries).
//
// Resource thresholds trigger "backoff" - the worker stops claiming new tasks
// but continues processing existing ones until resources return to normal.
//
// See: docs/WORKER_TASK_ARCHITECTURE.md#concurrent-task-processing
// =============================================================================
// Maximum number of tasks this worker will run concurrently
// Tune based on workload: I/O-bound tasks benefit from higher concurrency
const MAX_CONCURRENT_TASKS = parseInt(process.env.MAX_CONCURRENT_TASKS || '3');
// When heap memory usage exceeds this threshold (as decimal 0.0-1.0), stop claiming new tasks
// Default 85% - gives headroom before OOM
const MEMORY_BACKOFF_THRESHOLD = parseFloat(process.env.MEMORY_BACKOFF_THRESHOLD || '0.85');
// When CPU usage exceeds this threshold (as decimal 0.0-1.0), stop claiming new tasks
// Default 90% - allows some burst capacity
const CPU_BACKOFF_THRESHOLD = parseFloat(process.env.CPU_BACKOFF_THRESHOLD || '0.90');
// How long to wait (ms) when in backoff state before rechecking resources
const BACKOFF_DURATION_MS = parseInt(process.env.BACKOFF_DURATION_MS || '10000');
export interface TaskContext { export interface TaskContext {
pool: Pool; pool: Pool;
workerId: string; workerId: string;
@@ -80,14 +109,37 @@ export interface TaskResult {
type TaskHandler = (ctx: TaskContext) => Promise<TaskResult>; type TaskHandler = (ctx: TaskContext) => Promise<TaskResult>;
// Per TASK_WORKFLOW_2024-12-10.md: Handler registry
// payload_fetch: Fetches from Dutchie API, saves to disk, chains to product_refresh
// product_refresh: Reads local payload, normalizes, upserts to DB
const TASK_HANDLERS: Record<TaskRole, TaskHandler> = { const TASK_HANDLERS: Record<TaskRole, TaskHandler> = {
product_refresh: handleProductRefresh, payload_fetch: handlePayloadFetch, // NEW: API fetch -> disk
product_refresh: handleProductRefresh, // CHANGED: disk -> DB
product_discovery: handleProductDiscovery, product_discovery: handleProductDiscovery,
store_discovery: handleStoreDiscovery, store_discovery: handleStoreDiscovery,
entry_point_discovery: handleEntryPointDiscovery, entry_point_discovery: handleEntryPointDiscovery,
analytics_refresh: handleAnalyticsRefresh, analytics_refresh: handleAnalyticsRefresh,
}; };
/**
* Resource usage stats reported to the registry and used for backoff decisions.
* These values are included in worker heartbeats and displayed in the UI.
*/
interface ResourceStats {
/** Current heap memory usage as decimal (0.0 to 1.0) */
memoryPercent: number;
/** Current heap used in MB */
memoryMb: number;
/** Total heap available in MB */
memoryTotalMb: number;
/** CPU usage percentage since last check (0 to 100) */
cpuPercent: number;
/** True if worker is currently in backoff state */
isBackingOff: boolean;
/** Reason for backoff (e.g., "Memory at 87.3% (threshold: 85%)") */
backoffReason: string | null;
}
export class TaskWorker { export class TaskWorker {
private pool: Pool; private pool: Pool;
private workerId: string; private workerId: string;
@@ -96,37 +148,186 @@ export class TaskWorker {
private isRunning: boolean = false; private isRunning: boolean = false;
private heartbeatInterval: NodeJS.Timeout | null = null; private heartbeatInterval: NodeJS.Timeout | null = null;
private registryHeartbeatInterval: NodeJS.Timeout | null = null; private registryHeartbeatInterval: NodeJS.Timeout | null = null;
private currentTask: WorkerTask | null = null;
private crawlRotator: CrawlRotator; private crawlRotator: CrawlRotator;
// ==========================================================================
// CONCURRENT TASK TRACKING
// ==========================================================================
// activeTasks: Map of task ID -> task object for all currently running tasks
// taskPromises: Map of task ID -> Promise for cleanup when task completes
// maxConcurrentTasks: How many tasks this worker will run in parallel
// ==========================================================================
private activeTasks: Map<number, WorkerTask> = new Map();
private taskPromises: Map<number, Promise<void>> = new Map();
private maxConcurrentTasks: number = MAX_CONCURRENT_TASKS;
// ==========================================================================
// RESOURCE MONITORING FOR BACKOFF
// ==========================================================================
// CPU tracking uses differential measurement - we track last values and
// calculate percentage based on elapsed time since last check.
// ==========================================================================
private lastCpuUsage: { user: number; system: number } = { user: 0, system: 0 };
private lastCpuCheck: number = Date.now();
private isBackingOff: boolean = false;
private backoffReason: string | null = null;
constructor(role: TaskRole | null = null, workerId?: string) { constructor(role: TaskRole | null = null, workerId?: string) {
this.pool = getPool(); this.pool = getPool();
this.role = role; this.role = role;
this.workerId = workerId || `worker-${uuidv4().slice(0, 8)}`; this.workerId = workerId || `worker-${uuidv4().slice(0, 8)}`;
this.crawlRotator = new CrawlRotator(this.pool); this.crawlRotator = new CrawlRotator(this.pool);
// Initialize CPU tracking
const cpuUsage = process.cpuUsage();
this.lastCpuUsage = { user: cpuUsage.user, system: cpuUsage.system };
this.lastCpuCheck = Date.now();
}
/**
* Get current resource usage
*/
private getResourceStats(): ResourceStats {
const memUsage = process.memoryUsage();
const heapUsedMb = memUsage.heapUsed / 1024 / 1024;
const heapTotalMb = memUsage.heapTotal / 1024 / 1024;
const memoryPercent = heapUsedMb / heapTotalMb;
// Calculate CPU usage since last check
const cpuUsage = process.cpuUsage();
const now = Date.now();
const elapsed = now - this.lastCpuCheck;
let cpuPercent = 0;
if (elapsed > 0) {
const userDiff = (cpuUsage.user - this.lastCpuUsage.user) / 1000; // microseconds to ms
const systemDiff = (cpuUsage.system - this.lastCpuUsage.system) / 1000;
cpuPercent = ((userDiff + systemDiff) / elapsed) * 100;
}
// Update last values
this.lastCpuUsage = { user: cpuUsage.user, system: cpuUsage.system };
this.lastCpuCheck = now;
return {
memoryPercent,
memoryMb: Math.round(heapUsedMb),
memoryTotalMb: Math.round(heapTotalMb),
cpuPercent: Math.min(100, cpuPercent), // Cap at 100%
isBackingOff: this.isBackingOff,
backoffReason: this.backoffReason,
};
}
/**
* Check if we should back off from taking new tasks
*/
private shouldBackOff(): { backoff: boolean; reason: string | null } {
const stats = this.getResourceStats();
if (stats.memoryPercent > MEMORY_BACKOFF_THRESHOLD) {
return { backoff: true, reason: `Memory at ${(stats.memoryPercent * 100).toFixed(1)}% (threshold: ${MEMORY_BACKOFF_THRESHOLD * 100}%)` };
}
if (stats.cpuPercent > CPU_BACKOFF_THRESHOLD * 100) {
return { backoff: true, reason: `CPU at ${stats.cpuPercent.toFixed(1)}% (threshold: ${CPU_BACKOFF_THRESHOLD * 100}%)` };
}
return { backoff: false, reason: null };
}
/**
* Get count of currently running tasks
*/
get activeTaskCount(): number {
return this.activeTasks.size;
}
/**
* Check if we can accept more tasks
*/
private canAcceptMoreTasks(): boolean {
return this.activeTasks.size < this.maxConcurrentTasks;
} }
/** /**
* Initialize stealth systems (proxy rotation, fingerprints) * Initialize stealth systems (proxy rotation, fingerprints)
* Called once on worker startup before processing any tasks. * Called once on worker startup before processing any tasks.
* *
* IMPORTANT: Proxies are REQUIRED. Workers will fail to start if no proxies available. * IMPORTANT: Proxies are REQUIRED. Workers will wait until proxies are available.
* Workers listen for PostgreSQL NOTIFY 'proxy_added' to wake up immediately when proxies are added.
*/ */
private async initializeStealth(): Promise<void> { private async initializeStealth(): Promise<void> {
const MAX_WAIT_MINUTES = 60;
const POLL_INTERVAL_MS = 30000; // 30 seconds fallback polling
const maxAttempts = (MAX_WAIT_MINUTES * 60 * 1000) / POLL_INTERVAL_MS;
let attempts = 0;
let notifyClient: any = null;
// Set up PostgreSQL LISTEN for proxy notifications
try {
notifyClient = await this.pool.connect();
await notifyClient.query('LISTEN proxy_added');
console.log(`[TaskWorker] Listening for proxy_added notifications...`);
} catch (err: any) {
console.log(`[TaskWorker] Could not set up LISTEN (will poll): ${err.message}`);
}
// Create a promise that resolves when notified
let notifyResolve: (() => void) | null = null;
if (notifyClient) {
notifyClient.on('notification', (msg: any) => {
if (msg.channel === 'proxy_added') {
console.log(`[TaskWorker] Received proxy_added notification!`);
if (notifyResolve) notifyResolve();
}
});
}
try {
while (attempts < maxAttempts) {
try {
// Load proxies from database // Load proxies from database
await this.crawlRotator.initialize(); await this.crawlRotator.initialize();
const stats = this.crawlRotator.proxy.getStats(); const stats = this.crawlRotator.proxy.getStats();
if (stats.activeProxies === 0) { if (stats.activeProxies > 0) {
throw new Error('No active proxies available. Workers MUST use proxies for all requests. Add proxies to the database before starting workers.');
}
console.log(`[TaskWorker] Loaded ${stats.activeProxies} proxies (${stats.avgSuccessRate.toFixed(1)}% avg success rate)`); console.log(`[TaskWorker] Loaded ${stats.activeProxies} proxies (${stats.avgSuccessRate.toFixed(1)}% avg success rate)`);
// Wire rotator to Dutchie client - proxies will be used for ALL requests // Wire rotator to Dutchie client - proxies will be used for ALL requests
setCrawlRotator(this.crawlRotator); setCrawlRotator(this.crawlRotator);
console.log(`[TaskWorker] Stealth initialized: ${this.crawlRotator.userAgent.getCount()} fingerprints, proxy REQUIRED for all requests`); console.log(`[TaskWorker] Stealth initialized: ${this.crawlRotator.userAgent.getCount()} fingerprints, proxy REQUIRED for all requests`);
return;
}
attempts++;
console.log(`[TaskWorker] No active proxies available (attempt ${attempts}). Waiting for proxies...`);
// Wait for either notification or timeout
await new Promise<void>((resolve) => {
notifyResolve = resolve;
setTimeout(resolve, POLL_INTERVAL_MS);
});
} catch (error: any) {
attempts++;
console.log(`[TaskWorker] Error loading proxies (attempt ${attempts}): ${error.message}. Retrying...`);
await this.sleep(POLL_INTERVAL_MS);
}
}
throw new Error(`No active proxies available after waiting ${MAX_WAIT_MINUTES} minutes. Add proxies to the database.`);
} finally {
// Clean up LISTEN connection
if (notifyClient) {
try {
await notifyClient.query('UNLISTEN proxy_added');
notifyClient.release();
} catch {
// Ignore cleanup errors
}
}
}
} }
/** /**
@@ -189,21 +390,32 @@ export class TaskWorker {
const memUsage = process.memoryUsage(); const memUsage = process.memoryUsage();
const cpuUsage = process.cpuUsage(); const cpuUsage = process.cpuUsage();
const proxyLocation = this.crawlRotator.getProxyLocation(); const proxyLocation = this.crawlRotator.getProxyLocation();
const resourceStats = this.getResourceStats();
// Get array of active task IDs
const activeTaskIds = Array.from(this.activeTasks.keys());
await fetch(`${API_BASE_URL}/api/worker-registry/heartbeat`, { await fetch(`${API_BASE_URL}/api/worker-registry/heartbeat`, {
method: 'POST', method: 'POST',
headers: { 'Content-Type': 'application/json' }, headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ body: JSON.stringify({
worker_id: this.workerId, worker_id: this.workerId,
current_task_id: this.currentTask?.id || null, current_task_id: activeTaskIds[0] || null, // Primary task for backwards compat
status: this.currentTask ? 'active' : 'idle', current_task_ids: activeTaskIds, // All active tasks
active_task_count: this.activeTasks.size,
max_concurrent_tasks: this.maxConcurrentTasks,
status: this.activeTasks.size > 0 ? 'active' : 'idle',
resources: { resources: {
memory_mb: Math.round(memUsage.heapUsed / 1024 / 1024), memory_mb: Math.round(memUsage.heapUsed / 1024 / 1024),
memory_total_mb: Math.round(memUsage.heapTotal / 1024 / 1024), memory_total_mb: Math.round(memUsage.heapTotal / 1024 / 1024),
memory_rss_mb: Math.round(memUsage.rss / 1024 / 1024), memory_rss_mb: Math.round(memUsage.rss / 1024 / 1024),
memory_percent: Math.round(resourceStats.memoryPercent * 100),
cpu_user_ms: Math.round(cpuUsage.user / 1000), cpu_user_ms: Math.round(cpuUsage.user / 1000),
cpu_system_ms: Math.round(cpuUsage.system / 1000), cpu_system_ms: Math.round(cpuUsage.system / 1000),
cpu_percent: Math.round(resourceStats.cpuPercent),
proxy_location: proxyLocation, proxy_location: proxyLocation,
is_backing_off: this.isBackingOff,
backoff_reason: this.backoffReason,
} }
}) })
}); });
@@ -265,20 +477,85 @@ export class TaskWorker {
this.startRegistryHeartbeat(); this.startRegistryHeartbeat();
const roleMsg = this.role ? `for role: ${this.role}` : '(role-agnostic - any task)'; const roleMsg = this.role ? `for role: ${this.role}` : '(role-agnostic - any task)';
console.log(`[TaskWorker] ${this.friendlyName} starting ${roleMsg}`); console.log(`[TaskWorker] ${this.friendlyName} starting ${roleMsg} (max ${this.maxConcurrentTasks} concurrent tasks)`);
while (this.isRunning) { while (this.isRunning) {
try { try {
await this.processNextTask(); await this.mainLoop();
} catch (error: any) { } catch (error: any) {
console.error(`[TaskWorker] Loop error:`, error.message); console.error(`[TaskWorker] Loop error:`, error.message);
await this.sleep(POLL_INTERVAL_MS); await this.sleep(POLL_INTERVAL_MS);
} }
} }
// Wait for any remaining tasks to complete
if (this.taskPromises.size > 0) {
console.log(`[TaskWorker] Waiting for ${this.taskPromises.size} active tasks to complete...`);
await Promise.allSettled(this.taskPromises.values());
}
console.log(`[TaskWorker] Worker ${this.workerId} stopped`); console.log(`[TaskWorker] Worker ${this.workerId} stopped`);
} }
/**
* Main loop - tries to fill up to maxConcurrentTasks
*/
private async mainLoop(): Promise<void> {
// Check resource usage and backoff if needed
const { backoff, reason } = this.shouldBackOff();
if (backoff) {
if (!this.isBackingOff) {
console.log(`[TaskWorker] ${this.friendlyName} backing off: ${reason}`);
}
this.isBackingOff = true;
this.backoffReason = reason;
await this.sleep(BACKOFF_DURATION_MS);
return;
}
// Clear backoff state
if (this.isBackingOff) {
console.log(`[TaskWorker] ${this.friendlyName} resuming normal operation`);
this.isBackingOff = false;
this.backoffReason = null;
}
// Check for decommission signal
const shouldDecommission = await this.checkDecommission();
if (shouldDecommission) {
console.log(`[TaskWorker] ${this.friendlyName} received decommission signal - waiting for ${this.activeTasks.size} tasks to complete`);
// Stop accepting new tasks, wait for current to finish
this.isRunning = false;
return;
}
// Try to claim more tasks if we have capacity
if (this.canAcceptMoreTasks()) {
const task = await taskService.claimTask(this.role, this.workerId);
if (task) {
console.log(`[TaskWorker] ${this.friendlyName} claimed task ${task.id} (${task.role}) [${this.activeTasks.size + 1}/${this.maxConcurrentTasks}]`);
this.activeTasks.set(task.id, task);
// Start task in background (don't await)
const taskPromise = this.executeTask(task);
this.taskPromises.set(task.id, taskPromise);
// Clean up when done
taskPromise.finally(() => {
this.activeTasks.delete(task.id);
this.taskPromises.delete(task.id);
});
// Immediately try to claim more tasks (don't wait for poll interval)
return;
}
}
// No task claimed or at capacity - wait before next poll
await this.sleep(POLL_INTERVAL_MS);
}
/** /**
* Stop the worker * Stop the worker
*/ */
@@ -291,23 +568,10 @@ export class TaskWorker {
} }
/** /**
* Process the next available task * Execute a single task (runs concurrently with other tasks)
*/ */
private async processNextTask(): Promise<void> { private async executeTask(task: WorkerTask): Promise<void> {
// Try to claim a task console.log(`[TaskWorker] ${this.friendlyName} starting task ${task.id} (${task.role}) for dispensary ${task.dispensary_id || 'N/A'}`);
const task = await taskService.claimTask(this.role, this.workerId);
if (!task) {
// No tasks available, wait and retry
await this.sleep(POLL_INTERVAL_MS);
return;
}
this.currentTask = task;
console.log(`[TaskWorker] Claimed task ${task.id} (${task.role}) for dispensary ${task.dispensary_id || 'N/A'}`);
// Start heartbeat
this.startHeartbeat(task.id);
try { try {
// Mark as running // Mark as running
@@ -336,7 +600,7 @@ export class TaskWorker {
// Mark as completed // Mark as completed
await taskService.completeTask(task.id, result); await taskService.completeTask(task.id, result);
await this.reportTaskCompletion(true); await this.reportTaskCompletion(true);
console.log(`[TaskWorker] ${this.friendlyName} completed task ${task.id}`); console.log(`[TaskWorker] ${this.friendlyName} completed task ${task.id} [${this.activeTasks.size}/${this.maxConcurrentTasks} active]`);
// Chain next task if applicable // Chain next task if applicable
const chainedTask = await taskService.chainNextTask({ const chainedTask = await taskService.chainNextTask({
@@ -358,9 +622,35 @@ export class TaskWorker {
await taskService.failTask(task.id, error.message); await taskService.failTask(task.id, error.message);
await this.reportTaskCompletion(false); await this.reportTaskCompletion(false);
console.error(`[TaskWorker] ${this.friendlyName} task ${task.id} error:`, error.message); console.error(`[TaskWorker] ${this.friendlyName} task ${task.id} error:`, error.message);
} finally { }
this.stopHeartbeat(); // Note: cleanup (removing from activeTasks) is handled in mainLoop's finally block
this.currentTask = null; }
/**
* Check if this worker has been flagged for decommission
* Returns true if worker should stop after current task
*/
private async checkDecommission(): Promise<boolean> {
try {
// Check worker_registry for decommission flag
const result = await this.pool.query(
`SELECT decommission_requested, decommission_reason
FROM worker_registry
WHERE worker_id = $1`,
[this.workerId]
);
if (result.rows.length > 0 && result.rows[0].decommission_requested) {
const reason = result.rows[0].decommission_reason || 'No reason provided';
console.log(`[TaskWorker] Decommission requested: ${reason}`);
return true;
}
return false;
} catch (error: any) {
// If we can't check, continue running
console.warn(`[TaskWorker] Could not check decommission status: ${error.message}`);
return false;
} }
} }
@@ -397,12 +687,25 @@ export class TaskWorker {
/** /**
* Get worker info * Get worker info
*/ */
getInfo(): { workerId: string; role: TaskRole | null; isRunning: boolean; currentTaskId: number | null } { getInfo(): {
workerId: string;
role: TaskRole | null;
isRunning: boolean;
activeTaskIds: number[];
activeTaskCount: number;
maxConcurrentTasks: number;
isBackingOff: boolean;
backoffReason: string | null;
} {
return { return {
workerId: this.workerId, workerId: this.workerId,
role: this.role, role: this.role,
isRunning: this.isRunning, isRunning: this.isRunning,
currentTaskId: this.currentTask?.id || null, activeTaskIds: Array.from(this.activeTasks.keys()),
activeTaskCount: this.activeTasks.size,
maxConcurrentTasks: this.maxConcurrentTasks,
isBackingOff: this.isBackingOff,
backoffReason: this.backoffReason,
}; };
} }
} }
@@ -414,11 +717,13 @@ export class TaskWorker {
async function main(): Promise<void> { async function main(): Promise<void> {
const role = process.env.WORKER_ROLE as TaskRole | undefined; const role = process.env.WORKER_ROLE as TaskRole | undefined;
// Per TASK_WORKFLOW_2024-12-10.md: Valid task roles
const validRoles: TaskRole[] = [ const validRoles: TaskRole[] = [
'store_discovery', 'store_discovery',
'entry_point_discovery', 'entry_point_discovery',
'product_discovery', 'product_discovery',
'product_refresh', 'payload_fetch', // NEW: Fetches from API, saves to disk
'product_refresh', // CHANGED: Reads from disk, processes to DB
'analytics_refresh', 'analytics_refresh',
]; ];

49
backend/src/types/user-agents.d.ts vendored Normal file
View File

@@ -0,0 +1,49 @@
/**
* Type declarations for user-agents npm package
* Per workflow-12102025.md: Used for realistic UA generation with market-share weighting
*/
declare module 'user-agents' {
interface UserAgentData {
userAgent: string;
platform: string;
screenWidth: number;
screenHeight: number;
viewportWidth: number;
viewportHeight: number;
deviceCategory: 'desktop' | 'mobile' | 'tablet';
appName: string;
connection?: {
downlink: number;
effectiveType: string;
rtt: number;
};
}
interface UserAgentOptions {
deviceCategory?: 'desktop' | 'mobile' | 'tablet';
platform?: RegExp | string;
screenWidth?: RegExp | { min?: number; max?: number };
screenHeight?: RegExp | { min?: number; max?: number };
}
interface UserAgentInstance {
data: UserAgentData;
toString(): string;
random(): UserAgentInstance;
}
class UserAgent {
constructor(options?: UserAgentOptions | UserAgentOptions[]);
data: UserAgentData;
toString(): string;
random(): UserAgentInstance;
}
// Make it callable
interface UserAgent {
(): UserAgentInstance;
}
export default UserAgent;
}

View File

@@ -0,0 +1,406 @@
/**
* Payload Storage Utility
*
* Per TASK_WORKFLOW_2024-12-10.md: Store raw GraphQL payloads for historical analysis.
*
* Design Pattern: Metadata/Payload Separation
* - Metadata in PostgreSQL (raw_crawl_payloads table): Small, indexed, queryable
* - Payload on filesystem: Gzipped JSON at storage_path
*
* Storage structure:
* /storage/payloads/{year}/{month}/{day}/store_{dispensary_id}_{timestamp}.json.gz
*
* Benefits:
* - Compare any two crawls to see what changed
* - Replay/re-normalize historical data if logic changes
* - Debug issues by seeing exactly what the API returned
* - DB stays small, backups stay fast
* - ~90% compression (1.5MB -> 150KB per crawl)
*/
import * as fs from 'fs';
import * as path from 'path';
import * as zlib from 'zlib';
import { promisify } from 'util';
import { Pool } from 'pg';
import * as crypto from 'crypto';
const gzip = promisify(zlib.gzip);
const gunzip = promisify(zlib.gunzip);
// Base path for payload storage (matches image storage pattern)
const PAYLOAD_BASE_PATH = process.env.PAYLOAD_STORAGE_PATH || './storage/payloads';
/**
* Result from saving a payload
*/
export interface SavePayloadResult {
id: number;
storagePath: string;
sizeBytes: number;
sizeBytesRaw: number;
checksum: string;
}
/**
* Result from loading a payload
*/
export interface LoadPayloadResult {
payload: any;
metadata: {
id: number;
dispensaryId: number;
crawlRunId: number | null;
productCount: number;
fetchedAt: Date;
storagePath: string;
};
}
/**
* Generate storage path for a payload
*
* Format: /storage/payloads/{year}/{month}/{day}/store_{dispensary_id}_{timestamp}.json.gz
*/
function generateStoragePath(dispensaryId: number, timestamp: Date): string {
const year = timestamp.getFullYear();
const month = String(timestamp.getMonth() + 1).padStart(2, '0');
const day = String(timestamp.getDate()).padStart(2, '0');
const ts = timestamp.getTime();
return path.join(
PAYLOAD_BASE_PATH,
String(year),
month,
day,
`store_${dispensaryId}_${ts}.json.gz`
);
}
/**
* Ensure directory exists for a file path
*/
async function ensureDir(filePath: string): Promise<void> {
const dir = path.dirname(filePath);
await fs.promises.mkdir(dir, { recursive: true });
}
/**
* Calculate SHA256 checksum of data
*/
function calculateChecksum(data: Buffer): string {
return crypto.createHash('sha256').update(data).digest('hex');
}
/**
* Save a raw crawl payload to filesystem and record metadata in DB
*
* @param pool - Database connection pool
* @param dispensaryId - ID of the dispensary
* @param payload - Raw JSON payload from GraphQL
* @param crawlRunId - Optional crawl_run ID for linking
* @param productCount - Number of products in payload
* @returns SavePayloadResult with file info and DB record ID
*/
export async function saveRawPayload(
pool: Pool,
dispensaryId: number,
payload: any,
crawlRunId: number | null = null,
productCount: number = 0
): Promise<SavePayloadResult> {
const timestamp = new Date();
const storagePath = generateStoragePath(dispensaryId, timestamp);
// Serialize and compress
const jsonStr = JSON.stringify(payload);
const rawSize = Buffer.byteLength(jsonStr, 'utf8');
const compressed = await gzip(Buffer.from(jsonStr, 'utf8'));
const compressedSize = compressed.length;
const checksum = calculateChecksum(compressed);
// Write to filesystem
await ensureDir(storagePath);
await fs.promises.writeFile(storagePath, compressed);
// Record metadata in DB
const result = await pool.query(`
INSERT INTO raw_crawl_payloads (
crawl_run_id,
dispensary_id,
storage_path,
product_count,
size_bytes,
size_bytes_raw,
fetched_at,
checksum_sha256
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
RETURNING id
`, [
crawlRunId,
dispensaryId,
storagePath,
productCount,
compressedSize,
rawSize,
timestamp,
checksum
]);
console.log(`[PayloadStorage] Saved payload for store ${dispensaryId}: ${storagePath} (${(compressedSize / 1024).toFixed(1)}KB compressed, ${(rawSize / 1024).toFixed(1)}KB raw)`);
return {
id: result.rows[0].id,
storagePath,
sizeBytes: compressedSize,
sizeBytesRaw: rawSize,
checksum
};
}
/**
* Load a raw payload from filesystem by metadata ID
*
* @param pool - Database connection pool
* @param payloadId - ID from raw_crawl_payloads table
* @returns LoadPayloadResult with parsed payload and metadata
*/
export async function loadRawPayloadById(
pool: Pool,
payloadId: number
): Promise<LoadPayloadResult | null> {
const result = await pool.query(`
SELECT id, dispensary_id, crawl_run_id, storage_path, product_count, fetched_at
FROM raw_crawl_payloads
WHERE id = $1
`, [payloadId]);
if (result.rows.length === 0) {
return null;
}
const row = result.rows[0];
const payload = await loadPayloadFromPath(row.storage_path);
return {
payload,
metadata: {
id: row.id,
dispensaryId: row.dispensary_id,
crawlRunId: row.crawl_run_id,
productCount: row.product_count,
fetchedAt: row.fetched_at,
storagePath: row.storage_path
}
};
}
/**
* Load a raw payload directly from filesystem path
*
* @param storagePath - Path to gzipped JSON file
* @returns Parsed JSON payload
*/
export async function loadPayloadFromPath(storagePath: string): Promise<any> {
const compressed = await fs.promises.readFile(storagePath);
const decompressed = await gunzip(compressed);
return JSON.parse(decompressed.toString('utf8'));
}
/**
* Get the latest payload for a dispensary
*
* @param pool - Database connection pool
* @param dispensaryId - ID of the dispensary
* @returns LoadPayloadResult or null if none exists
*/
export async function getLatestPayload(
pool: Pool,
dispensaryId: number
): Promise<LoadPayloadResult | null> {
const result = await pool.query(`
SELECT id, dispensary_id, crawl_run_id, storage_path, product_count, fetched_at
FROM raw_crawl_payloads
WHERE dispensary_id = $1
ORDER BY fetched_at DESC
LIMIT 1
`, [dispensaryId]);
if (result.rows.length === 0) {
return null;
}
const row = result.rows[0];
const payload = await loadPayloadFromPath(row.storage_path);
return {
payload,
metadata: {
id: row.id,
dispensaryId: row.dispensary_id,
crawlRunId: row.crawl_run_id,
productCount: row.product_count,
fetchedAt: row.fetched_at,
storagePath: row.storage_path
}
};
}
/**
* Get two payloads for comparison (latest and previous, or by IDs)
*
* @param pool - Database connection pool
* @param dispensaryId - ID of the dispensary
* @param limit - Number of recent payloads to retrieve (default 2)
* @returns Array of LoadPayloadResult, most recent first
*/
export async function getRecentPayloads(
pool: Pool,
dispensaryId: number,
limit: number = 2
): Promise<LoadPayloadResult[]> {
const result = await pool.query(`
SELECT id, dispensary_id, crawl_run_id, storage_path, product_count, fetched_at
FROM raw_crawl_payloads
WHERE dispensary_id = $1
ORDER BY fetched_at DESC
LIMIT $2
`, [dispensaryId, limit]);
const payloads: LoadPayloadResult[] = [];
for (const row of result.rows) {
const payload = await loadPayloadFromPath(row.storage_path);
payloads.push({
payload,
metadata: {
id: row.id,
dispensaryId: row.dispensary_id,
crawlRunId: row.crawl_run_id,
productCount: row.product_count,
fetchedAt: row.fetched_at,
storagePath: row.storage_path
}
});
}
return payloads;
}
/**
* List payload metadata without loading files (for browsing/pagination)
*
* @param pool - Database connection pool
* @param options - Query options
* @returns Array of metadata rows
*/
export async function listPayloadMetadata(
pool: Pool,
options: {
dispensaryId?: number;
startDate?: Date;
endDate?: Date;
limit?: number;
offset?: number;
} = {}
): Promise<Array<{
id: number;
dispensaryId: number;
crawlRunId: number | null;
storagePath: string;
productCount: number;
sizeBytes: number;
sizeBytesRaw: number;
fetchedAt: Date;
}>> {
const conditions: string[] = [];
const params: any[] = [];
let paramIndex = 1;
if (options.dispensaryId) {
conditions.push(`dispensary_id = $${paramIndex++}`);
params.push(options.dispensaryId);
}
if (options.startDate) {
conditions.push(`fetched_at >= $${paramIndex++}`);
params.push(options.startDate);
}
if (options.endDate) {
conditions.push(`fetched_at <= $${paramIndex++}`);
params.push(options.endDate);
}
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const limit = options.limit || 50;
const offset = options.offset || 0;
params.push(limit, offset);
const result = await pool.query(`
SELECT
id,
dispensary_id,
crawl_run_id,
storage_path,
product_count,
size_bytes,
size_bytes_raw,
fetched_at
FROM raw_crawl_payloads
${whereClause}
ORDER BY fetched_at DESC
LIMIT $${paramIndex++} OFFSET $${paramIndex}
`, params);
return result.rows.map(row => ({
id: row.id,
dispensaryId: row.dispensary_id,
crawlRunId: row.crawl_run_id,
storagePath: row.storage_path,
productCount: row.product_count,
sizeBytes: row.size_bytes,
sizeBytesRaw: row.size_bytes_raw,
fetchedAt: row.fetched_at
}));
}
/**
* Delete old payloads (for retention policy)
*
* @param pool - Database connection pool
* @param olderThan - Delete payloads older than this date
* @returns Number of payloads deleted
*/
export async function deleteOldPayloads(
pool: Pool,
olderThan: Date
): Promise<number> {
// Get paths first
const result = await pool.query(`
SELECT id, storage_path FROM raw_crawl_payloads
WHERE fetched_at < $1
`, [olderThan]);
// Delete files
for (const row of result.rows) {
try {
await fs.promises.unlink(row.storage_path);
} catch (err: any) {
if (err.code !== 'ENOENT') {
console.warn(`[PayloadStorage] Failed to delete ${row.storage_path}: ${err.message}`);
}
}
}
// Delete DB records
await pool.query(`
DELETE FROM raw_crawl_payloads
WHERE fetched_at < $1
`, [olderThan]);
console.log(`[PayloadStorage] Deleted ${result.rows.length} payloads older than ${olderThan.toISOString()}`);
return result.rows.length;
}

View File

@@ -6,8 +6,8 @@ WORKDIR /app
# Copy package files # Copy package files
COPY package*.json ./ COPY package*.json ./
# Install dependencies # Install dependencies (npm install is more forgiving than npm ci)
RUN npm ci RUN npm install
# Copy source files # Copy source files
COPY . . COPY . .

View File

@@ -7,8 +7,8 @@
<title>CannaIQ - Cannabis Menu Intelligence Platform</title> <title>CannaIQ - Cannabis Menu Intelligence Platform</title>
<meta name="description" content="CannaIQ provides real-time cannabis dispensary menu data, product tracking, and analytics for dispensaries across Arizona." /> <meta name="description" content="CannaIQ provides real-time cannabis dispensary menu data, product tracking, and analytics for dispensaries across Arizona." />
<meta name="keywords" content="cannabis, dispensary, menu, products, analytics, Arizona" /> <meta name="keywords" content="cannabis, dispensary, menu, products, analytics, Arizona" />
<script type="module" crossorigin src="/assets/index-BML8-px1.js"></script> <script type="module" crossorigin src="/assets/index-Dq9S0rVi.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-B2gR-58G.css"> <link rel="stylesheet" crossorigin href="/assets/index-DhM09B-d.css">
</head> </head>
<body> <body>
<div id="root"></div> <div id="root"></div>

View File

@@ -8,6 +8,7 @@ import { ProductDetail } from './pages/ProductDetail';
import { Stores } from './pages/Stores'; import { Stores } from './pages/Stores';
import { Dispensaries } from './pages/Dispensaries'; import { Dispensaries } from './pages/Dispensaries';
import { DispensaryDetail } from './pages/DispensaryDetail'; import { DispensaryDetail } from './pages/DispensaryDetail';
import { DispensarySchedule } from './pages/DispensarySchedule';
import { StoreDetail } from './pages/StoreDetail'; import { StoreDetail } from './pages/StoreDetail';
import { StoreBrands } from './pages/StoreBrands'; import { StoreBrands } from './pages/StoreBrands';
import { StoreSpecials } from './pages/StoreSpecials'; import { StoreSpecials } from './pages/StoreSpecials';
@@ -46,7 +47,6 @@ import CrossStateCompare from './pages/CrossStateCompare';
import StateDetail from './pages/StateDetail'; import StateDetail from './pages/StateDetail';
import { Discovery } from './pages/Discovery'; import { Discovery } from './pages/Discovery';
import { WorkersDashboard } from './pages/WorkersDashboard'; import { WorkersDashboard } from './pages/WorkersDashboard';
import { JobQueue } from './pages/JobQueue';
import TasksDashboard from './pages/TasksDashboard'; import TasksDashboard from './pages/TasksDashboard';
import { ScraperOverviewDashboard } from './pages/ScraperOverviewDashboard'; import { ScraperOverviewDashboard } from './pages/ScraperOverviewDashboard';
import { SeoOrchestrator } from './pages/admin/seo/SeoOrchestrator'; import { SeoOrchestrator } from './pages/admin/seo/SeoOrchestrator';
@@ -66,6 +66,7 @@ export default function App() {
<Route path="/stores" element={<PrivateRoute><Stores /></PrivateRoute>} /> <Route path="/stores" element={<PrivateRoute><Stores /></PrivateRoute>} />
<Route path="/dispensaries" element={<PrivateRoute><Dispensaries /></PrivateRoute>} /> <Route path="/dispensaries" element={<PrivateRoute><Dispensaries /></PrivateRoute>} />
<Route path="/dispensaries/:state/:city/:slug" element={<PrivateRoute><DispensaryDetail /></PrivateRoute>} /> <Route path="/dispensaries/:state/:city/:slug" element={<PrivateRoute><DispensaryDetail /></PrivateRoute>} />
<Route path="/dispensaries/:state/:city/:slug/schedule" element={<PrivateRoute><DispensarySchedule /></PrivateRoute>} />
<Route path="/stores/:state/:storeName/:slug/brands" element={<PrivateRoute><StoreBrands /></PrivateRoute>} /> <Route path="/stores/:state/:storeName/:slug/brands" element={<PrivateRoute><StoreBrands /></PrivateRoute>} />
<Route path="/stores/:state/:storeName/:slug/specials" element={<PrivateRoute><StoreSpecials /></PrivateRoute>} /> <Route path="/stores/:state/:storeName/:slug/specials" element={<PrivateRoute><StoreSpecials /></PrivateRoute>} />
<Route path="/stores/:state/:storeName/:slug" element={<PrivateRoute><StoreDetail /></PrivateRoute>} /> <Route path="/stores/:state/:storeName/:slug" element={<PrivateRoute><StoreDetail /></PrivateRoute>} />
@@ -123,8 +124,6 @@ export default function App() {
<Route path="/discovery" element={<PrivateRoute><Discovery /></PrivateRoute>} /> <Route path="/discovery" element={<PrivateRoute><Discovery /></PrivateRoute>} />
{/* Workers Dashboard */} {/* Workers Dashboard */}
<Route path="/workers" element={<PrivateRoute><WorkersDashboard /></PrivateRoute>} /> <Route path="/workers" element={<PrivateRoute><WorkersDashboard /></PrivateRoute>} />
{/* Job Queue Management */}
<Route path="/job-queue" element={<PrivateRoute><JobQueue /></PrivateRoute>} />
{/* Task Queue Dashboard */} {/* Task Queue Dashboard */}
<Route path="/tasks" element={<PrivateRoute><TasksDashboard /></PrivateRoute>} /> <Route path="/tasks" element={<PrivateRoute><TasksDashboard /></PrivateRoute>} />
{/* Scraper Overview Dashboard (new primary) */} {/* Scraper Overview Dashboard (new primary) */}

View File

@@ -1,5 +1,5 @@
import { ReactNode, useEffect, useState } from 'react'; import { ReactNode, useEffect, useState, useRef } from 'react';
import { useNavigate, useLocation } from 'react-router-dom'; import { useNavigate, useLocation, Link } from 'react-router-dom';
import { useAuthStore } from '../store/authStore'; import { useAuthStore } from '../store/authStore';
import { api } from '../lib/api'; import { api } from '../lib/api';
import { StateSelector } from './StateSelector'; import { StateSelector } from './StateSelector';
@@ -48,8 +48,8 @@ interface NavLinkProps {
function NavLink({ to, icon, label, isActive }: NavLinkProps) { function NavLink({ to, icon, label, isActive }: NavLinkProps) {
return ( return (
<a <Link
href={to} to={to}
className={`flex items-center gap-3 px-3 py-2 rounded-lg text-sm font-medium transition-colors ${ className={`flex items-center gap-3 px-3 py-2 rounded-lg text-sm font-medium transition-colors ${
isActive isActive
? 'bg-emerald-50 text-emerald-700' ? 'bg-emerald-50 text-emerald-700'
@@ -58,7 +58,7 @@ function NavLink({ to, icon, label, isActive }: NavLinkProps) {
> >
<span className={`flex-shrink-0 ${isActive ? 'text-emerald-600' : 'text-gray-400'}`}>{icon}</span> <span className={`flex-shrink-0 ${isActive ? 'text-emerald-600' : 'text-gray-400'}`}>{icon}</span>
<span>{label}</span> <span>{label}</span>
</a> </Link>
); );
} }
@@ -86,6 +86,8 @@ export function Layout({ children }: LayoutProps) {
const { user, logout } = useAuthStore(); const { user, logout } = useAuthStore();
const [versionInfo, setVersionInfo] = useState<VersionInfo | null>(null); const [versionInfo, setVersionInfo] = useState<VersionInfo | null>(null);
const [sidebarOpen, setSidebarOpen] = useState(false); const [sidebarOpen, setSidebarOpen] = useState(false);
const navRef = useRef<HTMLElement>(null);
const scrollPositionRef = useRef<number>(0);
useEffect(() => { useEffect(() => {
const fetchVersion = async () => { const fetchVersion = async () => {
@@ -111,9 +113,27 @@ export function Layout({ children }: LayoutProps) {
return location.pathname.startsWith(path); return location.pathname.startsWith(path);
}; };
// Close sidebar on route change (mobile) // Save scroll position before route change
useEffect(() => {
const nav = navRef.current;
if (nav) {
const handleScroll = () => {
scrollPositionRef.current = nav.scrollTop;
};
nav.addEventListener('scroll', handleScroll);
return () => nav.removeEventListener('scroll', handleScroll);
}
}, []);
// Restore scroll position after route change and close mobile sidebar
useEffect(() => { useEffect(() => {
setSidebarOpen(false); setSidebarOpen(false);
// Restore scroll position after render
requestAnimationFrame(() => {
if (navRef.current) {
navRef.current.scrollTop = scrollPositionRef.current;
}
});
}, [location.pathname]); }, [location.pathname]);
const sidebarContent = ( const sidebarContent = (
@@ -131,7 +151,7 @@ export function Layout({ children }: LayoutProps) {
<span className="text-lg font-bold text-gray-900">CannaIQ</span> <span className="text-lg font-bold text-gray-900">CannaIQ</span>
{versionInfo && ( {versionInfo && (
<p className="text-xs text-gray-400"> <p className="text-xs text-gray-400">
v{versionInfo.version} ({versionInfo.git_sha}) {versionInfo.build_time !== 'unknown' && `- ${new Date(versionInfo.build_time).toLocaleDateString()}`} {versionInfo.git_sha || 'dev'}
</p> </p>
)} )}
</div> </div>
@@ -145,7 +165,7 @@ export function Layout({ children }: LayoutProps) {
</div> </div>
{/* Navigation */} {/* Navigation */}
<nav className="flex-1 px-3 py-4 space-y-6 overflow-y-auto"> <nav ref={navRef} className="flex-1 px-3 py-4 space-y-6 overflow-y-auto">
<NavSection title="Main"> <NavSection title="Main">
<NavLink to="/dashboard" icon={<LayoutDashboard className="w-4 h-4" />} label="Dashboard" isActive={isActive('/dashboard', true)} /> <NavLink to="/dashboard" icon={<LayoutDashboard className="w-4 h-4" />} label="Dashboard" isActive={isActive('/dashboard', true)} />
<NavLink to="/dispensaries" icon={<Building2 className="w-4 h-4" />} label="Dispensaries" isActive={isActive('/dispensaries')} /> <NavLink to="/dispensaries" icon={<Building2 className="w-4 h-4" />} label="Dispensaries" isActive={isActive('/dispensaries')} />
@@ -164,8 +184,7 @@ export function Layout({ children }: LayoutProps) {
<NavLink to="/admin/orchestrator" icon={<Activity className="w-4 h-4" />} label="Orchestrator" isActive={isActive('/admin/orchestrator')} /> <NavLink to="/admin/orchestrator" icon={<Activity className="w-4 h-4" />} label="Orchestrator" isActive={isActive('/admin/orchestrator')} />
<NavLink to="/users" icon={<UserCog className="w-4 h-4" />} label="Users" isActive={isActive('/users')} /> <NavLink to="/users" icon={<UserCog className="w-4 h-4" />} label="Users" isActive={isActive('/users')} />
<NavLink to="/workers" icon={<Users className="w-4 h-4" />} label="Workers" isActive={isActive('/workers')} /> <NavLink to="/workers" icon={<Users className="w-4 h-4" />} label="Workers" isActive={isActive('/workers')} />
<NavLink to="/job-queue" icon={<ListOrdered className="w-4 h-4" />} label="Job Queue" isActive={isActive('/job-queue')} /> <NavLink to="/tasks" icon={<ListChecks className="w-4 h-4" />} label="Tasks" isActive={isActive('/tasks')} />
<NavLink to="/tasks" icon={<ListChecks className="w-4 h-4" />} label="Task Queue" isActive={isActive('/tasks')} />
<NavLink to="/admin/seo" icon={<FileText className="w-4 h-4" />} label="SEO Pages" isActive={isActive('/admin/seo')} /> <NavLink to="/admin/seo" icon={<FileText className="w-4 h-4" />} label="SEO Pages" isActive={isActive('/admin/seo')} />
<NavLink to="/proxies" icon={<Shield className="w-4 h-4" />} label="Proxies" isActive={isActive('/proxies')} /> <NavLink to="/proxies" icon={<Shield className="w-4 h-4" />} label="Proxies" isActive={isActive('/proxies')} />
<NavLink to="/api-permissions" icon={<Key className="w-4 h-4" />} label="API Keys" isActive={isActive('/api-permissions')} /> <NavLink to="/api-permissions" icon={<Key className="w-4 h-4" />} label="API Keys" isActive={isActive('/api-permissions')} />

View File

@@ -0,0 +1,138 @@
import { useState, useEffect, useRef } from 'react';
import { api } from '../lib/api';
import { Shield, X, Loader2 } from 'lucide-react';
interface PasswordConfirmModalProps {
isOpen: boolean;
onClose: () => void;
onConfirm: () => void;
title: string;
description: string;
}
export function PasswordConfirmModal({
isOpen,
onClose,
onConfirm,
title,
description,
}: PasswordConfirmModalProps) {
const [password, setPassword] = useState('');
const [error, setError] = useState('');
const [loading, setLoading] = useState(false);
const inputRef = useRef<HTMLInputElement>(null);
useEffect(() => {
if (isOpen) {
setPassword('');
setError('');
// Focus the input when modal opens
setTimeout(() => inputRef.current?.focus(), 100);
}
}, [isOpen]);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!password.trim()) {
setError('Password is required');
return;
}
setLoading(true);
setError('');
try {
const result = await api.verifyPassword(password);
if (result.verified) {
onConfirm();
onClose();
} else {
setError('Invalid password');
}
} catch (err: any) {
setError(err.message || 'Verification failed');
} finally {
setLoading(false);
}
};
if (!isOpen) return null;
return (
<div className="fixed inset-0 z-50 flex items-center justify-center">
{/* Backdrop */}
<div
className="absolute inset-0 bg-black bg-opacity-50"
onClick={onClose}
/>
{/* Modal */}
<div className="relative bg-white rounded-lg shadow-xl max-w-md w-full mx-4">
{/* Header */}
<div className="flex items-center justify-between px-6 py-4 border-b border-gray-200">
<div className="flex items-center gap-3">
<div className="p-2 bg-amber-100 rounded-lg">
<Shield className="w-5 h-5 text-amber-600" />
</div>
<h3 className="text-lg font-semibold text-gray-900">{title}</h3>
</div>
<button
onClick={onClose}
className="p-1 hover:bg-gray-100 rounded-lg transition-colors"
>
<X className="w-5 h-5 text-gray-500" />
</button>
</div>
{/* Body */}
<form onSubmit={handleSubmit}>
<div className="px-6 py-4">
<p className="text-gray-600 mb-4">{description}</p>
<div className="space-y-2">
<label
htmlFor="password"
className="block text-sm font-medium text-gray-700"
>
Enter your password to continue
</label>
<input
ref={inputRef}
type="password"
id="password"
value={password}
onChange={(e) => setPassword(e.target.value)}
className="w-full px-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500 focus:border-emerald-500"
placeholder="Password"
disabled={loading}
/>
{error && (
<p className="text-sm text-red-600">{error}</p>
)}
</div>
</div>
{/* Footer */}
<div className="flex justify-end gap-3 px-6 py-4 border-t border-gray-200 bg-gray-50 rounded-b-lg">
<button
type="button"
onClick={onClose}
disabled={loading}
className="px-4 py-2 text-gray-700 hover:bg-gray-100 rounded-lg transition-colors"
>
Cancel
</button>
<button
type="submit"
disabled={loading}
className="px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 transition-colors disabled:opacity-50 flex items-center gap-2"
>
{loading && <Loader2 className="w-4 h-4 animate-spin" />}
Confirm
</button>
</div>
</form>
</div>
</div>
);
}

View File

@@ -84,6 +84,13 @@ class ApiClient {
}); });
} }
async verifyPassword(password: string) {
return this.request<{ verified: boolean; error?: string }>('/api/auth/verify-password', {
method: 'POST',
body: JSON.stringify({ password }),
});
}
async getMe() { async getMe() {
return this.request<{ user: any }>('/api/auth/me'); return this.request<{ user: any }>('/api/auth/me');
} }
@@ -320,7 +327,7 @@ class ApiClient {
} }
async testAllProxies() { async testAllProxies() {
return this.request<{ jobId: number; message: string }>('/api/proxies/test-all', { return this.request<{ jobId: number; total: number; message: string }>('/api/proxies/test-all', {
method: 'POST', method: 'POST',
}); });
} }
@@ -983,6 +990,47 @@ class ApiClient {
}>(`/api/markets/stores/${id}/categories`); }>(`/api/markets/stores/${id}/categories`);
} }
async getStoreCrawlHistory(id: number, limit = 50) {
return this.request<{
dispensary: {
id: number;
name: string;
dba_name: string | null;
slug: string;
state: string;
city: string;
menu_type: string | null;
platform_dispensary_id: string | null;
last_menu_scrape: string | null;
} | null;
history: Array<{
id: number;
runId: string | null;
profileKey: string | null;
crawlerModule: string | null;
stateAtStart: string | null;
stateAtEnd: string | null;
totalSteps: number;
durationMs: number | null;
success: boolean;
errorMessage: string | null;
productsFound: number | null;
startedAt: string | null;
completedAt: string | null;
}>;
nextSchedule: {
scheduleId: number;
jobName: string;
enabled: boolean;
baseIntervalMinutes: number;
jitterMinutes: number;
nextRunAt: string | null;
lastRunAt: string | null;
lastStatus: string | null;
} | null;
}>(`/api/markets/stores/${id}/crawl-history?limit=${limit}`);
}
// Global Brands/Categories (from v_brands/v_categories views) // Global Brands/Categories (from v_brands/v_categories views)
async getMarketBrands(params?: { limit?: number; offset?: number }) { async getMarketBrands(params?: { limit?: number; offset?: number }) {
const searchParams = new URLSearchParams(); const searchParams = new URLSearchParams();
@@ -1518,10 +1566,11 @@ class ApiClient {
} }
// Intelligence API // Intelligence API
async getIntelligenceBrands(params?: { limit?: number; offset?: number }) { async getIntelligenceBrands(params?: { limit?: number; offset?: number; state?: string }) {
const searchParams = new URLSearchParams(); const searchParams = new URLSearchParams();
if (params?.limit) searchParams.append('limit', params.limit.toString()); if (params?.limit) searchParams.append('limit', params.limit.toString());
if (params?.offset) searchParams.append('offset', params.offset.toString()); if (params?.offset) searchParams.append('offset', params.offset.toString());
if (params?.state) searchParams.append('state', params.state);
const queryString = searchParams.toString() ? `?${searchParams.toString()}` : ''; const queryString = searchParams.toString() ? `?${searchParams.toString()}` : '';
return this.request<{ return this.request<{
brands: Array<{ brands: Array<{
@@ -1536,7 +1585,10 @@ class ApiClient {
}>(`/api/admin/intelligence/brands${queryString}`); }>(`/api/admin/intelligence/brands${queryString}`);
} }
async getIntelligencePricing() { async getIntelligencePricing(params?: { state?: string }) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
const queryString = searchParams.toString() ? `?${searchParams.toString()}` : '';
return this.request<{ return this.request<{
byCategory: Array<{ byCategory: Array<{
category: string; category: string;
@@ -1552,7 +1604,7 @@ class ApiClient {
maxPrice: number; maxPrice: number;
totalProducts: number; totalProducts: number;
}; };
}>('/api/admin/intelligence/pricing'); }>(`/api/admin/intelligence/pricing${queryString}`);
} }
async getIntelligenceStoreActivity(params?: { state?: string; chainId?: number; limit?: number }) { async getIntelligenceStoreActivity(params?: { state?: string; chainId?: number; limit?: number }) {
@@ -2884,6 +2936,46 @@ class ApiClient {
`/api/tasks/store/${dispensaryId}/active` `/api/tasks/store/${dispensaryId}/active`
); );
} }
// Task Pool Control
async getTaskPoolStatus() {
return this.request<{ success: boolean; paused: boolean; message: string }>(
'/api/tasks/pool/status'
);
}
async pauseTaskPool() {
return this.request<{ success: boolean; paused: boolean; message: string }>(
'/api/tasks/pool/pause',
{ method: 'POST' }
);
}
async resumeTaskPool() {
return this.request<{ success: boolean; paused: boolean; message: string }>(
'/api/tasks/pool/resume',
{ method: 'POST' }
);
}
// K8s Worker Control
async getK8sWorkers() {
return this.request<{
success: boolean;
available: boolean;
replicas: number;
readyReplicas: number;
availableReplicas?: number;
error?: string;
}>('/api/k8s/workers');
}
async scaleK8sWorkers(replicas: number) {
return this.request<{ success: boolean; replicas: number; message?: string; error?: string }>(
'/api/k8s/workers/scale',
{ method: 'POST', body: JSON.stringify({ replicas }) }
);
}
} }
export const api = new ApiClient(API_URL); export const api = new ApiClient(API_URL);

View File

@@ -2,7 +2,7 @@ import { useEffect, useState, useRef } from 'react';
import { Layout } from '../components/Layout'; import { Layout } from '../components/Layout';
import { api } from '../lib/api'; import { api } from '../lib/api';
import { Toast } from '../components/Toast'; import { Toast } from '../components/Toast';
import { Key, Plus, Copy, Check, X, Trash2, Power, PowerOff, Store, Globe, Shield, Clock, Eye, EyeOff, Search, ChevronDown } from 'lucide-react'; import { Key, Plus, Copy, Check, X, Trash2, Power, PowerOff, Store, Globe, Shield, Clock, Eye, EyeOff, Search, ChevronDown, Pencil } from 'lucide-react';
interface ApiPermission { interface ApiPermission {
id: number; id: number;
@@ -161,6 +161,12 @@ export function ApiPermissions() {
allowed_ips: '', allowed_ips: '',
allowed_domains: '', allowed_domains: '',
}); });
const [editingPermission, setEditingPermission] = useState<ApiPermission | null>(null);
const [editForm, setEditForm] = useState({
user_name: '',
allowed_ips: '',
allowed_domains: '',
});
const [notification, setNotification] = useState<{ message: string; type: 'success' | 'error' | 'info' } | null>(null); const [notification, setNotification] = useState<{ message: string; type: 'success' | 'error' | 'info' } | null>(null);
useEffect(() => { useEffect(() => {
@@ -240,6 +246,33 @@ export function ApiPermissions() {
} }
}; };
const handleEdit = (perm: ApiPermission) => {
setEditingPermission(perm);
setEditForm({
user_name: perm.user_name,
allowed_ips: perm.allowed_ips || '',
allowed_domains: perm.allowed_domains || '',
});
};
const handleSaveEdit = async (e: React.FormEvent) => {
e.preventDefault();
if (!editingPermission) return;
try {
await api.updateApiPermission(editingPermission.id, {
user_name: editForm.user_name,
allowed_ips: editForm.allowed_ips || undefined,
allowed_domains: editForm.allowed_domains || undefined,
});
setNotification({ message: 'API key updated successfully', type: 'success' });
setEditingPermission(null);
loadPermissions();
} catch (error: any) {
setNotification({ message: 'Failed to update permission: ' + error.message, type: 'error' });
}
};
const copyToClipboard = async (text: string, id: number) => { const copyToClipboard = async (text: string, id: number) => {
await navigator.clipboard.writeText(text); await navigator.clipboard.writeText(text);
setCopiedId(id); setCopiedId(id);
@@ -494,21 +527,36 @@ export function ApiPermissions() {
</button> </button>
</div> </div>
{/* Restrictions */} {/* Allowed Domains - Always show */}
{(perm.allowed_ips || perm.allowed_domains) && ( <div className="mt-3 text-xs">
<div className="flex gap-4 mt-3 text-xs text-gray-500"> <span className="text-gray-500 flex items-center gap-1">
{perm.allowed_ips && ( <Globe className="w-3 h-3" />
<span>IPs: {perm.allowed_ips.split('\n').length} allowed</span> Domains:{' '}
{perm.allowed_domains ? (
<span className="text-gray-700 font-mono">
{perm.allowed_domains.split('\n').filter(d => d.trim()).join(', ')}
</span>
) : (
<span className="text-amber-600">Any domain (no restriction)</span>
)} )}
{perm.allowed_domains && ( </span>
<span>Domains: {perm.allowed_domains.split('\n').length} allowed</span> {perm.allowed_ips && (
<span className="text-gray-500 ml-4">
IPs: {perm.allowed_ips.split('\n').filter(ip => ip.trim()).length} allowed
</span>
)} )}
</div> </div>
)}
</div> </div>
{/* Actions */} {/* Actions */}
<div className="flex items-center gap-2 ml-4"> <div className="flex items-center gap-2 ml-4">
<button
onClick={() => handleEdit(perm)}
className="p-2 text-blue-600 hover:bg-blue-50 rounded-lg transition-colors"
title="Edit"
>
<Pencil className="w-5 h-5" />
</button>
<button <button
onClick={() => handleToggle(perm.id)} onClick={() => handleToggle(perm.id)}
className={`p-2 rounded-lg transition-colors ${ className={`p-2 rounded-lg transition-colors ${
@@ -534,6 +582,86 @@ export function ApiPermissions() {
</div> </div>
)} )}
</div> </div>
{/* Edit Modal */}
{editingPermission && (
<div className="fixed inset-0 bg-black/50 flex items-center justify-center z-50">
<div className="bg-white rounded-xl shadow-xl max-w-lg w-full mx-4 max-h-[90vh] overflow-y-auto">
<div className="px-6 py-4 border-b border-gray-200">
<h2 className="text-lg font-semibold text-gray-900 flex items-center gap-2">
<Pencil className="w-5 h-5 text-blue-600" />
Edit API Key
</h2>
<p className="text-sm text-gray-500 mt-1">
{editingPermission.store_name}
</p>
</div>
<form onSubmit={handleSaveEdit} className="p-6 space-y-5">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Label / Website Name
</label>
<input
type="text"
value={editForm.user_name}
onChange={(e) => setEditForm({ ...editForm, user_name: e.target.value })}
className="w-full px-4 py-2.5 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent"
required
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
<Globe className="w-4 h-4 inline mr-1" />
Allowed Domains
</label>
<textarea
value={editForm.allowed_domains}
onChange={(e) => setEditForm({ ...editForm, allowed_domains: e.target.value })}
rows={4}
className="w-full px-4 py-2.5 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent font-mono text-sm"
placeholder="example.com&#10;*.example.com&#10;subdomain.example.com"
/>
<p className="text-xs text-gray-500 mt-1">
One domain per line. Use * for wildcards (e.g., *.example.com). Leave empty to allow any domain.
</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
<Shield className="w-4 h-4 inline mr-1" />
Allowed IP Addresses
</label>
<textarea
value={editForm.allowed_ips}
onChange={(e) => setEditForm({ ...editForm, allowed_ips: e.target.value })}
rows={3}
className="w-full px-4 py-2.5 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent font-mono text-sm"
placeholder="192.168.1.1&#10;10.0.0.0/8"
/>
<p className="text-xs text-gray-500 mt-1">One per line. CIDR notation supported. Leave empty to allow any IP.</p>
</div>
<div className="flex gap-3 pt-2">
<button
type="submit"
className="flex-1 px-5 py-2.5 bg-blue-600 text-white rounded-lg hover:bg-blue-700 transition-colors"
>
Save Changes
</button>
<button
type="button"
onClick={() => setEditingPermission(null)}
className="px-5 py-2.5 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 transition-colors"
>
Cancel
</button>
</div>
</form>
</div>
</div>
)}
</div> </div>
</Layout> </Layout>
); );

View File

@@ -1,6 +1,5 @@
import { useEffect, useState } from 'react'; import { useEffect, useState } from 'react';
import { Layout } from '../components/Layout'; import { Layout } from '../components/Layout';
import { HealthPanel } from '../components/HealthPanel';
import { api } from '../lib/api'; import { api } from '../lib/api';
import { useNavigate } from 'react-router-dom'; import { useNavigate } from 'react-router-dom';
import { import {
@@ -42,7 +41,6 @@ export function Dashboard() {
const [activity, setActivity] = useState<any>(null); const [activity, setActivity] = useState<any>(null);
const [nationalStats, setNationalStats] = useState<any>(null); const [nationalStats, setNationalStats] = useState<any>(null);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [refreshing, setRefreshing] = useState(false);
const [pendingChangesCount, setPendingChangesCount] = useState(0); const [pendingChangesCount, setPendingChangesCount] = useState(0);
const [showNotification, setShowNotification] = useState(false); const [showNotification, setShowNotification] = useState(false);
const [taskCounts, setTaskCounts] = useState<Record<string, number> | null>(null); const [taskCounts, setTaskCounts] = useState<Record<string, number> | null>(null);
@@ -93,10 +91,7 @@ export function Dashboard() {
} }
}; };
const loadData = async (isRefresh = false) => { const loadData = async () => {
if (isRefresh) {
setRefreshing(true);
}
try { try {
// Fetch dashboard data (primary data source) // Fetch dashboard data (primary data source)
const dashboard = await api.getMarketDashboard(); const dashboard = await api.getMarketDashboard();
@@ -158,7 +153,6 @@ export function Dashboard() {
console.error('Failed to load dashboard:', error); console.error('Failed to load dashboard:', error);
} finally { } finally {
setLoading(false); setLoading(false);
setRefreshing(false);
} }
}; };
@@ -271,23 +265,10 @@ export function Dashboard() {
<div className="space-y-8"> <div className="space-y-8">
{/* Header */} {/* Header */}
<div className="flex flex-col sm:flex-row sm:justify-between sm:items-center gap-4">
<div> <div>
<h1 className="text-xl sm:text-2xl font-semibold text-gray-900">Dashboard</h1> <h1 className="text-xl sm:text-2xl font-semibold text-gray-900">Dashboard</h1>
<p className="text-sm text-gray-500 mt-1">Monitor your dispensary data aggregation</p> <p className="text-sm text-gray-500 mt-1">Monitor your dispensary data aggregation</p>
</div> </div>
<button
onClick={() => loadData(true)}
disabled={refreshing}
className="inline-flex items-center justify-center gap-2 px-4 py-2 bg-white border border-gray-200 rounded-lg hover:bg-gray-50 transition-colors text-sm font-medium text-gray-700 self-start sm:self-auto disabled:opacity-50 disabled:cursor-not-allowed"
>
<RefreshCw className={`w-4 h-4 ${refreshing ? 'animate-spin' : ''}`} />
{refreshing ? 'Refreshing...' : 'Refresh'}
</button>
</div>
{/* System Health */}
<HealthPanel showQueues={false} refreshInterval={60000} />
{/* Stats Grid */} {/* Stats Grid */}
<div className="grid grid-cols-2 lg:grid-cols-3 gap-3 sm:gap-6"> <div className="grid grid-cols-2 lg:grid-cols-3 gap-3 sm:gap-6">

View File

@@ -161,23 +161,6 @@ export function Dispensaries() {
))} ))}
</select> </select>
</div> </div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Filter by Status
</label>
<select
value={filterStatus}
onChange={(e) => handleStatusFilter(e.target.value)}
className={`w-full px-3 py-2 border rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500 ${
filterStatus === 'dropped' ? 'border-red-300 bg-red-50' : 'border-gray-300'
}`}
>
<option value="">All Statuses</option>
<option value="open">Open</option>
<option value="dropped">Dropped (Needs Review)</option>
<option value="closed">Closed</option>
</select>
</div>
</div> </div>
</div> </div>

View File

@@ -204,47 +204,6 @@ export function DispensaryDetail() {
Back to Dispensaries Back to Dispensaries
</button> </button>
{/* Update Dropdown */}
<div className="relative">
<button
onClick={() => setShowUpdateDropdown(!showUpdateDropdown)}
disabled={isUpdating}
className="flex items-center gap-2 px-4 py-2 text-sm font-medium text-white bg-blue-600 hover:bg-blue-700 rounded-lg disabled:opacity-50 disabled:cursor-not-allowed"
>
<RefreshCw className={`w-4 h-4 ${isUpdating ? 'animate-spin' : ''}`} />
{isUpdating ? 'Updating...' : 'Update'}
{!isUpdating && <ChevronDown className="w-4 h-4" />}
</button>
{showUpdateDropdown && !isUpdating && (
<div className="absolute right-0 mt-2 w-48 bg-white rounded-lg shadow-lg border border-gray-200 z-10">
<button
onClick={() => handleUpdate('products')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-t-lg"
>
Products
</button>
<button
onClick={() => handleUpdate('brands')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100"
>
Brands
</button>
<button
onClick={() => handleUpdate('specials')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100"
>
Specials
</button>
<button
onClick={() => handleUpdate('all')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-b-lg border-t border-gray-200"
>
All
</button>
</div>
)}
</div>
</div> </div>
{/* Dispensary Header */} {/* Dispensary Header */}
@@ -266,7 +225,7 @@ export function DispensaryDetail() {
<div className="flex items-center gap-2 text-sm text-gray-600 bg-gray-50 px-4 py-2 rounded-lg"> <div className="flex items-center gap-2 text-sm text-gray-600 bg-gray-50 px-4 py-2 rounded-lg">
<Calendar className="w-4 h-4" /> <Calendar className="w-4 h-4" />
<div> <div>
<span className="font-medium">Last Crawl Date:</span> <span className="font-medium">Last Updated:</span>
<span className="ml-2"> <span className="ml-2">
{dispensary.last_menu_scrape {dispensary.last_menu_scrape
? new Date(dispensary.last_menu_scrape).toLocaleDateString('en-US', { ? new Date(dispensary.last_menu_scrape).toLocaleDateString('en-US', {
@@ -331,7 +290,7 @@ export function DispensaryDetail() {
</a> </a>
)} )}
<Link <Link
to="/schedule" to={`/dispensaries/${state}/${city}/${slug}/schedule`}
className="flex items-center gap-2 text-sm text-blue-600 hover:text-blue-800" className="flex items-center gap-2 text-sm text-blue-600 hover:text-blue-800"
> >
<Clock className="w-4 h-4" /> <Clock className="w-4 h-4" />
@@ -533,57 +492,31 @@ export function DispensaryDetail() {
`$${product.regular_price}` `$${product.regular_price}`
) : '-'} ) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.quantity != null ? ( {product.quantity != null ? product.quantity : '-'}
<span className={`badge badge-sm ${product.quantity > 0 ? 'badge-info' : 'badge-error'}`}>
{product.quantity}
</span>
) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.thc_percentage ? ( {product.thc_percentage ? `${product.thc_percentage}%` : '-'}
<span className="badge badge-success badge-sm">{product.thc_percentage}%</span>
) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.cbd_percentage ? ( {product.cbd_percentage ? `${product.cbd_percentage}%` : '-'}
<span className="badge badge-info badge-sm">{product.cbd_percentage}%</span>
) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.strain_type ? ( {product.strain_type || '-'}
<span className="badge badge-ghost badge-sm">{product.strain_type}</span>
) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.in_stock ? ( {product.in_stock ? 'Yes' : product.in_stock === false ? 'No' : '-'}
<span className="badge badge-success badge-sm">Yes</span>
) : product.in_stock === false ? (
<span className="badge badge-error badge-sm">No</span>
) : '-'}
</td> </td>
<td className="whitespace-nowrap text-xs text-gray-500"> <td className="whitespace-nowrap text-xs text-gray-500">
{product.updated_at ? formatDate(product.updated_at) : '-'} {product.updated_at ? formatDate(product.updated_at) : '-'}
</td> </td>
<td> <td>
<div className="flex gap-1">
{product.dutchie_url && (
<a
href={product.dutchie_url}
target="_blank"
rel="noopener noreferrer"
className="btn btn-xs btn-outline"
>
Dutchie
</a>
)}
<button <button
onClick={() => navigate(`/products/${product.id}`)} onClick={() => navigate(`/products/${product.id}`)}
className="btn btn-xs btn-primary" className="btn btn-xs btn-ghost text-gray-500 hover:text-gray-700"
> >
Details Details
</button> </button>
</div>
</td> </td>
</tr> </tr>
))} ))}

View File

@@ -0,0 +1,378 @@
import { useEffect, useState } from 'react';
import { useParams, useNavigate, Link } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import {
ArrowLeft,
Clock,
Calendar,
CheckCircle,
XCircle,
AlertCircle,
Package,
Timer,
Building2,
} from 'lucide-react';
interface CrawlHistoryItem {
id: number;
runId: string | null;
profileKey: string | null;
crawlerModule: string | null;
stateAtStart: string | null;
stateAtEnd: string | null;
totalSteps: number;
durationMs: number | null;
success: boolean;
errorMessage: string | null;
productsFound: number | null;
startedAt: string | null;
completedAt: string | null;
}
interface NextSchedule {
scheduleId: number;
jobName: string;
enabled: boolean;
baseIntervalMinutes: number;
jitterMinutes: number;
nextRunAt: string | null;
lastRunAt: string | null;
lastStatus: string | null;
}
interface Dispensary {
id: number;
name: string;
dba_name: string | null;
slug: string;
state: string;
city: string;
menu_type: string | null;
platform_dispensary_id: string | null;
last_menu_scrape: string | null;
}
export function DispensarySchedule() {
const { state, city, slug } = useParams();
const navigate = useNavigate();
const [dispensary, setDispensary] = useState<Dispensary | null>(null);
const [history, setHistory] = useState<CrawlHistoryItem[]>([]);
const [nextSchedule, setNextSchedule] = useState<NextSchedule | null>(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
loadScheduleData();
}, [slug]);
const loadScheduleData = async () => {
setLoading(true);
try {
// First get the dispensary to get the ID
const dispData = await api.getDispensary(slug!);
if (dispData?.id) {
const data = await api.getStoreCrawlHistory(dispData.id);
setDispensary(data.dispensary);
setHistory(data.history || []);
setNextSchedule(data.nextSchedule);
}
} catch (error) {
console.error('Failed to load schedule data:', error);
} finally {
setLoading(false);
}
};
const formatDate = (dateStr: string | null) => {
if (!dateStr) return 'Never';
const date = new Date(dateStr);
return date.toLocaleDateString('en-US', {
year: 'numeric',
month: 'short',
day: 'numeric',
hour: '2-digit',
minute: '2-digit',
});
};
const formatTimeAgo = (dateStr: string | null) => {
if (!dateStr) return 'Never';
const date = new Date(dateStr);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffMinutes = Math.floor(diffMs / (1000 * 60));
const diffHours = Math.floor(diffMs / (1000 * 60 * 60));
const diffDays = Math.floor(diffMs / (1000 * 60 * 60 * 24));
if (diffMinutes < 1) return 'Just now';
if (diffMinutes < 60) return `${diffMinutes}m ago`;
if (diffHours < 24) return `${diffHours}h ago`;
if (diffDays === 1) return 'Yesterday';
if (diffDays < 7) return `${diffDays} days ago`;
return date.toLocaleDateString();
};
const formatTimeUntil = (dateStr: string | null) => {
if (!dateStr) return 'Not scheduled';
const date = new Date(dateStr);
const now = new Date();
const diffMs = date.getTime() - now.getTime();
if (diffMs < 0) return 'Overdue';
const diffMinutes = Math.floor(diffMs / (1000 * 60));
const diffHours = Math.floor(diffMinutes / 60);
if (diffMinutes < 60) return `in ${diffMinutes}m`;
return `in ${diffHours}h ${diffMinutes % 60}m`;
};
const formatDuration = (ms: number | null) => {
if (!ms) return '-';
if (ms < 1000) return `${ms}ms`;
const seconds = Math.floor(ms / 1000);
const minutes = Math.floor(seconds / 60);
if (minutes < 1) return `${seconds}s`;
return `${minutes}m ${seconds % 60}s`;
};
const formatInterval = (baseMinutes: number, jitterMinutes: number) => {
const hours = Math.floor(baseMinutes / 60);
const mins = baseMinutes % 60;
let base = hours > 0 ? `${hours}h` : '';
if (mins > 0) base += `${mins}m`;
return `Every ${base} (+/- ${jitterMinutes}m jitter)`;
};
if (loading) {
return (
<Layout>
<div className="text-center py-12">
<div className="inline-block animate-spin rounded-full h-8 w-8 border-4 border-gray-400 border-t-transparent"></div>
<p className="mt-2 text-sm text-gray-600">Loading schedule...</p>
</div>
</Layout>
);
}
if (!dispensary) {
return (
<Layout>
<div className="text-center py-12">
<p className="text-gray-600">Dispensary not found</p>
</div>
</Layout>
);
}
// Stats from history
const successCount = history.filter(h => h.success).length;
const failureCount = history.filter(h => !h.success).length;
const lastSuccess = history.find(h => h.success);
const avgDuration = history.length > 0
? Math.round(history.reduce((sum, h) => sum + (h.durationMs || 0), 0) / history.length)
: 0;
return (
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between gap-4">
<button
onClick={() => navigate(`/dispensaries/${state}/${city}/${slug}`)}
className="flex items-center gap-2 text-sm text-gray-600 hover:text-gray-900"
>
<ArrowLeft className="w-4 h-4" />
Back to {dispensary.dba_name || dispensary.name}
</button>
</div>
{/* Dispensary Info */}
<div className="bg-white rounded-lg border border-gray-200 p-6">
<div className="flex items-start gap-4">
<div className="p-3 bg-blue-50 rounded-lg">
<Building2 className="w-8 h-8 text-blue-600" />
</div>
<div>
<h1 className="text-2xl font-bold text-gray-900">
{dispensary.dba_name || dispensary.name}
</h1>
<p className="text-sm text-gray-600 mt-1">
{dispensary.city}, {dispensary.state} - Crawl Schedule & History
</p>
<div className="flex items-center gap-4 mt-2 text-sm text-gray-500">
<span>Slug: {dispensary.slug}</span>
{dispensary.menu_type && (
<span className="px-2 py-0.5 bg-gray-100 rounded text-xs">
{dispensary.menu_type}
</span>
)}
</div>
</div>
</div>
</div>
{/* Next Scheduled Crawl */}
{nextSchedule && (
<div className="bg-white rounded-lg border border-gray-200 p-6">
<h2 className="text-lg font-semibold text-gray-900 mb-4 flex items-center gap-2">
<Clock className="w-5 h-5 text-blue-500" />
Upcoming Schedule
</h2>
<div className="grid grid-cols-4 gap-6">
<div>
<p className="text-sm text-gray-500">Next Run</p>
<p className="text-xl font-semibold text-blue-600">
{formatTimeUntil(nextSchedule.nextRunAt)}
</p>
<p className="text-xs text-gray-400">
{formatDate(nextSchedule.nextRunAt)}
</p>
</div>
<div>
<p className="text-sm text-gray-500">Interval</p>
<p className="text-lg font-medium">
{formatInterval(nextSchedule.baseIntervalMinutes, nextSchedule.jitterMinutes)}
</p>
</div>
<div>
<p className="text-sm text-gray-500">Last Run</p>
<p className="text-lg font-medium">
{formatTimeAgo(nextSchedule.lastRunAt)}
</p>
</div>
<div>
<p className="text-sm text-gray-500">Last Status</p>
<p className={`text-lg font-medium ${
nextSchedule.lastStatus === 'success' ? 'text-green-600' :
nextSchedule.lastStatus === 'error' ? 'text-red-600' : 'text-gray-600'
}`}>
{nextSchedule.lastStatus || '-'}
</p>
</div>
</div>
</div>
)}
{/* Stats Summary */}
<div className="grid grid-cols-4 gap-4">
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<CheckCircle className="w-8 h-8 text-green-500" />
<div>
<p className="text-sm text-gray-500">Successful Runs</p>
<p className="text-2xl font-bold text-green-600">{successCount}</p>
</div>
</div>
</div>
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<XCircle className="w-8 h-8 text-red-500" />
<div>
<p className="text-sm text-gray-500">Failed Runs</p>
<p className="text-2xl font-bold text-red-600">{failureCount}</p>
</div>
</div>
</div>
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<Timer className="w-8 h-8 text-blue-500" />
<div>
<p className="text-sm text-gray-500">Avg Duration</p>
<p className="text-2xl font-bold">{formatDuration(avgDuration)}</p>
</div>
</div>
</div>
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<Package className="w-8 h-8 text-purple-500" />
<div>
<p className="text-sm text-gray-500">Last Products Found</p>
<p className="text-2xl font-bold">
{lastSuccess?.productsFound?.toLocaleString() || '-'}
</p>
</div>
</div>
</div>
</div>
{/* Crawl History Table */}
<div className="bg-white rounded-lg border border-gray-200">
<div className="p-4 border-b border-gray-200">
<h2 className="text-lg font-semibold text-gray-900 flex items-center gap-2">
<Calendar className="w-5 h-5 text-gray-500" />
Crawl History
</h2>
</div>
<div className="overflow-x-auto">
<table className="table table-sm w-full">
<thead className="bg-gray-50">
<tr>
<th>Status</th>
<th>Started</th>
<th>Duration</th>
<th className="text-right">Products</th>
<th>State</th>
<th>Error</th>
</tr>
</thead>
<tbody>
{history.length === 0 ? (
<tr>
<td colSpan={6} className="text-center py-8 text-gray-500">
No crawl history available
</td>
</tr>
) : (
history.map((item) => (
<tr key={item.id} className="hover:bg-gray-50">
<td>
<span className={`inline-flex items-center gap-1 px-2 py-1 rounded text-xs font-medium ${
item.success
? 'bg-green-100 text-green-700'
: 'bg-red-100 text-red-700'
}`}>
{item.success ? (
<CheckCircle className="w-3 h-3" />
) : (
<XCircle className="w-3 h-3" />
)}
{item.success ? 'Success' : 'Failed'}
</span>
</td>
<td>
<div className="text-sm">{formatDate(item.startedAt)}</div>
<div className="text-xs text-gray-400">{formatTimeAgo(item.startedAt)}</div>
</td>
<td className="font-mono text-sm">
{formatDuration(item.durationMs)}
</td>
<td className="text-right font-mono text-sm">
{item.productsFound?.toLocaleString() || '-'}
</td>
<td className="text-sm text-gray-600">
{item.stateAtEnd || item.stateAtStart || '-'}
</td>
<td className="max-w-[200px]">
{item.errorMessage ? (
<span
className="text-xs text-red-600 truncate block cursor-help"
title={item.errorMessage}
>
{item.errorMessage.substring(0, 50)}...
</span>
) : '-'}
</td>
</tr>
))
)}
</tbody>
</table>
</div>
</div>
</div>
</Layout>
);
}
export default DispensarySchedule;

View File

@@ -3,15 +3,16 @@ import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout'; import { Layout } from '../components/Layout';
import { api } from '../lib/api'; import { api } from '../lib/api';
import { trackProductClick } from '../lib/analytics'; import { trackProductClick } from '../lib/analytics';
import { useStateFilter } from '../hooks/useStateFilter';
import { import {
Building2, Building2,
MapPin, MapPin,
Package, Package,
DollarSign, DollarSign,
RefreshCw,
Search, Search,
TrendingUp, TrendingUp,
BarChart3, BarChart3,
ChevronDown,
} from 'lucide-react'; } from 'lucide-react';
interface BrandData { interface BrandData {
@@ -25,19 +26,28 @@ interface BrandData {
export function IntelligenceBrands() { export function IntelligenceBrands() {
const navigate = useNavigate(); const navigate = useNavigate();
const { selectedState, setSelectedState, stateParam, stateLabel, isAllStates } = useStateFilter();
const [availableStates, setAvailableStates] = useState<string[]>([]);
const [brands, setBrands] = useState<BrandData[]>([]); const [brands, setBrands] = useState<BrandData[]>([]);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState(''); const [searchTerm, setSearchTerm] = useState('');
const [sortBy, setSortBy] = useState<'stores' | 'skus' | 'name'>('stores'); const [sortBy, setSortBy] = useState<'stores' | 'skus' | 'name' | 'states'>('stores');
useEffect(() => { useEffect(() => {
loadBrands(); loadBrands();
}, [stateParam]);
useEffect(() => {
// Load available states
api.getOrchestratorStates().then(data => {
setAvailableStates(data.states?.map((s: any) => s.state) || []);
}).catch(console.error);
}, []); }, []);
const loadBrands = async () => { const loadBrands = async () => {
try { try {
setLoading(true); setLoading(true);
const data = await api.getIntelligenceBrands({ limit: 500 }); const data = await api.getIntelligenceBrands({ limit: 500, state: stateParam });
setBrands(data.brands || []); setBrands(data.brands || []);
} catch (error) { } catch (error) {
console.error('Failed to load brands:', error); console.error('Failed to load brands:', error);
@@ -58,6 +68,8 @@ export function IntelligenceBrands() {
return b.skuCount - a.skuCount; return b.skuCount - a.skuCount;
case 'name': case 'name':
return a.brandName.localeCompare(b.brandName); return a.brandName.localeCompare(b.brandName);
case 'states':
return b.states.length - a.states.length;
default: default:
return 0; return 0;
} }
@@ -89,37 +101,62 @@ export function IntelligenceBrands() {
<Layout> <Layout>
<div className="space-y-6"> <div className="space-y-6">
{/* Header */} {/* Header */}
<div className="flex items-center justify-between"> <div className="flex flex-col gap-4 sm:flex-row sm:items-center sm:justify-between">
<div> <div>
<h1 className="text-2xl font-bold text-gray-900">Brands Intelligence</h1> <h1 className="text-2xl font-bold text-gray-900">Brands Intelligence</h1>
<p className="text-sm text-gray-600 mt-1"> <p className="text-sm text-gray-600 mt-1">
Brand penetration and pricing analytics across markets Brand penetration and pricing analytics across markets
</p> </p>
</div> </div>
<div className="flex gap-2"> <div className="flex flex-wrap gap-2 items-center">
{/* State Selector */}
<div className="dropdown dropdown-end">
<button tabIndex={0} className="btn btn-sm gap-2 bg-emerald-50 border-emerald-200 hover:bg-emerald-100">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-50 menu p-2 shadow-lg bg-white rounded-box w-44 max-h-60 overflow-y-auto border border-gray-200">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active bg-emerald-100' : ''}>
All States
</a>
</li>
<div className="divider my-1"></div>
{availableStates.map((state) => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active bg-emerald-100' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
{/* Page Navigation */}
<div className="flex gap-1">
<button <button
onClick={() => navigate('/admin/intelligence/pricing')} className="btn btn-sm gap-1 bg-emerald-600 text-white hover:bg-emerald-700 border-emerald-600"
className="btn btn-sm btn-outline gap-1"
> >
<DollarSign className="w-4 h-4" /> <Building2 className="w-4 h-4" />
Pricing <span>Brands</span>
</button> </button>
<button <button
onClick={() => navigate('/admin/intelligence/stores')} onClick={() => navigate('/admin/intelligence/stores')}
className="btn btn-sm btn-outline gap-1" className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
> >
<MapPin className="w-4 h-4" /> <MapPin className="w-4 h-4" />
Stores <span>Stores</span>
</button> </button>
<button <button
onClick={loadBrands} onClick={() => navigate('/admin/intelligence/pricing')}
className="btn btn-sm btn-outline gap-2" className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
> >
<RefreshCw className="w-4 h-4" /> <DollarSign className="w-4 h-4" />
Refresh <span>Pricing</span>
</button> </button>
</div> </div>
</div> </div>
</div>
{/* Summary Cards */} {/* Summary Cards */}
<div className="grid grid-cols-4 gap-4"> <div className="grid grid-cols-4 gap-4">
@@ -169,28 +206,32 @@ export function IntelligenceBrands() {
{/* Top Brands Chart */} {/* Top Brands Chart */}
<div className="bg-white rounded-lg border border-gray-200 p-4"> <div className="bg-white rounded-lg border border-gray-200 p-4">
<h3 className="text-lg font-semibold text-gray-900 mb-4 flex items-center gap-2"> <h3 className="text-lg font-semibold text-gray-900 flex items-center gap-2 mb-4">
<BarChart3 className="w-5 h-5 text-blue-500" /> <BarChart3 className="w-5 h-5 text-emerald-500" />
Top 10 Brands by Store Count Top 10 Brands by Store Count
</h3> </h3>
<div className="space-y-2"> <div className="space-y-2">
{topBrands.map((brand, idx) => ( {topBrands.map((brand) => {
const barWidth = Math.min((brand.storeCount / maxStoreCount) * 100, 100);
return (
<div key={brand.brandName} className="flex items-center gap-3"> <div key={brand.brandName} className="flex items-center gap-3">
<span className="text-sm text-gray-500 w-6">{idx + 1}.</span> <span className="text-sm font-medium w-28 truncate shrink-0" title={brand.brandName}>
<span className="text-sm font-medium w-40 truncate" title={brand.brandName}>
{brand.brandName} {brand.brandName}
</span> </span>
<div className="flex-1 bg-gray-100 rounded-full h-4 relative"> <div className="flex-1 min-w-0">
<div className="bg-gray-100 rounded h-5 overflow-hidden">
<div <div
className="bg-blue-500 rounded-full h-4" className="bg-gradient-to-r from-emerald-400 to-emerald-500 h-5 rounded transition-all"
style={{ width: `${(brand.storeCount / maxStoreCount) * 100}%` }} style={{ width: `${barWidth}%` }}
/> />
</div> </div>
<span className="text-sm text-gray-600 w-16 text-right"> </div>
{brand.storeCount} stores <span className="text-sm font-mono font-semibold text-emerald-600 w-16 text-right shrink-0">
{brand.storeCount}
</span> </span>
</div> </div>
))} );
})}
</div> </div>
</div> </div>
@@ -213,6 +254,7 @@ export function IntelligenceBrands() {
> >
<option value="stores">Sort by Stores</option> <option value="stores">Sort by Stores</option>
<option value="skus">Sort by SKUs</option> <option value="skus">Sort by SKUs</option>
<option value="states">Sort by States</option>
<option value="name">Sort by Name</option> <option value="name">Sort by Name</option>
</select> </select>
<span className="text-sm text-gray-500"> <span className="text-sm text-gray-500">

View File

@@ -2,15 +2,16 @@ import { useEffect, useState } from 'react';
import { useNavigate } from 'react-router-dom'; import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout'; import { Layout } from '../components/Layout';
import { api } from '../lib/api'; import { api } from '../lib/api';
import { useStateFilter } from '../hooks/useStateFilter';
import { import {
DollarSign, DollarSign,
Building2, Building2,
MapPin, MapPin,
Package, Package,
RefreshCw,
TrendingUp, TrendingUp,
TrendingDown, TrendingDown,
BarChart3, BarChart3,
ChevronDown,
} from 'lucide-react'; } from 'lucide-react';
interface CategoryPricing { interface CategoryPricing {
@@ -31,18 +32,27 @@ interface OverallPricing {
export function IntelligencePricing() { export function IntelligencePricing() {
const navigate = useNavigate(); const navigate = useNavigate();
const { selectedState, setSelectedState, stateParam, stateLabel, isAllStates } = useStateFilter();
const [availableStates, setAvailableStates] = useState<string[]>([]);
const [categories, setCategories] = useState<CategoryPricing[]>([]); const [categories, setCategories] = useState<CategoryPricing[]>([]);
const [overall, setOverall] = useState<OverallPricing | null>(null); const [overall, setOverall] = useState<OverallPricing | null>(null);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
useEffect(() => { useEffect(() => {
loadPricing(); loadPricing();
}, [stateParam]);
useEffect(() => {
// Load available states
api.getOrchestratorStates().then(data => {
setAvailableStates(data.states?.map((s: any) => s.state) || []);
}).catch(console.error);
}, []); }, []);
const loadPricing = async () => { const loadPricing = async () => {
try { try {
setLoading(true); setLoading(true);
const data = await api.getIntelligencePricing(); const data = await api.getIntelligencePricing({ state: stateParam });
setCategories(data.byCategory || []); setCategories(data.byCategory || []);
setOverall(data.overall || null); setOverall(data.overall || null);
} catch (error) { } catch (error) {
@@ -76,37 +86,62 @@ export function IntelligencePricing() {
<Layout> <Layout>
<div className="space-y-6"> <div className="space-y-6">
{/* Header */} {/* Header */}
<div className="flex items-center justify-between"> <div className="flex flex-col gap-4 sm:flex-row sm:items-center sm:justify-between">
<div> <div>
<h1 className="text-2xl font-bold text-gray-900">Pricing Intelligence</h1> <h1 className="text-2xl font-bold text-gray-900">Pricing Intelligence</h1>
<p className="text-sm text-gray-600 mt-1"> <p className="text-sm text-gray-600 mt-1">
Price distribution and trends by category Price distribution and trends by category
</p> </p>
</div> </div>
<div className="flex gap-2"> <div className="flex flex-wrap gap-2 items-center">
{/* State Selector */}
<div className="dropdown dropdown-end">
<button tabIndex={0} className="btn btn-sm gap-2 bg-emerald-50 border-emerald-200 hover:bg-emerald-100">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-50 menu p-2 shadow-lg bg-white rounded-box w-44 max-h-60 overflow-y-auto border border-gray-200">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active bg-emerald-100' : ''}>
All States
</a>
</li>
<div className="divider my-1"></div>
{availableStates.map((state) => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active bg-emerald-100' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
{/* Page Navigation */}
<div className="flex gap-1">
<button <button
onClick={() => navigate('/admin/intelligence/brands')} onClick={() => navigate('/admin/intelligence/brands')}
className="btn btn-sm btn-outline gap-1" className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
> >
<Building2 className="w-4 h-4" /> <Building2 className="w-4 h-4" />
Brands <span>Brands</span>
</button> </button>
<button <button
onClick={() => navigate('/admin/intelligence/stores')} onClick={() => navigate('/admin/intelligence/stores')}
className="btn btn-sm btn-outline gap-1" className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
> >
<MapPin className="w-4 h-4" /> <MapPin className="w-4 h-4" />
Stores <span>Stores</span>
</button> </button>
<button <button
onClick={loadPricing} className="btn btn-sm gap-1 bg-emerald-600 text-white hover:bg-emerald-700 border-emerald-600"
className="btn btn-sm btn-outline gap-2"
> >
<RefreshCw className="w-4 h-4" /> <DollarSign className="w-4 h-4" />
Refresh <span>Pricing</span>
</button> </button>
</div> </div>
</div> </div>
</div>
{/* Overall Stats */} {/* Overall Stats */}
{overall && ( {overall && (
@@ -150,7 +185,7 @@ export function IntelligencePricing() {
<div> <div>
<p className="text-sm text-gray-500">Products Priced</p> <p className="text-sm text-gray-500">Products Priced</p>
<p className="text-2xl font-bold"> <p className="text-2xl font-bold">
{overall.totalProducts.toLocaleString()} {(overall.totalProducts || 0).toLocaleString()}
</p> </p>
</div> </div>
</div> </div>
@@ -164,43 +199,29 @@ export function IntelligencePricing() {
<BarChart3 className="w-5 h-5 text-green-500" /> <BarChart3 className="w-5 h-5 text-green-500" />
Average Price by Category Average Price by Category
</h3> </h3>
<div className="space-y-3"> <div className="space-y-2">
{sortedCategories.map((cat) => ( {sortedCategories.slice(0, 12).map((cat) => {
const maxPrice = Math.max(...sortedCategories.map(c => c.avgPrice || 0), 1);
const barWidth = Math.min(((cat.avgPrice || 0) / maxPrice) * 100, 100);
return (
<div key={cat.category} className="flex items-center gap-3"> <div key={cat.category} className="flex items-center gap-3">
<span className="text-sm font-medium w-32 truncate" title={cat.category}> <span className="text-sm font-medium w-28 truncate shrink-0" title={cat.category}>
{cat.category || 'Unknown'} {cat.category || 'Unknown'}
</span> </span>
<div className="flex-1 relative"> <div className="flex-1 min-w-0">
{/* Price range bar */} <div className="bg-gray-100 rounded h-5 overflow-hidden">
<div className="bg-gray-100 rounded-full h-6 relative">
{/* Min-Max range */}
<div <div
className="absolute top-0 h-6 bg-blue-100 rounded-full" className="bg-gradient-to-r from-emerald-400 to-emerald-500 h-5 rounded transition-all"
style={{ style={{ width: `${barWidth}%` }}
left: `${(cat.minPrice / (overall?.maxPrice || 100)) * 100}%`,
width: `${((cat.maxPrice - cat.minPrice) / (overall?.maxPrice || 100)) * 100}%`,
}}
/>
{/* Average marker */}
<div
className="absolute top-0 h-6 w-1 bg-green-500 rounded"
style={{ left: `${(cat.avgPrice / (overall?.maxPrice || 100)) * 100}%` }}
/> />
</div> </div>
</div> </div>
<div className="flex gap-4 text-xs w-48"> <span className="text-sm font-mono font-semibold text-emerald-600 w-16 text-right shrink-0">
<span className="text-gray-500"> {formatPrice(cat.avgPrice)}
Min: <span className="text-blue-600 font-mono">{formatPrice(cat.minPrice)}</span>
</span>
<span className="text-gray-500">
Avg: <span className="text-green-600 font-mono font-bold">{formatPrice(cat.avgPrice)}</span>
</span>
<span className="text-gray-500">
Max: <span className="text-orange-600 font-mono">{formatPrice(cat.maxPrice)}</span>
</span> </span>
</div> </div>
</div> );
))} })}
</div> </div>
</div> </div>
@@ -236,7 +257,7 @@ export function IntelligencePricing() {
<span className="font-medium">{cat.category || 'Unknown'}</span> <span className="font-medium">{cat.category || 'Unknown'}</span>
</td> </td>
<td className="text-center"> <td className="text-center">
<span className="font-mono">{cat.productCount.toLocaleString()}</span> <span className="font-mono">{(cat.productCount || 0).toLocaleString()}</span>
</td> </td>
<td className="text-right"> <td className="text-right">
<span className="font-mono text-blue-600">{formatPrice(cat.minPrice)}</span> <span className="font-mono text-blue-600">{formatPrice(cat.minPrice)}</span>

View File

@@ -8,7 +8,6 @@ import {
Building2, Building2,
DollarSign, DollarSign,
Package, Package,
RefreshCw,
Search, Search,
Clock, Clock,
Activity, Activity,
@@ -34,12 +33,19 @@ export function IntelligenceStores() {
const [stores, setStores] = useState<StoreActivity[]>([]); const [stores, setStores] = useState<StoreActivity[]>([]);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState(''); const [searchTerm, setSearchTerm] = useState('');
const [localStates, setLocalStates] = useState<string[]>([]); const [availableStates, setAvailableStates] = useState<string[]>([]);
useEffect(() => { useEffect(() => {
loadStores(); loadStores();
}, [selectedState]); }, [selectedState]);
useEffect(() => {
// Load available states from orchestrator API
api.getOrchestratorStates().then(data => {
setAvailableStates(data.states?.map((s: any) => s.state) || []);
}).catch(console.error);
}, []);
const loadStores = async () => { const loadStores = async () => {
try { try {
setLoading(true); setLoading(true);
@@ -48,10 +54,6 @@ export function IntelligenceStores() {
limit: 500, limit: 500,
}); });
setStores(data.stores || []); setStores(data.stores || []);
// Extract unique states from response for dropdown counts
const uniqueStates = [...new Set(data.stores.map((s: StoreActivity) => s.state))].sort();
setLocalStates(uniqueStates);
} catch (error) { } catch (error) {
console.error('Failed to load stores:', error); console.error('Failed to load stores:', error);
} finally { } finally {
@@ -97,49 +99,74 @@ export function IntelligenceStores() {
); );
} }
// Calculate stats // Calculate stats with null safety
const totalSKUs = stores.reduce((sum, s) => sum + s.skuCount, 0); const totalSKUs = stores.reduce((sum, s) => sum + (s.skuCount || 0), 0);
const totalSnapshots = stores.reduce((sum, s) => sum + s.snapshotCount, 0); const totalSnapshots = stores.reduce((sum, s) => sum + (s.snapshotCount || 0), 0);
const avgFrequency = stores.filter(s => s.crawlFrequencyHours).length > 0 const storesWithFrequency = stores.filter(s => s.crawlFrequencyHours != null);
? stores.filter(s => s.crawlFrequencyHours).reduce((sum, s) => sum + (s.crawlFrequencyHours || 0), 0) / const avgFrequency = storesWithFrequency.length > 0
stores.filter(s => s.crawlFrequencyHours).length ? storesWithFrequency.reduce((sum, s) => sum + (s.crawlFrequencyHours || 0), 0) / storesWithFrequency.length
: 0; : 0;
return ( return (
<Layout> <Layout>
<div className="space-y-6"> <div className="space-y-6">
{/* Header */} {/* Header */}
<div className="flex items-center justify-between"> <div className="flex flex-col gap-4 sm:flex-row sm:items-center sm:justify-between">
<div> <div>
<h1 className="text-2xl font-bold text-gray-900">Store Activity</h1> <h1 className="text-2xl font-bold text-gray-900">Store Activity</h1>
<p className="text-sm text-gray-600 mt-1"> <p className="text-sm text-gray-600 mt-1">
Per-store SKU counts, snapshots, and crawl frequency Per-store SKU counts, snapshots, and crawl frequency
</p> </p>
</div> </div>
<div className="flex gap-2"> <div className="flex flex-wrap gap-2 items-center">
{/* State Selector */}
<div className="dropdown dropdown-end">
<button tabIndex={0} className="btn btn-sm gap-2 bg-emerald-50 border-emerald-200 hover:bg-emerald-100">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-50 menu p-2 shadow-lg bg-white rounded-box w-44 max-h-60 overflow-y-auto border border-gray-200">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active bg-emerald-100' : ''}>
All States
</a>
</li>
<div className="divider my-1"></div>
{availableStates.map((state) => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active bg-emerald-100' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
{/* Page Navigation */}
<div className="flex gap-1">
<button <button
onClick={() => navigate('/admin/intelligence/brands')} onClick={() => navigate('/admin/intelligence/brands')}
className="btn btn-sm btn-outline gap-1" className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
> >
<Building2 className="w-4 h-4" /> <Building2 className="w-4 h-4" />
Brands <span>Brands</span>
</button>
<button
className="btn btn-sm gap-1 bg-emerald-600 text-white hover:bg-emerald-700 border-emerald-600"
>
<MapPin className="w-4 h-4" />
<span>Stores</span>
</button> </button>
<button <button
onClick={() => navigate('/admin/intelligence/pricing')} onClick={() => navigate('/admin/intelligence/pricing')}
className="btn btn-sm btn-outline gap-1" className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
> >
<DollarSign className="w-4 h-4" /> <DollarSign className="w-4 h-4" />
Pricing <span>Pricing</span>
</button>
<button
onClick={loadStores}
className="btn btn-sm btn-outline gap-2"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button> </button>
</div> </div>
</div> </div>
</div>
{/* Summary Cards - Responsive: 2→4 columns */} {/* Summary Cards - Responsive: 2→4 columns */}
<div className="grid grid-cols-2 md:grid-cols-4 gap-4"> <div className="grid grid-cols-2 md:grid-cols-4 gap-4">
@@ -193,26 +220,6 @@ export function IntelligenceStores() {
className="input input-bordered input-sm w-full pl-10" className="input input-bordered input-sm w-full pl-10"
/> />
</div> </div>
<div className="dropdown">
<button tabIndex={0} className="btn btn-sm btn-outline gap-2">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-[1] menu p-2 shadow bg-base-100 rounded-box w-40 max-h-60 overflow-y-auto">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active' : ''}>
All States
</a>
</li>
{localStates.map(state => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
<span className="text-sm text-gray-500"> <span className="text-sm text-gray-500">
Showing {filteredStores.length} of {stores.length} stores Showing {filteredStores.length} of {stores.length} stores
</span> </span>
@@ -246,7 +253,7 @@ export function IntelligenceStores() {
<tr <tr
key={store.id} key={store.id}
className="hover:bg-gray-50 cursor-pointer" className="hover:bg-gray-50 cursor-pointer"
onClick={() => navigate(`/admin/orchestrator/stores?storeId=${store.id}`)} onClick={() => navigate(`/stores/list/${store.id}`)}
> >
<td> <td>
<span className="font-medium">{store.name}</span> <span className="font-medium">{store.name}</span>
@@ -262,10 +269,10 @@ export function IntelligenceStores() {
)} )}
</td> </td>
<td className="text-center"> <td className="text-center">
<span className="font-mono">{store.skuCount.toLocaleString()}</span> <span className="font-mono">{(store.skuCount || 0).toLocaleString()}</span>
</td> </td>
<td className="text-center"> <td className="text-center">
<span className="font-mono">{store.snapshotCount.toLocaleString()}</span> <span className="font-mono">{(store.snapshotCount || 0).toLocaleString()}</span>
</td> </td>
<td> <td>
<span className={store.lastCrawl ? 'text-green-600' : 'text-gray-400'}> <span className={store.lastCrawl ? 'text-green-600' : 'text-gray-400'}>

File diff suppressed because it is too large Load Diff

View File

@@ -8,7 +8,6 @@
import { useState, useEffect } from 'react'; import { useState, useEffect } from 'react';
import { useNavigate } from 'react-router-dom'; import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout'; import { Layout } from '../components/Layout';
import { StateBadge } from '../components/StateSelector';
import { useStateStore } from '../store/stateStore'; import { useStateStore } from '../store/stateStore';
import { api } from '../lib/api'; import { api } from '../lib/api';
import { import {
@@ -21,7 +20,6 @@ import {
DollarSign, DollarSign,
MapPin, MapPin,
ArrowRight, ArrowRight,
RefreshCw,
AlertCircle AlertCircle
} from 'lucide-react'; } from 'lucide-react';
@@ -205,7 +203,6 @@ export default function NationalDashboard() {
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null); const [error, setError] = useState<string | null>(null);
const [summary, setSummary] = useState<NationalSummary | null>(null); const [summary, setSummary] = useState<NationalSummary | null>(null);
const [refreshing, setRefreshing] = useState(false);
const fetchData = async () => { const fetchData = async () => {
setLoading(true); setLoading(true);
@@ -230,18 +227,6 @@ export default function NationalDashboard() {
fetchData(); fetchData();
}, []); }, []);
const handleRefreshMetrics = async () => {
setRefreshing(true);
try {
await api.post('/api/admin/states/refresh-metrics');
await fetchData();
} catch (err) {
console.error('Failed to refresh metrics:', err);
} finally {
setRefreshing(false);
}
};
const handleStateClick = (stateCode: string) => { const handleStateClick = (stateCode: string) => {
setSelectedState(stateCode); setSelectedState(stateCode);
navigate(`/national/state/${stateCode}`); navigate(`/national/state/${stateCode}`);
@@ -278,32 +263,19 @@ export default function NationalDashboard() {
<Layout> <Layout>
<div className="space-y-6"> <div className="space-y-6">
{/* Header */} {/* Header */}
<div className="flex items-center justify-between">
<div> <div>
<h1 className="text-2xl font-bold text-gray-900">National Dashboard</h1> <h1 className="text-2xl font-bold text-gray-900">National Dashboard</h1>
<p className="text-gray-500 mt-1"> <p className="text-gray-500 mt-1">
Multi-state cannabis market intelligence Multi-state cannabis market intelligence
</p> </p>
</div> </div>
<div className="flex items-center gap-3">
<StateBadge />
<button
onClick={handleRefreshMetrics}
disabled={refreshing}
className="flex items-center gap-2 px-3 py-2 text-sm text-gray-600 hover:text-gray-900 border border-gray-200 rounded-lg hover:bg-gray-50 disabled:opacity-50"
>
<RefreshCw className={`w-4 h-4 ${refreshing ? 'animate-spin' : ''}`} />
Refresh Metrics
</button>
</div>
</div>
{/* Summary Cards */} {/* Summary Cards */}
{summary && ( {summary && (
<> <>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-4"> <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-4">
<MetricCard <MetricCard
title="Active States" title="Regions (US + CA)"
value={summary.activeStates} value={summary.activeStates}
icon={Globe} icon={Globe}
/> />

View File

@@ -96,7 +96,8 @@ export function Proxies() {
try { try {
const response = await api.testAllProxies(); const response = await api.testAllProxies();
setNotification({ message: 'Proxy testing job started', type: 'success' }); setNotification({ message: 'Proxy testing job started', type: 'success' });
setActiveJob({ id: response.jobId, status: 'pending', tested_proxies: 0, total_proxies: proxies.length, passed_proxies: 0, failed_proxies: 0 }); // Use response.total if available, otherwise proxies.length, but immediately poll for accurate count
setActiveJob({ id: response.jobId, status: 'pending', tested_proxies: 0, total_proxies: response.total || proxies.length || 0, passed_proxies: 0, failed_proxies: 0 });
} catch (error: any) { } catch (error: any) {
setNotification({ message: 'Failed to start testing: ' + error.message, type: 'error' }); setNotification({ message: 'Failed to start testing: ' + error.message, type: 'error' });
} }

View File

@@ -153,29 +153,6 @@ export function StoreDetailPage() {
Back to Stores Back to Stores
</button> </button>
{/* Update Button */}
<div className="relative">
<button
onClick={() => setShowUpdateDropdown(!showUpdateDropdown)}
disabled={isUpdating}
className="flex items-center gap-2 px-4 py-2 text-sm font-medium text-white bg-blue-600 hover:bg-blue-700 rounded-lg disabled:opacity-50 disabled:cursor-not-allowed"
>
<RefreshCw className={`w-4 h-4 ${isUpdating ? 'animate-spin' : ''}`} />
{isUpdating ? 'Crawling...' : 'Crawl Now'}
{!isUpdating && <ChevronDown className="w-4 h-4" />}
</button>
{showUpdateDropdown && !isUpdating && (
<div className="absolute right-0 mt-2 w-48 bg-white rounded-lg shadow-lg border border-gray-200 z-10">
<button
onClick={handleCrawl}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-lg"
>
Start Full Crawl
</button>
</div>
)}
</div>
</div> </div>
{/* Store Header */} {/* Store Header */}
@@ -200,7 +177,7 @@ export function StoreDetailPage() {
<div className="flex items-center gap-2 text-sm text-gray-600 bg-gray-50 px-4 py-2 rounded-lg"> <div className="flex items-center gap-2 text-sm text-gray-600 bg-gray-50 px-4 py-2 rounded-lg">
<Clock className="w-4 h-4" /> <Clock className="w-4 h-4" />
<div> <div>
<span className="font-medium">Last Crawl:</span> <span className="font-medium">Last Updated:</span>
<span className="ml-2"> <span className="ml-2">
{lastCrawl?.completed_at {lastCrawl?.completed_at
? new Date(lastCrawl.completed_at).toLocaleDateString('en-US', { ? new Date(lastCrawl.completed_at).toLocaleDateString('en-US', {
@@ -212,15 +189,6 @@ export function StoreDetailPage() {
}) })
: 'Never'} : 'Never'}
</span> </span>
{lastCrawl?.status && (
<span className={`ml-2 px-2 py-0.5 rounded text-xs ${
lastCrawl.status === 'completed' ? 'bg-green-100 text-green-800' :
lastCrawl.status === 'failed' ? 'bg-red-100 text-red-800' :
'bg-yellow-100 text-yellow-800'
}`}>
{lastCrawl.status}
</span>
)}
</div> </div>
</div> </div>
</div> </div>
@@ -282,8 +250,8 @@ export function StoreDetailPage() {
setStockFilter('in_stock'); setStockFilter('in_stock');
setSearchQuery(''); setSearchQuery('');
}} }}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${ className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
stockFilter === 'in_stock' ? 'border-blue-500' : 'border-gray-200' stockFilter === 'in_stock' ? 'border-gray-400' : 'border-gray-200'
}`} }`}
> >
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
@@ -303,8 +271,8 @@ export function StoreDetailPage() {
setStockFilter('out_of_stock'); setStockFilter('out_of_stock');
setSearchQuery(''); setSearchQuery('');
}} }}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${ className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
stockFilter === 'out_of_stock' ? 'border-blue-500' : 'border-gray-200' stockFilter === 'out_of_stock' ? 'border-gray-400' : 'border-gray-200'
}`} }`}
> >
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
@@ -320,8 +288,8 @@ export function StoreDetailPage() {
<button <button
onClick={() => setActiveTab('brands')} onClick={() => setActiveTab('brands')}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${ className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
activeTab === 'brands' ? 'border-blue-500' : 'border-gray-200' activeTab === 'brands' ? 'border-gray-400' : 'border-gray-200'
}`} }`}
> >
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
@@ -337,8 +305,8 @@ export function StoreDetailPage() {
<button <button
onClick={() => setActiveTab('categories')} onClick={() => setActiveTab('categories')}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${ className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
activeTab === 'categories' ? 'border-blue-500' : 'border-gray-200' activeTab === 'categories' ? 'border-gray-400' : 'border-gray-200'
}`} }`}
> >
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
@@ -364,7 +332,7 @@ export function StoreDetailPage() {
}} }}
className={`py-4 px-2 text-sm font-medium border-b-2 ${ className={`py-4 px-2 text-sm font-medium border-b-2 ${
activeTab === 'products' activeTab === 'products'
? 'border-blue-600 text-blue-600' ? 'border-gray-800 text-gray-900'
: 'border-transparent text-gray-600 hover:text-gray-900' : 'border-transparent text-gray-600 hover:text-gray-900'
}`} }`}
> >
@@ -374,7 +342,7 @@ export function StoreDetailPage() {
onClick={() => setActiveTab('brands')} onClick={() => setActiveTab('brands')}
className={`py-4 px-2 text-sm font-medium border-b-2 ${ className={`py-4 px-2 text-sm font-medium border-b-2 ${
activeTab === 'brands' activeTab === 'brands'
? 'border-blue-600 text-blue-600' ? 'border-gray-800 text-gray-900'
: 'border-transparent text-gray-600 hover:text-gray-900' : 'border-transparent text-gray-600 hover:text-gray-900'
}`} }`}
> >
@@ -384,7 +352,7 @@ export function StoreDetailPage() {
onClick={() => setActiveTab('categories')} onClick={() => setActiveTab('categories')}
className={`py-4 px-2 text-sm font-medium border-b-2 ${ className={`py-4 px-2 text-sm font-medium border-b-2 ${
activeTab === 'categories' activeTab === 'categories'
? 'border-blue-600 text-blue-600' ? 'border-gray-800 text-gray-900'
: 'border-transparent text-gray-600 hover:text-gray-900' : 'border-transparent text-gray-600 hover:text-gray-900'
}`} }`}
> >
@@ -433,7 +401,7 @@ export function StoreDetailPage() {
{productsLoading ? ( {productsLoading ? (
<div className="text-center py-8"> <div className="text-center py-8">
<div className="inline-block animate-spin rounded-full h-6 w-6 border-4 border-blue-500 border-t-transparent"></div> <div className="inline-block animate-spin rounded-full h-6 w-6 border-4 border-gray-400 border-t-transparent"></div>
<p className="mt-2 text-sm text-gray-600">Loading products...</p> <p className="mt-2 text-sm text-gray-600">Loading products...</p>
</div> </div>
) : products.length === 0 ? ( ) : products.length === 0 ? (
@@ -485,9 +453,9 @@ export function StoreDetailPage() {
<div className="line-clamp-2" title={product.brand || '-'}>{product.brand || '-'}</div> <div className="line-clamp-2" title={product.brand || '-'}>{product.brand || '-'}</div>
</td> </td>
<td className="whitespace-nowrap"> <td className="whitespace-nowrap">
<span className="badge badge-ghost badge-sm">{product.type || '-'}</span> <span className="text-xs text-gray-500 bg-gray-100 px-1.5 py-0.5 rounded">{product.type || '-'}</span>
{product.subcategory && ( {product.subcategory && (
<span className="badge badge-ghost badge-sm ml-1">{product.subcategory}</span> <span className="text-xs text-gray-500 bg-gray-100 px-1.5 py-0.5 rounded ml-1">{product.subcategory}</span>
)} )}
</td> </td>
<td className="text-right font-semibold whitespace-nowrap"> <td className="text-right font-semibold whitespace-nowrap">
@@ -500,21 +468,14 @@ export function StoreDetailPage() {
`$${product.regular_price}` `$${product.regular_price}`
) : '-'} ) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.thc_percentage ? ( {product.thc_percentage ? `${product.thc_percentage}%` : '-'}
<span className="badge badge-success badge-sm">{product.thc_percentage}%</span>
) : '-'}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.stock_status === 'in_stock' ? ( {product.stock_status === 'in_stock' ? 'In Stock' :
<span className="badge badge-success badge-sm">In Stock</span> product.stock_status === 'out_of_stock' ? 'Out' : '-'}
) : product.stock_status === 'out_of_stock' ? (
<span className="badge badge-error badge-sm">Out</span>
) : (
<span className="badge badge-warning badge-sm">Unknown</span>
)}
</td> </td>
<td className="text-center whitespace-nowrap"> <td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.total_quantity != null ? product.total_quantity : '-'} {product.total_quantity != null ? product.total_quantity : '-'}
</td> </td>
<td className="whitespace-nowrap text-xs text-gray-500"> <td className="whitespace-nowrap text-xs text-gray-500">

View File

@@ -12,10 +12,16 @@ import {
Search, Search,
ChevronDown, ChevronDown,
ChevronUp, ChevronUp,
ChevronLeft,
ChevronRight,
Gauge, Gauge,
Users, Users,
Play,
Square,
Plus,
X,
Calendar, Calendar,
Zap, Trash2,
} from 'lucide-react'; } from 'lucide-react';
interface Task { interface Task {
@@ -65,6 +71,313 @@ interface TaskCounts {
stale: number; stale: number;
} }
interface Store {
id: number;
name: string;
state_code: string;
crawl_enabled: boolean;
}
interface CreateTaskModalProps {
isOpen: boolean;
onClose: () => void;
onTaskCreated: () => void;
}
const TASK_ROLES = [
{ id: 'product_refresh', name: 'Product Resync', description: 'Re-crawl products for price/stock changes' },
{ id: 'product_discovery', name: 'Product Discovery', description: 'Initial crawl for new dispensaries' },
{ id: 'store_discovery', name: 'Store Discovery', description: 'Discover new dispensary locations' },
{ id: 'entry_point_discovery', name: 'Entry Point Discovery', description: 'Resolve platform IDs from menu URLs' },
{ id: 'analytics_refresh', name: 'Analytics Refresh', description: 'Refresh materialized views' },
];
function CreateTaskModal({ isOpen, onClose, onTaskCreated }: CreateTaskModalProps) {
const [role, setRole] = useState('product_refresh');
const [priority, setPriority] = useState(10);
const [scheduleType, setScheduleType] = useState<'now' | 'scheduled'>('now');
const [scheduledFor, setScheduledFor] = useState('');
const [stores, setStores] = useState<Store[]>([]);
const [storeSearch, setStoreSearch] = useState('');
const [selectedStores, setSelectedStores] = useState<Store[]>([]);
const [loading, setLoading] = useState(false);
const [storesLoading, setStoresLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
if (isOpen) {
fetchStores();
}
}, [isOpen]);
const fetchStores = async () => {
setStoresLoading(true);
try {
const res = await api.get('/api/stores?limit=500');
setStores(res.data.stores || res.data || []);
} catch (err) {
console.error('Failed to fetch stores:', err);
} finally {
setStoresLoading(false);
}
};
const filteredStores = stores.filter(s =>
s.name.toLowerCase().includes(storeSearch.toLowerCase()) ||
s.state_code?.toLowerCase().includes(storeSearch.toLowerCase())
);
const toggleStore = (store: Store) => {
if (selectedStores.find(s => s.id === store.id)) {
setSelectedStores(selectedStores.filter(s => s.id !== store.id));
} else {
setSelectedStores([...selectedStores, store]);
}
};
const selectAll = () => setSelectedStores(filteredStores);
const clearAll = () => setSelectedStores([]);
const handleSubmit = async () => {
setLoading(true);
setError(null);
try {
const scheduledDate = scheduleType === 'scheduled' && scheduledFor
? new Date(scheduledFor).toISOString()
: undefined;
if (role === 'store_discovery' || role === 'analytics_refresh') {
await api.post('/api/tasks', {
role,
priority,
scheduled_for: scheduledDate,
platform: 'dutchie',
});
} else if (selectedStores.length === 0) {
setError('Please select at least one store');
setLoading(false);
return;
} else {
for (const store of selectedStores) {
await api.post('/api/tasks', {
role,
dispensary_id: store.id,
priority,
scheduled_for: scheduledDate,
platform: 'dutchie',
});
}
}
onTaskCreated();
onClose();
setSelectedStores([]);
setPriority(10);
setScheduleType('now');
setScheduledFor('');
} catch (err: any) {
setError(err.response?.data?.error || err.message || 'Failed to create task');
} finally {
setLoading(false);
}
};
if (!isOpen) return null;
const needsStore = role !== 'store_discovery' && role !== 'analytics_refresh';
return (
<div className="fixed inset-0 z-50 overflow-y-auto">
<div className="flex min-h-full items-center justify-center p-4">
<div className="fixed inset-0 bg-black/50" onClick={onClose} />
<div className="relative bg-white rounded-xl shadow-xl max-w-2xl w-full max-h-[90vh] overflow-hidden">
<div className="px-6 py-4 border-b border-gray-200 flex items-center justify-between">
<h2 className="text-lg font-semibold text-gray-900">Create New Task</h2>
<button onClick={onClose} className="p-1 hover:bg-gray-100 rounded">
<X className="w-5 h-5 text-gray-500" />
</button>
</div>
<div className="px-6 py-4 space-y-6 overflow-y-auto max-h-[calc(90vh-140px)]">
{error && (
<div className="bg-red-50 border border-red-200 rounded-lg p-3 text-red-700 text-sm">
{error}
</div>
)}
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Task Role</label>
<div className="grid grid-cols-1 gap-2">
{TASK_ROLES.map(r => (
<button
key={r.id}
onClick={() => setRole(r.id)}
className={`flex items-start gap-3 p-3 rounded-lg border text-left transition-colors ${
role === r.id
? 'border-emerald-500 bg-emerald-50'
: 'border-gray-200 hover:border-gray-300'
}`}
>
<div className={`w-4 h-4 rounded-full border-2 mt-0.5 flex-shrink-0 ${
role === r.id ? 'border-emerald-500 bg-emerald-500' : 'border-gray-300'
}`}>
{role === r.id && (
<div className="w-full h-full flex items-center justify-center">
<div className="w-1.5 h-1.5 bg-white rounded-full" />
</div>
)}
</div>
<div>
<p className="font-medium text-gray-900">{r.name}</p>
<p className="text-xs text-gray-500">{r.description}</p>
</div>
</button>
))}
</div>
</div>
{needsStore && (
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Select Stores ({selectedStores.length} selected)
</label>
<div className="border border-gray-200 rounded-lg overflow-hidden">
<div className="p-2 border-b border-gray-200 bg-gray-50">
<div className="relative">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-4 h-4 text-gray-400" />
<input
type="text"
value={storeSearch}
onChange={(e) => setStoreSearch(e.target.value)}
placeholder="Search stores..."
className="w-full pl-9 pr-3 py-2 text-sm border border-gray-200 rounded"
/>
</div>
<div className="flex gap-2 mt-2">
<button onClick={selectAll} className="text-xs text-emerald-600 hover:underline">
Select all ({filteredStores.length})
</button>
<span className="text-gray-300">|</span>
<button onClick={clearAll} className="text-xs text-gray-500 hover:underline">
Clear
</button>
</div>
</div>
<div className="max-h-48 overflow-y-auto">
{storesLoading ? (
<div className="p-4 text-center text-gray-500">
<RefreshCw className="w-5 h-5 animate-spin mx-auto mb-1" />
Loading stores...
</div>
) : filteredStores.length === 0 ? (
<div className="p-4 text-center text-gray-500">No stores found</div>
) : (
filteredStores.map(store => (
<label key={store.id} className="flex items-center gap-3 px-3 py-2 hover:bg-gray-50 cursor-pointer">
<input
type="checkbox"
checked={!!selectedStores.find(s => s.id === store.id)}
onChange={() => toggleStore(store)}
className="w-4 h-4 text-emerald-600 rounded"
/>
<div className="flex-1 min-w-0">
<p className="text-sm text-gray-900 truncate">{store.name}</p>
<p className="text-xs text-gray-500">{store.state_code}</p>
</div>
{!store.crawl_enabled && (
<span className="text-xs text-orange-600 bg-orange-50 px-1.5 py-0.5 rounded">disabled</span>
)}
</label>
))
)}
</div>
</div>
</div>
)}
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Priority: {priority}</label>
<input
type="range"
min="0"
max="100"
value={priority}
onChange={(e) => setPriority(parseInt(e.target.value))}
className="w-full h-2 bg-gray-200 rounded-lg appearance-none cursor-pointer"
/>
<div className="flex justify-between text-xs text-gray-500 mt-1">
<span>0 (Low)</span>
<span>10 (Normal)</span>
<span>50 (High)</span>
<span>100 (Urgent)</span>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Schedule</label>
<div className="flex gap-4">
<label className="flex items-center gap-2 cursor-pointer">
<input
type="radio"
name="schedule"
checked={scheduleType === 'now'}
onChange={() => setScheduleType('now')}
className="w-4 h-4 text-emerald-600"
/>
<span className="text-sm text-gray-700">Run immediately</span>
</label>
<label className="flex items-center gap-2 cursor-pointer">
<input
type="radio"
name="schedule"
checked={scheduleType === 'scheduled'}
onChange={() => setScheduleType('scheduled')}
className="w-4 h-4 text-emerald-600"
/>
<span className="text-sm text-gray-700">Schedule for later</span>
</label>
</div>
{scheduleType === 'scheduled' && (
<div className="mt-3 relative">
<Calendar className="absolute left-3 top-1/2 -translate-y-1/2 w-4 h-4 text-gray-400" />
<input
type="datetime-local"
value={scheduledFor}
onChange={(e) => setScheduledFor(e.target.value)}
className="w-full pl-9 pr-3 py-2 text-sm border border-gray-200 rounded"
/>
</div>
)}
</div>
</div>
<div className="px-6 py-4 border-t border-gray-200 bg-gray-50 flex items-center justify-between">
<div className="text-sm text-gray-500">
{needsStore ? (
selectedStores.length > 0 ? `Will create ${selectedStores.length} task${selectedStores.length > 1 ? 's' : ''}` : 'Select stores to create tasks'
) : 'Will create 1 task'}
</div>
<div className="flex gap-3">
<button onClick={onClose} className="px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-lg">
Cancel
</button>
<button
onClick={handleSubmit}
disabled={loading || (needsStore && selectedStores.length === 0)}
className="px-4 py-2 text-sm bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 disabled:opacity-50 disabled:cursor-not-allowed flex items-center gap-2"
>
{loading && <RefreshCw className="w-4 h-4 animate-spin" />}
Create Task{selectedStores.length > 1 ? 's' : ''}
</button>
</div>
</div>
</div>
</div>
</div>
);
}
const ROLES = [ const ROLES = [
'store_discovery', 'store_discovery',
'entry_point_discovery', 'entry_point_discovery',
@@ -82,6 +395,27 @@ const STATUS_COLORS: Record<string, string> = {
stale: 'bg-gray-100 text-gray-800', stale: 'bg-gray-100 text-gray-800',
}; };
const getStatusIcon = (status: string, poolPaused: boolean): React.ReactNode => {
switch (status) {
case 'pending':
return <Clock className="w-4 h-4" />;
case 'claimed':
return <PlayCircle className="w-4 h-4" />;
case 'running':
// Don't spin when pool is paused
return <RefreshCw className={`w-4 h-4 ${!poolPaused ? 'animate-spin' : ''}`} />;
case 'completed':
return <CheckCircle2 className="w-4 h-4" />;
case 'failed':
return <XCircle className="w-4 h-4" />;
case 'stale':
return <AlertTriangle className="w-4 h-4" />;
default:
return null;
}
};
// Static version for summary cards (always shows animation)
const STATUS_ICONS: Record<string, React.ReactNode> = { const STATUS_ICONS: Record<string, React.ReactNode> = {
pending: <Clock className="w-4 h-4" />, pending: <Clock className="w-4 h-4" />,
claimed: <PlayCircle className="w-4 h-4" />, claimed: <PlayCircle className="w-4 h-4" />,
@@ -116,6 +450,13 @@ export default function TasksDashboard() {
const [capacity, setCapacity] = useState<CapacityMetric[]>([]); const [capacity, setCapacity] = useState<CapacityMetric[]>([]);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null); const [error, setError] = useState<string | null>(null);
const [poolPaused, setPoolPaused] = useState(false);
const [poolLoading, setPoolLoading] = useState(false);
const [showCreateModal, setShowCreateModal] = useState(false);
// Pagination
const [page, setPage] = useState(0);
const tasksPerPage = 25;
// Filters // Filters
const [roleFilter, setRoleFilter] = useState<string>(''); const [roleFilter, setRoleFilter] = useState<string>('');
@@ -123,13 +464,10 @@ export default function TasksDashboard() {
const [searchQuery, setSearchQuery] = useState(''); const [searchQuery, setSearchQuery] = useState('');
const [showCapacity, setShowCapacity] = useState(true); const [showCapacity, setShowCapacity] = useState(true);
// Actions
const [actionLoading, setActionLoading] = useState(false);
const [actionMessage, setActionMessage] = useState<string | null>(null);
const fetchData = async () => { const fetchData = async () => {
try { try {
const [tasksRes, countsRes, capacityRes] = await Promise.all([ const [tasksRes, countsRes, capacityRes, poolStatus] = await Promise.all([
api.getTasks({ api.getTasks({
role: roleFilter || undefined, role: roleFilter || undefined,
status: statusFilter || undefined, status: statusFilter || undefined,
@@ -137,11 +475,13 @@ export default function TasksDashboard() {
}), }),
api.getTaskCounts(), api.getTaskCounts(),
api.getTaskCapacity(), api.getTaskCapacity(),
api.getTaskPoolStatus(),
]); ]);
setTasks(tasksRes.tasks || []); setTasks(tasksRes.tasks || []);
setCounts(countsRes); setCounts(countsRes);
setCapacity(capacityRes.metrics || []); setCapacity(capacityRes.metrics || []);
setPoolPaused(poolStatus.paused);
setError(null); setError(null);
} catch (err: any) { } catch (err: any) {
setError(err.message || 'Failed to load tasks'); setError(err.message || 'Failed to load tasks');
@@ -150,40 +490,40 @@ export default function TasksDashboard() {
} }
}; };
const togglePool = async () => {
setPoolLoading(true);
try {
if (poolPaused) {
await api.resumeTaskPool();
setPoolPaused(false);
} else {
await api.pauseTaskPool();
setPoolPaused(true);
}
} catch (err: any) {
setError(err.message || 'Failed to toggle pool');
} finally {
setPoolLoading(false);
}
};
const handleDeleteTask = async (taskId: number) => {
if (!confirm('Delete this task?')) return;
try {
await api.delete(`/api/tasks/${taskId}`);
fetchData();
} catch (err: any) {
console.error('Delete error:', err);
alert(err.response?.data?.error || 'Failed to delete task');
}
};
useEffect(() => { useEffect(() => {
fetchData(); fetchData();
const interval = setInterval(fetchData, 10000); // Refresh every 10 seconds const interval = setInterval(fetchData, 15000); // Auto-refresh every 15 seconds
return () => clearInterval(interval); return () => clearInterval(interval);
}, [roleFilter, statusFilter]); }, [roleFilter, statusFilter]);
const handleGenerateResync = async () => {
setActionLoading(true);
try {
const result = await api.generateResyncTasks();
setActionMessage(`Generated ${result.tasks_created} resync tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const handleRecoverStale = async () => {
setActionLoading(true);
try {
const result = await api.recoverStaleTasks();
setActionMessage(`Recovered ${result.tasks_recovered} stale tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const filteredTasks = tasks.filter((task) => { const filteredTasks = tasks.filter((task) => {
if (searchQuery) { if (searchQuery) {
const query = searchQuery.toLowerCase(); const query = searchQuery.toLowerCase();
@@ -197,6 +537,10 @@ export default function TasksDashboard() {
return true; return true;
}); });
// Pagination
const paginatedTasks = filteredTasks.slice(page * tasksPerPage, (page + 1) * tasksPerPage);
const totalPages = Math.ceil(filteredTasks.length / tasksPerPage);
const totalActive = (counts?.claimed || 0) + (counts?.running || 0); const totalActive = (counts?.claimed || 0) + (counts?.running || 0);
const totalPending = counts?.pending || 0; const totalPending = counts?.pending || 0;
@@ -213,7 +557,8 @@ export default function TasksDashboard() {
return ( return (
<Layout> <Layout>
<div className="space-y-6"> <div className="space-y-6">
{/* Header */} {/* Sticky Header */}
<div className="sticky top-0 z-10 bg-white pb-4 -mx-6 px-6 pt-2 border-b border-gray-200 shadow-sm">
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4"> <div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4">
<div> <div>
<h1 className="text-2xl font-bold text-gray-900 flex items-center gap-2"> <h1 className="text-2xl font-bold text-gray-900 flex items-center gap-2">
@@ -225,50 +570,53 @@ export default function TasksDashboard() {
</p> </p>
</div> </div>
<div className="flex gap-2"> <div className="flex items-center gap-4">
{/* Create Task Button */}
<button <button
onClick={handleGenerateResync} onClick={() => setShowCreateModal(true)}
disabled={actionLoading} className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 transition-colors"
className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 disabled:opacity-50"
> >
<Calendar className="w-4 h-4" /> <Plus className="w-4 h-4" />
Generate Resync Create Task
</button> </button>
{/* Pool Toggle */}
<button <button
onClick={handleRecoverStale} onClick={togglePool}
disabled={actionLoading} disabled={poolLoading}
className="flex items-center gap-2 px-4 py-2 bg-gray-600 text-white rounded-lg hover:bg-gray-700 disabled:opacity-50" className={`flex items-center gap-2 px-4 py-2 rounded-lg font-medium transition-colors ${
> poolPaused
<Zap className="w-4 h-4" /> ? 'bg-emerald-100 text-emerald-700 hover:bg-emerald-200'
Recover Stale : 'bg-red-100 text-red-700 hover:bg-red-200'
</button>
<button
onClick={fetchData}
className="flex items-center gap-2 px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
</div>
</div>
{/* Action Message */}
{actionMessage && (
<div
className={`p-4 rounded-lg ${
actionMessage.startsWith('Error')
? 'bg-red-50 text-red-700'
: 'bg-green-50 text-green-700'
}`} }`}
> >
{actionMessage} {poolPaused ? (
</div> <>
<Play className={`w-5 h-5 ${poolLoading ? 'animate-pulse' : ''}`} />
Resume Pool
</>
) : (
<>
<Square className={`w-5 h-5 ${poolLoading ? 'animate-pulse' : ''}`} />
Pause Pool
</>
)} )}
</button>
<span className="text-sm text-gray-400">Auto-refreshes every 15s</span>
</div>
</div>
</div>
{error && ( {error && (
<div className="p-4 bg-red-50 text-red-700 rounded-lg">{error}</div> <div className="p-4 bg-red-50 text-red-700 rounded-lg">{error}</div>
)} )}
{/* Create Task Modal */}
<CreateTaskModal
isOpen={showCreateModal}
onClose={() => setShowCreateModal(false)}
onTaskCreated={fetchData}
/>
{/* Status Summary Cards */} {/* Status Summary Cards */}
<div className="grid grid-cols-2 sm:grid-cols-3 lg:grid-cols-6 gap-4"> <div className="grid grid-cols-2 sm:grid-cols-3 lg:grid-cols-6 gap-4">
{Object.entries(counts || {}).map(([status, count]) => ( {Object.entries(counts || {}).map(([status, count]) => (
@@ -281,7 +629,7 @@ export default function TasksDashboard() {
> >
<div className="flex items-center gap-2 mb-2"> <div className="flex items-center gap-2 mb-2">
<span className={`p-1.5 rounded ${STATUS_COLORS[status]}`}> <span className={`p-1.5 rounded ${STATUS_COLORS[status]}`}>
{STATUS_ICONS[status]} {getStatusIcon(status, poolPaused)}
</span> </span>
<span className="text-sm font-medium text-gray-600 capitalize">{status}</span> <span className="text-sm font-medium text-gray-600 capitalize">{status}</span>
</div> </div>
@@ -471,17 +819,19 @@ export default function TasksDashboard() {
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase"> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Error Error
</th> </th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase w-16">
</th>
</tr> </tr>
</thead> </thead>
<tbody className="divide-y divide-gray-200"> <tbody className="divide-y divide-gray-200">
{filteredTasks.length === 0 ? ( {paginatedTasks.length === 0 ? (
<tr> <tr>
<td colSpan={8} className="px-4 py-8 text-center text-gray-500"> <td colSpan={9} className="px-4 py-8 text-center text-gray-500">
No tasks found No tasks found
</td> </td>
</tr> </tr>
) : ( ) : (
filteredTasks.map((task) => ( paginatedTasks.map((task) => (
<tr key={task.id} className="hover:bg-gray-50"> <tr key={task.id} className="hover:bg-gray-50">
<td className="px-4 py-3 text-sm font-mono text-gray-600">#{task.id}</td> <td className="px-4 py-3 text-sm font-mono text-gray-600">#{task.id}</td>
<td className="px-4 py-3 text-sm text-gray-900"> <td className="px-4 py-3 text-sm text-gray-900">
@@ -496,7 +846,7 @@ export default function TasksDashboard() {
STATUS_COLORS[task.status] STATUS_COLORS[task.status]
}`} }`}
> >
{STATUS_ICONS[task.status]} {getStatusIcon(task.status, poolPaused)}
{task.status} {task.status}
</span> </span>
</td> </td>
@@ -512,12 +862,47 @@ export default function TasksDashboard() {
<td className="px-4 py-3 text-sm text-red-600 max-w-xs truncate"> <td className="px-4 py-3 text-sm text-red-600 max-w-xs truncate">
{task.error_message || '-'} {task.error_message || '-'}
</td> </td>
<td className="px-4 py-3">
{(task.status === 'failed' || task.status === 'completed' || task.status === 'pending') && (
<button
onClick={() => handleDeleteTask(task.id)}
className="p-1 text-gray-400 hover:text-red-500 hover:bg-red-50 rounded transition-colors"
title="Delete task"
>
<Trash2 className="w-4 h-4" />
</button>
)}
</td>
</tr> </tr>
)) ))
)} )}
</tbody> </tbody>
</table> </table>
</div> </div>
{/* Pagination */}
<div className="px-4 py-3 border-t border-gray-200 bg-gray-50 flex items-center justify-between">
<div className="text-sm text-gray-500">
Showing {page * tasksPerPage + 1} - {Math.min((page + 1) * tasksPerPage, filteredTasks.length)} of {filteredTasks.length} tasks
</div>
<div className="flex items-center gap-2">
<button
onClick={() => setPage(p => Math.max(0, p - 1))}
disabled={page === 0}
className="px-3 py-1 text-sm border border-gray-200 rounded hover:bg-gray-100 disabled:opacity-50 disabled:cursor-not-allowed"
>
<ChevronLeft className="w-4 h-4" />
</button>
<span className="text-sm text-gray-600">Page {page + 1} of {totalPages || 1}</span>
<button
onClick={() => setPage(p => p + 1)}
disabled={page >= totalPages - 1}
className="px-3 py-1 text-sm border border-gray-200 rounded hover:bg-gray-100 disabled:opacity-50 disabled:cursor-not-allowed"
>
<ChevronRight className="w-4 h-4" />
</button>
</div>
</div>
</div> </div>
</div> </div>
</Layout> </Layout>

View File

@@ -18,6 +18,11 @@ import {
Server, Server,
MapPin, MapPin,
Trash2, Trash2,
PowerOff,
Undo2,
Plus,
MemoryStick,
AlertTriangle,
} from 'lucide-react'; } from 'lucide-react';
// Worker from registry // Worker from registry
@@ -36,16 +41,25 @@ interface Worker {
tasks_completed: number; tasks_completed: number;
tasks_failed: number; tasks_failed: number;
current_task_id: number | null; current_task_id: number | null;
current_task_ids?: number[]; // Multiple concurrent tasks
active_task_count?: number;
max_concurrent_tasks?: number;
health_status: string; health_status: string;
seconds_since_heartbeat: number; seconds_since_heartbeat: number;
decommission_requested?: boolean;
decommission_reason?: string;
metadata: { metadata: {
cpu?: number; cpu?: number;
memory?: number; memory?: number;
memoryTotal?: number; memoryTotal?: number;
memory_mb?: number; memory_mb?: number;
memory_total_mb?: number; memory_total_mb?: number;
memory_percent?: number; // NEW: memory as percentage
cpu_user_ms?: number; cpu_user_ms?: number;
cpu_system_ms?: number; cpu_system_ms?: number;
cpu_percent?: number; // NEW: CPU percentage
is_backing_off?: boolean; // NEW: resource backoff state
backoff_reason?: string; // NEW: why backing off
proxy_location?: { proxy_location?: {
city?: string; city?: string;
state?: string; state?: string;
@@ -209,26 +223,257 @@ function HealthBadge({ status, healthStatus }: { status: string; healthStatus: s
); );
} }
// Format CPU time for display
function formatCpuTime(ms: number): string {
if (ms < 1000) return `${ms}ms`;
if (ms < 60000) return `${(ms / 1000).toFixed(1)}s`;
return `${(ms / 60000).toFixed(1)}m`;
}
// Resource usage badge showing memory%, CPU%, and backoff status
function ResourceBadge({ worker }: { worker: Worker }) {
const memPercent = worker.metadata?.memory_percent;
const cpuPercent = worker.metadata?.cpu_percent;
const isBackingOff = worker.metadata?.is_backing_off;
const backoffReason = worker.metadata?.backoff_reason;
if (isBackingOff) {
return (
<div className="flex items-center gap-1.5" title={backoffReason || 'Backing off due to resource pressure'}>
<AlertTriangle className="w-4 h-4 text-amber-500 animate-pulse" />
<span className="text-xs text-amber-600 font-medium">Backing off</span>
</div>
);
}
// No data yet
if (memPercent === undefined && cpuPercent === undefined) {
return <span className="text-gray-400 text-xs">-</span>;
}
// Color based on usage level
const getColor = (pct: number) => {
if (pct >= 90) return 'text-red-600';
if (pct >= 75) return 'text-amber-600';
if (pct >= 50) return 'text-yellow-600';
return 'text-emerald-600';
};
return (
<div className="flex flex-col gap-0.5 text-xs">
{memPercent !== undefined && (
<div className="flex items-center gap-1" title={`Memory: ${worker.metadata?.memory_mb || 0}MB / ${worker.metadata?.memory_total_mb || 0}MB`}>
<MemoryStick className={`w-3 h-3 ${getColor(memPercent)}`} />
<span className={getColor(memPercent)}>{memPercent}%</span>
</div>
)}
{cpuPercent !== undefined && (
<div className="flex items-center gap-1">
<Cpu className={`w-3 h-3 ${getColor(cpuPercent)}`} />
<span className={getColor(cpuPercent)}>{cpuPercent}%</span>
</div>
)}
</div>
);
}
// Task count badge showing active/max concurrent tasks
function TaskCountBadge({ worker, tasks }: { worker: Worker; tasks: Task[] }) {
const activeCount = worker.active_task_count ?? (worker.current_task_id ? 1 : 0);
const maxCount = worker.max_concurrent_tasks ?? 1;
const taskIds = worker.current_task_ids ?? (worker.current_task_id ? [worker.current_task_id] : []);
if (activeCount === 0) {
return <span className="text-gray-400 text-sm">Idle</span>;
}
// Get task names for tooltip
const taskNames = taskIds.map(id => {
const task = tasks.find(t => t.id === id);
return task ? `#${id}: ${task.role}${task.dispensary_name ? ` (${task.dispensary_name})` : ''}` : `#${id}`;
}).join('\n');
return (
<div className="flex items-center gap-2" title={taskNames}>
<span className="text-sm font-medium text-blue-600">
{activeCount}/{maxCount} tasks
</span>
{taskIds.length === 1 && (
<span className="text-xs text-gray-500">#{taskIds[0]}</span>
)}
</div>
);
}
// Pod visualization - shows pod as hub with worker nodes radiating out
function PodVisualization({
podName,
workers,
isSelected = false,
onSelect
}: {
podName: string;
workers: Worker[];
isSelected?: boolean;
onSelect?: () => void;
}) {
const busyCount = workers.filter(w => w.current_task_id !== null).length;
const allBusy = busyCount === workers.length;
const allIdle = busyCount === 0;
// Aggregate resource stats for the pod
const totalMemoryMb = workers.reduce((sum, w) => sum + (w.metadata?.memory_mb || 0), 0);
const totalCpuUserMs = workers.reduce((sum, w) => sum + (w.metadata?.cpu_user_ms || 0), 0);
const totalCpuSystemMs = workers.reduce((sum, w) => sum + (w.metadata?.cpu_system_ms || 0), 0);
const totalCompleted = workers.reduce((sum, w) => sum + w.tasks_completed, 0);
const totalFailed = workers.reduce((sum, w) => sum + w.tasks_failed, 0);
// Pod color based on worker status
const podColor = allBusy ? 'bg-blue-500' : allIdle ? 'bg-emerald-500' : 'bg-yellow-500';
const podBorder = allBusy ? 'border-blue-400' : allIdle ? 'border-emerald-400' : 'border-yellow-400';
const podGlow = allBusy ? 'shadow-blue-200' : allIdle ? 'shadow-emerald-200' : 'shadow-yellow-200';
// Selection ring
const selectionRing = isSelected ? 'ring-4 ring-purple-400 ring-offset-2' : '';
// Build pod tooltip
const podTooltip = [
`Pod: ${podName}`,
`Workers: ${busyCount}/${workers.length} busy`,
`Memory: ${totalMemoryMb} MB (RSS)`,
`CPU: ${formatCpuTime(totalCpuUserMs)} user, ${formatCpuTime(totalCpuSystemMs)} system`,
`Tasks: ${totalCompleted} completed, ${totalFailed} failed`,
'Click to select',
].join('\n');
return (
<div className="flex flex-col items-center p-4">
{/* Pod hub */}
<div className="relative">
{/* Center pod circle */}
<div
className={`w-20 h-20 rounded-full ${podColor} border-4 ${podBorder} shadow-lg ${podGlow} ${selectionRing} flex items-center justify-center text-white font-bold text-xs text-center leading-tight z-10 relative cursor-pointer hover:scale-105 transition-all`}
title={podTooltip}
onClick={onSelect}
>
<span className="px-1">{podName}</span>
</div>
{/* Worker nodes radiating out */}
{workers.map((worker, index) => {
const angle = (index * 360) / workers.length - 90; // Start from top
const radians = (angle * Math.PI) / 180;
const radius = 55; // Distance from center
const x = Math.cos(radians) * radius;
const y = Math.sin(radians) * radius;
const isBusy = worker.current_task_id !== null;
const isDecommissioning = worker.decommission_requested;
const workerColor = isDecommissioning ? 'bg-orange-500' : isBusy ? 'bg-blue-500' : 'bg-emerald-500';
const workerBorder = isDecommissioning ? 'border-orange-300' : isBusy ? 'border-blue-300' : 'border-emerald-300';
// Line from center to worker
const lineLength = radius - 10;
const lineX = Math.cos(radians) * (lineLength / 2 + 10);
const lineY = Math.sin(radians) * (lineLength / 2 + 10);
return (
<div key={worker.id}>
{/* Connection line */}
<div
className={`absolute w-0.5 ${isDecommissioning ? 'bg-orange-300' : isBusy ? 'bg-blue-300' : 'bg-emerald-300'}`}
style={{
height: `${lineLength}px`,
left: '50%',
top: '50%',
transform: `translate(-50%, -50%) translate(${lineX}px, ${lineY}px) rotate(${angle + 90}deg)`,
transformOrigin: 'center',
}}
/>
{/* Worker node */}
<div
className={`absolute w-6 h-6 rounded-full ${workerColor} border-2 ${workerBorder} flex items-center justify-center text-white text-xs font-bold cursor-pointer hover:scale-110 transition-transform`}
style={{
left: '50%',
top: '50%',
transform: `translate(-50%, -50%) translate(${x}px, ${y}px)`,
}}
title={`${worker.friendly_name}\nStatus: ${isDecommissioning ? 'Stopping after current task' : isBusy ? `Working on task #${worker.current_task_id}` : 'Idle - waiting for tasks'}\nMemory: ${worker.metadata?.memory_mb || 0} MB\nCPU: ${formatCpuTime(worker.metadata?.cpu_user_ms || 0)} user, ${formatCpuTime(worker.metadata?.cpu_system_ms || 0)} sys\nCompleted: ${worker.tasks_completed} | Failed: ${worker.tasks_failed}\nLast heartbeat: ${new Date(worker.last_heartbeat_at).toLocaleTimeString()}`}
>
{index + 1}
</div>
</div>
);
})}
</div>
{/* Pod stats */}
<div className="mt-12 text-center">
<p className="text-xs text-gray-500">
{busyCount}/{workers.length} busy
</p>
{isSelected && (
<p className="text-xs text-purple-600 font-medium mt-1">Selected</p>
)}
</div>
</div>
);
}
// Group workers by pod
function groupWorkersByPod(workers: Worker[]): Map<string, Worker[]> {
const pods = new Map<string, Worker[]>();
for (const worker of workers) {
const podName = worker.pod_name || 'Unknown';
if (!pods.has(podName)) {
pods.set(podName, []);
}
pods.get(podName)!.push(worker);
}
return pods;
}
// Format estimated time remaining
function formatEstimatedTime(hours: number): string {
if (hours < 1) {
return `${Math.round(hours * 60)} minutes`;
}
if (hours < 24) {
return `${hours.toFixed(1)} hours`;
}
const days = hours / 24;
if (days < 7) {
return `${days.toFixed(1)} days`;
}
return `${(days / 7).toFixed(1)} weeks`;
}
export function WorkersDashboard() { export function WorkersDashboard() {
const [workers, setWorkers] = useState<Worker[]>([]); const [workers, setWorkers] = useState<Worker[]>([]);
const [tasks, setTasks] = useState<Task[]>([]); const [tasks, setTasks] = useState<Task[]>([]);
const [pendingTaskCount, setPendingTaskCount] = useState<number>(0);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null); const [error, setError] = useState<string | null>(null);
// Pod selection state
const [selectedPod, setSelectedPod] = useState<string | null>(null);
// Pagination // Pagination
const [page, setPage] = useState(0); const [page, setPage] = useState(0);
const workersPerPage = 15; const workersPerPage = 15;
const fetchData = useCallback(async () => { const fetchData = useCallback(async () => {
try { try {
// Fetch workers from registry // Fetch workers from registry, running tasks, and task counts
const workersRes = await api.get('/api/worker-registry/workers'); const [workersRes, tasksRes, countsRes] = await Promise.all([
api.get('/api/worker-registry/workers'),
// Fetch running tasks to get current task details api.get('/api/tasks?status=running&limit=100'),
const tasksRes = await api.get('/api/tasks?status=running&limit=100'); api.get('/api/tasks/counts'),
]);
setWorkers(workersRes.data.workers || []); setWorkers(workersRes.data.workers || []);
setTasks(tasksRes.data.tasks || []); setTasks(tasksRes.data.tasks || []);
setPendingTaskCount(countsRes.data?.pending || 0);
setError(null); setError(null);
} catch (err: any) { } catch (err: any) {
console.error('Fetch error:', err); console.error('Fetch error:', err);
@@ -238,16 +483,6 @@ export function WorkersDashboard() {
} }
}, []); }, []);
// Cleanup stale workers
const handleCleanupStale = async () => {
try {
await api.post('/api/worker-registry/cleanup', { stale_threshold_minutes: 2 });
fetchData();
} catch (err: any) {
console.error('Cleanup error:', err);
}
};
// Remove a single worker // Remove a single worker
const handleRemoveWorker = async (workerId: string) => { const handleRemoveWorker = async (workerId: string) => {
if (!confirm('Remove this worker from the registry?')) return; if (!confirm('Remove this worker from the registry?')) return;
@@ -259,6 +494,46 @@ export function WorkersDashboard() {
} }
}; };
// Decommission a worker (graceful shutdown after current task)
const handleDecommissionWorker = async (workerId: string, friendlyName: string) => {
if (!confirm(`Decommission ${friendlyName}? Worker will stop after completing its current task.`)) return;
try {
const res = await api.post(`/api/worker-registry/workers/${workerId}/decommission`, {
reason: 'Manual decommission from admin UI'
});
if (res.data.success) {
fetchData();
}
} catch (err: any) {
console.error('Decommission error:', err);
alert(err.response?.data?.error || 'Failed to decommission worker');
}
};
// Cancel decommission
const handleCancelDecommission = async (workerId: string) => {
try {
await api.post(`/api/worker-registry/workers/${workerId}/cancel-decommission`);
fetchData();
} catch (err: any) {
console.error('Cancel decommission error:', err);
}
};
// Add a worker by scaling up the K8s deployment
const handleAddWorker = async () => {
try {
const res = await api.post('/api/workers/k8s/scale-up');
if (res.data.success) {
// Refresh after a short delay to see the new worker
setTimeout(fetchData, 2000);
}
} catch (err: any) {
console.error('Add worker error:', err);
alert(err.response?.data?.error || 'Failed to add worker. K8s scaling may not be available.');
}
};
useEffect(() => { useEffect(() => {
fetchData(); fetchData();
const interval = setInterval(fetchData, 5000); const interval = setInterval(fetchData, 5000);
@@ -303,25 +578,9 @@ export function WorkersDashboard() {
<h1 className="text-2xl font-bold text-gray-900">Workers</h1> <h1 className="text-2xl font-bold text-gray-900">Workers</h1>
<p className="text-gray-500 mt-1"> <p className="text-gray-500 mt-1">
{workers.length} registered workers ({busyWorkers.length} busy, {idleWorkers.length} idle) {workers.length} registered workers ({busyWorkers.length} busy, {idleWorkers.length} idle)
<span className="text-xs text-gray-400 ml-2">(auto-refresh 5s)</span>
</p> </p>
</div> </div>
<div className="flex items-center gap-2">
<button
onClick={handleCleanupStale}
className="flex items-center gap-2 px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 transition-colors"
title="Mark stale workers (no heartbeat > 2 min) as offline"
>
<Trash2 className="w-4 h-4" />
Cleanup Stale
</button>
<button
onClick={() => fetchData()}
className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 transition-colors"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
</div>
</div> </div>
{error && ( {error && (
@@ -389,6 +648,197 @@ export function WorkersDashboard() {
</div> </div>
</div> </div>
{/* Estimated Completion Time Card */}
{pendingTaskCount > 0 && activeWorkers.length > 0 && (() => {
// Calculate average task rate across all workers
const totalHoursUp = activeWorkers.reduce((sum, w) => {
if (!w.started_at) return sum;
const start = new Date(w.started_at);
const now = new Date();
return sum + (now.getTime() - start.getTime()) / (1000 * 60 * 60);
}, 0);
const totalTasksDone = totalCompleted + totalFailed;
const avgTasksPerHour = totalHoursUp > 0.1 ? totalTasksDone / totalHoursUp : 0;
const estimatedHours = avgTasksPerHour > 0 ? pendingTaskCount / avgTasksPerHour : null;
return (
<div className="bg-gradient-to-r from-amber-50 to-orange-50 rounded-lg border border-amber-200 p-4">
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-amber-100 rounded-lg flex items-center justify-center">
<Clock className="w-5 h-5 text-amber-600" />
</div>
<div>
<p className="text-sm text-amber-700 font-medium">Estimated Time to Complete Queue</p>
<p className="text-2xl font-bold text-amber-900">
{estimatedHours !== null ? formatEstimatedTime(estimatedHours) : 'Calculating...'}
</p>
</div>
</div>
<div className="text-right text-sm text-amber-700">
<p><span className="font-semibold">{pendingTaskCount}</span> pending tasks</p>
<p><span className="font-semibold">{activeWorkers.length}</span> active workers</p>
{avgTasksPerHour > 0 && (
<p className="text-xs text-amber-600 mt-1">
~{avgTasksPerHour.toFixed(1)} tasks/hour
</p>
)}
</div>
</div>
</div>
);
})()}
{/* Worker Pods Visualization */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="px-4 py-3 border-b border-gray-200 bg-gray-50">
<div className="flex items-center justify-between">
<div>
<h3 className="text-sm font-semibold text-gray-900 flex items-center gap-2">
<Zap className="w-4 h-4 text-emerald-500" />
Worker Pods ({Array.from(groupWorkersByPod(workers)).length} pods, {activeWorkers.length} workers)
</h3>
<p className="text-xs text-gray-500 mt-0.5">
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-emerald-500"></span> idle</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-blue-500"></span> busy</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-yellow-500"></span> mixed</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-orange-500"></span> stopping</span>
</p>
</div>
<div className="text-sm text-gray-500">
{busyWorkers.length} busy, {activeWorkers.length - busyWorkers.length} idle
{selectedPod && (
<button
onClick={() => setSelectedPod(null)}
className="ml-3 text-xs text-purple-600 hover:text-purple-800 underline"
>
Clear selection
</button>
)}
</div>
</div>
</div>
{workers.length === 0 ? (
<div className="px-4 py-12 text-center text-gray-500">
<Users className="w-12 h-12 mx-auto mb-3 text-gray-300" />
<p className="font-medium">No worker pods running</p>
<p className="text-xs mt-1">Start pods to process tasks from the queue</p>
</div>
) : (
<div className="p-6">
<div className="flex flex-wrap justify-center gap-8">
{Array.from(groupWorkersByPod(workers)).map(([podName, podWorkers]) => (
<PodVisualization
key={podName}
podName={podName}
workers={podWorkers}
isSelected={selectedPod === podName}
onSelect={() => setSelectedPod(selectedPod === podName ? null : podName)}
/>
))}
</div>
{/* Selected Pod Control Panel */}
{selectedPod && (() => {
const podWorkers = groupWorkersByPod(workers).get(selectedPod) || [];
const busyInPod = podWorkers.filter(w => w.current_task_id !== null).length;
const idleInPod = podWorkers.filter(w => w.current_task_id === null && !w.decommission_requested).length;
const stoppingInPod = podWorkers.filter(w => w.decommission_requested).length;
return (
<div className="mt-6 border-t border-gray-200 pt-6">
<div className="bg-purple-50 rounded-lg border border-purple-200 p-4">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-purple-100 rounded-lg flex items-center justify-center">
<Server className="w-5 h-5 text-purple-600" />
</div>
<div>
<h4 className="font-semibold text-purple-900">{selectedPod}</h4>
<p className="text-xs text-purple-600">
{podWorkers.length} workers: {busyInPod} busy, {idleInPod} idle{stoppingInPod > 0 && `, ${stoppingInPod} stopping`}
</p>
</div>
</div>
</div>
{/* Worker list in selected pod */}
<div className="space-y-2">
{podWorkers.map((worker) => {
const isBusy = worker.current_task_id !== null;
const isDecommissioning = worker.decommission_requested;
return (
<div key={worker.id} className="flex items-center justify-between bg-white rounded-lg px-3 py-2 border border-purple-100">
<div className="flex items-center gap-3">
<div className={`w-8 h-8 rounded-full flex items-center justify-center text-white text-sm font-bold ${
isDecommissioning ? 'bg-orange-500' :
isBusy ? 'bg-blue-500' : 'bg-emerald-500'
}`}>
{worker.friendly_name?.charAt(0) || '?'}
</div>
<div>
<p className="text-sm font-medium text-gray-900">{worker.friendly_name}</p>
<p className="text-xs text-gray-500">
{isDecommissioning ? (
<span className="text-orange-600">Stopping after current task...</span>
) : isBusy ? (
<span className="text-blue-600">Working on task #{worker.current_task_id}</span>
) : (
<span className="text-emerald-600">Idle - ready for tasks</span>
)}
</p>
</div>
</div>
<div className="flex items-center gap-2">
{isDecommissioning ? (
<button
onClick={() => handleCancelDecommission(worker.worker_id)}
className="flex items-center gap-1.5 px-3 py-1.5 text-sm bg-white border border-gray-300 text-gray-700 rounded-lg hover:bg-gray-50 transition-colors"
title="Cancel decommission"
>
<Undo2 className="w-4 h-4" />
Cancel
</button>
) : (
<button
onClick={() => handleDecommissionWorker(worker.worker_id, worker.friendly_name)}
className="flex items-center gap-1.5 px-3 py-1.5 text-sm bg-orange-100 text-orange-700 rounded-lg hover:bg-orange-200 transition-colors"
title={isBusy ? 'Worker will stop after completing current task' : 'Remove idle worker'}
>
<PowerOff className="w-4 h-4" />
{isBusy ? 'Stop after task' : 'Remove'}
</button>
)}
</div>
</div>
);
})}
</div>
{/* Add Worker button */}
<div className="mt-4 pt-4 border-t border-purple-200">
<button
onClick={handleAddWorker}
className="flex items-center gap-1.5 px-3 py-2 text-sm bg-emerald-100 text-emerald-700 rounded-lg hover:bg-emerald-200 transition-colors"
>
<Plus className="w-4 h-4" />
Add Worker
</button>
</div>
</div>
</div>
);
})()}
</div>
)}
</div>
{/* Workers Table */} {/* Workers Table */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden"> <div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="px-4 py-3 border-b border-gray-200 bg-gray-50 flex items-center justify-between"> <div className="px-4 py-3 border-b border-gray-200 bg-gray-50 flex items-center justify-between">
@@ -431,10 +881,10 @@ export function WorkersDashboard() {
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Worker</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Worker</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Role</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Role</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Status</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Status</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Exit Location</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Resources</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Current Task</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Tasks</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Duration</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Duration</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Utilization</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Throughput</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Heartbeat</th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Heartbeat</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase"></th> <th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase"></th>
</tr> </tr>
@@ -449,16 +899,29 @@ export function WorkersDashboard() {
<tr key={worker.id} className="hover:bg-gray-50"> <tr key={worker.id} className="hover:bg-gray-50">
<td className="px-4 py-3"> <td className="px-4 py-3">
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
<div className={`w-10 h-10 rounded-full flex items-center justify-center text-white font-bold text-sm ${ <div className={`w-10 h-10 rounded-full flex items-center justify-center text-white font-bold text-sm relative ${
worker.decommission_requested ? 'bg-orange-500' :
worker.health_status === 'offline' ? 'bg-gray-400' : worker.health_status === 'offline' ? 'bg-gray-400' :
worker.health_status === 'stale' ? 'bg-yellow-500' : worker.health_status === 'stale' ? 'bg-yellow-500' :
worker.health_status === 'busy' ? 'bg-blue-500' : worker.health_status === 'busy' ? 'bg-blue-500' :
'bg-emerald-500' 'bg-emerald-500'
}`}> }`}>
{worker.friendly_name?.charAt(0) || '?'} {worker.friendly_name?.charAt(0) || '?'}
{worker.decommission_requested && (
<div className="absolute -top-1 -right-1 w-4 h-4 bg-red-500 rounded-full flex items-center justify-center">
<PowerOff className="w-2.5 h-2.5 text-white" />
</div>
)}
</div> </div>
<div> <div>
<p className="font-medium text-gray-900">{worker.friendly_name}</p> <p className="font-medium text-gray-900 flex items-center gap-1.5">
{worker.friendly_name}
{worker.decommission_requested && (
<span className="text-xs text-orange-600 bg-orange-100 px-1.5 py-0.5 rounded" title={worker.decommission_reason || 'Pending decommission'}>
stopping
</span>
)}
</p>
<p className="text-xs text-gray-400 font-mono">{worker.worker_id.slice(0, 20)}...</p> <p className="text-xs text-gray-400 font-mono">{worker.worker_id.slice(0, 20)}...</p>
</div> </div>
</div> </div>
@@ -470,45 +933,10 @@ export function WorkersDashboard() {
<HealthBadge status={worker.status} healthStatus={worker.health_status} /> <HealthBadge status={worker.status} healthStatus={worker.health_status} />
</td> </td>
<td className="px-4 py-3"> <td className="px-4 py-3">
{(() => { <ResourceBadge worker={worker} />
const loc = worker.metadata?.proxy_location;
if (!loc) {
return <span className="text-gray-400 text-sm">-</span>;
}
const parts = [loc.city, loc.state, loc.country].filter(Boolean);
if (parts.length === 0) {
return loc.isRotating ? (
<span className="text-xs text-purple-600 font-medium" title="Rotating proxy - exit location varies per request">
Rotating
</span>
) : (
<span className="text-gray-400 text-sm">Unknown</span>
);
}
return (
<div className="flex items-center gap-1.5" title={loc.timezone || ''}>
<MapPin className="w-3 h-3 text-gray-400" />
<span className="text-sm text-gray-700">
{parts.join(', ')}
</span>
{loc.isRotating && (
<span className="text-xs text-purple-500" title="Rotating proxy">*</span>
)}
</div>
);
})()}
</td> </td>
<td className="px-4 py-3"> <td className="px-4 py-3">
{worker.current_task_id ? ( <TaskCountBadge worker={worker} tasks={tasks} />
<div>
<span className="text-sm text-gray-900">Task #{worker.current_task_id}</span>
{currentTask?.dispensary_name && (
<p className="text-xs text-gray-500">{currentTask.dispensary_name}</p>
)}
</div>
) : (
<span className="text-gray-400 text-sm">Idle</span>
)}
</td> </td>
<td className="px-4 py-3"> <td className="px-4 py-3">
{currentTask?.started_at ? ( {currentTask?.started_at ? (

View File

@@ -7,7 +7,7 @@
import { useState, useEffect } from 'react'; import { useState, useEffect } from 'react';
import { api } from '../../../lib/api'; import { api } from '../../../lib/api';
import { Building2, Tag, Globe, Target, FileText, RefreshCw, Sparkles, Loader2 } from 'lucide-react'; import { Building2, Tag, Globe, Target, FileText, RefreshCw, Sparkles, Loader2, AlertCircle } from 'lucide-react';
interface SeoPage { interface SeoPage {
id: number; id: number;
@@ -47,11 +47,31 @@ export function PagesTab() {
const [search, setSearch] = useState(''); const [search, setSearch] = useState('');
const [syncing, setSyncing] = useState(false); const [syncing, setSyncing] = useState(false);
const [generatingId, setGeneratingId] = useState<number | null>(null); const [generatingId, setGeneratingId] = useState<number | null>(null);
const [hasActiveAiProvider, setHasActiveAiProvider] = useState<boolean | null>(null);
useEffect(() => { useEffect(() => {
loadPages(); loadPages();
checkAiProvider();
}, [typeFilter, search]); }, [typeFilter, search]);
async function checkAiProvider() {
try {
const data = await api.getSettings();
const settings = data.settings || [];
// Check if either Anthropic or OpenAI is configured with an API key AND enabled
const anthropicKey = settings.find((s: any) => s.key === 'anthropic_api_key')?.value;
const anthropicEnabled = settings.find((s: any) => s.key === 'anthropic_enabled')?.value === 'true';
const openaiKey = settings.find((s: any) => s.key === 'openai_api_key')?.value;
const openaiEnabled = settings.find((s: any) => s.key === 'openai_enabled')?.value === 'true';
const hasProvider = (anthropicKey && anthropicEnabled) || (openaiKey && openaiEnabled);
setHasActiveAiProvider(!!hasProvider);
} catch (error) {
console.error('Failed to check AI provider:', error);
setHasActiveAiProvider(false);
}
}
async function loadPages() { async function loadPages() {
setLoading(true); setLoading(true);
try { try {
@@ -188,12 +208,18 @@ export function PagesTab() {
<td className="px-3 sm:px-4 py-3"> <td className="px-3 sm:px-4 py-3">
<button <button
onClick={() => handleGenerate(page.id)} onClick={() => handleGenerate(page.id)}
disabled={generatingId === page.id} disabled={generatingId === page.id || hasActiveAiProvider === false}
className="flex items-center gap-1 px-2 sm:px-3 py-1.5 text-xs font-medium bg-purple-50 text-purple-700 rounded-lg hover:bg-purple-100 disabled:opacity-50" className={`flex items-center gap-1 px-2 sm:px-3 py-1.5 text-xs font-medium rounded-lg disabled:cursor-not-allowed ${
title="Generate content" hasActiveAiProvider === false
? 'bg-gray-100 text-gray-400'
: 'bg-purple-50 text-purple-700 hover:bg-purple-100 disabled:opacity-50'
}`}
title={hasActiveAiProvider === false ? 'No Active AI Provider' : 'Generate content'}
> >
{generatingId === page.id ? ( {generatingId === page.id ? (
<Loader2 className="w-3.5 h-3.5 animate-spin" /> <Loader2 className="w-3.5 h-3.5 animate-spin" />
) : hasActiveAiProvider === false ? (
<AlertCircle className="w-3.5 h-3.5" />
) : ( ) : (
<Sparkles className="w-3.5 h-3.5" /> <Sparkles className="w-3.5 h-3.5" />
)} )}

View File

@@ -7,16 +7,6 @@
"src": "favicon.ico", "src": "favicon.ico",
"sizes": "64x64 32x32 24x24 16x16", "sizes": "64x64 32x32 24x24 16x16",
"type": "image/x-icon" "type": "image/x-icon"
},
{
"src": "logo192.png",
"type": "image/png",
"sizes": "192x192"
},
{
"src": "logo512.png",
"type": "image/png",
"sizes": "512x512"
} }
], ],
"start_url": ".", "start_url": ".",

View File

@@ -373,10 +373,12 @@ export function mapCategoryForUI(apiCategory) {
* Map API brand to UI-compatible format * Map API brand to UI-compatible format
*/ */
export function mapBrandForUI(apiBrand) { export function mapBrandForUI(apiBrand) {
// API returns 'brand' field (see /api/v1/brands endpoint)
const brandName = apiBrand.brand || apiBrand.brand_name || '';
return { return {
id: apiBrand.brand_name, id: brandName,
name: apiBrand.brand_name, name: brandName,
slug: apiBrand.brand_name?.toLowerCase().replace(/\s+/g, '-'), slug: brandName ? brandName.toLowerCase().replace(/\s+/g, '-') : '',
logo: apiBrand.brand_logo_url || null, logo: apiBrand.brand_logo_url || null,
productCount: parseInt(apiBrand.product_count || 0, 10), productCount: parseInt(apiBrand.product_count || 0, 10),
dispensaryCount: parseInt(apiBrand.dispensary_count || 0, 10), dispensaryCount: parseInt(apiBrand.dispensary_count || 0, 10),

View File

@@ -27,7 +27,7 @@ const Brands = () => {
}, []); }, []);
const filteredBrands = brands.filter((brand) => const filteredBrands = brands.filter((brand) =>
brand.name.toLowerCase().includes(searchQuery.toLowerCase()) brand.name && brand.name.toLowerCase().includes(searchQuery.toLowerCase())
); );
// Group brands alphabetically // Group brands alphabetically

36
k8s/scraper-rbac.yaml Normal file
View File

@@ -0,0 +1,36 @@
# RBAC configuration for scraper pod to control worker scaling
# Allows the scraper to read and scale the scraper-worker statefulset
apiVersion: v1
kind: ServiceAccount
metadata:
name: scraper-sa
namespace: dispensary-scraper
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: worker-scaler
namespace: dispensary-scraper
rules:
# Allow reading deployment and statefulset status
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets"]
verbs: ["get", "list"]
# Allow scaling deployments and statefulsets
- apiGroups: ["apps"]
resources: ["deployments/scale", "statefulsets/scale"]
verbs: ["get", "patch", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: scraper-worker-scaler
namespace: dispensary-scraper
subjects:
- kind: ServiceAccount
name: scraper-sa
namespace: dispensary-scraper
roleRef:
kind: Role
name: worker-scaler
apiGroup: rbac.authorization.k8s.io

View File

@@ -1,4 +1,67 @@
# Task Worker Pods # Task Worker Deployment
#
# Simple Deployment that runs task-worker.js to process tasks from worker_tasks queue.
# Workers pull tasks using DB-level locking (FOR UPDATE SKIP LOCKED).
#
# The worker will wait up to 60 minutes for active proxies to be added before failing.
# This allows deployment to succeed even if proxies aren't configured yet.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: scraper-worker
namespace: dispensary-scraper
spec:
replicas: 25
selector:
matchLabels:
app: scraper-worker
template:
metadata:
labels:
app: scraper-worker
spec:
imagePullSecrets:
- name: regcred
containers:
- name: worker
image: code.cannabrands.app/creationshop/dispensary-scraper:latest
command: ["node"]
args: ["dist/tasks/task-worker.js"]
envFrom:
- configMapRef:
name: scraper-config
- secretRef:
name: scraper-secrets
env:
- name: WORKER_MODE
value: "true"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "pgrep -f 'task-worker' > /dev/null"
initialDelaySeconds: 60
periodSeconds: 30
failureThreshold: 3
terminationGracePeriodSeconds: 60
---
# =============================================================================
# ALTERNATIVE: StatefulSet with multiple workers per pod (not currently used)
# =============================================================================
# Task Worker Pods (StatefulSet)
# Each pod runs 5 role-agnostic workers that pull tasks from worker_tasks queue. # Each pod runs 5 role-agnostic workers that pull tasks from worker_tasks queue.
# #
# Architecture: # Architecture:

View File

@@ -25,6 +25,7 @@ spec:
labels: labels:
app: scraper app: scraper
spec: spec:
serviceAccountName: scraper-sa
imagePullSecrets: imagePullSecrets:
- name: regcred - name: regcred
containers: containers:

View File

@@ -1 +1 @@
1.5.4 1.6.0

View File

@@ -312,3 +312,184 @@
border-radius: 4px; border-radius: 4px;
border-left: 4px solid #c62828; border-left: 4px solid #c62828;
} }
/* ========================================
Brand Grid Widget
======================================== */
.cannaiq-brand-grid {
display: grid;
gap: 20px;
margin: 20px 0;
}
.cannaiq-brand-card {
background: #fff;
border-radius: 8px;
padding: 20px;
text-align: center;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
transition: transform 0.2s, box-shadow 0.2s;
}
.cannaiq-brand-card:hover {
transform: translateY(-4px);
box-shadow: 0 4px 16px rgba(0, 0, 0, 0.15);
}
.cannaiq-brand-name {
font-size: 16px;
font-weight: 600;
margin: 0 0 8px 0;
color: #333;
}
.cannaiq-brand-count {
font-size: 13px;
color: #666;
}
/* ========================================
Category List Widget
======================================== */
.cannaiq-category-grid {
display: grid;
gap: 16px;
margin: 20px 0;
}
.cannaiq-category-list {
display: flex;
flex-direction: column;
gap: 8px;
margin: 20px 0;
}
.cannaiq-category-pills {
display: flex;
flex-wrap: wrap;
gap: 10px;
margin: 20px 0;
}
.cannaiq-category-item {
display: flex;
align-items: center;
justify-content: space-between;
padding: 12px 16px;
background: #fff;
border-radius: 8px;
text-decoration: none;
color: #333;
box-shadow: 0 1px 4px rgba(0, 0, 0, 0.08);
transition: background 0.2s, transform 0.2s;
}
.cannaiq-category-item:hover {
background: #f3f4f6;
transform: translateX(4px);
}
.cannaiq-category-pills-item {
display: inline-flex;
align-items: center;
gap: 6px;
padding: 8px 16px;
background: #f3f4f6;
border-radius: 20px;
text-decoration: none;
color: #333;
font-size: 14px;
transition: background 0.2s;
}
.cannaiq-category-pills-item:hover {
background: #e5e7eb;
}
.cannaiq-category-name {
font-weight: 500;
}
.cannaiq-category-count {
font-size: 13px;
color: #666;
}
/* ========================================
Specials/Deals Grid Widget
======================================== */
.cannaiq-specials-grid {
display: grid;
gap: 24px;
margin: 20px 0;
}
.cannaiq-special-card {
background: #fff;
border-radius: 8px;
overflow: hidden;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
transition: transform 0.2s, box-shadow 0.2s;
position: relative;
}
.cannaiq-special-card:hover {
transform: translateY(-4px);
box-shadow: 0 4px 16px rgba(0, 0, 0, 0.15);
}
.cannaiq-discount-badge {
position: absolute;
top: 12px;
right: 12px;
background: #ef4444;
color: #fff;
font-size: 13px;
font-weight: 700;
padding: 4px 10px;
border-radius: 4px;
z-index: 1;
}
.cannaiq-special-image {
width: 100%;
aspect-ratio: 1;
overflow: hidden;
background: #f5f5f5;
}
.cannaiq-special-image img {
width: 100%;
height: 100%;
object-fit: cover;
}
.cannaiq-special-content {
padding: 16px;
}
.cannaiq-special-title {
font-size: 16px;
font-weight: 600;
margin: 0 0 8px 0;
color: #333;
line-height: 1.4;
}
.cannaiq-special-price {
display: flex;
align-items: center;
gap: 8px;
margin-top: 12px;
}
.cannaiq-special-price .cannaiq-price-sale {
font-size: 20px;
font-weight: 700;
color: #16a34a;
}
.cannaiq-special-price .cannaiq-price-regular {
font-size: 14px;
color: #999;
}

View File

@@ -3,7 +3,7 @@
* Plugin Name: CannaIQ Menus * Plugin Name: CannaIQ Menus
* Plugin URI: https://cannaiq.co * Plugin URI: https://cannaiq.co
* Description: Display cannabis product menus from CannaIQ with Elementor integration. Real-time menu data updated daily. * Description: Display cannabis product menus from CannaIQ with Elementor integration. Real-time menu data updated daily.
* Version: 1.5.4 * Version: 1.6.0
* Author: CannaIQ * Author: CannaIQ
* Author URI: https://cannaiq.co * Author URI: https://cannaiq.co
* License: GPL v2 or later * License: GPL v2 or later
@@ -15,7 +15,7 @@ if (!defined('ABSPATH')) {
exit; // Exit if accessed directly exit; // Exit if accessed directly
} }
define('CANNAIQ_MENUS_VERSION', '1.5.4'); define('CANNAIQ_MENUS_VERSION', '1.6.0');
define('CANNAIQ_MENUS_API_URL', 'https://cannaiq.co/api/v1'); define('CANNAIQ_MENUS_API_URL', 'https://cannaiq.co/api/v1');
define('CANNAIQ_MENUS_PLUGIN_DIR', plugin_dir_path(__FILE__)); define('CANNAIQ_MENUS_PLUGIN_DIR', plugin_dir_path(__FILE__));
define('CANNAIQ_MENUS_PLUGIN_URL', plugin_dir_url(__FILE__)); define('CANNAIQ_MENUS_PLUGIN_URL', plugin_dir_url(__FILE__));
@@ -46,14 +46,17 @@ class CannaIQ_Menus_Plugin {
// Initialize plugin // Initialize plugin
load_plugin_textdomain('cannaiq-menus', false, dirname(plugin_basename(__FILE__)) . '/languages'); load_plugin_textdomain('cannaiq-menus', false, dirname(plugin_basename(__FILE__)) . '/languages');
// Register shortcodes // Register shortcodes - primary CannaIQ shortcodes
add_shortcode('cannaiq_products', [$this, 'products_shortcode']); add_shortcode('cannaiq_products', [$this, 'products_shortcode']);
add_shortcode('cannaiq_product', [$this, 'single_product_shortcode']); add_shortcode('cannaiq_product', [$this, 'single_product_shortcode']);
// Legacy shortcode support (backward compatibility)
add_shortcode('crawlsy_products', [$this, 'products_shortcode']); // DEPRECATED: Legacy shortcode aliases for backward compatibility only
add_shortcode('crawlsy_product', [$this, 'single_product_shortcode']); // These allow sites that used the old plugin names to continue working
add_shortcode('dutchie_products', [$this, 'products_shortcode']); // New implementations should use [cannaiq_products] and [cannaiq_product]
add_shortcode('dutchie_product', [$this, 'single_product_shortcode']); add_shortcode('crawlsy_products', [$this, 'products_shortcode']); // deprecated
add_shortcode('crawlsy_product', [$this, 'single_product_shortcode']); // deprecated
add_shortcode('dutchie_products', [$this, 'products_shortcode']); // deprecated
add_shortcode('dutchie_product', [$this, 'single_product_shortcode']); // deprecated
} }
/** /**
@@ -62,9 +65,15 @@ class CannaIQ_Menus_Plugin {
public function register_elementor_widgets($widgets_manager) { public function register_elementor_widgets($widgets_manager) {
require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/product-grid.php'; require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/product-grid.php';
require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/single-product.php'; require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/single-product.php';
require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/brand-grid.php';
require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/category-list.php';
require_once CANNAIQ_MENUS_PLUGIN_DIR . 'widgets/specials-grid.php';
$widgets_manager->register(new \CannaIQ_Menus_Product_Grid_Widget()); $widgets_manager->register(new \CannaIQ_Menus_Product_Grid_Widget());
$widgets_manager->register(new \CannaIQ_Menus_Single_Product_Widget()); $widgets_manager->register(new \CannaIQ_Menus_Single_Product_Widget());
$widgets_manager->register(new \CannaIQ_Menus_Brand_Grid_Widget());
$widgets_manager->register(new \CannaIQ_Menus_Category_List_Widget());
$widgets_manager->register(new \CannaIQ_Menus_Specials_Grid_Widget());
} }
/** /**
@@ -108,7 +117,9 @@ class CannaIQ_Menus_Plugin {
public function register_settings() { public function register_settings() {
register_setting('cannaiq_menus_settings', 'cannaiq_api_token'); register_setting('cannaiq_menus_settings', 'cannaiq_api_token');
// Migrate old settings if they exist // MIGRATION: Auto-migrate API tokens from old plugin versions
// This runs once - if user had crawlsy or dutchie plugin, their token is preserved
// Can be removed in a future major version once all users have migrated
$old_crawlsy_token = get_option('crawlsy_api_token'); $old_crawlsy_token = get_option('crawlsy_api_token');
$old_dutchie_token = get_option('dutchie_api_token'); $old_dutchie_token = get_option('dutchie_api_token');
@@ -392,6 +403,152 @@ class CannaIQ_Menus_Plugin {
return $data['product'] ?? false; return $data['product'] ?? false;
} }
/**
* Fetch Categories from API
*/
public function fetch_categories($args = []) {
$api_token = get_option('cannaiq_api_token');
if (!$api_token) {
return false;
}
$query_args = http_build_query($args);
$url = CANNAIQ_MENUS_API_URL . '/categories' . ($query_args ? '?' . $query_args : '');
$response = wp_remote_get($url, [
'headers' => [
'X-API-Key' => $api_token
],
'timeout' => 30
]);
if (is_wp_error($response)) {
return false;
}
$body = wp_remote_retrieve_body($response);
$data = json_decode($body, true);
return $data['categories'] ?? false;
}
/**
* Fetch Brands from API
*/
public function fetch_brands($args = []) {
$api_token = get_option('cannaiq_api_token');
if (!$api_token) {
return false;
}
$query_args = http_build_query($args);
$url = CANNAIQ_MENUS_API_URL . '/brands' . ($query_args ? '?' . $query_args : '');
$response = wp_remote_get($url, [
'headers' => [
'X-API-Key' => $api_token
],
'timeout' => 30
]);
if (is_wp_error($response)) {
return false;
}
$body = wp_remote_retrieve_body($response);
$data = json_decode($body, true);
return $data['brands'] ?? false;
}
/**
* Fetch Specials/Deals from API
*/
public function fetch_specials($args = []) {
$api_token = get_option('cannaiq_api_token');
if (!$api_token) {
return false;
}
$query_args = http_build_query($args);
$url = CANNAIQ_MENUS_API_URL . '/specials' . ($query_args ? '?' . $query_args : '');
$response = wp_remote_get($url, [
'headers' => [
'X-API-Key' => $api_token
],
'timeout' => 30
]);
if (is_wp_error($response)) {
return false;
}
$body = wp_remote_retrieve_body($response);
$data = json_decode($body, true);
return $data['products'] ?? false;
}
/**
* Get categories as options for Elementor select control
* Returns cached results for performance
*/
public function get_category_options() {
$cache_key = 'cannaiq_category_options';
$cached = get_transient($cache_key);
if ($cached !== false) {
return $cached;
}
$categories = $this->fetch_categories();
$options = ['' => __('All Categories', 'cannaiq-menus')];
if ($categories) {
foreach ($categories as $cat) {
$name = $cat['type'] ?? $cat['name'] ?? '';
if ($name) {
$options[$name] = ucwords(str_replace('_', ' ', $name));
}
}
}
set_transient($cache_key, $options, 5 * MINUTE_IN_SECONDS);
return $options;
}
/**
* Get brands as options for Elementor select control
* Returns cached results for performance
*/
public function get_brand_options() {
$cache_key = 'cannaiq_brand_options';
$cached = get_transient($cache_key);
if ($cached !== false) {
return $cached;
}
$brands = $this->fetch_brands(['limit' => 200]);
$options = ['' => __('All Brands', 'cannaiq-menus')];
if ($brands) {
foreach ($brands as $brand) {
$name = $brand['brand'] ?? $brand['brand_name'] ?? '';
if ($name) {
$options[$name] = $name;
}
}
}
set_transient($cache_key, $options, 5 * MINUTE_IN_SECONDS);
return $options;
}
} }
// Initialize Plugin // Initialize Plugin

View File

@@ -0,0 +1,184 @@
<?php
/**
* Elementor Brand Grid Widget
*/
if (!defined('ABSPATH')) {
exit;
}
class CannaIQ_Menus_Brand_Grid_Widget extends \Elementor\Widget_Base {
public function get_name() {
return 'cannaiq_brand_grid';
}
public function get_title() {
return __('CannaIQ Brand Grid', 'cannaiq-menus');
}
public function get_icon() {
return 'eicon-gallery-grid';
}
public function get_categories() {
return ['general'];
}
protected function register_controls() {
// Content Section
$this->start_controls_section(
'content_section',
[
'label' => __('Content', 'cannaiq-menus'),
'tab' => \Elementor\Controls_Manager::TAB_CONTENT,
]
);
$this->add_control(
'limit',
[
'label' => __('Number of Brands', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::NUMBER,
'default' => 12,
'min' => 1,
'max' => 100,
]
);
$this->add_control(
'columns',
[
'label' => __('Columns', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SELECT,
'default' => '4',
'options' => [
'2' => __('2 Columns', 'cannaiq-menus'),
'3' => __('3 Columns', 'cannaiq-menus'),
'4' => __('4 Columns', 'cannaiq-menus'),
'6' => __('6 Columns', 'cannaiq-menus'),
],
]
);
$this->add_control(
'show_product_count',
[
'label' => __('Show Product Count', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SWITCHER,
'label_on' => __('Yes', 'cannaiq-menus'),
'label_off' => __('No', 'cannaiq-menus'),
'return_value' => 'yes',
'default' => 'yes',
]
);
$this->add_control(
'link_to_products',
[
'label' => __('Link to Products Page', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::URL,
'placeholder' => __('/products', 'cannaiq-menus'),
'description' => __('Brand name will be appended as ?brand=Name', 'cannaiq-menus'),
]
);
$this->end_controls_section();
// Style Section
$this->start_controls_section(
'style_section',
[
'label' => __('Style', 'cannaiq-menus'),
'tab' => \Elementor\Controls_Manager::TAB_STYLE,
]
);
$this->add_control(
'card_background',
[
'label' => __('Card Background', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::COLOR,
'default' => '#ffffff',
'selectors' => [
'{{WRAPPER}} .cannaiq-brand-card' => 'background-color: {{VALUE}};',
],
]
);
$this->add_control(
'card_border_radius',
[
'label' => __('Border Radius', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SLIDER,
'size_units' => ['px'],
'range' => [
'px' => [
'min' => 0,
'max' => 50,
],
],
'default' => [
'size' => 8,
],
'selectors' => [
'{{WRAPPER}} .cannaiq-brand-card' => 'border-radius: {{SIZE}}{{UNIT}};',
],
]
);
$this->add_control(
'text_color',
[
'label' => __('Text Color', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::COLOR,
'default' => '#333333',
'selectors' => [
'{{WRAPPER}} .cannaiq-brand-card' => 'color: {{VALUE}};',
],
]
);
$this->end_controls_section();
}
protected function render() {
$settings = $this->get_settings_for_display();
$plugin = CannaIQ_Menus_Plugin::instance();
$brands = $plugin->fetch_brands(['limit' => $settings['limit']]);
if (!$brands) {
echo '<p>' . __('No brands found.', 'cannaiq-menus') . '</p>';
return;
}
$columns = $settings['columns'];
$link_base = $settings['link_to_products']['url'] ?? '';
?>
<div class="cannaiq-brand-grid cannaiq-grid-cols-<?php echo esc_attr($columns); ?>">
<?php foreach ($brands as $brand):
$brand_name = $brand['brand'] ?? $brand['brand_name'] ?? '';
$product_count = $brand['product_count'] ?? 0;
$brand_url = $link_base ? $link_base . '?brand=' . urlencode($brand_name) : '#';
?>
<div class="cannaiq-brand-card"
<?php if ($brand_url !== '#'): ?>onclick="window.location.href='<?php echo esc_url($brand_url); ?>'"<?php endif; ?>
style="cursor: <?php echo ($brand_url !== '#') ? 'pointer' : 'default'; ?>;">
<div class="cannaiq-brand-content">
<h3 class="cannaiq-brand-name">
<?php echo esc_html($brand_name); ?>
</h3>
<?php if ($settings['show_product_count'] === 'yes' && $product_count > 0): ?>
<span class="cannaiq-brand-count">
<?php echo esc_html($product_count); ?> <?php _e('products', 'cannaiq-menus'); ?>
</span>
<?php endif; ?>
</div>
</div>
<?php endforeach; ?>
</div>
<?php
}
}

View File

@@ -0,0 +1,205 @@
<?php
/**
* Elementor Category List Widget
*/
if (!defined('ABSPATH')) {
exit;
}
class CannaIQ_Menus_Category_List_Widget extends \Elementor\Widget_Base {
public function get_name() {
return 'cannaiq_category_list';
}
public function get_title() {
return __('CannaIQ Category List', 'cannaiq-menus');
}
public function get_icon() {
return 'eicon-bullet-list';
}
public function get_categories() {
return ['general'];
}
protected function register_controls() {
// Content Section
$this->start_controls_section(
'content_section',
[
'label' => __('Content', 'cannaiq-menus'),
'tab' => \Elementor\Controls_Manager::TAB_CONTENT,
]
);
$this->add_control(
'layout',
[
'label' => __('Layout', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SELECT,
'default' => 'grid',
'options' => [
'grid' => __('Grid', 'cannaiq-menus'),
'list' => __('List', 'cannaiq-menus'),
'pills' => __('Pills/Tags', 'cannaiq-menus'),
],
]
);
$this->add_control(
'columns',
[
'label' => __('Columns', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SELECT,
'default' => '3',
'options' => [
'2' => __('2 Columns', 'cannaiq-menus'),
'3' => __('3 Columns', 'cannaiq-menus'),
'4' => __('4 Columns', 'cannaiq-menus'),
'6' => __('6 Columns', 'cannaiq-menus'),
],
'condition' => [
'layout' => 'grid',
],
]
);
$this->add_control(
'show_product_count',
[
'label' => __('Show Product Count', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SWITCHER,
'label_on' => __('Yes', 'cannaiq-menus'),
'label_off' => __('No', 'cannaiq-menus'),
'return_value' => 'yes',
'default' => 'yes',
]
);
$this->add_control(
'link_to_products',
[
'label' => __('Link to Products Page', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::URL,
'placeholder' => __('/products', 'cannaiq-menus'),
'description' => __('Category name will be appended as ?category=Name', 'cannaiq-menus'),
]
);
$this->end_controls_section();
// Style Section
$this->start_controls_section(
'style_section',
[
'label' => __('Style', 'cannaiq-menus'),
'tab' => \Elementor\Controls_Manager::TAB_STYLE,
]
);
$this->add_control(
'card_background',
[
'label' => __('Card Background', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::COLOR,
'default' => '#ffffff',
'selectors' => [
'{{WRAPPER}} .cannaiq-category-item' => 'background-color: {{VALUE}};',
],
]
);
$this->add_control(
'card_border_radius',
[
'label' => __('Border Radius', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::SLIDER,
'size_units' => ['px'],
'range' => [
'px' => [
'min' => 0,
'max' => 50,
],
],
'default' => [
'size' => 8,
],
'selectors' => [
'{{WRAPPER}} .cannaiq-category-item' => 'border-radius: {{SIZE}}{{UNIT}};',
],
]
);
$this->add_control(
'text_color',
[
'label' => __('Text Color', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::COLOR,
'default' => '#333333',
'selectors' => [
'{{WRAPPER}} .cannaiq-category-item' => 'color: {{VALUE}};',
],
]
);
$this->add_control(
'hover_background',
[
'label' => __('Hover Background', 'cannaiq-menus'),
'type' => \Elementor\Controls_Manager::COLOR,
'default' => '#f3f4f6',
'selectors' => [
'{{WRAPPER}} .cannaiq-category-item:hover' => 'background-color: {{VALUE}};',
],
]
);
$this->end_controls_section();
}
protected function render() {
$settings = $this->get_settings_for_display();
$plugin = CannaIQ_Menus_Plugin::instance();
$categories = $plugin->fetch_categories();
if (!$categories) {
echo '<p>' . __('No categories found.', 'cannaiq-menus') . '</p>';
return;
}
$layout = $settings['layout'];
$columns = $settings['columns'];
$link_base = $settings['link_to_products']['url'] ?? '';
$container_class = 'cannaiq-category-' . $layout;
if ($layout === 'grid') {
$container_class .= ' cannaiq-grid-cols-' . $columns;
}
?>
<div class="<?php echo esc_attr($container_class); ?>">
<?php foreach ($categories as $category):
$cat_name = $category['type'] ?? $category['name'] ?? '';
$display_name = ucwords(str_replace('_', ' ', $cat_name));
$product_count = $category['product_count'] ?? 0;
$cat_url = $link_base ? $link_base . '?category=' . urlencode($cat_name) : '#';
?>
<a href="<?php echo esc_url($cat_url); ?>" class="cannaiq-category-item cannaiq-category-<?php echo esc_attr($layout); ?>-item">
<span class="cannaiq-category-name">
<?php echo esc_html($display_name); ?>
</span>
<?php if ($settings['show_product_count'] === 'yes' && $product_count > 0): ?>
<span class="cannaiq-category-count">
(<?php echo esc_html($product_count); ?>)
</span>
<?php endif; ?>
</a>
<?php endforeach; ?>
</div>
<?php
}
}

Some files were not shown because too many files have changed in this diff Show More