Compare commits

..

48 Commits

Author SHA1 Message Date
Kelly
90567511dd ci: Remove explicit migration step from deploy
Auto-migrate runs at server startup and handles migration errors gracefully.
The explicit kubectl exec migration step was failing due to trigger
already existing (schema_migrations table out of sync).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 15:56:47 -07:00
Kelly
fc7fc5ea85 ci: Run migrations via kubectl exec instead of separate step
Removes the migrate step that required db_* secrets (which CI can't
access since postgres is cluster-internal). Instead, run migrations
via kubectl exec on the deployed scraper pod, which already has DB
access via its env vars.

Deploy order:
1. Deploy scraper image
2. Wait for rollout
3. Run migrations via kubectl exec
4. Deploy remaining services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 15:16:12 -07:00
kelly
ab8956b14b Merge pull request 'fix: Revert CI event array syntax to single value' (#39) from fix/ci-event-syntax into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/39
2025-12-11 22:10:03 +00:00
Kelly
1d9c90641f fix: Revert event array syntax to single value
The [push, manual] array syntax broke CI config parsing.
Reverting to event: push which is known to work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 14:48:34 -07:00
kelly
6126b907f2 Merge pull request 'ci: Support manual pipeline events for deploy' (#38) from ci/support-manual-pipelines into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/38
2025-12-11 21:12:26 +00:00
Kelly
cc93d2d483 ci: Support manual pipeline events for deploy
Allow deploy steps to run on both push and manual events.
This enables triggering deploys via `woodpecker-cli pipeline create`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 13:54:04 -07:00
kelly
7642c17ec0 Merge pull request 'ci: Fix pipeline config path' (#37) from fix/ci-config-path into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/37
2025-12-11 20:01:30 +00:00
Kelly
cb60dcf352 ci: Add root-level woodpecker config for CI detection 2025-12-11 12:37:35 -07:00
kelly
5ffe05d519 Merge pull request 'feat: Concurrent task processing with resource-based backoff' (#36) from feat/worker-scaling into master 2025-12-11 18:49:01 +00:00
Kelly
8e2f07c941 feat(workers): Concurrent task processing with resource-based backoff
Workers can now process multiple tasks concurrently (default: 3 max).
Self-regulate based on resource usage - back off at 85% memory or 90% CPU.

Backend changes:
- TaskWorker handles concurrent tasks using async Maps
- Resource monitoring (memory %, CPU %) with backoff logic
- Heartbeat reports active_task_count, max_concurrent_tasks, resource stats
- Decommission support via worker_commands table

Frontend changes:
- Workers Dashboard shows tasks per worker (N/M format)
- Resource badges with color-coded thresholds
- Pod visualization with clickable selection
- Decommission controls per worker

New env vars:
- MAX_CONCURRENT_TASKS (default: 3)
- MEMORY_BACKOFF_THRESHOLD (default: 0.85)
- CPU_BACKOFF_THRESHOLD (default: 0.90)
- BACKOFF_DURATION_MS (default: 10000)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 11:47:24 -07:00
kelly
0b6e615075 Merge pull request 'fix: Use React Router Link to prevent scroll reset' (#35) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/35
2025-12-11 17:14:25 +00:00
Kelly
be251c6fb3 fix: Use React Router Link for nav to prevent scroll reset
Changed sidebar NavLink from <a> to <Link> for client-side navigation.
This prevents full page reload and scroll position reset.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:59:34 -07:00
kelly
efb1e89e33 Merge pull request 'fix: Show only git SHA in header' (#34) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/34
2025-12-11 16:56:04 +00:00
Kelly
529c447413 refactor: Move worker scaling to Workers page with password confirmation
- Worker scaling controls now on /workers page only (removed from /tasks)
- Password confirmation required before scaling
- Show only git SHA in header (removed version number)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:41:58 -07:00
Kelly
1eaf95c06b fix: Show only git SHA in header, remove version number
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:35:41 -07:00
kelly
138ed17d8b Merge pull request 'feat: Worker scaling from admin UI with password confirmation' (#33) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/33
2025-12-11 16:29:04 +00:00
Kelly
a880c41d89 feat: Add password confirmation for worker scaling + RBAC
- Add /api/auth/verify-password endpoint for re-authentication
- Add PasswordConfirmModal component for sensitive actions
- Worker scaling (+/-) now requires password confirmation
- Add RBAC (ServiceAccount, Role, RoleBinding) for scraper pod
- Scraper pod can now read/scale worker deployment via k8s API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:16:27 -07:00
Kelly
2a9ae61dce fix(ui): Make Tasks Dashboard header sticky
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 09:04:21 -07:00
kelly
1f21911fa1 Merge pull request 'feat(admin): Worker scaling controls via k8s API' (#32) from feat/worker-scaling into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/32
2025-12-11 16:01:25 +00:00
Kelly
6f0a58f5d2 fix(k8s): Correct API call signatures for k8s client v1.4
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 08:47:27 -07:00
Kelly
8206dce821 feat(admin): Worker scaling controls via k8s API
- Add /api/k8s/workers endpoint to get deployment status
- Add /api/k8s/workers/scale endpoint to scale replicas (0-50)
- Add worker scaling UI to Tasks Dashboard (+/- 5 workers)
- Shows ready/desired replica count
- Uses in-cluster config in k8s, kubeconfig locally

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 08:24:32 -07:00
kelly
ced1afaa8a Merge pull request 'fix(ci): CI config fix + 25 workers + pool starts paused' (#31) from fix/ci-filename into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/31
2025-12-11 08:47:26 +00:00
Kelly
d6c602c567 fix(ui): Remove Cleanup Stale button from Workers page
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 01:29:25 -07:00
Kelly
a252a7fefd feat(tasks): 25 workers, pool starts paused by default
- Increase worker replicas from 5 to 25
- Task pool now starts PAUSED on deploy, admin must click Start Pool
- Prevents workers from grabbing tasks before system is ready

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 01:19:02 -07:00
Kelly
83b06c21cc fix(tasks): Stop spinner in status cards when pool is paused
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 01:14:25 -07:00
kelly
f5214da54c Merge pull request 'fix(ci): Fix Woodpecker config - remove invalid top-level when' (#30) from fix/ci-filename into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/30
2025-12-11 08:07:16 +00:00
Kelly
e3d4dd0127 fix(ci): Fix Woodpecker config - remove invalid top-level when
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 00:45:08 -07:00
kelly
d0ee0d72f5 Merge pull request 'feat(tasks): Add task pool start/stop toggle' (#29) from feat/task-pool-toggle into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/29
2025-12-11 07:21:02 +00:00
kelly
521f0550cd Merge pull request 'feat: Admin UI cleanup and dispensary schedule page' (#28) from fix/worker-proxy-wait into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/28
2025-12-11 07:08:03 +00:00
Kelly
8a09691e91 feat(tasks): Add task pool start/stop toggle
- Add task-pool-state.ts for shared pause/resume state
- Add /api/tasks/pool/status, pause, resume endpoints
- Add Start/Stop Pool toggle button to TasksDashboard
- Spinner stops when pool is closed
- Fix is_active column name in store-discovery.ts
- Fix missing active column in task-service.ts claimTask
- Auto-refresh every 15 seconds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 00:07:14 -07:00
Kelly
459ad7d9c9 fix(tasks): Fix missing column errors in task queries
- Change 'active' to 'is_active' in states table query (store-discovery.ts)
- Remove non-existent 'active' column check from worker_tasks query (task-service.ts)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:54:28 -07:00
Kelly
d102d27731 feat(admin): Dispensary schedule page and UI cleanup
- Add DispensarySchedule page showing crawl history and upcoming schedule
- Add /dispensaries/:state/:city/:slug/schedule route
- Add API endpoint for store crawl history
- Update View Schedule link to use dispensary-specific route
- Remove colored badges from DispensaryDetail product table (plain text)
- Make Details button ghost style in product table
- Add "Sort by States" option to IntelligenceBrands
- Remove status filter dropdown from Dispensaries page

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:50:47 -07:00
kelly
01810c40a1 Merge pull request 'fix(worker): Wait for proxies instead of crashing' (#27) from fix/worker-proxy-wait into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/27
2025-12-11 06:43:29 +00:00
Kelly
b7d33e1cbf fix(admin): Clean up store detail and intelligence pages
- Remove Update dropdown from DispensaryDetail page
- Remove Crawl Now button from StoreDetailPage
- Change "Last Crawl" to "Last Updated" on both detail pages
- Tone down emerald colors on StoreDetailPage (use gray borders/tabs)
- Simplify THC/CBD/Stock badges to plain text
- Remove duplicate state dropdown from IntelligenceStores filters
- Make store rows clickable in IntelligenceStores

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:42:50 -07:00
Kelly
5b34b5a78c fix(admin): Consistent navigation across Intelligence pages
- Add state selector dropdown to all three Intelligence pages (Brands, Stores, Pricing)
- Use consistent emerald-styled page navigation badges with current page highlighted
- Remove Refresh buttons from all Intelligence pages
- Update chart styling to use emerald gradient bars (matching Pricing page)
- Load all available states from orchestrator API instead of extracting from local data
- Fix z-index and styling on state dropdown for better visibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:32:57 -07:00
Kelly
c091d2316b fix(dashboard): Remove refresh button and HealthPanel
- Removed refresh button and refreshing state from Dashboard
- Removed HealthPanel component (deploy status auto-refresh)
- Simplified header layout

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:28:18 -07:00
Kelly
e8862b8a8b fix(national): Remove Refresh Metrics button
Removed unused refresh button and related state/handlers from
National Dashboard.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:27:01 -07:00
Kelly
1b46ab699d fix(national): Show all states count, not filtered "active" states
The "Active States" metric was arbitrary and confusing. Changed to
show total states count - all states in the system regardless of
whether they have data or not.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:25:50 -07:00
Kelly
ac1995f63f fix(pricing): Simplify category chart to prevent overflow
- Replace complex price range bars with simple horizontal bars
- Use overflow-hidden to prevent bars extending beyond container
- Calculate bar width as percentage of max avg price
- Limit to top 12 categories for cleaner display
- Fixed-width labels prevent layout shift

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:24:31 -07:00
Kelly
de93669652 fix(national): Count active states by product data, not crawl status
Active states should count states with actual product data, not just
states where crawling is enabled. A state can have historical data
even if crawling is currently disabled.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:23:20 -07:00
Kelly
dffc124920 fix(national): Fix active states count and remove StateBadge
- Change active_states to count states with crawl_enabled=true dispensaries
- Filter all national summary queries by crawl_enabled=true
- Remove unused StateBadge from National Dashboard header
- StateBadge was showing "Arizona" with no way to change it

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:22:19 -07:00
Kelly
932ceb0287 feat(intelligence): Add state filter to all Intelligence pages
- Add state filter to Intelligence Brands API and frontend
- Add state filter to Intelligence Pricing API and frontend
- Add state filter to Intelligence Stores API and frontend
- Fix null safety issues with toLocaleString() calls
- Update backend /stores endpoint to return skuCount, snapshotCount, chainName
- Add overall stats to pricing endpoint

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:19:54 -07:00
Kelly
824d48fd85 fix: Add curl to Docker, add active flag to worker_tasks
- Install curl in Docker container for Dutchie HTTP requests
- Add 'active' column to worker_tasks (default false) to prevent
  accidental task execution on startup
- Update task-service to only claim tasks where active=true

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:12:09 -07:00
Kelly
47fdab0382 fix: Filter orchestrator states by crawl_enabled
The states dropdown was showing count of ALL dispensaries instead of
just crawl-enabled ones. Now correctly filters to match the actual
stores that would be displayed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:09:04 -07:00
Kelly
ed7ddc6375 ci: Add database migration step to deploy pipeline
Migrations now run automatically before deployments:
1. Build new Docker image
2. Run migrations using the new image
3. Deploy to Kubernetes

Requires new secrets: db_host, db_port, db_name, db_user, db_pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 23:07:25 -07:00
Kelly
cf06f4a8c0 feat(worker): Listen for proxy_added notifications
- Workers now use PostgreSQL LISTEN/NOTIFY to wake up immediately when proxies are added
- Added trigger on proxies table to NOTIFY 'proxy_added' when active proxy inserted/updated
- Falls back to 30s polling if LISTEN fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 22:58:00 -07:00
kelly
61e915968f Merge pull request 'feat(tasks): Refactor task workflow with payload/refresh separation' (#26) from feat/task-workflow-refactor into master 2025-12-11 05:24:11 +00:00
kelly
a4338669a9 Merge pull request 'fix(auth): Prioritize JWT token over trusted origin bypass' (#24) from fix/auth-token-priority into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/24
2025-12-11 01:34:10 +00:00
42 changed files with 3698 additions and 954 deletions

View File

@@ -1,6 +1,3 @@
when:
- event: [push, pull_request]
steps:
# ===========================================
# PR VALIDATION: Parallel type checks (PRs only)
@@ -163,7 +160,7 @@ steps:
event: push
# ===========================================
# STAGE 3: Deploy (after Docker builds)
# STAGE 3: Deploy and Run Migrations
# ===========================================
deploy:
image: bitnami/kubectl:latest
@@ -174,12 +171,15 @@ steps:
- mkdir -p ~/.kube
- echo "$KUBECONFIG_CONTENT" | tr -d '[:space:]' | base64 -d > ~/.kube/config
- chmod 600 ~/.kube/config
# Deploy backend first
- kubectl set image deployment/scraper scraper=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/scraper -n dispensary-scraper --timeout=300s
# Note: Migrations run automatically at startup via auto-migrate
# Deploy remaining services
- kubectl set image deployment/scraper-worker worker=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/cannaiq-frontend cannaiq-frontend=code.cannabrands.app/creationshop/cannaiq-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findadispo-frontend findadispo-frontend=code.cannabrands.app/creationshop/findadispo-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findagram-frontend findagram-frontend=code.cannabrands.app/creationshop/findagram-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/scraper -n dispensary-scraper --timeout=300s
- kubectl rollout status deployment/cannaiq-frontend -n dispensary-scraper --timeout=120s
depends_on:
- docker-backend

191
.woodpecker/ci.yml Normal file
View File

@@ -0,0 +1,191 @@
steps:
# ===========================================
# PR VALIDATION: Parallel type checks (PRs only)
# ===========================================
typecheck-backend:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd backend
- npm ci --prefer-offline
- npx tsc --noEmit
depends_on: []
when:
event: pull_request
typecheck-cannaiq:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd cannaiq
- npm ci --prefer-offline
- npx tsc --noEmit
depends_on: []
when:
event: pull_request
typecheck-findadispo:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd findadispo/frontend
- npm ci --prefer-offline
- npx tsc --noEmit 2>/dev/null || true
depends_on: []
when:
event: pull_request
typecheck-findagram:
image: code.cannabrands.app/creationshop/node:20
commands:
- cd findagram/frontend
- npm ci --prefer-offline
- npx tsc --noEmit 2>/dev/null || true
depends_on: []
when:
event: pull_request
# ===========================================
# AUTO-MERGE: Merge PR after all checks pass
# ===========================================
auto-merge:
image: alpine:latest
environment:
GITEA_TOKEN:
from_secret: gitea_token
commands:
- apk add --no-cache curl
- |
echo "Merging PR #${CI_COMMIT_PULL_REQUEST}..."
curl -s -X POST \
-H "Authorization: token $GITEA_TOKEN" \
-H "Content-Type: application/json" \
-d '{"Do":"merge"}' \
"https://code.cannabrands.app/api/v1/repos/Creationshop/dispensary-scraper/pulls/${CI_COMMIT_PULL_REQUEST}/merge"
depends_on:
- typecheck-backend
- typecheck-cannaiq
- typecheck-findadispo
- typecheck-findagram
when:
event: pull_request
# ===========================================
# MASTER DEPLOY: Parallel Docker builds
# ===========================================
docker-backend:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/dispensary-scraper
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: backend/Dockerfile
context: backend
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
build_args:
APP_BUILD_VERSION: ${CI_COMMIT_SHA:0:8}
APP_GIT_SHA: ${CI_COMMIT_SHA}
APP_BUILD_TIME: ${CI_PIPELINE_CREATED}
CONTAINER_IMAGE_TAG: ${CI_COMMIT_SHA:0:8}
depends_on: []
when:
branch: master
event: push
docker-cannaiq:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/cannaiq-frontend
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: cannaiq/Dockerfile
context: cannaiq
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
depends_on: []
when:
branch: master
event: push
docker-findadispo:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/findadispo-frontend
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: findadispo/frontend/Dockerfile
context: findadispo/frontend
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
depends_on: []
when:
branch: master
event: push
docker-findagram:
image: woodpeckerci/plugin-docker-buildx
settings:
registry: code.cannabrands.app
repo: code.cannabrands.app/creationshop/findagram-frontend
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
dockerfile: findagram/frontend/Dockerfile
context: findagram/frontend
username:
from_secret: registry_username
password:
from_secret: registry_password
platforms: linux/amd64
provenance: false
depends_on: []
when:
branch: master
event: push
# ===========================================
# STAGE 3: Deploy and Run Migrations
# ===========================================
deploy:
image: bitnami/kubectl:latest
environment:
KUBECONFIG_CONTENT:
from_secret: kubeconfig_data
commands:
- mkdir -p ~/.kube
- echo "$KUBECONFIG_CONTENT" | tr -d '[:space:]' | base64 -d > ~/.kube/config
- chmod 600 ~/.kube/config
# Deploy backend first
- kubectl set image deployment/scraper scraper=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/scraper -n dispensary-scraper --timeout=300s
# Note: Migrations run automatically at startup via auto-migrate
# Deploy remaining services
- kubectl set image deployment/scraper-worker worker=code.cannabrands.app/creationshop/dispensary-scraper:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/cannaiq-frontend cannaiq-frontend=code.cannabrands.app/creationshop/cannaiq-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findadispo-frontend findadispo-frontend=code.cannabrands.app/creationshop/findadispo-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl set image deployment/findagram-frontend findagram-frontend=code.cannabrands.app/creationshop/findagram-frontend:${CI_COMMIT_SHA:0:8} -n dispensary-scraper
- kubectl rollout status deployment/cannaiq-frontend -n dispensary-scraper --timeout=120s
depends_on:
- docker-backend
- docker-cannaiq
- docker-findadispo
- docker-findagram
when:
branch: master
event: push

View File

@@ -25,8 +25,9 @@ ENV APP_GIT_SHA=${APP_GIT_SHA}
ENV APP_BUILD_TIME=${APP_BUILD_TIME}
ENV CONTAINER_IMAGE_TAG=${CONTAINER_IMAGE_TAG}
# Install Chromium dependencies
# Install Chromium dependencies and curl for HTTP requests
RUN apt-get update && apt-get install -y \
curl \
chromium \
fonts-liberation \
libnss3 \

View File

@@ -362,6 +362,148 @@ SET status = 'pending', retry_count = retry_count + 1
WHERE status = 'failed' AND retry_count < max_retries;
```
## Concurrent Task Processing (Added 2024-12)
Workers can now process multiple tasks concurrently within a single worker instance. This improves throughput by utilizing async I/O efficiently.
### Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Pod (K8s) │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ TaskWorker │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Task 1 │ │ Task 2 │ │ Task 3 │ (concurrent)│ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ │ │
│ │ Resource Monitor │ │
│ │ ├── Memory: 65% (threshold: 85%) │ │
│ │ ├── CPU: 45% (threshold: 90%) │ │
│ │ └── Status: Normal │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MAX_CONCURRENT_TASKS` | 3 | Maximum tasks a worker will run concurrently |
| `MEMORY_BACKOFF_THRESHOLD` | 0.85 | Back off when heap memory exceeds 85% |
| `CPU_BACKOFF_THRESHOLD` | 0.90 | Back off when CPU exceeds 90% |
| `BACKOFF_DURATION_MS` | 10000 | How long to wait when backing off (10s) |
### How It Works
1. **Main Loop**: Worker continuously tries to fill up to `MAX_CONCURRENT_TASKS`
2. **Resource Monitoring**: Before claiming a new task, worker checks memory and CPU
3. **Backoff**: If resources exceed thresholds, worker pauses and stops claiming new tasks
4. **Concurrent Execution**: Tasks run in parallel using `Promise` - they don't block each other
5. **Graceful Shutdown**: On SIGTERM/decommission, worker stops claiming but waits for active tasks
### Resource Monitoring
```typescript
// ResourceStats interface
interface ResourceStats {
memoryPercent: number; // Current heap usage as decimal (0.0-1.0)
memoryMb: number; // Current heap used in MB
memoryTotalMb: number; // Total heap available in MB
cpuPercent: number; // CPU usage as percentage (0-100)
isBackingOff: boolean; // True if worker is in backoff state
backoffReason: string; // Why the worker is backing off
}
```
### Heartbeat Data
Workers report the following in their heartbeat:
```json
{
"worker_id": "worker-abc123",
"current_task_id": 456,
"current_task_ids": [456, 457, 458],
"active_task_count": 3,
"max_concurrent_tasks": 3,
"status": "active",
"resources": {
"memory_mb": 256,
"memory_total_mb": 512,
"memory_rss_mb": 320,
"memory_percent": 50,
"cpu_user_ms": 12500,
"cpu_system_ms": 3200,
"cpu_percent": 45,
"is_backing_off": false,
"backoff_reason": null
}
}
```
### Backoff Behavior
When resources exceed thresholds:
1. Worker logs the backoff reason:
```
[TaskWorker] MyWorker backing off: Memory at 87.3% (threshold: 85%)
```
2. Worker stops claiming new tasks but continues existing tasks
3. After `BACKOFF_DURATION_MS`, worker rechecks resources
4. When resources return to normal:
```
[TaskWorker] MyWorker resuming normal operation
```
### UI Display
The Workers Dashboard shows:
- **Tasks Column**: `2/3 tasks` (active/max concurrent)
- **Resources Column**: Memory % and CPU % with color coding
- Green: < 50%
- Yellow: 50-74%
- Amber: 75-89%
- Red: 90%+
- **Backing Off**: Orange warning badge when worker is in backoff state
### Task Count Badge Details
```
┌─────────────────────────────────────────────┐
│ Worker: "MyWorker" │
│ Tasks: 2/3 tasks #456, #457 │
│ Resources: 🧠 65% 💻 45% │
│ Status: ● Active │
└─────────────────────────────────────────────┘
```
### Best Practices
1. **Start Conservative**: Use `MAX_CONCURRENT_TASKS=3` initially
2. **Monitor Resources**: Watch for frequent backoffs in logs
3. **Tune Per Workload**: I/O-bound tasks benefit from higher concurrency
4. **Scale Horizontally**: Add more pods rather than cranking concurrency too high
### Code References
| File | Purpose |
|------|---------|
| `src/tasks/task-worker.ts:68-71` | Concurrency environment variables |
| `src/tasks/task-worker.ts:104-111` | ResourceStats interface |
| `src/tasks/task-worker.ts:149-179` | getResourceStats() method |
| `src/tasks/task-worker.ts:184-196` | shouldBackOff() method |
| `src/tasks/task-worker.ts:462-516` | mainLoop() with concurrent claiming |
| `src/routes/worker-registry.ts:148-195` | Heartbeat endpoint handling |
| `cannaiq/src/pages/WorkersDashboard.tsx:233-305` | UI components for resources |
## Monitoring
### Logs

View File

@@ -0,0 +1,27 @@
-- Migration: Worker Commands Table
-- Purpose: Store commands for workers (decommission, etc.)
-- Workers poll this table after each task to check for commands
CREATE TABLE IF NOT EXISTS worker_commands (
id SERIAL PRIMARY KEY,
worker_id TEXT NOT NULL,
command TEXT NOT NULL, -- 'decommission', 'pause', 'resume'
reason TEXT,
issued_by TEXT,
issued_at TIMESTAMPTZ DEFAULT NOW(),
acknowledged_at TIMESTAMPTZ,
executed_at TIMESTAMPTZ,
status TEXT DEFAULT 'pending' -- 'pending', 'acknowledged', 'executed', 'cancelled'
);
-- Index for worker lookups
CREATE INDEX IF NOT EXISTS idx_worker_commands_worker_id ON worker_commands(worker_id);
CREATE INDEX IF NOT EXISTS idx_worker_commands_pending ON worker_commands(worker_id, status) WHERE status = 'pending';
-- Add decommission_requested column to worker_registry for quick checks
ALTER TABLE worker_registry ADD COLUMN IF NOT EXISTS decommission_requested BOOLEAN DEFAULT FALSE;
ALTER TABLE worker_registry ADD COLUMN IF NOT EXISTS decommission_reason TEXT;
ALTER TABLE worker_registry ADD COLUMN IF NOT EXISTS decommission_requested_at TIMESTAMPTZ;
-- Comment
COMMENT ON TABLE worker_commands IS 'Commands issued to workers (decommission after task, pause, etc.)';

View File

@@ -0,0 +1,27 @@
-- Migration: 082_proxy_notification_trigger
-- Date: 2024-12-11
-- Description: Add PostgreSQL NOTIFY trigger to alert workers when proxies are added
-- Create function to notify workers when active proxy is added/activated
CREATE OR REPLACE FUNCTION notify_proxy_added()
RETURNS TRIGGER AS $$
BEGIN
-- Only notify if proxy is active
IF NEW.active = true THEN
PERFORM pg_notify('proxy_added', NEW.id::text);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Drop existing trigger if any
DROP TRIGGER IF EXISTS proxy_added_trigger ON proxies;
-- Create trigger on insert and update of active column
CREATE TRIGGER proxy_added_trigger
AFTER INSERT OR UPDATE OF active ON proxies
FOR EACH ROW
EXECUTE FUNCTION notify_proxy_added();
COMMENT ON FUNCTION notify_proxy_added() IS
'Sends PostgreSQL NOTIFY to proxy_added channel when an active proxy is added or activated. Workers LISTEN on this channel to wake up immediately.';

View File

@@ -1,6 +1,6 @@
{
"name": "dutchie-menus-backend",
"version": "1.5.1",
"version": "1.6.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
@@ -46,6 +46,97 @@
"resolved": "https://registry.npmjs.org/@ioredis/commands/-/commands-1.4.0.tgz",
"integrity": "sha512-aFT2yemJJo+TZCmieA7qnYGQooOS7QfNmYrzGtsYd3g9j5iDP8AimYYAesf79ohjbLG12XxC4nG5DyEnC88AsQ=="
},
"node_modules/@jsep-plugin/assignment": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/@jsep-plugin/assignment/-/assignment-1.3.0.tgz",
"integrity": "sha512-VVgV+CXrhbMI3aSusQyclHkenWSAm95WaiKrMxRFam3JSUiIaQjoMIw2sEs/OX4XifnqeQUN4DYbJjlA8EfktQ==",
"engines": {
"node": ">= 10.16.0"
},
"peerDependencies": {
"jsep": "^0.4.0||^1.0.0"
}
},
"node_modules/@jsep-plugin/regex": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/@jsep-plugin/regex/-/regex-1.0.4.tgz",
"integrity": "sha512-q7qL4Mgjs1vByCaTnDFcBnV9HS7GVPJX5vyVoCgZHNSC9rjwIlmbXG5sUuorR5ndfHAIlJ8pVStxvjXHbNvtUg==",
"engines": {
"node": ">= 10.16.0"
},
"peerDependencies": {
"jsep": "^0.4.0||^1.0.0"
}
},
"node_modules/@kubernetes/client-node": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/@kubernetes/client-node/-/client-node-1.4.0.tgz",
"integrity": "sha512-Zge3YvF7DJi264dU1b3wb/GmzR99JhUpqTvp+VGHfwZT+g7EOOYNScDJNZwXy9cszyIGPIs0VHr+kk8e95qqrA==",
"dependencies": {
"@types/js-yaml": "^4.0.1",
"@types/node": "^24.0.0",
"@types/node-fetch": "^2.6.13",
"@types/stream-buffers": "^3.0.3",
"form-data": "^4.0.0",
"hpagent": "^1.2.0",
"isomorphic-ws": "^5.0.0",
"js-yaml": "^4.1.0",
"jsonpath-plus": "^10.3.0",
"node-fetch": "^2.7.0",
"openid-client": "^6.1.3",
"rfc4648": "^1.3.0",
"socks-proxy-agent": "^8.0.4",
"stream-buffers": "^3.0.2",
"tar-fs": "^3.0.9",
"ws": "^8.18.2"
}
},
"node_modules/@kubernetes/client-node/node_modules/@types/node": {
"version": "24.10.3",
"resolved": "https://registry.npmjs.org/@types/node/-/node-24.10.3.tgz",
"integrity": "sha512-gqkrWUsS8hcm0r44yn7/xZeV1ERva/nLgrLxFRUGb7aoNMIJfZJ3AC261zDQuOAKC7MiXai1WCpYc48jAHoShQ==",
"dependencies": {
"undici-types": "~7.16.0"
}
},
"node_modules/@kubernetes/client-node/node_modules/tar-fs": {
"version": "3.1.1",
"resolved": "https://registry.npmjs.org/tar-fs/-/tar-fs-3.1.1.tgz",
"integrity": "sha512-LZA0oaPOc2fVo82Txf3gw+AkEd38szODlptMYejQUhndHMLQ9M059uXR+AfS7DNo0NpINvSqDsvyaCrBVkptWg==",
"dependencies": {
"pump": "^3.0.0",
"tar-stream": "^3.1.5"
},
"optionalDependencies": {
"bare-fs": "^4.0.1",
"bare-path": "^3.0.0"
}
},
"node_modules/@kubernetes/client-node/node_modules/undici-types": {
"version": "7.16.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.16.0.tgz",
"integrity": "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw=="
},
"node_modules/@kubernetes/client-node/node_modules/ws": {
"version": "8.18.3",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.18.3.tgz",
"integrity": "sha512-PEIGCY5tSlUt50cqyMXfCzX+oOPqN0vuGqWzbcJ2xvnkzkq46oOpz7dQaTDBdfICb4N14+GARUDw2XV2N4tvzg==",
"engines": {
"node": ">=10.0.0"
},
"peerDependencies": {
"bufferutil": "^4.0.1",
"utf-8-validate": ">=5.0.2"
},
"peerDependenciesMeta": {
"bufferutil": {
"optional": true
},
"utf-8-validate": {
"optional": true
}
}
},
"node_modules/@mapbox/node-pre-gyp": {
"version": "1.0.11",
"resolved": "https://registry.npmjs.org/@mapbox/node-pre-gyp/-/node-pre-gyp-1.0.11.tgz",
@@ -251,6 +342,11 @@
"integrity": "sha512-r8Tayk8HJnX0FztbZN7oVqGccWgw98T/0neJphO91KkmOzug1KkofZURD4UaD5uH8AqcFLfdPErnBod0u71/qg==",
"dev": true
},
"node_modules/@types/js-yaml": {
"version": "4.0.9",
"resolved": "https://registry.npmjs.org/@types/js-yaml/-/js-yaml-4.0.9.tgz",
"integrity": "sha512-k4MGaQl5TGo/iipqb2UDG2UwjXziSWkh0uysQelTlJpX1qGlpUZYm8PnO4DxG1qBomtJUdYJ6qR6xdIah10JLg=="
},
"node_modules/@types/jsonwebtoken": {
"version": "9.0.10",
"resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.10.tgz",
@@ -276,7 +372,6 @@
"version": "20.19.25",
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz",
"integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==",
"devOptional": true,
"dependencies": {
"undici-types": "~6.21.0"
}
@@ -287,6 +382,15 @@
"integrity": "sha512-0ikrnug3/IyneSHqCBeslAhlK2aBfYek1fGo4bP4QnZPmiqSGRK+Oy7ZMisLWkesffJvQ1cqAcBnJC+8+nxIAg==",
"dev": true
},
"node_modules/@types/node-fetch": {
"version": "2.6.13",
"resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.13.tgz",
"integrity": "sha512-QGpRVpzSaUs30JBSGPjOg4Uveu384erbHBoT1zeONvyCfwQxIkUshLAOqN/k9EjGviPRmWTTe6aH2qySWKTVSw==",
"dependencies": {
"@types/node": "*",
"form-data": "^4.0.4"
}
},
"node_modules/@types/pg": {
"version": "8.15.6",
"resolved": "https://registry.npmjs.org/@types/pg/-/pg-8.15.6.tgz",
@@ -340,6 +444,14 @@
"@types/node": "*"
}
},
"node_modules/@types/stream-buffers": {
"version": "3.0.8",
"resolved": "https://registry.npmjs.org/@types/stream-buffers/-/stream-buffers-3.0.8.tgz",
"integrity": "sha512-J+7VaHKNvlNPJPEJXX/fKa9DZtR/xPMwuIbe+yNOwp1YB+ApUOBv2aUpEoBJEi8nJgbgs1x8e73ttg0r1rSUdw==",
"dependencies": {
"@types/node": "*"
}
},
"node_modules/@types/uuid": {
"version": "9.0.8",
"resolved": "https://registry.npmjs.org/@types/uuid/-/uuid-9.0.8.tgz",
@@ -520,6 +632,78 @@
}
}
},
"node_modules/bare-fs": {
"version": "4.5.2",
"resolved": "https://registry.npmjs.org/bare-fs/-/bare-fs-4.5.2.tgz",
"integrity": "sha512-veTnRzkb6aPHOvSKIOy60KzURfBdUflr5VReI+NSaPL6xf+XLdONQgZgpYvUuZLVQ8dCqxpBAudaOM1+KpAUxw==",
"optional": true,
"dependencies": {
"bare-events": "^2.5.4",
"bare-path": "^3.0.0",
"bare-stream": "^2.6.4",
"bare-url": "^2.2.2",
"fast-fifo": "^1.3.2"
},
"engines": {
"bare": ">=1.16.0"
},
"peerDependencies": {
"bare-buffer": "*"
},
"peerDependenciesMeta": {
"bare-buffer": {
"optional": true
}
}
},
"node_modules/bare-os": {
"version": "3.6.2",
"resolved": "https://registry.npmjs.org/bare-os/-/bare-os-3.6.2.tgz",
"integrity": "sha512-T+V1+1srU2qYNBmJCXZkUY5vQ0B4FSlL3QDROnKQYOqeiQR8UbjNHlPa+TIbM4cuidiN9GaTaOZgSEgsvPbh5A==",
"optional": true,
"engines": {
"bare": ">=1.14.0"
}
},
"node_modules/bare-path": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/bare-path/-/bare-path-3.0.0.tgz",
"integrity": "sha512-tyfW2cQcB5NN8Saijrhqn0Zh7AnFNsnczRcuWODH0eYAXBsJ5gVxAUuNr7tsHSC6IZ77cA0SitzT+s47kot8Mw==",
"optional": true,
"dependencies": {
"bare-os": "^3.0.1"
}
},
"node_modules/bare-stream": {
"version": "2.7.0",
"resolved": "https://registry.npmjs.org/bare-stream/-/bare-stream-2.7.0.tgz",
"integrity": "sha512-oyXQNicV1y8nc2aKffH+BUHFRXmx6VrPzlnaEvMhram0nPBrKcEdcyBg5r08D0i8VxngHFAiVyn1QKXpSG0B8A==",
"optional": true,
"dependencies": {
"streamx": "^2.21.0"
},
"peerDependencies": {
"bare-buffer": "*",
"bare-events": "*"
},
"peerDependenciesMeta": {
"bare-buffer": {
"optional": true
},
"bare-events": {
"optional": true
}
}
},
"node_modules/bare-url": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/bare-url/-/bare-url-2.3.2.tgz",
"integrity": "sha512-ZMq4gd9ngV5aTMa5p9+UfY0b3skwhHELaDkhEHetMdX0LRkW9kzaym4oo/Eh+Ghm0CCDuMTsRIGM/ytUc1ZYmw==",
"optional": true,
"dependencies": {
"bare-path": "^3.0.0"
}
},
"node_modules/base64-js": {
"version": "1.5.1",
"resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz",
@@ -2019,6 +2203,14 @@
"node": ">=16.0.0"
}
},
"node_modules/hpagent": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/hpagent/-/hpagent-1.2.0.tgz",
"integrity": "sha512-A91dYTeIB6NoXG+PxTQpCCDDnfHsW9kc06Lvpu1TEe9gnd6ZFeiBoRO9JvzEv6xK7EX97/dUE8g/vBMTqTS3CA==",
"engines": {
"node": ">=14"
}
},
"node_modules/htmlparser2": {
"version": "10.0.0",
"resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz",
@@ -2382,6 +2574,22 @@
"node": ">=0.10.0"
}
},
"node_modules/isomorphic-ws": {
"version": "5.0.0",
"resolved": "https://registry.npmjs.org/isomorphic-ws/-/isomorphic-ws-5.0.0.tgz",
"integrity": "sha512-muId7Zzn9ywDsyXgTIafTry2sV3nySZeUDe6YedVd1Hvuuep5AsIlqK+XefWpYTyJG5e503F2xIuT2lcU6rCSw==",
"peerDependencies": {
"ws": "*"
}
},
"node_modules/jose": {
"version": "6.1.3",
"resolved": "https://registry.npmjs.org/jose/-/jose-6.1.3.tgz",
"integrity": "sha512-0TpaTfihd4QMNwrz/ob2Bp7X04yuxJkjRGi4aKmOqwhov54i6u79oCv7T+C7lo70MKH6BesI3vscD1yb/yzKXQ==",
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/js-tokens": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz",
@@ -2398,6 +2606,14 @@
"js-yaml": "bin/js-yaml.js"
}
},
"node_modules/jsep": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/jsep/-/jsep-1.4.0.tgz",
"integrity": "sha512-B7qPcEVE3NVkmSJbaYxvv4cHkVW7DQsZz13pUMrfS8z8Q/BuShN+gcTXrUlPiGqM2/t/EEaI030bpxMqY8gMlw==",
"engines": {
"node": ">= 10.16.0"
}
},
"node_modules/json-parse-even-better-errors": {
"version": "2.3.1",
"resolved": "https://registry.npmjs.org/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz",
@@ -2419,6 +2635,23 @@
"graceful-fs": "^4.1.6"
}
},
"node_modules/jsonpath-plus": {
"version": "10.3.0",
"resolved": "https://registry.npmjs.org/jsonpath-plus/-/jsonpath-plus-10.3.0.tgz",
"integrity": "sha512-8TNmfeTCk2Le33A3vRRwtuworG/L5RrgMvdjhKZxvyShO+mBu2fP50OWUjRLNtvw344DdDarFh9buFAZs5ujeA==",
"dependencies": {
"@jsep-plugin/assignment": "^1.3.0",
"@jsep-plugin/regex": "^1.0.4",
"jsep": "^1.4.0"
},
"bin": {
"jsonpath": "bin/jsonpath-cli.js",
"jsonpath-plus": "bin/jsonpath-cli.js"
},
"engines": {
"node": ">=18.0.0"
}
},
"node_modules/jsonwebtoken": {
"version": "9.0.2",
"resolved": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.2.tgz",
@@ -2493,6 +2726,11 @@
"resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
"integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg=="
},
"node_modules/lodash.clonedeep": {
"version": "4.5.0",
"resolved": "https://registry.npmjs.org/lodash.clonedeep/-/lodash.clonedeep-4.5.0.tgz",
"integrity": "sha512-H5ZhCF25riFd9uB5UCkVKo61m3S/xZk1x4wA6yp/L3RFP6Z/eHH1ymQcGLo7J3GMPfm0V/7m1tryHuGVxpqEBQ=="
},
"node_modules/lodash.defaults": {
"version": "4.2.0",
"resolved": "https://registry.npmjs.org/lodash.defaults/-/lodash.defaults-4.2.0.tgz",
@@ -2942,6 +3180,14 @@
"url": "https://github.com/fb55/nth-check?sponsor=1"
}
},
"node_modules/oauth4webapi": {
"version": "3.8.3",
"resolved": "https://registry.npmjs.org/oauth4webapi/-/oauth4webapi-3.8.3.tgz",
"integrity": "sha512-pQ5BsX3QRTgnt5HxgHwgunIRaDXBdkT23tf8dfzmtTIL2LTpdmxgbpbBm0VgFWAIDlezQvQCTgnVIUmHupXHxw==",
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/object-assign": {
"version": "4.1.1",
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
@@ -2980,6 +3226,18 @@
"wrappy": "1"
}
},
"node_modules/openid-client": {
"version": "6.8.1",
"resolved": "https://registry.npmjs.org/openid-client/-/openid-client-6.8.1.tgz",
"integrity": "sha512-VoYT6enBo6Vj2j3Q5Ec0AezS+9YGzQo1f5Xc42lreMGlfP4ljiXPKVDvCADh+XHCV/bqPu/wWSiCVXbJKvrODw==",
"dependencies": {
"jose": "^6.1.0",
"oauth4webapi": "^3.8.2"
},
"funding": {
"url": "https://github.com/sponsors/panva"
}
},
"node_modules/pac-proxy-agent": {
"version": "7.2.0",
"resolved": "https://registry.npmjs.org/pac-proxy-agent/-/pac-proxy-agent-7.2.0.tgz",
@@ -3883,6 +4141,11 @@
"url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1"
}
},
"node_modules/rfc4648": {
"version": "1.5.4",
"resolved": "https://registry.npmjs.org/rfc4648/-/rfc4648-1.5.4.tgz",
"integrity": "sha512-rRg/6Lb+IGfJqO05HZkN50UtY7K/JhxJag1kP23+zyMfrvoB0B7RWv06MbOzoc79RgCdNTiUaNsTT1AJZ7Z+cg=="
},
"node_modules/rimraf": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz",
@@ -4313,6 +4576,14 @@
"node": ">= 0.8"
}
},
"node_modules/stream-buffers": {
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/stream-buffers/-/stream-buffers-3.0.3.tgz",
"integrity": "sha512-pqMqwQCso0PBJt2PQmDO0cFj0lyqmiwOMiMSkVtRokl7e+ZTRYgDHKnuZNbqjiJXgsg4nuqtD/zxuo9KqTp0Yw==",
"engines": {
"node": ">= 0.10.0"
}
},
"node_modules/streamx": {
"version": "2.23.0",
"resolved": "https://registry.npmjs.org/streamx/-/streamx-2.23.0.tgz",
@@ -4532,8 +4803,7 @@
"node_modules/undici-types": {
"version": "6.21.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==",
"devOptional": true
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ=="
},
"node_modules/universalify": {
"version": "2.0.1",
@@ -4556,6 +4826,14 @@
"resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-10.0.0.tgz",
"integrity": "sha512-H/A06tKD7sS1O1X2SshBVeA5FLycRpjqiBeqGKmBwBDBy28EnRjORxTNe269KSSr5un5qyWi1iL61wLxpd+ZOg=="
},
"node_modules/user-agents": {
"version": "1.1.669",
"resolved": "https://registry.npmjs.org/user-agents/-/user-agents-1.1.669.tgz",
"integrity": "sha512-pbIzG+AOqCaIpySKJ4IAm1l0VyE4jMnK4y1thV8lm8PYxI+7X5uWcppOK7zY79TCKKTAnJH3/4gaVIZHsjrmJA==",
"dependencies": {
"lodash.clonedeep": "^4.5.0"
}
},
"node_modules/util": {
"version": "0.12.5",
"resolved": "https://registry.npmjs.org/util/-/util-0.12.5.tgz",

View File

@@ -146,6 +146,7 @@ import tasksRoutes from './routes/tasks';
import workerRegistryRoutes from './routes/worker-registry';
// Per TASK_WORKFLOW_2024-12-10.md: Raw payload access API
import payloadsRoutes from './routes/payloads';
import k8sRoutes from './routes/k8s';
// Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com)
// These domains can access the API without authentication
@@ -230,6 +231,10 @@ console.log('[WorkerRegistry] Routes registered at /api/worker-registry');
app.use('/api/payloads', payloadsRoutes);
console.log('[Payloads] Routes registered at /api/payloads');
// K8s control routes - worker scaling from admin UI
app.use('/api/k8s', k8sRoutes);
console.log('[K8s] Routes registered at /api/k8s');
// Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation
try {
const analyticsV2Router = createAnalyticsV2Router(getPool());

View File

@@ -702,12 +702,10 @@ export class StateQueryService {
async getNationalSummary(): Promise<NationalSummary> {
const stateMetrics = await this.getAllStateMetrics();
// Get all states count and aggregate metrics
const result = await this.pool.query(`
SELECT
COUNT(DISTINCT s.code) AS total_states,
COUNT(DISTINCT CASE WHEN EXISTS (
SELECT 1 FROM dispensaries d WHERE d.state = s.code AND d.menu_type IS NOT NULL
) THEN s.code END) AS active_states,
(SELECT COUNT(*) FROM dispensaries WHERE state IS NOT NULL) AS total_stores,
(SELECT COUNT(*) FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
@@ -725,7 +723,7 @@ export class StateQueryService {
return {
totalStates: parseInt(data.total_states),
activeStates: parseInt(data.active_states),
activeStates: parseInt(data.total_states), // Same as totalStates - all states shown
totalStores: parseInt(data.total_stores),
totalProducts: parseInt(data.total_products),
totalBrands: parseInt(data.total_brands),

View File

@@ -47,4 +47,27 @@ router.post('/refresh', authMiddleware, async (req: AuthRequest, res) => {
res.json({ token });
});
// Verify password for sensitive actions (requires current user to be authenticated)
router.post('/verify-password', authMiddleware, async (req: AuthRequest, res) => {
try {
const { password } = req.body;
if (!password) {
return res.status(400).json({ error: 'Password required' });
}
// Re-authenticate the current user with the provided password
const user = await authenticateUser(req.user!.email, password);
if (!user) {
return res.status(401).json({ error: 'Invalid password', verified: false });
}
res.json({ verified: true });
} catch (error) {
console.error('Password verification error:', error);
res.status(500).json({ error: 'Internal server error' });
}
});
export default router;

View File

@@ -14,13 +14,25 @@ router.use(authMiddleware);
/**
* GET /api/admin/intelligence/brands
* List all brands with state presence, store counts, and pricing
* Query params:
* - state: Filter by state (e.g., "AZ")
* - limit: Max results (default 500)
* - offset: Pagination offset
*/
router.get('/brands', async (req: Request, res: Response) => {
try {
const { limit = '500', offset = '0' } = req.query;
const { limit = '500', offset = '0', state } = req.query;
const limitNum = Math.min(parseInt(limit as string, 10), 1000);
const offsetNum = parseInt(offset as string, 10);
// Build WHERE clause based on state filter
let stateFilter = '';
const params: any[] = [limitNum, offsetNum];
if (state && state !== 'all') {
stateFilter = 'AND d.state = $3';
params.push(state);
}
const { rows } = await pool.query(`
SELECT
sp.brand_name_raw as brand_name,
@@ -32,17 +44,26 @@ router.get('/brands', async (req: Request, res: Response) => {
FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.brand_name_raw IS NOT NULL AND sp.brand_name_raw != ''
${stateFilter}
GROUP BY sp.brand_name_raw
ORDER BY store_count DESC, sku_count DESC
LIMIT $1 OFFSET $2
`, [limitNum, offsetNum]);
`, params);
// Get total count
// Get total count with same state filter
const countParams: any[] = [];
let countStateFilter = '';
if (state && state !== 'all') {
countStateFilter = 'AND d.state = $1';
countParams.push(state);
}
const { rows: countRows } = await pool.query(`
SELECT COUNT(DISTINCT brand_name_raw) as total
FROM store_products
WHERE brand_name_raw IS NOT NULL AND brand_name_raw != ''
`);
SELECT COUNT(DISTINCT sp.brand_name_raw) as total
FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.brand_name_raw IS NOT NULL AND sp.brand_name_raw != ''
${countStateFilter}
`, countParams);
res.json({
brands: rows.map((r: any) => ({
@@ -147,23 +168,58 @@ router.get('/brands/:brandName/penetration', async (req: Request, res: Response)
/**
* GET /api/admin/intelligence/pricing
* Get pricing analytics by category
* Query params:
* - state: Filter by state (e.g., "AZ")
*/
router.get('/pricing', async (req: Request, res: Response) => {
try {
const { rows: categoryRows } = await pool.query(`
SELECT
sp.category_raw as category,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)::numeric, 2) as median_price,
COUNT(*) as product_count
FROM store_products sp
WHERE sp.category_raw IS NOT NULL AND sp.price_rec > 0
GROUP BY sp.category_raw
ORDER BY product_count DESC
`);
const { state } = req.query;
// Build WHERE clause based on state filter
let stateFilter = '';
const categoryParams: any[] = [];
const stateQueryParams: any[] = [];
const overallParams: any[] = [];
if (state && state !== 'all') {
stateFilter = 'AND d.state = $1';
categoryParams.push(state);
overallParams.push(state);
}
// Category pricing with optional state filter
const categoryQuery = state && state !== 'all'
? `
SELECT
sp.category_raw as category,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)::numeric, 2) as median_price,
COUNT(*) as product_count
FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.category_raw IS NOT NULL AND sp.price_rec > 0 ${stateFilter}
GROUP BY sp.category_raw
ORDER BY product_count DESC
`
: `
SELECT
sp.category_raw as category,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sp.price_rec)::numeric, 2) as median_price,
COUNT(*) as product_count
FROM store_products sp
WHERE sp.category_raw IS NOT NULL AND sp.price_rec > 0
GROUP BY sp.category_raw
ORDER BY product_count DESC
`;
const { rows: categoryRows } = await pool.query(categoryQuery, categoryParams);
// State pricing
const { rows: stateRows } = await pool.query(`
SELECT
d.state,
@@ -178,6 +234,31 @@ router.get('/pricing', async (req: Request, res: Response) => {
ORDER BY avg_price DESC
`);
// Overall stats with optional state filter
const overallQuery = state && state !== 'all'
? `
SELECT
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
COUNT(*) as total_products
FROM store_products sp
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.price_rec > 0 ${stateFilter}
`
: `
SELECT
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MIN(sp.price_rec) as min_price,
MAX(sp.price_rec) as max_price,
COUNT(*) as total_products
FROM store_products sp
WHERE sp.price_rec > 0
`;
const { rows: overallRows } = await pool.query(overallQuery, overallParams);
const overall = overallRows[0];
res.json({
byCategory: categoryRows.map((r: any) => ({
category: r.category,
@@ -194,6 +275,12 @@ router.get('/pricing', async (req: Request, res: Response) => {
maxPrice: r.max_price ? parseFloat(r.max_price) : null,
productCount: parseInt(r.product_count, 10),
})),
overall: {
avgPrice: overall?.avg_price ? parseFloat(overall.avg_price) : null,
minPrice: overall?.min_price ? parseFloat(overall.min_price) : null,
maxPrice: overall?.max_price ? parseFloat(overall.max_price) : null,
totalProducts: parseInt(overall?.total_products || '0', 10),
},
});
} catch (error: any) {
console.error('[Intelligence] Error fetching pricing:', error.message);
@@ -204,9 +291,23 @@ router.get('/pricing', async (req: Request, res: Response) => {
/**
* GET /api/admin/intelligence/stores
* Get store intelligence summary
* Query params:
* - state: Filter by state (e.g., "AZ")
* - limit: Max results (default 200)
*/
router.get('/stores', async (req: Request, res: Response) => {
try {
const { state, limit = '200' } = req.query;
const limitNum = Math.min(parseInt(limit as string, 10), 500);
// Build WHERE clause based on state filter
let stateFilter = '';
const params: any[] = [limitNum];
if (state && state !== 'all') {
stateFilter = 'AND d.state = $2';
params.push(state);
}
const { rows: storeRows } = await pool.query(`
SELECT
d.id,
@@ -216,17 +317,22 @@ router.get('/stores', async (req: Request, res: Response) => {
d.state,
d.menu_type,
d.crawl_enabled,
COUNT(DISTINCT sp.id) as product_count,
c.name as chain_name,
COUNT(DISTINCT sp.id) as sku_count,
COUNT(DISTINCT sp.brand_name_raw) as brand_count,
ROUND(AVG(sp.price_rec)::numeric, 2) as avg_price,
MAX(sp.updated_at) as last_product_update
MAX(sp.updated_at) as last_crawl,
(SELECT COUNT(*) FROM store_product_snapshots sps
WHERE sps.store_product_id IN (SELECT id FROM store_products WHERE dispensary_id = d.id)) as snapshot_count
FROM dispensaries d
LEFT JOIN store_products sp ON sp.dispensary_id = d.id
WHERE d.state IS NOT NULL
GROUP BY d.id, d.name, d.dba_name, d.city, d.state, d.menu_type, d.crawl_enabled
ORDER BY product_count DESC
LIMIT 200
`);
LEFT JOIN chains c ON d.chain_id = c.id
WHERE d.state IS NOT NULL AND d.crawl_enabled = true
${stateFilter}
GROUP BY d.id, d.name, d.dba_name, d.city, d.state, d.menu_type, d.crawl_enabled, c.name
ORDER BY sku_count DESC
LIMIT $1
`, params);
res.json({
stores: storeRows.map((r: any) => ({
@@ -237,10 +343,13 @@ router.get('/stores', async (req: Request, res: Response) => {
state: r.state,
menuType: r.menu_type,
crawlEnabled: r.crawl_enabled,
productCount: parseInt(r.product_count || '0', 10),
chainName: r.chain_name || null,
skuCount: parseInt(r.sku_count || '0', 10),
snapshotCount: parseInt(r.snapshot_count || '0', 10),
brandCount: parseInt(r.brand_count || '0', 10),
avgPrice: r.avg_price ? parseFloat(r.avg_price) : null,
lastProductUpdate: r.last_product_update,
lastCrawl: r.last_crawl,
crawlFrequencyHours: 4, // Default crawl frequency
})),
total: storeRows.length,
});

140
backend/src/routes/k8s.ts Normal file
View File

@@ -0,0 +1,140 @@
/**
* Kubernetes Control Routes
*
* Provides admin UI control over k8s resources like worker scaling.
* Uses in-cluster config when running in k8s, or kubeconfig locally.
*/
import { Router, Request, Response } from 'express';
import * as k8s from '@kubernetes/client-node';
const router = Router();
// K8s client setup - lazy initialization
let appsApi: k8s.AppsV1Api | null = null;
let k8sError: string | null = null;
function getK8sClient(): k8s.AppsV1Api | null {
if (appsApi) return appsApi;
if (k8sError) return null;
try {
const kc = new k8s.KubeConfig();
// Try in-cluster config first (when running in k8s)
try {
kc.loadFromCluster();
console.log('[K8s] Loaded in-cluster config');
} catch {
// Fall back to default kubeconfig (local dev)
try {
kc.loadFromDefault();
console.log('[K8s] Loaded default kubeconfig');
} catch (e) {
k8sError = 'No k8s config available';
console.log('[K8s] No config available - k8s routes disabled');
return null;
}
}
appsApi = kc.makeApiClient(k8s.AppsV1Api);
return appsApi;
} catch (e: any) {
k8sError = e.message;
console.error('[K8s] Failed to initialize client:', e.message);
return null;
}
}
const NAMESPACE = process.env.K8S_NAMESPACE || 'dispensary-scraper';
const WORKER_DEPLOYMENT = 'scraper-worker';
/**
* GET /api/k8s/workers
* Get current worker deployment status
*/
router.get('/workers', async (_req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.json({
success: true,
available: false,
error: k8sError || 'K8s not available',
replicas: 0,
readyReplicas: 0,
});
}
try {
const deployment = await client.readNamespacedDeployment({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
});
res.json({
success: true,
available: true,
replicas: deployment.spec?.replicas || 0,
readyReplicas: deployment.status?.readyReplicas || 0,
availableReplicas: deployment.status?.availableReplicas || 0,
updatedReplicas: deployment.status?.updatedReplicas || 0,
});
} catch (e: any) {
console.error('[K8s] Error getting deployment:', e.message);
res.status(500).json({
success: false,
error: e.message,
});
}
});
/**
* POST /api/k8s/workers/scale
* Scale worker deployment
* Body: { replicas: number }
*/
router.post('/workers/scale', async (req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: k8sError || 'K8s not available',
});
}
const { replicas } = req.body;
if (typeof replicas !== 'number' || replicas < 0 || replicas > 50) {
return res.status(400).json({
success: false,
error: 'replicas must be a number between 0 and 50',
});
}
try {
// Patch the deployment to set replicas
await client.patchNamespacedDeploymentScale({
name: WORKER_DEPLOYMENT,
namespace: NAMESPACE,
body: { spec: { replicas } },
});
console.log(`[K8s] Scaled ${WORKER_DEPLOYMENT} to ${replicas} replicas`);
res.json({
success: true,
replicas,
message: `Scaled to ${replicas} workers`,
});
} catch (e: any) {
console.error('[K8s] Error scaling deployment:', e.message);
res.status(500).json({
success: false,
error: e.message,
});
}
});
export default router;

View File

@@ -291,6 +291,107 @@ router.get('/stores/:id/summary', async (req: Request, res: Response) => {
}
});
/**
* GET /api/markets/stores/:id/crawl-history
* Get crawl history for a specific store
*/
router.get('/stores/:id/crawl-history', async (req: Request, res: Response) => {
try {
const { id } = req.params;
const { limit = '50' } = req.query;
const dispensaryId = parseInt(id, 10);
const limitNum = Math.min(parseInt(limit as string, 10), 100);
// Get crawl history from crawl_orchestration_traces
const { rows: historyRows } = await pool.query(`
SELECT
id,
run_id,
profile_key,
crawler_module,
state_at_start,
state_at_end,
total_steps,
duration_ms,
success,
error_message,
products_found,
started_at,
completed_at
FROM crawl_orchestration_traces
WHERE dispensary_id = $1
ORDER BY started_at DESC
LIMIT $2
`, [dispensaryId, limitNum]);
// Get next scheduled crawl if available
const { rows: scheduleRows } = await pool.query(`
SELECT
js.id as schedule_id,
js.job_name,
js.enabled,
js.base_interval_minutes,
js.jitter_minutes,
js.next_run_at,
js.last_run_at,
js.last_status
FROM job_schedules js
WHERE js.enabled = true
AND js.job_config->>'dispensaryId' = $1::text
ORDER BY js.next_run_at
LIMIT 1
`, [dispensaryId.toString()]);
// Get dispensary info for slug
const { rows: dispRows } = await pool.query(`
SELECT
id,
name,
dba_name,
slug,
state,
city,
menu_type,
platform_dispensary_id,
last_menu_scrape
FROM dispensaries
WHERE id = $1
`, [dispensaryId]);
res.json({
dispensary: dispRows[0] || null,
history: historyRows.map(row => ({
id: row.id,
runId: row.run_id,
profileKey: row.profile_key,
crawlerModule: row.crawler_module,
stateAtStart: row.state_at_start,
stateAtEnd: row.state_at_end,
totalSteps: row.total_steps,
durationMs: row.duration_ms,
success: row.success,
errorMessage: row.error_message,
productsFound: row.products_found,
startedAt: row.started_at?.toISOString() || null,
completedAt: row.completed_at?.toISOString() || null,
})),
nextSchedule: scheduleRows[0] ? {
scheduleId: scheduleRows[0].schedule_id,
jobName: scheduleRows[0].job_name,
enabled: scheduleRows[0].enabled,
baseIntervalMinutes: scheduleRows[0].base_interval_minutes,
jitterMinutes: scheduleRows[0].jitter_minutes,
nextRunAt: scheduleRows[0].next_run_at?.toISOString() || null,
lastRunAt: scheduleRows[0].last_run_at?.toISOString() || null,
lastStatus: scheduleRows[0].last_status,
} : null,
});
} catch (error: any) {
console.error('[Markets] Error fetching crawl history:', error.message);
res.status(500).json({ error: error.message });
}
});
/**
* GET /api/markets/stores/:id/products
* Get products for a store with filtering and pagination

View File

@@ -78,14 +78,14 @@ router.get('/metrics', async (_req: Request, res: Response) => {
/**
* GET /api/admin/orchestrator/states
* Returns array of states with at least one known dispensary
* Returns array of states with at least one crawl-enabled dispensary
*/
router.get('/states', async (_req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT DISTINCT state, COUNT(*) as store_count
FROM dispensaries
WHERE state IS NOT NULL
WHERE state IS NOT NULL AND crawl_enabled = true
GROUP BY state
ORDER BY state
`);

View File

@@ -13,6 +13,12 @@ import {
TaskFilter,
} from '../tasks/task-service';
import { pool } from '../db/pool';
import {
isTaskPoolPaused,
pauseTaskPool,
resumeTaskPool,
getTaskPoolStatus,
} from '../tasks/task-pool-state';
const router = Router();
@@ -592,4 +598,42 @@ router.post('/migration/full-migrate', async (req: Request, res: Response) => {
}
});
/**
* GET /api/tasks/pool/status
* Check if task pool is paused
*/
router.get('/pool/status', async (_req: Request, res: Response) => {
const status = getTaskPoolStatus();
res.json({
success: true,
...status,
});
});
/**
* POST /api/tasks/pool/pause
* Pause the task pool - workers won't pick up new tasks
*/
router.post('/pool/pause', async (_req: Request, res: Response) => {
pauseTaskPool();
res.json({
success: true,
paused: true,
message: 'Task pool paused - workers will not pick up new tasks',
});
});
/**
* POST /api/tasks/pool/resume
* Resume the task pool - workers will pick up tasks again
*/
router.post('/pool/resume', async (_req: Request, res: Response) => {
resumeTaskPool();
res.json({
success: true,
paused: false,
message: 'Task pool resumed - workers will pick up new tasks',
});
});
export default router;

View File

@@ -138,17 +138,36 @@ router.post('/register', async (req: Request, res: Response) => {
*
* Body:
* - worker_id: string (required)
* - current_task_id: number (optional) - task currently being processed
* - current_task_id: number (optional) - task currently being processed (primary task)
* - current_task_ids: number[] (optional) - all tasks currently being processed (concurrent)
* - active_task_count: number (optional) - number of tasks currently running
* - max_concurrent_tasks: number (optional) - max concurrent tasks this worker can handle
* - status: string (optional) - 'active', 'idle'
* - resources: object (optional) - memory_mb, cpu_user_ms, cpu_system_ms, etc.
*/
router.post('/heartbeat', async (req: Request, res: Response) => {
try {
const { worker_id, current_task_id, status = 'active', resources } = req.body;
const {
worker_id,
current_task_id,
current_task_ids,
active_task_count,
max_concurrent_tasks,
status = 'active',
resources
} = req.body;
if (!worker_id) {
return res.status(400).json({ success: false, error: 'worker_id is required' });
}
// Build metadata object with all the new fields
const metadata: Record<string, unknown> = {};
if (resources) Object.assign(metadata, resources);
if (current_task_ids) metadata.current_task_ids = current_task_ids;
if (active_task_count !== undefined) metadata.active_task_count = active_task_count;
if (max_concurrent_tasks !== undefined) metadata.max_concurrent_tasks = max_concurrent_tasks;
// Store resources in metadata jsonb column
const { rows } = await pool.query(`
UPDATE worker_registry
@@ -159,7 +178,7 @@ router.post('/heartbeat', async (req: Request, res: Response) => {
updated_at = NOW()
WHERE worker_id = $3
RETURNING id, friendly_name, status
`, [current_task_id || null, status, worker_id, resources ? JSON.stringify(resources) : null]);
`, [current_task_id || null, status, worker_id, Object.keys(metadata).length > 0 ? JSON.stringify(metadata) : null]);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found - please register first' });
@@ -330,12 +349,21 @@ router.get('/workers', async (req: Request, res: Response) => {
tasks_completed,
tasks_failed,
current_task_id,
-- Concurrent task fields from metadata
(metadata->>'current_task_ids')::jsonb as current_task_ids,
(metadata->>'active_task_count')::int as active_task_count,
(metadata->>'max_concurrent_tasks')::int as max_concurrent_tasks,
-- Decommission fields
COALESCE(decommission_requested, false) as decommission_requested,
decommission_reason,
-- Full metadata for resources
metadata,
EXTRACT(EPOCH FROM (NOW() - last_heartbeat_at)) as seconds_since_heartbeat,
CASE
WHEN status = 'offline' OR status = 'terminated' THEN status
WHEN last_heartbeat_at < NOW() - INTERVAL '2 minutes' THEN 'stale'
WHEN current_task_id IS NOT NULL THEN 'busy'
WHEN (metadata->>'active_task_count')::int > 0 THEN 'busy'
ELSE 'ready'
END as health_status,
created_at
@@ -672,4 +700,163 @@ router.get('/capacity', async (_req: Request, res: Response) => {
}
});
// ============================================================
// WORKER LIFECYCLE MANAGEMENT
// ============================================================
/**
* POST /api/worker-registry/workers/:workerId/decommission
* Request graceful decommission of a worker (will stop after current task)
*/
router.post('/workers/:workerId/decommission', async (req: Request, res: Response) => {
try {
const { workerId } = req.params;
const { reason, issued_by } = req.body;
// Update worker_registry to flag for decommission
const result = await pool.query(
`UPDATE worker_registry
SET decommission_requested = true,
decommission_reason = $2,
decommission_requested_at = NOW()
WHERE worker_id = $1
RETURNING friendly_name, status, current_task_id`,
[workerId, reason || 'Manual decommission from admin']
);
if (result.rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
const worker = result.rows[0];
// Also log to worker_commands for audit trail
await pool.query(
`INSERT INTO worker_commands (worker_id, command, reason, issued_by)
VALUES ($1, 'decommission', $2, $3)
ON CONFLICT DO NOTHING`,
[workerId, reason || 'Manual decommission', issued_by || 'admin']
).catch(() => {
// Table might not exist yet - ignore
});
res.json({
success: true,
message: worker.current_task_id
? `Worker ${worker.friendly_name} will stop after completing task #${worker.current_task_id}`
: `Worker ${worker.friendly_name} will stop on next poll`,
worker: {
friendly_name: worker.friendly_name,
status: worker.status,
current_task_id: worker.current_task_id,
decommission_requested: true
}
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/workers/:workerId/cancel-decommission
* Cancel a pending decommission request
*/
router.post('/workers/:workerId/cancel-decommission', async (req: Request, res: Response) => {
try {
const { workerId } = req.params;
const result = await pool.query(
`UPDATE worker_registry
SET decommission_requested = false,
decommission_reason = NULL,
decommission_requested_at = NULL
WHERE worker_id = $1
RETURNING friendly_name`,
[workerId]
);
if (result.rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
res.json({
success: true,
message: `Decommission cancelled for ${result.rows[0].friendly_name}`
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/spawn
* Spawn a new worker in the current pod (only works in multi-worker-per-pod mode)
* For now, this is a placeholder - actual spawning requires the pod supervisor
*/
router.post('/spawn', async (req: Request, res: Response) => {
try {
const { pod_name, role } = req.body;
// For now, we can't actually spawn workers from the API
// This would require a supervisor process in each pod that listens for spawn commands
// Instead, return instructions for how to scale
res.json({
success: false,
error: 'Direct worker spawning not yet implemented',
instructions: 'To add workers, scale the K8s deployment: kubectl scale deployment/scraper-worker --replicas=N'
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/worker-registry/pods
* Get workers grouped by pod
*/
router.get('/pods', async (_req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
COALESCE(pod_name, 'Unknown') as pod_name,
COUNT(*) as worker_count,
COUNT(*) FILTER (WHERE current_task_id IS NOT NULL) as busy_count,
COUNT(*) FILTER (WHERE current_task_id IS NULL) as idle_count,
SUM(tasks_completed) as total_completed,
SUM(tasks_failed) as total_failed,
SUM((metadata->>'memory_rss_mb')::int) as total_memory_mb,
array_agg(json_build_object(
'worker_id', worker_id,
'friendly_name', friendly_name,
'status', status,
'current_task_id', current_task_id,
'tasks_completed', tasks_completed,
'tasks_failed', tasks_failed,
'decommission_requested', COALESCE(decommission_requested, false),
'last_heartbeat_at', last_heartbeat_at
)) as workers
FROM worker_registry
WHERE status NOT IN ('offline', 'terminated')
GROUP BY pod_name
ORDER BY pod_name
`);
res.json({
success: true,
pods: rows.map(row => ({
pod_name: row.pod_name,
worker_count: parseInt(row.worker_count),
busy_count: parseInt(row.busy_count),
idle_count: parseInt(row.idle_count),
total_completed: parseInt(row.total_completed) || 0,
total_failed: parseInt(row.total_failed) || 0,
total_memory_mb: parseInt(row.total_memory_mb) || 0,
workers: row.workers
}))
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
export default router;

View File

@@ -35,7 +35,7 @@ const router = Router();
// ============================================================
const K8S_NAMESPACE = process.env.K8S_NAMESPACE || 'dispensary-scraper';
const K8S_STATEFULSET_NAME = process.env.K8S_WORKER_STATEFULSET || 'scraper-worker';
const K8S_DEPLOYMENT_NAME = process.env.K8S_WORKER_DEPLOYMENT || 'scraper-worker';
// Initialize K8s client - uses in-cluster config when running in K8s,
// or kubeconfig when running locally
@@ -70,7 +70,7 @@ function getK8sClient(): k8s.AppsV1Api | null {
/**
* GET /api/workers/k8s/replicas - Get current worker replica count
* Returns current and desired replica counts from the StatefulSet
* Returns current and desired replica counts from the Deployment
*/
router.get('/k8s/replicas', async (_req: Request, res: Response) => {
const client = getK8sClient();
@@ -84,21 +84,21 @@ router.get('/k8s/replicas', async (_req: Request, res: Response) => {
}
try {
const response = await client.readNamespacedStatefulSet({
name: K8S_STATEFULSET_NAME,
const response = await client.readNamespacedDeployment({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
const statefulSet = response;
const deployment = response;
res.json({
success: true,
replicas: {
current: statefulSet.status?.readyReplicas || 0,
desired: statefulSet.spec?.replicas || 0,
available: statefulSet.status?.availableReplicas || 0,
updated: statefulSet.status?.updatedReplicas || 0,
current: deployment.status?.readyReplicas || 0,
desired: deployment.spec?.replicas || 0,
available: deployment.status?.availableReplicas || 0,
updated: deployment.status?.updatedReplicas || 0,
},
statefulset: K8S_STATEFULSET_NAME,
deployment: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
} catch (err: any) {
@@ -112,7 +112,7 @@ router.get('/k8s/replicas', async (_req: Request, res: Response) => {
/**
* POST /api/workers/k8s/scale - Scale worker replicas
* Body: { replicas: number } - desired replica count (1-20)
* Body: { replicas: number } - desired replica count (0-20)
*/
router.post('/k8s/scale', async (req: Request, res: Response) => {
const client = getK8sClient();
@@ -136,21 +136,21 @@ router.post('/k8s/scale', async (req: Request, res: Response) => {
try {
// Get current state first
const currentResponse = await client.readNamespacedStatefulSetScale({
name: K8S_STATEFULSET_NAME,
const currentResponse = await client.readNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
const currentReplicas = currentResponse.spec?.replicas || 0;
// Update scale using replaceNamespacedStatefulSetScale
await client.replaceNamespacedStatefulSetScale({
name: K8S_STATEFULSET_NAME,
// Update scale using replaceNamespacedDeploymentScale
await client.replaceNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
body: {
apiVersion: 'autoscaling/v1',
kind: 'Scale',
metadata: {
name: K8S_STATEFULSET_NAME,
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
},
spec: {
@@ -159,14 +159,14 @@ router.post('/k8s/scale', async (req: Request, res: Response) => {
},
});
console.log(`[Workers] Scaled ${K8S_STATEFULSET_NAME} from ${currentReplicas} to ${replicas} replicas`);
console.log(`[Workers] Scaled ${K8S_DEPLOYMENT_NAME} from ${currentReplicas} to ${replicas} replicas`);
res.json({
success: true,
message: `Scaled from ${currentReplicas} to ${replicas} replicas`,
previous: currentReplicas,
desired: replicas,
statefulset: K8S_STATEFULSET_NAME,
deployment: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
} catch (err: any) {
@@ -178,6 +178,73 @@ router.post('/k8s/scale', async (req: Request, res: Response) => {
}
});
/**
* POST /api/workers/k8s/scale-up - Scale up worker replicas by 1
* Convenience endpoint for adding a single worker
*/
router.post('/k8s/scale-up', async (_req: Request, res: Response) => {
const client = getK8sClient();
if (!client) {
return res.status(503).json({
success: false,
error: 'K8s client not available (not running in cluster or no kubeconfig)',
});
}
try {
// Get current replica count
const currentResponse = await client.readNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
const currentReplicas = currentResponse.spec?.replicas || 0;
const newReplicas = currentReplicas + 1;
// Cap at 20 replicas
if (newReplicas > 20) {
return res.status(400).json({
success: false,
error: 'Maximum replica count (20) reached',
});
}
// Scale up by 1
await client.replaceNamespacedDeploymentScale({
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
body: {
apiVersion: 'autoscaling/v1',
kind: 'Scale',
metadata: {
name: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
},
spec: {
replicas: newReplicas,
},
},
});
console.log(`[Workers] Scaled up ${K8S_DEPLOYMENT_NAME} from ${currentReplicas} to ${newReplicas} replicas`);
res.json({
success: true,
message: `Added worker (${currentReplicas}${newReplicas} replicas)`,
previous: currentReplicas,
desired: newReplicas,
deployment: K8S_DEPLOYMENT_NAME,
namespace: K8S_NAMESPACE,
});
} catch (err: any) {
console.error('[Workers] K8s scale-up error:', err.body?.message || err.message);
res.status(500).json({
success: false,
error: err.body?.message || err.message,
});
}
});
// ============================================================
// STATIC ROUTES (must come before parameterized routes)
// ============================================================

View File

@@ -25,7 +25,7 @@ export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult
try {
// Get states to discover
const statesResult = await pool.query(`
SELECT code FROM states WHERE active = true ORDER BY code
SELECT code FROM states WHERE is_active = true ORDER BY code
`);
const stateCodes = statesResult.rows.map(r => r.code);

View File

@@ -0,0 +1,37 @@
/**
* Task Pool State
*
* Shared state for task pool pause/resume functionality.
* This is kept separate to avoid circular dependencies between
* task-service.ts and routes/tasks.ts.
*
* State is in-memory and resets on server restart.
* By default, the pool is PAUSED (closed) - admin must explicitly start it.
* This prevents workers from immediately grabbing tasks on deploy before
* the system is ready.
*/
let taskPoolPaused = true;
export function isTaskPoolPaused(): boolean {
return taskPoolPaused;
}
export function pauseTaskPool(): void {
taskPoolPaused = true;
console.log('[TaskPool] Task pool PAUSED - workers will not pick up new tasks');
}
export function resumeTaskPool(): void {
taskPoolPaused = false;
console.log('[TaskPool] Task pool RESUMED - workers can pick up tasks');
}
export function getTaskPoolStatus(): { paused: boolean; message: string } {
return {
paused: taskPoolPaused,
message: taskPoolPaused
? 'Task pool is paused - workers will not pick up new tasks'
: 'Task pool is open - workers are picking up tasks',
};
}

View File

@@ -9,6 +9,7 @@
*/
import { pool } from '../db/pool';
import { isTaskPoolPaused } from './task-pool-state';
// Helper to check if a table exists
async function tableExists(tableName: string): Promise<boolean> {
@@ -149,8 +150,14 @@ class TaskService {
/**
* Claim a task atomically for a worker
* If role is null, claims ANY available task (role-agnostic worker)
* Returns null if task pool is paused.
*/
async claimTask(role: TaskRole | null, workerId: string): Promise<WorkerTask | null> {
// Check if task pool is paused - don't claim any tasks
if (isTaskPoolPaused()) {
return null;
}
if (role) {
// Role-specific claiming - use the SQL function
const result = await pool.query(

View File

@@ -64,6 +64,33 @@ const POLL_INTERVAL_MS = parseInt(process.env.POLL_INTERVAL_MS || '5000');
const HEARTBEAT_INTERVAL_MS = parseInt(process.env.HEARTBEAT_INTERVAL_MS || '30000');
const API_BASE_URL = process.env.API_BASE_URL || 'http://localhost:3010';
// =============================================================================
// CONCURRENT TASK PROCESSING SETTINGS
// =============================================================================
// Workers can process multiple tasks simultaneously using async I/O.
// This improves throughput for I/O-bound tasks (network calls, DB queries).
//
// Resource thresholds trigger "backoff" - the worker stops claiming new tasks
// but continues processing existing ones until resources return to normal.
//
// See: docs/WORKER_TASK_ARCHITECTURE.md#concurrent-task-processing
// =============================================================================
// Maximum number of tasks this worker will run concurrently
// Tune based on workload: I/O-bound tasks benefit from higher concurrency
const MAX_CONCURRENT_TASKS = parseInt(process.env.MAX_CONCURRENT_TASKS || '3');
// When heap memory usage exceeds this threshold (as decimal 0.0-1.0), stop claiming new tasks
// Default 85% - gives headroom before OOM
const MEMORY_BACKOFF_THRESHOLD = parseFloat(process.env.MEMORY_BACKOFF_THRESHOLD || '0.85');
// When CPU usage exceeds this threshold (as decimal 0.0-1.0), stop claiming new tasks
// Default 90% - allows some burst capacity
const CPU_BACKOFF_THRESHOLD = parseFloat(process.env.CPU_BACKOFF_THRESHOLD || '0.90');
// How long to wait (ms) when in backoff state before rechecking resources
const BACKOFF_DURATION_MS = parseInt(process.env.BACKOFF_DURATION_MS || '10000');
export interface TaskContext {
pool: Pool;
workerId: string;
@@ -94,6 +121,25 @@ const TASK_HANDLERS: Record<TaskRole, TaskHandler> = {
analytics_refresh: handleAnalyticsRefresh,
};
/**
* Resource usage stats reported to the registry and used for backoff decisions.
* These values are included in worker heartbeats and displayed in the UI.
*/
interface ResourceStats {
/** Current heap memory usage as decimal (0.0 to 1.0) */
memoryPercent: number;
/** Current heap used in MB */
memoryMb: number;
/** Total heap available in MB */
memoryTotalMb: number;
/** CPU usage percentage since last check (0 to 100) */
cpuPercent: number;
/** True if worker is currently in backoff state */
isBackingOff: boolean;
/** Reason for backoff (e.g., "Memory at 87.3% (threshold: 85%)") */
backoffReason: string | null;
}
export class TaskWorker {
private pool: Pool;
private workerId: string;
@@ -102,14 +148,106 @@ export class TaskWorker {
private isRunning: boolean = false;
private heartbeatInterval: NodeJS.Timeout | null = null;
private registryHeartbeatInterval: NodeJS.Timeout | null = null;
private currentTask: WorkerTask | null = null;
private crawlRotator: CrawlRotator;
// ==========================================================================
// CONCURRENT TASK TRACKING
// ==========================================================================
// activeTasks: Map of task ID -> task object for all currently running tasks
// taskPromises: Map of task ID -> Promise for cleanup when task completes
// maxConcurrentTasks: How many tasks this worker will run in parallel
// ==========================================================================
private activeTasks: Map<number, WorkerTask> = new Map();
private taskPromises: Map<number, Promise<void>> = new Map();
private maxConcurrentTasks: number = MAX_CONCURRENT_TASKS;
// ==========================================================================
// RESOURCE MONITORING FOR BACKOFF
// ==========================================================================
// CPU tracking uses differential measurement - we track last values and
// calculate percentage based on elapsed time since last check.
// ==========================================================================
private lastCpuUsage: { user: number; system: number } = { user: 0, system: 0 };
private lastCpuCheck: number = Date.now();
private isBackingOff: boolean = false;
private backoffReason: string | null = null;
constructor(role: TaskRole | null = null, workerId?: string) {
this.pool = getPool();
this.role = role;
this.workerId = workerId || `worker-${uuidv4().slice(0, 8)}`;
this.crawlRotator = new CrawlRotator(this.pool);
// Initialize CPU tracking
const cpuUsage = process.cpuUsage();
this.lastCpuUsage = { user: cpuUsage.user, system: cpuUsage.system };
this.lastCpuCheck = Date.now();
}
/**
* Get current resource usage
*/
private getResourceStats(): ResourceStats {
const memUsage = process.memoryUsage();
const heapUsedMb = memUsage.heapUsed / 1024 / 1024;
const heapTotalMb = memUsage.heapTotal / 1024 / 1024;
const memoryPercent = heapUsedMb / heapTotalMb;
// Calculate CPU usage since last check
const cpuUsage = process.cpuUsage();
const now = Date.now();
const elapsed = now - this.lastCpuCheck;
let cpuPercent = 0;
if (elapsed > 0) {
const userDiff = (cpuUsage.user - this.lastCpuUsage.user) / 1000; // microseconds to ms
const systemDiff = (cpuUsage.system - this.lastCpuUsage.system) / 1000;
cpuPercent = ((userDiff + systemDiff) / elapsed) * 100;
}
// Update last values
this.lastCpuUsage = { user: cpuUsage.user, system: cpuUsage.system };
this.lastCpuCheck = now;
return {
memoryPercent,
memoryMb: Math.round(heapUsedMb),
memoryTotalMb: Math.round(heapTotalMb),
cpuPercent: Math.min(100, cpuPercent), // Cap at 100%
isBackingOff: this.isBackingOff,
backoffReason: this.backoffReason,
};
}
/**
* Check if we should back off from taking new tasks
*/
private shouldBackOff(): { backoff: boolean; reason: string | null } {
const stats = this.getResourceStats();
if (stats.memoryPercent > MEMORY_BACKOFF_THRESHOLD) {
return { backoff: true, reason: `Memory at ${(stats.memoryPercent * 100).toFixed(1)}% (threshold: ${MEMORY_BACKOFF_THRESHOLD * 100}%)` };
}
if (stats.cpuPercent > CPU_BACKOFF_THRESHOLD * 100) {
return { backoff: true, reason: `CPU at ${stats.cpuPercent.toFixed(1)}% (threshold: ${CPU_BACKOFF_THRESHOLD * 100}%)` };
}
return { backoff: false, reason: null };
}
/**
* Get count of currently running tasks
*/
get activeTaskCount(): number {
return this.activeTasks.size;
}
/**
* Check if we can accept more tasks
*/
private canAcceptMoreTasks(): boolean {
return this.activeTasks.size < this.maxConcurrentTasks;
}
/**
@@ -117,40 +255,79 @@ export class TaskWorker {
* Called once on worker startup before processing any tasks.
*
* IMPORTANT: Proxies are REQUIRED. Workers will wait until proxies are available.
* Workers listen for PostgreSQL NOTIFY 'proxy_added' to wake up immediately when proxies are added.
*/
private async initializeStealth(): Promise<void> {
const MAX_WAIT_MINUTES = 60;
const RETRY_INTERVAL_MS = 30000; // 30 seconds
const maxAttempts = (MAX_WAIT_MINUTES * 60 * 1000) / RETRY_INTERVAL_MS;
const POLL_INTERVAL_MS = 30000; // 30 seconds fallback polling
const maxAttempts = (MAX_WAIT_MINUTES * 60 * 1000) / POLL_INTERVAL_MS;
let attempts = 0;
let notifyClient: any = null;
while (attempts < maxAttempts) {
try {
// Load proxies from database
await this.crawlRotator.initialize();
const stats = this.crawlRotator.proxy.getStats();
if (stats.activeProxies > 0) {
console.log(`[TaskWorker] Loaded ${stats.activeProxies} proxies (${stats.avgSuccessRate.toFixed(1)}% avg success rate)`);
// Wire rotator to Dutchie client - proxies will be used for ALL requests
setCrawlRotator(this.crawlRotator);
console.log(`[TaskWorker] Stealth initialized: ${this.crawlRotator.userAgent.getCount()} fingerprints, proxy REQUIRED for all requests`);
return;
}
attempts++;
console.log(`[TaskWorker] No active proxies available (attempt ${attempts}). Waiting ${RETRY_INTERVAL_MS / 1000}s for proxies to be added...`);
await this.sleep(RETRY_INTERVAL_MS);
} catch (error: any) {
attempts++;
console.log(`[TaskWorker] Error loading proxies (attempt ${attempts}): ${error.message}. Retrying in ${RETRY_INTERVAL_MS / 1000}s...`);
await this.sleep(RETRY_INTERVAL_MS);
}
// Set up PostgreSQL LISTEN for proxy notifications
try {
notifyClient = await this.pool.connect();
await notifyClient.query('LISTEN proxy_added');
console.log(`[TaskWorker] Listening for proxy_added notifications...`);
} catch (err: any) {
console.log(`[TaskWorker] Could not set up LISTEN (will poll): ${err.message}`);
}
throw new Error(`No active proxies available after waiting ${MAX_WAIT_MINUTES} minutes. Add proxies to the database.`);
// Create a promise that resolves when notified
let notifyResolve: (() => void) | null = null;
if (notifyClient) {
notifyClient.on('notification', (msg: any) => {
if (msg.channel === 'proxy_added') {
console.log(`[TaskWorker] Received proxy_added notification!`);
if (notifyResolve) notifyResolve();
}
});
}
try {
while (attempts < maxAttempts) {
try {
// Load proxies from database
await this.crawlRotator.initialize();
const stats = this.crawlRotator.proxy.getStats();
if (stats.activeProxies > 0) {
console.log(`[TaskWorker] Loaded ${stats.activeProxies} proxies (${stats.avgSuccessRate.toFixed(1)}% avg success rate)`);
// Wire rotator to Dutchie client - proxies will be used for ALL requests
setCrawlRotator(this.crawlRotator);
console.log(`[TaskWorker] Stealth initialized: ${this.crawlRotator.userAgent.getCount()} fingerprints, proxy REQUIRED for all requests`);
return;
}
attempts++;
console.log(`[TaskWorker] No active proxies available (attempt ${attempts}). Waiting for proxies...`);
// Wait for either notification or timeout
await new Promise<void>((resolve) => {
notifyResolve = resolve;
setTimeout(resolve, POLL_INTERVAL_MS);
});
} catch (error: any) {
attempts++;
console.log(`[TaskWorker] Error loading proxies (attempt ${attempts}): ${error.message}. Retrying...`);
await this.sleep(POLL_INTERVAL_MS);
}
}
throw new Error(`No active proxies available after waiting ${MAX_WAIT_MINUTES} minutes. Add proxies to the database.`);
} finally {
// Clean up LISTEN connection
if (notifyClient) {
try {
await notifyClient.query('UNLISTEN proxy_added');
notifyClient.release();
} catch {
// Ignore cleanup errors
}
}
}
}
/**
@@ -213,21 +390,32 @@ export class TaskWorker {
const memUsage = process.memoryUsage();
const cpuUsage = process.cpuUsage();
const proxyLocation = this.crawlRotator.getProxyLocation();
const resourceStats = this.getResourceStats();
// Get array of active task IDs
const activeTaskIds = Array.from(this.activeTasks.keys());
await fetch(`${API_BASE_URL}/api/worker-registry/heartbeat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
worker_id: this.workerId,
current_task_id: this.currentTask?.id || null,
status: this.currentTask ? 'active' : 'idle',
current_task_id: activeTaskIds[0] || null, // Primary task for backwards compat
current_task_ids: activeTaskIds, // All active tasks
active_task_count: this.activeTasks.size,
max_concurrent_tasks: this.maxConcurrentTasks,
status: this.activeTasks.size > 0 ? 'active' : 'idle',
resources: {
memory_mb: Math.round(memUsage.heapUsed / 1024 / 1024),
memory_total_mb: Math.round(memUsage.heapTotal / 1024 / 1024),
memory_rss_mb: Math.round(memUsage.rss / 1024 / 1024),
memory_percent: Math.round(resourceStats.memoryPercent * 100),
cpu_user_ms: Math.round(cpuUsage.user / 1000),
cpu_system_ms: Math.round(cpuUsage.system / 1000),
cpu_percent: Math.round(resourceStats.cpuPercent),
proxy_location: proxyLocation,
is_backing_off: this.isBackingOff,
backoff_reason: this.backoffReason,
}
})
});
@@ -289,20 +477,85 @@ export class TaskWorker {
this.startRegistryHeartbeat();
const roleMsg = this.role ? `for role: ${this.role}` : '(role-agnostic - any task)';
console.log(`[TaskWorker] ${this.friendlyName} starting ${roleMsg}`);
console.log(`[TaskWorker] ${this.friendlyName} starting ${roleMsg} (max ${this.maxConcurrentTasks} concurrent tasks)`);
while (this.isRunning) {
try {
await this.processNextTask();
await this.mainLoop();
} catch (error: any) {
console.error(`[TaskWorker] Loop error:`, error.message);
await this.sleep(POLL_INTERVAL_MS);
}
}
// Wait for any remaining tasks to complete
if (this.taskPromises.size > 0) {
console.log(`[TaskWorker] Waiting for ${this.taskPromises.size} active tasks to complete...`);
await Promise.allSettled(this.taskPromises.values());
}
console.log(`[TaskWorker] Worker ${this.workerId} stopped`);
}
/**
* Main loop - tries to fill up to maxConcurrentTasks
*/
private async mainLoop(): Promise<void> {
// Check resource usage and backoff if needed
const { backoff, reason } = this.shouldBackOff();
if (backoff) {
if (!this.isBackingOff) {
console.log(`[TaskWorker] ${this.friendlyName} backing off: ${reason}`);
}
this.isBackingOff = true;
this.backoffReason = reason;
await this.sleep(BACKOFF_DURATION_MS);
return;
}
// Clear backoff state
if (this.isBackingOff) {
console.log(`[TaskWorker] ${this.friendlyName} resuming normal operation`);
this.isBackingOff = false;
this.backoffReason = null;
}
// Check for decommission signal
const shouldDecommission = await this.checkDecommission();
if (shouldDecommission) {
console.log(`[TaskWorker] ${this.friendlyName} received decommission signal - waiting for ${this.activeTasks.size} tasks to complete`);
// Stop accepting new tasks, wait for current to finish
this.isRunning = false;
return;
}
// Try to claim more tasks if we have capacity
if (this.canAcceptMoreTasks()) {
const task = await taskService.claimTask(this.role, this.workerId);
if (task) {
console.log(`[TaskWorker] ${this.friendlyName} claimed task ${task.id} (${task.role}) [${this.activeTasks.size + 1}/${this.maxConcurrentTasks}]`);
this.activeTasks.set(task.id, task);
// Start task in background (don't await)
const taskPromise = this.executeTask(task);
this.taskPromises.set(task.id, taskPromise);
// Clean up when done
taskPromise.finally(() => {
this.activeTasks.delete(task.id);
this.taskPromises.delete(task.id);
});
// Immediately try to claim more tasks (don't wait for poll interval)
return;
}
}
// No task claimed or at capacity - wait before next poll
await this.sleep(POLL_INTERVAL_MS);
}
/**
* Stop the worker
*/
@@ -315,23 +568,10 @@ export class TaskWorker {
}
/**
* Process the next available task
* Execute a single task (runs concurrently with other tasks)
*/
private async processNextTask(): Promise<void> {
// Try to claim a task
const task = await taskService.claimTask(this.role, this.workerId);
if (!task) {
// No tasks available, wait and retry
await this.sleep(POLL_INTERVAL_MS);
return;
}
this.currentTask = task;
console.log(`[TaskWorker] Claimed task ${task.id} (${task.role}) for dispensary ${task.dispensary_id || 'N/A'}`);
// Start heartbeat
this.startHeartbeat(task.id);
private async executeTask(task: WorkerTask): Promise<void> {
console.log(`[TaskWorker] ${this.friendlyName} starting task ${task.id} (${task.role}) for dispensary ${task.dispensary_id || 'N/A'}`);
try {
// Mark as running
@@ -360,7 +600,7 @@ export class TaskWorker {
// Mark as completed
await taskService.completeTask(task.id, result);
await this.reportTaskCompletion(true);
console.log(`[TaskWorker] ${this.friendlyName} completed task ${task.id}`);
console.log(`[TaskWorker] ${this.friendlyName} completed task ${task.id} [${this.activeTasks.size}/${this.maxConcurrentTasks} active]`);
// Chain next task if applicable
const chainedTask = await taskService.chainNextTask({
@@ -382,9 +622,35 @@ export class TaskWorker {
await taskService.failTask(task.id, error.message);
await this.reportTaskCompletion(false);
console.error(`[TaskWorker] ${this.friendlyName} task ${task.id} error:`, error.message);
} finally {
this.stopHeartbeat();
this.currentTask = null;
}
// Note: cleanup (removing from activeTasks) is handled in mainLoop's finally block
}
/**
* Check if this worker has been flagged for decommission
* Returns true if worker should stop after current task
*/
private async checkDecommission(): Promise<boolean> {
try {
// Check worker_registry for decommission flag
const result = await this.pool.query(
`SELECT decommission_requested, decommission_reason
FROM worker_registry
WHERE worker_id = $1`,
[this.workerId]
);
if (result.rows.length > 0 && result.rows[0].decommission_requested) {
const reason = result.rows[0].decommission_reason || 'No reason provided';
console.log(`[TaskWorker] Decommission requested: ${reason}`);
return true;
}
return false;
} catch (error: any) {
// If we can't check, continue running
console.warn(`[TaskWorker] Could not check decommission status: ${error.message}`);
return false;
}
}
@@ -421,12 +687,25 @@ export class TaskWorker {
/**
* Get worker info
*/
getInfo(): { workerId: string; role: TaskRole | null; isRunning: boolean; currentTaskId: number | null } {
getInfo(): {
workerId: string;
role: TaskRole | null;
isRunning: boolean;
activeTaskIds: number[];
activeTaskCount: number;
maxConcurrentTasks: number;
isBackingOff: boolean;
backoffReason: string | null;
} {
return {
workerId: this.workerId,
role: this.role,
isRunning: this.isRunning,
currentTaskId: this.currentTask?.id || null,
activeTaskIds: Array.from(this.activeTasks.keys()),
activeTaskCount: this.activeTasks.size,
maxConcurrentTasks: this.maxConcurrentTasks,
isBackingOff: this.isBackingOff,
backoffReason: this.backoffReason,
};
}
}

View File

@@ -7,8 +7,8 @@
<title>CannaIQ - Cannabis Menu Intelligence Platform</title>
<meta name="description" content="CannaIQ provides real-time cannabis dispensary menu data, product tracking, and analytics for dispensaries across Arizona." />
<meta name="keywords" content="cannabis, dispensary, menu, products, analytics, Arizona" />
<script type="module" crossorigin src="/assets/index-BML8-px1.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-B2gR-58G.css">
<script type="module" crossorigin src="/assets/index-Dq9S0rVi.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-DhM09B-d.css">
</head>
<body>
<div id="root"></div>

View File

@@ -8,6 +8,7 @@ import { ProductDetail } from './pages/ProductDetail';
import { Stores } from './pages/Stores';
import { Dispensaries } from './pages/Dispensaries';
import { DispensaryDetail } from './pages/DispensaryDetail';
import { DispensarySchedule } from './pages/DispensarySchedule';
import { StoreDetail } from './pages/StoreDetail';
import { StoreBrands } from './pages/StoreBrands';
import { StoreSpecials } from './pages/StoreSpecials';
@@ -66,6 +67,7 @@ export default function App() {
<Route path="/stores" element={<PrivateRoute><Stores /></PrivateRoute>} />
<Route path="/dispensaries" element={<PrivateRoute><Dispensaries /></PrivateRoute>} />
<Route path="/dispensaries/:state/:city/:slug" element={<PrivateRoute><DispensaryDetail /></PrivateRoute>} />
<Route path="/dispensaries/:state/:city/:slug/schedule" element={<PrivateRoute><DispensarySchedule /></PrivateRoute>} />
<Route path="/stores/:state/:storeName/:slug/brands" element={<PrivateRoute><StoreBrands /></PrivateRoute>} />
<Route path="/stores/:state/:storeName/:slug/specials" element={<PrivateRoute><StoreSpecials /></PrivateRoute>} />
<Route path="/stores/:state/:storeName/:slug" element={<PrivateRoute><StoreDetail /></PrivateRoute>} />

View File

@@ -1,5 +1,5 @@
import { ReactNode, useEffect, useState } from 'react';
import { useNavigate, useLocation } from 'react-router-dom';
import { ReactNode, useEffect, useState, useRef } from 'react';
import { useNavigate, useLocation, Link } from 'react-router-dom';
import { useAuthStore } from '../store/authStore';
import { api } from '../lib/api';
import { StateSelector } from './StateSelector';
@@ -48,8 +48,8 @@ interface NavLinkProps {
function NavLink({ to, icon, label, isActive }: NavLinkProps) {
return (
<a
href={to}
<Link
to={to}
className={`flex items-center gap-3 px-3 py-2 rounded-lg text-sm font-medium transition-colors ${
isActive
? 'bg-emerald-50 text-emerald-700'
@@ -58,7 +58,7 @@ function NavLink({ to, icon, label, isActive }: NavLinkProps) {
>
<span className={`flex-shrink-0 ${isActive ? 'text-emerald-600' : 'text-gray-400'}`}>{icon}</span>
<span>{label}</span>
</a>
</Link>
);
}
@@ -86,6 +86,8 @@ export function Layout({ children }: LayoutProps) {
const { user, logout } = useAuthStore();
const [versionInfo, setVersionInfo] = useState<VersionInfo | null>(null);
const [sidebarOpen, setSidebarOpen] = useState(false);
const navRef = useRef<HTMLElement>(null);
const scrollPositionRef = useRef<number>(0);
useEffect(() => {
const fetchVersion = async () => {
@@ -111,9 +113,27 @@ export function Layout({ children }: LayoutProps) {
return location.pathname.startsWith(path);
};
// Close sidebar on route change (mobile)
// Save scroll position before route change
useEffect(() => {
const nav = navRef.current;
if (nav) {
const handleScroll = () => {
scrollPositionRef.current = nav.scrollTop;
};
nav.addEventListener('scroll', handleScroll);
return () => nav.removeEventListener('scroll', handleScroll);
}
}, []);
// Restore scroll position after route change and close mobile sidebar
useEffect(() => {
setSidebarOpen(false);
// Restore scroll position after render
requestAnimationFrame(() => {
if (navRef.current) {
navRef.current.scrollTop = scrollPositionRef.current;
}
});
}, [location.pathname]);
const sidebarContent = (
@@ -131,7 +151,7 @@ export function Layout({ children }: LayoutProps) {
<span className="text-lg font-bold text-gray-900">CannaIQ</span>
{versionInfo && (
<p className="text-xs text-gray-400">
v{versionInfo.version} ({versionInfo.git_sha}) {versionInfo.build_time !== 'unknown' && `- ${new Date(versionInfo.build_time).toLocaleDateString()}`}
{versionInfo.git_sha || 'dev'}
</p>
)}
</div>
@@ -145,7 +165,7 @@ export function Layout({ children }: LayoutProps) {
</div>
{/* Navigation */}
<nav className="flex-1 px-3 py-4 space-y-6 overflow-y-auto">
<nav ref={navRef} className="flex-1 px-3 py-4 space-y-6 overflow-y-auto">
<NavSection title="Main">
<NavLink to="/dashboard" icon={<LayoutDashboard className="w-4 h-4" />} label="Dashboard" isActive={isActive('/dashboard', true)} />
<NavLink to="/dispensaries" icon={<Building2 className="w-4 h-4" />} label="Dispensaries" isActive={isActive('/dispensaries')} />

View File

@@ -0,0 +1,138 @@
import { useState, useEffect, useRef } from 'react';
import { api } from '../lib/api';
import { Shield, X, Loader2 } from 'lucide-react';
interface PasswordConfirmModalProps {
isOpen: boolean;
onClose: () => void;
onConfirm: () => void;
title: string;
description: string;
}
export function PasswordConfirmModal({
isOpen,
onClose,
onConfirm,
title,
description,
}: PasswordConfirmModalProps) {
const [password, setPassword] = useState('');
const [error, setError] = useState('');
const [loading, setLoading] = useState(false);
const inputRef = useRef<HTMLInputElement>(null);
useEffect(() => {
if (isOpen) {
setPassword('');
setError('');
// Focus the input when modal opens
setTimeout(() => inputRef.current?.focus(), 100);
}
}, [isOpen]);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!password.trim()) {
setError('Password is required');
return;
}
setLoading(true);
setError('');
try {
const result = await api.verifyPassword(password);
if (result.verified) {
onConfirm();
onClose();
} else {
setError('Invalid password');
}
} catch (err: any) {
setError(err.message || 'Verification failed');
} finally {
setLoading(false);
}
};
if (!isOpen) return null;
return (
<div className="fixed inset-0 z-50 flex items-center justify-center">
{/* Backdrop */}
<div
className="absolute inset-0 bg-black bg-opacity-50"
onClick={onClose}
/>
{/* Modal */}
<div className="relative bg-white rounded-lg shadow-xl max-w-md w-full mx-4">
{/* Header */}
<div className="flex items-center justify-between px-6 py-4 border-b border-gray-200">
<div className="flex items-center gap-3">
<div className="p-2 bg-amber-100 rounded-lg">
<Shield className="w-5 h-5 text-amber-600" />
</div>
<h3 className="text-lg font-semibold text-gray-900">{title}</h3>
</div>
<button
onClick={onClose}
className="p-1 hover:bg-gray-100 rounded-lg transition-colors"
>
<X className="w-5 h-5 text-gray-500" />
</button>
</div>
{/* Body */}
<form onSubmit={handleSubmit}>
<div className="px-6 py-4">
<p className="text-gray-600 mb-4">{description}</p>
<div className="space-y-2">
<label
htmlFor="password"
className="block text-sm font-medium text-gray-700"
>
Enter your password to continue
</label>
<input
ref={inputRef}
type="password"
id="password"
value={password}
onChange={(e) => setPassword(e.target.value)}
className="w-full px-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500 focus:border-emerald-500"
placeholder="Password"
disabled={loading}
/>
{error && (
<p className="text-sm text-red-600">{error}</p>
)}
</div>
</div>
{/* Footer */}
<div className="flex justify-end gap-3 px-6 py-4 border-t border-gray-200 bg-gray-50 rounded-b-lg">
<button
type="button"
onClick={onClose}
disabled={loading}
className="px-4 py-2 text-gray-700 hover:bg-gray-100 rounded-lg transition-colors"
>
Cancel
</button>
<button
type="submit"
disabled={loading}
className="px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 transition-colors disabled:opacity-50 flex items-center gap-2"
>
{loading && <Loader2 className="w-4 h-4 animate-spin" />}
Confirm
</button>
</div>
</form>
</div>
</div>
);
}

View File

@@ -84,6 +84,13 @@ class ApiClient {
});
}
async verifyPassword(password: string) {
return this.request<{ verified: boolean; error?: string }>('/api/auth/verify-password', {
method: 'POST',
body: JSON.stringify({ password }),
});
}
async getMe() {
return this.request<{ user: any }>('/api/auth/me');
}
@@ -983,6 +990,47 @@ class ApiClient {
}>(`/api/markets/stores/${id}/categories`);
}
async getStoreCrawlHistory(id: number, limit = 50) {
return this.request<{
dispensary: {
id: number;
name: string;
dba_name: string | null;
slug: string;
state: string;
city: string;
menu_type: string | null;
platform_dispensary_id: string | null;
last_menu_scrape: string | null;
} | null;
history: Array<{
id: number;
runId: string | null;
profileKey: string | null;
crawlerModule: string | null;
stateAtStart: string | null;
stateAtEnd: string | null;
totalSteps: number;
durationMs: number | null;
success: boolean;
errorMessage: string | null;
productsFound: number | null;
startedAt: string | null;
completedAt: string | null;
}>;
nextSchedule: {
scheduleId: number;
jobName: string;
enabled: boolean;
baseIntervalMinutes: number;
jitterMinutes: number;
nextRunAt: string | null;
lastRunAt: string | null;
lastStatus: string | null;
} | null;
}>(`/api/markets/stores/${id}/crawl-history?limit=${limit}`);
}
// Global Brands/Categories (from v_brands/v_categories views)
async getMarketBrands(params?: { limit?: number; offset?: number }) {
const searchParams = new URLSearchParams();
@@ -1518,10 +1566,11 @@ class ApiClient {
}
// Intelligence API
async getIntelligenceBrands(params?: { limit?: number; offset?: number }) {
async getIntelligenceBrands(params?: { limit?: number; offset?: number; state?: string }) {
const searchParams = new URLSearchParams();
if (params?.limit) searchParams.append('limit', params.limit.toString());
if (params?.offset) searchParams.append('offset', params.offset.toString());
if (params?.state) searchParams.append('state', params.state);
const queryString = searchParams.toString() ? `?${searchParams.toString()}` : '';
return this.request<{
brands: Array<{
@@ -1536,7 +1585,10 @@ class ApiClient {
}>(`/api/admin/intelligence/brands${queryString}`);
}
async getIntelligencePricing() {
async getIntelligencePricing(params?: { state?: string }) {
const searchParams = new URLSearchParams();
if (params?.state) searchParams.append('state', params.state);
const queryString = searchParams.toString() ? `?${searchParams.toString()}` : '';
return this.request<{
byCategory: Array<{
category: string;
@@ -1552,7 +1604,7 @@ class ApiClient {
maxPrice: number;
totalProducts: number;
};
}>('/api/admin/intelligence/pricing');
}>(`/api/admin/intelligence/pricing${queryString}`);
}
async getIntelligenceStoreActivity(params?: { state?: string; chainId?: number; limit?: number }) {
@@ -2884,6 +2936,46 @@ class ApiClient {
`/api/tasks/store/${dispensaryId}/active`
);
}
// Task Pool Control
async getTaskPoolStatus() {
return this.request<{ success: boolean; paused: boolean; message: string }>(
'/api/tasks/pool/status'
);
}
async pauseTaskPool() {
return this.request<{ success: boolean; paused: boolean; message: string }>(
'/api/tasks/pool/pause',
{ method: 'POST' }
);
}
async resumeTaskPool() {
return this.request<{ success: boolean; paused: boolean; message: string }>(
'/api/tasks/pool/resume',
{ method: 'POST' }
);
}
// K8s Worker Control
async getK8sWorkers() {
return this.request<{
success: boolean;
available: boolean;
replicas: number;
readyReplicas: number;
availableReplicas?: number;
error?: string;
}>('/api/k8s/workers');
}
async scaleK8sWorkers(replicas: number) {
return this.request<{ success: boolean; replicas: number; message?: string; error?: string }>(
'/api/k8s/workers/scale',
{ method: 'POST', body: JSON.stringify({ replicas }) }
);
}
}
export const api = new ApiClient(API_URL);

View File

@@ -1,6 +1,5 @@
import { useEffect, useState } from 'react';
import { Layout } from '../components/Layout';
import { HealthPanel } from '../components/HealthPanel';
import { api } from '../lib/api';
import { useNavigate } from 'react-router-dom';
import {
@@ -42,7 +41,6 @@ export function Dashboard() {
const [activity, setActivity] = useState<any>(null);
const [nationalStats, setNationalStats] = useState<any>(null);
const [loading, setLoading] = useState(true);
const [refreshing, setRefreshing] = useState(false);
const [pendingChangesCount, setPendingChangesCount] = useState(0);
const [showNotification, setShowNotification] = useState(false);
const [taskCounts, setTaskCounts] = useState<Record<string, number> | null>(null);
@@ -93,10 +91,7 @@ export function Dashboard() {
}
};
const loadData = async (isRefresh = false) => {
if (isRefresh) {
setRefreshing(true);
}
const loadData = async () => {
try {
// Fetch dashboard data (primary data source)
const dashboard = await api.getMarketDashboard();
@@ -158,7 +153,6 @@ export function Dashboard() {
console.error('Failed to load dashboard:', error);
} finally {
setLoading(false);
setRefreshing(false);
}
};
@@ -271,24 +265,11 @@ export function Dashboard() {
<div className="space-y-8">
{/* Header */}
<div className="flex flex-col sm:flex-row sm:justify-between sm:items-center gap-4">
<div>
<h1 className="text-xl sm:text-2xl font-semibold text-gray-900">Dashboard</h1>
<p className="text-sm text-gray-500 mt-1">Monitor your dispensary data aggregation</p>
</div>
<button
onClick={() => loadData(true)}
disabled={refreshing}
className="inline-flex items-center justify-center gap-2 px-4 py-2 bg-white border border-gray-200 rounded-lg hover:bg-gray-50 transition-colors text-sm font-medium text-gray-700 self-start sm:self-auto disabled:opacity-50 disabled:cursor-not-allowed"
>
<RefreshCw className={`w-4 h-4 ${refreshing ? 'animate-spin' : ''}`} />
{refreshing ? 'Refreshing...' : 'Refresh'}
</button>
<div>
<h1 className="text-xl sm:text-2xl font-semibold text-gray-900">Dashboard</h1>
<p className="text-sm text-gray-500 mt-1">Monitor your dispensary data aggregation</p>
</div>
{/* System Health */}
<HealthPanel showQueues={false} refreshInterval={60000} />
{/* Stats Grid */}
<div className="grid grid-cols-2 lg:grid-cols-3 gap-3 sm:gap-6">
{/* Products */}

View File

@@ -161,23 +161,6 @@ export function Dispensaries() {
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Filter by Status
</label>
<select
value={filterStatus}
onChange={(e) => handleStatusFilter(e.target.value)}
className={`w-full px-3 py-2 border rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500 ${
filterStatus === 'dropped' ? 'border-red-300 bg-red-50' : 'border-gray-300'
}`}
>
<option value="">All Statuses</option>
<option value="open">Open</option>
<option value="dropped">Dropped (Needs Review)</option>
<option value="closed">Closed</option>
</select>
</div>
</div>
</div>

View File

@@ -204,47 +204,6 @@ export function DispensaryDetail() {
Back to Dispensaries
</button>
{/* Update Dropdown */}
<div className="relative">
<button
onClick={() => setShowUpdateDropdown(!showUpdateDropdown)}
disabled={isUpdating}
className="flex items-center gap-2 px-4 py-2 text-sm font-medium text-white bg-blue-600 hover:bg-blue-700 rounded-lg disabled:opacity-50 disabled:cursor-not-allowed"
>
<RefreshCw className={`w-4 h-4 ${isUpdating ? 'animate-spin' : ''}`} />
{isUpdating ? 'Updating...' : 'Update'}
{!isUpdating && <ChevronDown className="w-4 h-4" />}
</button>
{showUpdateDropdown && !isUpdating && (
<div className="absolute right-0 mt-2 w-48 bg-white rounded-lg shadow-lg border border-gray-200 z-10">
<button
onClick={() => handleUpdate('products')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-t-lg"
>
Products
</button>
<button
onClick={() => handleUpdate('brands')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100"
>
Brands
</button>
<button
onClick={() => handleUpdate('specials')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100"
>
Specials
</button>
<button
onClick={() => handleUpdate('all')}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-b-lg border-t border-gray-200"
>
All
</button>
</div>
)}
</div>
</div>
{/* Dispensary Header */}
@@ -266,7 +225,7 @@ export function DispensaryDetail() {
<div className="flex items-center gap-2 text-sm text-gray-600 bg-gray-50 px-4 py-2 rounded-lg">
<Calendar className="w-4 h-4" />
<div>
<span className="font-medium">Last Crawl Date:</span>
<span className="font-medium">Last Updated:</span>
<span className="ml-2">
{dispensary.last_menu_scrape
? new Date(dispensary.last_menu_scrape).toLocaleDateString('en-US', {
@@ -331,7 +290,7 @@ export function DispensaryDetail() {
</a>
)}
<Link
to="/schedule"
to={`/dispensaries/${state}/${city}/${slug}/schedule`}
className="flex items-center gap-2 text-sm text-blue-600 hover:text-blue-800"
>
<Clock className="w-4 h-4" />
@@ -533,57 +492,31 @@ export function DispensaryDetail() {
`$${product.regular_price}`
) : '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.quantity != null ? (
<span className={`badge badge-sm ${product.quantity > 0 ? 'badge-info' : 'badge-error'}`}>
{product.quantity}
</span>
) : '-'}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.quantity != null ? product.quantity : '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.thc_percentage ? (
<span className="badge badge-success badge-sm">{product.thc_percentage}%</span>
) : '-'}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.thc_percentage ? `${product.thc_percentage}%` : '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.cbd_percentage ? (
<span className="badge badge-info badge-sm">{product.cbd_percentage}%</span>
) : '-'}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.cbd_percentage ? `${product.cbd_percentage}%` : '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.strain_type ? (
<span className="badge badge-ghost badge-sm">{product.strain_type}</span>
) : '-'}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.strain_type || '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.in_stock ? (
<span className="badge badge-success badge-sm">Yes</span>
) : product.in_stock === false ? (
<span className="badge badge-error badge-sm">No</span>
) : '-'}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.in_stock ? 'Yes' : product.in_stock === false ? 'No' : '-'}
</td>
<td className="whitespace-nowrap text-xs text-gray-500">
{product.updated_at ? formatDate(product.updated_at) : '-'}
</td>
<td>
<div className="flex gap-1">
{product.dutchie_url && (
<a
href={product.dutchie_url}
target="_blank"
rel="noopener noreferrer"
className="btn btn-xs btn-outline"
>
Dutchie
</a>
)}
<button
onClick={() => navigate(`/products/${product.id}`)}
className="btn btn-xs btn-primary"
>
Details
</button>
</div>
<button
onClick={() => navigate(`/products/${product.id}`)}
className="btn btn-xs btn-ghost text-gray-500 hover:text-gray-700"
>
Details
</button>
</td>
</tr>
))}

View File

@@ -0,0 +1,378 @@
import { useEffect, useState } from 'react';
import { useParams, useNavigate, Link } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import {
ArrowLeft,
Clock,
Calendar,
CheckCircle,
XCircle,
AlertCircle,
Package,
Timer,
Building2,
} from 'lucide-react';
interface CrawlHistoryItem {
id: number;
runId: string | null;
profileKey: string | null;
crawlerModule: string | null;
stateAtStart: string | null;
stateAtEnd: string | null;
totalSteps: number;
durationMs: number | null;
success: boolean;
errorMessage: string | null;
productsFound: number | null;
startedAt: string | null;
completedAt: string | null;
}
interface NextSchedule {
scheduleId: number;
jobName: string;
enabled: boolean;
baseIntervalMinutes: number;
jitterMinutes: number;
nextRunAt: string | null;
lastRunAt: string | null;
lastStatus: string | null;
}
interface Dispensary {
id: number;
name: string;
dba_name: string | null;
slug: string;
state: string;
city: string;
menu_type: string | null;
platform_dispensary_id: string | null;
last_menu_scrape: string | null;
}
export function DispensarySchedule() {
const { state, city, slug } = useParams();
const navigate = useNavigate();
const [dispensary, setDispensary] = useState<Dispensary | null>(null);
const [history, setHistory] = useState<CrawlHistoryItem[]>([]);
const [nextSchedule, setNextSchedule] = useState<NextSchedule | null>(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
loadScheduleData();
}, [slug]);
const loadScheduleData = async () => {
setLoading(true);
try {
// First get the dispensary to get the ID
const dispData = await api.getDispensary(slug!);
if (dispData?.id) {
const data = await api.getStoreCrawlHistory(dispData.id);
setDispensary(data.dispensary);
setHistory(data.history || []);
setNextSchedule(data.nextSchedule);
}
} catch (error) {
console.error('Failed to load schedule data:', error);
} finally {
setLoading(false);
}
};
const formatDate = (dateStr: string | null) => {
if (!dateStr) return 'Never';
const date = new Date(dateStr);
return date.toLocaleDateString('en-US', {
year: 'numeric',
month: 'short',
day: 'numeric',
hour: '2-digit',
minute: '2-digit',
});
};
const formatTimeAgo = (dateStr: string | null) => {
if (!dateStr) return 'Never';
const date = new Date(dateStr);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffMinutes = Math.floor(diffMs / (1000 * 60));
const diffHours = Math.floor(diffMs / (1000 * 60 * 60));
const diffDays = Math.floor(diffMs / (1000 * 60 * 60 * 24));
if (diffMinutes < 1) return 'Just now';
if (diffMinutes < 60) return `${diffMinutes}m ago`;
if (diffHours < 24) return `${diffHours}h ago`;
if (diffDays === 1) return 'Yesterday';
if (diffDays < 7) return `${diffDays} days ago`;
return date.toLocaleDateString();
};
const formatTimeUntil = (dateStr: string | null) => {
if (!dateStr) return 'Not scheduled';
const date = new Date(dateStr);
const now = new Date();
const diffMs = date.getTime() - now.getTime();
if (diffMs < 0) return 'Overdue';
const diffMinutes = Math.floor(diffMs / (1000 * 60));
const diffHours = Math.floor(diffMinutes / 60);
if (diffMinutes < 60) return `in ${diffMinutes}m`;
return `in ${diffHours}h ${diffMinutes % 60}m`;
};
const formatDuration = (ms: number | null) => {
if (!ms) return '-';
if (ms < 1000) return `${ms}ms`;
const seconds = Math.floor(ms / 1000);
const minutes = Math.floor(seconds / 60);
if (minutes < 1) return `${seconds}s`;
return `${minutes}m ${seconds % 60}s`;
};
const formatInterval = (baseMinutes: number, jitterMinutes: number) => {
const hours = Math.floor(baseMinutes / 60);
const mins = baseMinutes % 60;
let base = hours > 0 ? `${hours}h` : '';
if (mins > 0) base += `${mins}m`;
return `Every ${base} (+/- ${jitterMinutes}m jitter)`;
};
if (loading) {
return (
<Layout>
<div className="text-center py-12">
<div className="inline-block animate-spin rounded-full h-8 w-8 border-4 border-gray-400 border-t-transparent"></div>
<p className="mt-2 text-sm text-gray-600">Loading schedule...</p>
</div>
</Layout>
);
}
if (!dispensary) {
return (
<Layout>
<div className="text-center py-12">
<p className="text-gray-600">Dispensary not found</p>
</div>
</Layout>
);
}
// Stats from history
const successCount = history.filter(h => h.success).length;
const failureCount = history.filter(h => !h.success).length;
const lastSuccess = history.find(h => h.success);
const avgDuration = history.length > 0
? Math.round(history.reduce((sum, h) => sum + (h.durationMs || 0), 0) / history.length)
: 0;
return (
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between gap-4">
<button
onClick={() => navigate(`/dispensaries/${state}/${city}/${slug}`)}
className="flex items-center gap-2 text-sm text-gray-600 hover:text-gray-900"
>
<ArrowLeft className="w-4 h-4" />
Back to {dispensary.dba_name || dispensary.name}
</button>
</div>
{/* Dispensary Info */}
<div className="bg-white rounded-lg border border-gray-200 p-6">
<div className="flex items-start gap-4">
<div className="p-3 bg-blue-50 rounded-lg">
<Building2 className="w-8 h-8 text-blue-600" />
</div>
<div>
<h1 className="text-2xl font-bold text-gray-900">
{dispensary.dba_name || dispensary.name}
</h1>
<p className="text-sm text-gray-600 mt-1">
{dispensary.city}, {dispensary.state} - Crawl Schedule & History
</p>
<div className="flex items-center gap-4 mt-2 text-sm text-gray-500">
<span>Slug: {dispensary.slug}</span>
{dispensary.menu_type && (
<span className="px-2 py-0.5 bg-gray-100 rounded text-xs">
{dispensary.menu_type}
</span>
)}
</div>
</div>
</div>
</div>
{/* Next Scheduled Crawl */}
{nextSchedule && (
<div className="bg-white rounded-lg border border-gray-200 p-6">
<h2 className="text-lg font-semibold text-gray-900 mb-4 flex items-center gap-2">
<Clock className="w-5 h-5 text-blue-500" />
Upcoming Schedule
</h2>
<div className="grid grid-cols-4 gap-6">
<div>
<p className="text-sm text-gray-500">Next Run</p>
<p className="text-xl font-semibold text-blue-600">
{formatTimeUntil(nextSchedule.nextRunAt)}
</p>
<p className="text-xs text-gray-400">
{formatDate(nextSchedule.nextRunAt)}
</p>
</div>
<div>
<p className="text-sm text-gray-500">Interval</p>
<p className="text-lg font-medium">
{formatInterval(nextSchedule.baseIntervalMinutes, nextSchedule.jitterMinutes)}
</p>
</div>
<div>
<p className="text-sm text-gray-500">Last Run</p>
<p className="text-lg font-medium">
{formatTimeAgo(nextSchedule.lastRunAt)}
</p>
</div>
<div>
<p className="text-sm text-gray-500">Last Status</p>
<p className={`text-lg font-medium ${
nextSchedule.lastStatus === 'success' ? 'text-green-600' :
nextSchedule.lastStatus === 'error' ? 'text-red-600' : 'text-gray-600'
}`}>
{nextSchedule.lastStatus || '-'}
</p>
</div>
</div>
</div>
)}
{/* Stats Summary */}
<div className="grid grid-cols-4 gap-4">
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<CheckCircle className="w-8 h-8 text-green-500" />
<div>
<p className="text-sm text-gray-500">Successful Runs</p>
<p className="text-2xl font-bold text-green-600">{successCount}</p>
</div>
</div>
</div>
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<XCircle className="w-8 h-8 text-red-500" />
<div>
<p className="text-sm text-gray-500">Failed Runs</p>
<p className="text-2xl font-bold text-red-600">{failureCount}</p>
</div>
</div>
</div>
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<Timer className="w-8 h-8 text-blue-500" />
<div>
<p className="text-sm text-gray-500">Avg Duration</p>
<p className="text-2xl font-bold">{formatDuration(avgDuration)}</p>
</div>
</div>
</div>
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center gap-3">
<Package className="w-8 h-8 text-purple-500" />
<div>
<p className="text-sm text-gray-500">Last Products Found</p>
<p className="text-2xl font-bold">
{lastSuccess?.productsFound?.toLocaleString() || '-'}
</p>
</div>
</div>
</div>
</div>
{/* Crawl History Table */}
<div className="bg-white rounded-lg border border-gray-200">
<div className="p-4 border-b border-gray-200">
<h2 className="text-lg font-semibold text-gray-900 flex items-center gap-2">
<Calendar className="w-5 h-5 text-gray-500" />
Crawl History
</h2>
</div>
<div className="overflow-x-auto">
<table className="table table-sm w-full">
<thead className="bg-gray-50">
<tr>
<th>Status</th>
<th>Started</th>
<th>Duration</th>
<th className="text-right">Products</th>
<th>State</th>
<th>Error</th>
</tr>
</thead>
<tbody>
{history.length === 0 ? (
<tr>
<td colSpan={6} className="text-center py-8 text-gray-500">
No crawl history available
</td>
</tr>
) : (
history.map((item) => (
<tr key={item.id} className="hover:bg-gray-50">
<td>
<span className={`inline-flex items-center gap-1 px-2 py-1 rounded text-xs font-medium ${
item.success
? 'bg-green-100 text-green-700'
: 'bg-red-100 text-red-700'
}`}>
{item.success ? (
<CheckCircle className="w-3 h-3" />
) : (
<XCircle className="w-3 h-3" />
)}
{item.success ? 'Success' : 'Failed'}
</span>
</td>
<td>
<div className="text-sm">{formatDate(item.startedAt)}</div>
<div className="text-xs text-gray-400">{formatTimeAgo(item.startedAt)}</div>
</td>
<td className="font-mono text-sm">
{formatDuration(item.durationMs)}
</td>
<td className="text-right font-mono text-sm">
{item.productsFound?.toLocaleString() || '-'}
</td>
<td className="text-sm text-gray-600">
{item.stateAtEnd || item.stateAtStart || '-'}
</td>
<td className="max-w-[200px]">
{item.errorMessage ? (
<span
className="text-xs text-red-600 truncate block cursor-help"
title={item.errorMessage}
>
{item.errorMessage.substring(0, 50)}...
</span>
) : '-'}
</td>
</tr>
))
)}
</tbody>
</table>
</div>
</div>
</div>
</Layout>
);
}
export default DispensarySchedule;

View File

@@ -3,15 +3,16 @@ import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { trackProductClick } from '../lib/analytics';
import { useStateFilter } from '../hooks/useStateFilter';
import {
Building2,
MapPin,
Package,
DollarSign,
RefreshCw,
Search,
TrendingUp,
BarChart3,
ChevronDown,
} from 'lucide-react';
interface BrandData {
@@ -25,19 +26,28 @@ interface BrandData {
export function IntelligenceBrands() {
const navigate = useNavigate();
const { selectedState, setSelectedState, stateParam, stateLabel, isAllStates } = useStateFilter();
const [availableStates, setAvailableStates] = useState<string[]>([]);
const [brands, setBrands] = useState<BrandData[]>([]);
const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState('');
const [sortBy, setSortBy] = useState<'stores' | 'skus' | 'name'>('stores');
const [sortBy, setSortBy] = useState<'stores' | 'skus' | 'name' | 'states'>('stores');
useEffect(() => {
loadBrands();
}, [stateParam]);
useEffect(() => {
// Load available states
api.getOrchestratorStates().then(data => {
setAvailableStates(data.states?.map((s: any) => s.state) || []);
}).catch(console.error);
}, []);
const loadBrands = async () => {
try {
setLoading(true);
const data = await api.getIntelligenceBrands({ limit: 500 });
const data = await api.getIntelligenceBrands({ limit: 500, state: stateParam });
setBrands(data.brands || []);
} catch (error) {
console.error('Failed to load brands:', error);
@@ -58,6 +68,8 @@ export function IntelligenceBrands() {
return b.skuCount - a.skuCount;
case 'name':
return a.brandName.localeCompare(b.brandName);
case 'states':
return b.states.length - a.states.length;
default:
return 0;
}
@@ -89,35 +101,60 @@ export function IntelligenceBrands() {
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex flex-col gap-4 sm:flex-row sm:items-center sm:justify-between">
<div>
<h1 className="text-2xl font-bold text-gray-900">Brands Intelligence</h1>
<p className="text-sm text-gray-600 mt-1">
Brand penetration and pricing analytics across markets
</p>
</div>
<div className="flex gap-2">
<button
onClick={() => navigate('/admin/intelligence/pricing')}
className="btn btn-sm btn-outline gap-1"
>
<DollarSign className="w-4 h-4" />
Pricing
</button>
<button
onClick={() => navigate('/admin/intelligence/stores')}
className="btn btn-sm btn-outline gap-1"
>
<MapPin className="w-4 h-4" />
Stores
</button>
<button
onClick={loadBrands}
className="btn btn-sm btn-outline gap-2"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
<div className="flex flex-wrap gap-2 items-center">
{/* State Selector */}
<div className="dropdown dropdown-end">
<button tabIndex={0} className="btn btn-sm gap-2 bg-emerald-50 border-emerald-200 hover:bg-emerald-100">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-50 menu p-2 shadow-lg bg-white rounded-box w-44 max-h-60 overflow-y-auto border border-gray-200">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active bg-emerald-100' : ''}>
All States
</a>
</li>
<div className="divider my-1"></div>
{availableStates.map((state) => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active bg-emerald-100' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
{/* Page Navigation */}
<div className="flex gap-1">
<button
className="btn btn-sm gap-1 bg-emerald-600 text-white hover:bg-emerald-700 border-emerald-600"
>
<Building2 className="w-4 h-4" />
<span>Brands</span>
</button>
<button
onClick={() => navigate('/admin/intelligence/stores')}
className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
>
<MapPin className="w-4 h-4" />
<span>Stores</span>
</button>
<button
onClick={() => navigate('/admin/intelligence/pricing')}
className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
>
<DollarSign className="w-4 h-4" />
<span>Pricing</span>
</button>
</div>
</div>
</div>
@@ -169,28 +206,32 @@ export function IntelligenceBrands() {
{/* Top Brands Chart */}
<div className="bg-white rounded-lg border border-gray-200 p-4">
<h3 className="text-lg font-semibold text-gray-900 mb-4 flex items-center gap-2">
<BarChart3 className="w-5 h-5 text-blue-500" />
<h3 className="text-lg font-semibold text-gray-900 flex items-center gap-2 mb-4">
<BarChart3 className="w-5 h-5 text-emerald-500" />
Top 10 Brands by Store Count
</h3>
<div className="space-y-2">
{topBrands.map((brand, idx) => (
<div key={brand.brandName} className="flex items-center gap-3">
<span className="text-sm text-gray-500 w-6">{idx + 1}.</span>
<span className="text-sm font-medium w-40 truncate" title={brand.brandName}>
{brand.brandName}
</span>
<div className="flex-1 bg-gray-100 rounded-full h-4 relative">
<div
className="bg-blue-500 rounded-full h-4"
style={{ width: `${(brand.storeCount / maxStoreCount) * 100}%` }}
/>
{topBrands.map((brand) => {
const barWidth = Math.min((brand.storeCount / maxStoreCount) * 100, 100);
return (
<div key={brand.brandName} className="flex items-center gap-3">
<span className="text-sm font-medium w-28 truncate shrink-0" title={brand.brandName}>
{brand.brandName}
</span>
<div className="flex-1 min-w-0">
<div className="bg-gray-100 rounded h-5 overflow-hidden">
<div
className="bg-gradient-to-r from-emerald-400 to-emerald-500 h-5 rounded transition-all"
style={{ width: `${barWidth}%` }}
/>
</div>
</div>
<span className="text-sm font-mono font-semibold text-emerald-600 w-16 text-right shrink-0">
{brand.storeCount}
</span>
</div>
<span className="text-sm text-gray-600 w-16 text-right">
{brand.storeCount} stores
</span>
</div>
))}
);
})}
</div>
</div>
@@ -213,6 +254,7 @@ export function IntelligenceBrands() {
>
<option value="stores">Sort by Stores</option>
<option value="skus">Sort by SKUs</option>
<option value="states">Sort by States</option>
<option value="name">Sort by Name</option>
</select>
<span className="text-sm text-gray-500">

View File

@@ -2,15 +2,16 @@ import { useEffect, useState } from 'react';
import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { useStateFilter } from '../hooks/useStateFilter';
import {
DollarSign,
Building2,
MapPin,
Package,
RefreshCw,
TrendingUp,
TrendingDown,
BarChart3,
ChevronDown,
} from 'lucide-react';
interface CategoryPricing {
@@ -31,18 +32,27 @@ interface OverallPricing {
export function IntelligencePricing() {
const navigate = useNavigate();
const { selectedState, setSelectedState, stateParam, stateLabel, isAllStates } = useStateFilter();
const [availableStates, setAvailableStates] = useState<string[]>([]);
const [categories, setCategories] = useState<CategoryPricing[]>([]);
const [overall, setOverall] = useState<OverallPricing | null>(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
loadPricing();
}, [stateParam]);
useEffect(() => {
// Load available states
api.getOrchestratorStates().then(data => {
setAvailableStates(data.states?.map((s: any) => s.state) || []);
}).catch(console.error);
}, []);
const loadPricing = async () => {
try {
setLoading(true);
const data = await api.getIntelligencePricing();
const data = await api.getIntelligencePricing({ state: stateParam });
setCategories(data.byCategory || []);
setOverall(data.overall || null);
} catch (error) {
@@ -76,35 +86,60 @@ export function IntelligencePricing() {
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex flex-col gap-4 sm:flex-row sm:items-center sm:justify-between">
<div>
<h1 className="text-2xl font-bold text-gray-900">Pricing Intelligence</h1>
<p className="text-sm text-gray-600 mt-1">
Price distribution and trends by category
</p>
</div>
<div className="flex gap-2">
<button
onClick={() => navigate('/admin/intelligence/brands')}
className="btn btn-sm btn-outline gap-1"
>
<Building2 className="w-4 h-4" />
Brands
</button>
<button
onClick={() => navigate('/admin/intelligence/stores')}
className="btn btn-sm btn-outline gap-1"
>
<MapPin className="w-4 h-4" />
Stores
</button>
<button
onClick={loadPricing}
className="btn btn-sm btn-outline gap-2"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
<div className="flex flex-wrap gap-2 items-center">
{/* State Selector */}
<div className="dropdown dropdown-end">
<button tabIndex={0} className="btn btn-sm gap-2 bg-emerald-50 border-emerald-200 hover:bg-emerald-100">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-50 menu p-2 shadow-lg bg-white rounded-box w-44 max-h-60 overflow-y-auto border border-gray-200">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active bg-emerald-100' : ''}>
All States
</a>
</li>
<div className="divider my-1"></div>
{availableStates.map((state) => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active bg-emerald-100' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
{/* Page Navigation */}
<div className="flex gap-1">
<button
onClick={() => navigate('/admin/intelligence/brands')}
className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
>
<Building2 className="w-4 h-4" />
<span>Brands</span>
</button>
<button
onClick={() => navigate('/admin/intelligence/stores')}
className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
>
<MapPin className="w-4 h-4" />
<span>Stores</span>
</button>
<button
className="btn btn-sm gap-1 bg-emerald-600 text-white hover:bg-emerald-700 border-emerald-600"
>
<DollarSign className="w-4 h-4" />
<span>Pricing</span>
</button>
</div>
</div>
</div>
@@ -150,7 +185,7 @@ export function IntelligencePricing() {
<div>
<p className="text-sm text-gray-500">Products Priced</p>
<p className="text-2xl font-bold">
{overall.totalProducts.toLocaleString()}
{(overall.totalProducts || 0).toLocaleString()}
</p>
</div>
</div>
@@ -164,43 +199,29 @@ export function IntelligencePricing() {
<BarChart3 className="w-5 h-5 text-green-500" />
Average Price by Category
</h3>
<div className="space-y-3">
{sortedCategories.map((cat) => (
<div key={cat.category} className="flex items-center gap-3">
<span className="text-sm font-medium w-32 truncate" title={cat.category}>
{cat.category || 'Unknown'}
</span>
<div className="flex-1 relative">
{/* Price range bar */}
<div className="bg-gray-100 rounded-full h-6 relative">
{/* Min-Max range */}
<div
className="absolute top-0 h-6 bg-blue-100 rounded-full"
style={{
left: `${(cat.minPrice / (overall?.maxPrice || 100)) * 100}%`,
width: `${((cat.maxPrice - cat.minPrice) / (overall?.maxPrice || 100)) * 100}%`,
}}
/>
{/* Average marker */}
<div
className="absolute top-0 h-6 w-1 bg-green-500 rounded"
style={{ left: `${(cat.avgPrice / (overall?.maxPrice || 100)) * 100}%` }}
/>
<div className="space-y-2">
{sortedCategories.slice(0, 12).map((cat) => {
const maxPrice = Math.max(...sortedCategories.map(c => c.avgPrice || 0), 1);
const barWidth = Math.min(((cat.avgPrice || 0) / maxPrice) * 100, 100);
return (
<div key={cat.category} className="flex items-center gap-3">
<span className="text-sm font-medium w-28 truncate shrink-0" title={cat.category}>
{cat.category || 'Unknown'}
</span>
<div className="flex-1 min-w-0">
<div className="bg-gray-100 rounded h-5 overflow-hidden">
<div
className="bg-gradient-to-r from-emerald-400 to-emerald-500 h-5 rounded transition-all"
style={{ width: `${barWidth}%` }}
/>
</div>
</div>
</div>
<div className="flex gap-4 text-xs w-48">
<span className="text-gray-500">
Min: <span className="text-blue-600 font-mono">{formatPrice(cat.minPrice)}</span>
</span>
<span className="text-gray-500">
Avg: <span className="text-green-600 font-mono font-bold">{formatPrice(cat.avgPrice)}</span>
</span>
<span className="text-gray-500">
Max: <span className="text-orange-600 font-mono">{formatPrice(cat.maxPrice)}</span>
<span className="text-sm font-mono font-semibold text-emerald-600 w-16 text-right shrink-0">
{formatPrice(cat.avgPrice)}
</span>
</div>
</div>
))}
);
})}
</div>
</div>
@@ -236,7 +257,7 @@ export function IntelligencePricing() {
<span className="font-medium">{cat.category || 'Unknown'}</span>
</td>
<td className="text-center">
<span className="font-mono">{cat.productCount.toLocaleString()}</span>
<span className="font-mono">{(cat.productCount || 0).toLocaleString()}</span>
</td>
<td className="text-right">
<span className="font-mono text-blue-600">{formatPrice(cat.minPrice)}</span>

View File

@@ -8,7 +8,6 @@ import {
Building2,
DollarSign,
Package,
RefreshCw,
Search,
Clock,
Activity,
@@ -34,12 +33,19 @@ export function IntelligenceStores() {
const [stores, setStores] = useState<StoreActivity[]>([]);
const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState('');
const [localStates, setLocalStates] = useState<string[]>([]);
const [availableStates, setAvailableStates] = useState<string[]>([]);
useEffect(() => {
loadStores();
}, [selectedState]);
useEffect(() => {
// Load available states from orchestrator API
api.getOrchestratorStates().then(data => {
setAvailableStates(data.states?.map((s: any) => s.state) || []);
}).catch(console.error);
}, []);
const loadStores = async () => {
try {
setLoading(true);
@@ -48,10 +54,6 @@ export function IntelligenceStores() {
limit: 500,
});
setStores(data.stores || []);
// Extract unique states from response for dropdown counts
const uniqueStates = [...new Set(data.stores.map((s: StoreActivity) => s.state))].sort();
setLocalStates(uniqueStates);
} catch (error) {
console.error('Failed to load stores:', error);
} finally {
@@ -97,47 +99,72 @@ export function IntelligenceStores() {
);
}
// Calculate stats
const totalSKUs = stores.reduce((sum, s) => sum + s.skuCount, 0);
const totalSnapshots = stores.reduce((sum, s) => sum + s.snapshotCount, 0);
const avgFrequency = stores.filter(s => s.crawlFrequencyHours).length > 0
? stores.filter(s => s.crawlFrequencyHours).reduce((sum, s) => sum + (s.crawlFrequencyHours || 0), 0) /
stores.filter(s => s.crawlFrequencyHours).length
// Calculate stats with null safety
const totalSKUs = stores.reduce((sum, s) => sum + (s.skuCount || 0), 0);
const totalSnapshots = stores.reduce((sum, s) => sum + (s.snapshotCount || 0), 0);
const storesWithFrequency = stores.filter(s => s.crawlFrequencyHours != null);
const avgFrequency = storesWithFrequency.length > 0
? storesWithFrequency.reduce((sum, s) => sum + (s.crawlFrequencyHours || 0), 0) / storesWithFrequency.length
: 0;
return (
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex flex-col gap-4 sm:flex-row sm:items-center sm:justify-between">
<div>
<h1 className="text-2xl font-bold text-gray-900">Store Activity</h1>
<p className="text-sm text-gray-600 mt-1">
Per-store SKU counts, snapshots, and crawl frequency
</p>
</div>
<div className="flex gap-2">
<button
onClick={() => navigate('/admin/intelligence/brands')}
className="btn btn-sm btn-outline gap-1"
>
<Building2 className="w-4 h-4" />
Brands
</button>
<button
onClick={() => navigate('/admin/intelligence/pricing')}
className="btn btn-sm btn-outline gap-1"
>
<DollarSign className="w-4 h-4" />
Pricing
</button>
<button
onClick={loadStores}
className="btn btn-sm btn-outline gap-2"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
<div className="flex flex-wrap gap-2 items-center">
{/* State Selector */}
<div className="dropdown dropdown-end">
<button tabIndex={0} className="btn btn-sm gap-2 bg-emerald-50 border-emerald-200 hover:bg-emerald-100">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-50 menu p-2 shadow-lg bg-white rounded-box w-44 max-h-60 overflow-y-auto border border-gray-200">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active bg-emerald-100' : ''}>
All States
</a>
</li>
<div className="divider my-1"></div>
{availableStates.map((state) => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active bg-emerald-100' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
{/* Page Navigation */}
<div className="flex gap-1">
<button
onClick={() => navigate('/admin/intelligence/brands')}
className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
>
<Building2 className="w-4 h-4" />
<span>Brands</span>
</button>
<button
className="btn btn-sm gap-1 bg-emerald-600 text-white hover:bg-emerald-700 border-emerald-600"
>
<MapPin className="w-4 h-4" />
<span>Stores</span>
</button>
<button
onClick={() => navigate('/admin/intelligence/pricing')}
className="btn btn-sm gap-1 bg-white border-gray-300 text-gray-700 hover:bg-gray-100"
>
<DollarSign className="w-4 h-4" />
<span>Pricing</span>
</button>
</div>
</div>
</div>
@@ -193,26 +220,6 @@ export function IntelligenceStores() {
className="input input-bordered input-sm w-full pl-10"
/>
</div>
<div className="dropdown">
<button tabIndex={0} className="btn btn-sm btn-outline gap-2">
{stateLabel}
<ChevronDown className="w-4 h-4" />
</button>
<ul tabIndex={0} className="dropdown-content z-[1] menu p-2 shadow bg-base-100 rounded-box w-40 max-h-60 overflow-y-auto">
<li>
<a onClick={() => setSelectedState(null)} className={isAllStates ? 'active' : ''}>
All States
</a>
</li>
{localStates.map(state => (
<li key={state}>
<a onClick={() => setSelectedState(state)} className={selectedState === state ? 'active' : ''}>
{state}
</a>
</li>
))}
</ul>
</div>
<span className="text-sm text-gray-500">
Showing {filteredStores.length} of {stores.length} stores
</span>
@@ -246,7 +253,7 @@ export function IntelligenceStores() {
<tr
key={store.id}
className="hover:bg-gray-50 cursor-pointer"
onClick={() => navigate(`/admin/orchestrator/stores?storeId=${store.id}`)}
onClick={() => navigate(`/stores/list/${store.id}`)}
>
<td>
<span className="font-medium">{store.name}</span>
@@ -262,10 +269,10 @@ export function IntelligenceStores() {
)}
</td>
<td className="text-center">
<span className="font-mono">{store.skuCount.toLocaleString()}</span>
<span className="font-mono">{(store.skuCount || 0).toLocaleString()}</span>
</td>
<td className="text-center">
<span className="font-mono">{store.snapshotCount.toLocaleString()}</span>
<span className="font-mono">{(store.snapshotCount || 0).toLocaleString()}</span>
</td>
<td>
<span className={store.lastCrawl ? 'text-green-600' : 'text-gray-400'}>

View File

@@ -11,7 +11,6 @@ import {
ChevronRight,
Users,
Inbox,
Zap,
Timer,
Plus,
X,
@@ -566,122 +565,6 @@ function PriorityBadge({ priority }: { priority: number }) {
);
}
// Pod visualization - shows pod as hub with worker nodes radiating out
function PodVisualization({ podName, workers }: { podName: string; workers: Worker[] }) {
const busyCount = workers.filter(w => w.current_task_id !== null).length;
const allBusy = busyCount === workers.length;
const allIdle = busyCount === 0;
// Aggregate resource stats for the pod
const totalMemoryMb = workers.reduce((sum, w) => sum + (w.metadata?.memory_rss_mb || 0), 0);
const totalCpuUserMs = workers.reduce((sum, w) => sum + (w.metadata?.cpu_user_ms || 0), 0);
const totalCpuSystemMs = workers.reduce((sum, w) => sum + (w.metadata?.cpu_system_ms || 0), 0);
const totalCompleted = workers.reduce((sum, w) => sum + w.tasks_completed, 0);
const totalFailed = workers.reduce((sum, w) => sum + w.tasks_failed, 0);
// Format CPU time
const formatCpuTime = (ms: number) => {
if (ms < 1000) return `${ms}ms`;
if (ms < 60000) return `${(ms / 1000).toFixed(1)}s`;
return `${(ms / 60000).toFixed(1)}m`;
};
// Pod color based on worker status
const podColor = allBusy ? 'bg-blue-500' : allIdle ? 'bg-emerald-500' : 'bg-yellow-500';
const podBorder = allBusy ? 'border-blue-400' : allIdle ? 'border-emerald-400' : 'border-yellow-400';
const podGlow = allBusy ? 'shadow-blue-200' : allIdle ? 'shadow-emerald-200' : 'shadow-yellow-200';
// Build pod tooltip
const podTooltip = [
`Pod: ${podName}`,
`Workers: ${busyCount}/${workers.length} busy`,
`Memory: ${totalMemoryMb} MB (RSS)`,
`CPU: ${formatCpuTime(totalCpuUserMs)} user, ${formatCpuTime(totalCpuSystemMs)} system`,
`Tasks: ${totalCompleted} completed, ${totalFailed} failed`,
].join('\n');
return (
<div className="flex flex-col items-center p-4">
{/* Pod hub */}
<div className="relative">
{/* Center pod circle */}
<div
className={`w-20 h-20 rounded-full ${podColor} border-4 ${podBorder} shadow-lg ${podGlow} flex items-center justify-center text-white font-bold text-xs text-center leading-tight z-10 relative cursor-help`}
title={podTooltip}
>
<span className="px-1">{podName}</span>
</div>
{/* Worker nodes radiating out */}
{workers.map((worker, index) => {
const angle = (index * 360) / workers.length - 90; // Start from top
const radians = (angle * Math.PI) / 180;
const radius = 55; // Distance from center
const x = Math.cos(radians) * radius;
const y = Math.sin(radians) * radius;
const isBusy = worker.current_task_id !== null;
const workerColor = isBusy ? 'bg-blue-500' : 'bg-emerald-500';
const workerBorder = isBusy ? 'border-blue-300' : 'border-emerald-300';
// Line from center to worker
const lineLength = radius - 10;
const lineX = Math.cos(radians) * (lineLength / 2 + 10);
const lineY = Math.sin(radians) * (lineLength / 2 + 10);
return (
<div key={worker.id}>
{/* Connection line */}
<div
className={`absolute w-0.5 ${isBusy ? 'bg-blue-300' : 'bg-emerald-300'}`}
style={{
height: `${lineLength}px`,
left: '50%',
top: '50%',
transform: `translate(-50%, -50%) translate(${lineX}px, ${lineY}px) rotate(${angle + 90}deg)`,
transformOrigin: 'center',
}}
/>
{/* Worker node */}
<div
className={`absolute w-6 h-6 rounded-full ${workerColor} border-2 ${workerBorder} flex items-center justify-center text-white text-xs font-bold cursor-pointer hover:scale-110 transition-transform`}
style={{
left: '50%',
top: '50%',
transform: `translate(-50%, -50%) translate(${x}px, ${y}px)`,
}}
title={`${worker.friendly_name}\nStatus: ${isBusy ? `Working on task #${worker.current_task_id}` : 'Idle - waiting for tasks'}\nMemory: ${worker.metadata?.memory_rss_mb || 0} MB\nCPU: ${formatCpuTime(worker.metadata?.cpu_user_ms || 0)} user, ${formatCpuTime(worker.metadata?.cpu_system_ms || 0)} sys\nCompleted: ${worker.tasks_completed} | Failed: ${worker.tasks_failed}\nLast heartbeat: ${new Date(worker.last_heartbeat_at).toLocaleTimeString()}`}
>
{index + 1}
</div>
</div>
);
})}
</div>
{/* Pod stats */}
<div className="mt-12 text-center">
<p className="text-xs text-gray-500">
{busyCount}/{workers.length} busy
</p>
</div>
</div>
);
}
// Group workers by pod
function groupWorkersByPod(workers: Worker[]): Map<string, Worker[]> {
const pods = new Map<string, Worker[]>();
for (const worker of workers) {
const podName = worker.pod_name || 'Unknown';
if (!pods.has(podName)) {
pods.set(podName, []);
}
pods.get(podName)!.push(worker);
}
return pods;
}
export function JobQueue() {
const [workers, setWorkers] = useState<Worker[]>([]);
const [tasks, setTasks] = useState<Task[]>([]);
@@ -768,7 +651,6 @@ export function JobQueue() {
// Get active workers (for display)
const activeWorkers = workers.filter(w => w.status !== 'offline' && w.status !== 'terminated');
const busyWorkers = workers.filter(w => w.current_task_id !== null);
if (loading) {
return (
@@ -874,46 +756,6 @@ export function JobQueue() {
</div>
)}
{/* Pods & Workers Section */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="px-4 py-3 border-b border-gray-200 bg-gray-50">
<div className="flex items-center justify-between">
<div>
<h3 className="text-sm font-semibold text-gray-900 flex items-center gap-2">
<Zap className="w-4 h-4 text-emerald-500" />
Worker Pods ({Array.from(groupWorkersByPod(workers)).length} pods, {activeWorkers.length} workers)
</h3>
<p className="text-xs text-gray-500 mt-0.5">
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-emerald-500"></span> idle</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-blue-500"></span> busy</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-yellow-500"></span> mixed</span>
</p>
</div>
<div className="text-sm text-gray-500">
{busyWorkers.length} busy, {activeWorkers.length - busyWorkers.length} idle
</div>
</div>
</div>
{workers.length === 0 ? (
<div className="px-4 py-12 text-center text-gray-500">
<Users className="w-12 h-12 mx-auto mb-3 text-gray-300" />
<p className="font-medium">No worker pods running</p>
<p className="text-xs mt-1">Start pods to process tasks from the queue</p>
</div>
) : (
<div className="p-6">
<div className="flex flex-wrap justify-center gap-8">
{Array.from(groupWorkersByPod(workers)).map(([podName, podWorkers]) => (
<PodVisualization key={podName} podName={podName} workers={podWorkers} />
))}
</div>
</div>
)}
</div>
{/* Task Pool Section */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="px-4 py-3 border-b border-gray-200 bg-gray-50">

View File

@@ -8,7 +8,6 @@
import { useState, useEffect } from 'react';
import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { StateBadge } from '../components/StateSelector';
import { useStateStore } from '../store/stateStore';
import { api } from '../lib/api';
import {
@@ -21,7 +20,6 @@ import {
DollarSign,
MapPin,
ArrowRight,
RefreshCw,
AlertCircle
} from 'lucide-react';
@@ -205,7 +203,6 @@ export default function NationalDashboard() {
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
const [summary, setSummary] = useState<NationalSummary | null>(null);
const [refreshing, setRefreshing] = useState(false);
const fetchData = async () => {
setLoading(true);
@@ -230,18 +227,6 @@ export default function NationalDashboard() {
fetchData();
}, []);
const handleRefreshMetrics = async () => {
setRefreshing(true);
try {
await api.post('/api/admin/states/refresh-metrics');
await fetchData();
} catch (err) {
console.error('Failed to refresh metrics:', err);
} finally {
setRefreshing(false);
}
};
const handleStateClick = (stateCode: string) => {
setSelectedState(stateCode);
navigate(`/national/state/${stateCode}`);
@@ -278,24 +263,11 @@ export default function NationalDashboard() {
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-2xl font-bold text-gray-900">National Dashboard</h1>
<p className="text-gray-500 mt-1">
Multi-state cannabis market intelligence
</p>
</div>
<div className="flex items-center gap-3">
<StateBadge />
<button
onClick={handleRefreshMetrics}
disabled={refreshing}
className="flex items-center gap-2 px-3 py-2 text-sm text-gray-600 hover:text-gray-900 border border-gray-200 rounded-lg hover:bg-gray-50 disabled:opacity-50"
>
<RefreshCw className={`w-4 h-4 ${refreshing ? 'animate-spin' : ''}`} />
Refresh Metrics
</button>
</div>
<div>
<h1 className="text-2xl font-bold text-gray-900">National Dashboard</h1>
<p className="text-gray-500 mt-1">
Multi-state cannabis market intelligence
</p>
</div>
{/* Summary Cards */}
@@ -303,7 +275,7 @@ export default function NationalDashboard() {
<>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-4">
<MetricCard
title="Active States"
title="Regions (US + CA)"
value={summary.activeStates}
icon={Globe}
/>

View File

@@ -153,29 +153,6 @@ export function StoreDetailPage() {
Back to Stores
</button>
{/* Update Button */}
<div className="relative">
<button
onClick={() => setShowUpdateDropdown(!showUpdateDropdown)}
disabled={isUpdating}
className="flex items-center gap-2 px-4 py-2 text-sm font-medium text-white bg-blue-600 hover:bg-blue-700 rounded-lg disabled:opacity-50 disabled:cursor-not-allowed"
>
<RefreshCw className={`w-4 h-4 ${isUpdating ? 'animate-spin' : ''}`} />
{isUpdating ? 'Crawling...' : 'Crawl Now'}
{!isUpdating && <ChevronDown className="w-4 h-4" />}
</button>
{showUpdateDropdown && !isUpdating && (
<div className="absolute right-0 mt-2 w-48 bg-white rounded-lg shadow-lg border border-gray-200 z-10">
<button
onClick={handleCrawl}
className="w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100 rounded-lg"
>
Start Full Crawl
</button>
</div>
)}
</div>
</div>
{/* Store Header */}
@@ -200,7 +177,7 @@ export function StoreDetailPage() {
<div className="flex items-center gap-2 text-sm text-gray-600 bg-gray-50 px-4 py-2 rounded-lg">
<Clock className="w-4 h-4" />
<div>
<span className="font-medium">Last Crawl:</span>
<span className="font-medium">Last Updated:</span>
<span className="ml-2">
{lastCrawl?.completed_at
? new Date(lastCrawl.completed_at).toLocaleDateString('en-US', {
@@ -212,15 +189,6 @@ export function StoreDetailPage() {
})
: 'Never'}
</span>
{lastCrawl?.status && (
<span className={`ml-2 px-2 py-0.5 rounded text-xs ${
lastCrawl.status === 'completed' ? 'bg-green-100 text-green-800' :
lastCrawl.status === 'failed' ? 'bg-red-100 text-red-800' :
'bg-yellow-100 text-yellow-800'
}`}>
{lastCrawl.status}
</span>
)}
</div>
</div>
</div>
@@ -282,8 +250,8 @@ export function StoreDetailPage() {
setStockFilter('in_stock');
setSearchQuery('');
}}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${
stockFilter === 'in_stock' ? 'border-blue-500' : 'border-gray-200'
className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
stockFilter === 'in_stock' ? 'border-gray-400' : 'border-gray-200'
}`}
>
<div className="flex items-center gap-3">
@@ -303,8 +271,8 @@ export function StoreDetailPage() {
setStockFilter('out_of_stock');
setSearchQuery('');
}}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${
stockFilter === 'out_of_stock' ? 'border-blue-500' : 'border-gray-200'
className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
stockFilter === 'out_of_stock' ? 'border-gray-400' : 'border-gray-200'
}`}
>
<div className="flex items-center gap-3">
@@ -320,8 +288,8 @@ export function StoreDetailPage() {
<button
onClick={() => setActiveTab('brands')}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${
activeTab === 'brands' ? 'border-blue-500' : 'border-gray-200'
className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
activeTab === 'brands' ? 'border-gray-400' : 'border-gray-200'
}`}
>
<div className="flex items-center gap-3">
@@ -337,8 +305,8 @@ export function StoreDetailPage() {
<button
onClick={() => setActiveTab('categories')}
className={`bg-white rounded-lg border p-4 hover:border-blue-300 hover:shadow-md transition-all cursor-pointer text-left ${
activeTab === 'categories' ? 'border-blue-500' : 'border-gray-200'
className={`bg-white rounded-lg border p-4 hover:border-gray-300 hover:shadow-md transition-all cursor-pointer text-left ${
activeTab === 'categories' ? 'border-gray-400' : 'border-gray-200'
}`}
>
<div className="flex items-center gap-3">
@@ -364,7 +332,7 @@ export function StoreDetailPage() {
}}
className={`py-4 px-2 text-sm font-medium border-b-2 ${
activeTab === 'products'
? 'border-blue-600 text-blue-600'
? 'border-gray-800 text-gray-900'
: 'border-transparent text-gray-600 hover:text-gray-900'
}`}
>
@@ -374,7 +342,7 @@ export function StoreDetailPage() {
onClick={() => setActiveTab('brands')}
className={`py-4 px-2 text-sm font-medium border-b-2 ${
activeTab === 'brands'
? 'border-blue-600 text-blue-600'
? 'border-gray-800 text-gray-900'
: 'border-transparent text-gray-600 hover:text-gray-900'
}`}
>
@@ -384,7 +352,7 @@ export function StoreDetailPage() {
onClick={() => setActiveTab('categories')}
className={`py-4 px-2 text-sm font-medium border-b-2 ${
activeTab === 'categories'
? 'border-blue-600 text-blue-600'
? 'border-gray-800 text-gray-900'
: 'border-transparent text-gray-600 hover:text-gray-900'
}`}
>
@@ -433,7 +401,7 @@ export function StoreDetailPage() {
{productsLoading ? (
<div className="text-center py-8">
<div className="inline-block animate-spin rounded-full h-6 w-6 border-4 border-blue-500 border-t-transparent"></div>
<div className="inline-block animate-spin rounded-full h-6 w-6 border-4 border-gray-400 border-t-transparent"></div>
<p className="mt-2 text-sm text-gray-600">Loading products...</p>
</div>
) : products.length === 0 ? (
@@ -485,9 +453,9 @@ export function StoreDetailPage() {
<div className="line-clamp-2" title={product.brand || '-'}>{product.brand || '-'}</div>
</td>
<td className="whitespace-nowrap">
<span className="badge badge-ghost badge-sm">{product.type || '-'}</span>
<span className="text-xs text-gray-500 bg-gray-100 px-1.5 py-0.5 rounded">{product.type || '-'}</span>
{product.subcategory && (
<span className="badge badge-ghost badge-sm ml-1">{product.subcategory}</span>
<span className="text-xs text-gray-500 bg-gray-100 px-1.5 py-0.5 rounded ml-1">{product.subcategory}</span>
)}
</td>
<td className="text-right font-semibold whitespace-nowrap">
@@ -500,21 +468,14 @@ export function StoreDetailPage() {
`$${product.regular_price}`
) : '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.thc_percentage ? (
<span className="badge badge-success badge-sm">{product.thc_percentage}%</span>
) : '-'}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.thc_percentage ? `${product.thc_percentage}%` : '-'}
</td>
<td className="text-center whitespace-nowrap">
{product.stock_status === 'in_stock' ? (
<span className="badge badge-success badge-sm">In Stock</span>
) : product.stock_status === 'out_of_stock' ? (
<span className="badge badge-error badge-sm">Out</span>
) : (
<span className="badge badge-warning badge-sm">Unknown</span>
)}
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.stock_status === 'in_stock' ? 'In Stock' :
product.stock_status === 'out_of_stock' ? 'Out' : '-'}
</td>
<td className="text-center whitespace-nowrap">
<td className="text-center whitespace-nowrap text-sm text-gray-700">
{product.total_quantity != null ? product.total_quantity : '-'}
</td>
<td className="whitespace-nowrap text-xs text-gray-500">

View File

@@ -14,8 +14,8 @@ import {
ChevronUp,
Gauge,
Users,
Calendar,
Zap,
Play,
Square,
} from 'lucide-react';
interface Task {
@@ -82,6 +82,27 @@ const STATUS_COLORS: Record<string, string> = {
stale: 'bg-gray-100 text-gray-800',
};
const getStatusIcon = (status: string, poolPaused: boolean): React.ReactNode => {
switch (status) {
case 'pending':
return <Clock className="w-4 h-4" />;
case 'claimed':
return <PlayCircle className="w-4 h-4" />;
case 'running':
// Don't spin when pool is paused
return <RefreshCw className={`w-4 h-4 ${!poolPaused ? 'animate-spin' : ''}`} />;
case 'completed':
return <CheckCircle2 className="w-4 h-4" />;
case 'failed':
return <XCircle className="w-4 h-4" />;
case 'stale':
return <AlertTriangle className="w-4 h-4" />;
default:
return null;
}
};
// Static version for summary cards (always shows animation)
const STATUS_ICONS: Record<string, React.ReactNode> = {
pending: <Clock className="w-4 h-4" />,
claimed: <PlayCircle className="w-4 h-4" />,
@@ -116,6 +137,8 @@ export default function TasksDashboard() {
const [capacity, setCapacity] = useState<CapacityMetric[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
const [poolPaused, setPoolPaused] = useState(false);
const [poolLoading, setPoolLoading] = useState(false);
// Filters
const [roleFilter, setRoleFilter] = useState<string>('');
@@ -123,13 +146,10 @@ export default function TasksDashboard() {
const [searchQuery, setSearchQuery] = useState('');
const [showCapacity, setShowCapacity] = useState(true);
// Actions
const [actionLoading, setActionLoading] = useState(false);
const [actionMessage, setActionMessage] = useState<string | null>(null);
const fetchData = async () => {
try {
const [tasksRes, countsRes, capacityRes] = await Promise.all([
const [tasksRes, countsRes, capacityRes, poolStatus] = await Promise.all([
api.getTasks({
role: roleFilter || undefined,
status: statusFilter || undefined,
@@ -137,11 +157,13 @@ export default function TasksDashboard() {
}),
api.getTaskCounts(),
api.getTaskCapacity(),
api.getTaskPoolStatus(),
]);
setTasks(tasksRes.tasks || []);
setCounts(countsRes);
setCapacity(capacityRes.metrics || []);
setPoolPaused(poolStatus.paused);
setError(null);
} catch (err: any) {
setError(err.message || 'Failed to load tasks');
@@ -150,40 +172,29 @@ export default function TasksDashboard() {
}
};
const togglePool = async () => {
setPoolLoading(true);
try {
if (poolPaused) {
await api.resumeTaskPool();
setPoolPaused(false);
} else {
await api.pauseTaskPool();
setPoolPaused(true);
}
} catch (err: any) {
setError(err.message || 'Failed to toggle pool');
} finally {
setPoolLoading(false);
}
};
useEffect(() => {
fetchData();
const interval = setInterval(fetchData, 10000); // Refresh every 10 seconds
const interval = setInterval(fetchData, 15000); // Auto-refresh every 15 seconds
return () => clearInterval(interval);
}, [roleFilter, statusFilter]);
const handleGenerateResync = async () => {
setActionLoading(true);
try {
const result = await api.generateResyncTasks();
setActionMessage(`Generated ${result.tasks_created} resync tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const handleRecoverStale = async () => {
setActionLoading(true);
try {
const result = await api.recoverStaleTasks();
setActionMessage(`Recovered ${result.tasks_recovered} stale tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const filteredTasks = tasks.filter((task) => {
if (searchQuery) {
const query = searchQuery.toLowerCase();
@@ -213,58 +224,47 @@ export default function TasksDashboard() {
return (
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4">
<div>
<h1 className="text-2xl font-bold text-gray-900 flex items-center gap-2">
<ListChecks className="w-7 h-7 text-emerald-600" />
Task Queue
</h1>
<p className="text-gray-500 mt-1">
{totalActive} active, {totalPending} pending tasks
</p>
</div>
{/* Sticky Header */}
<div className="sticky top-0 z-10 bg-white pb-4 -mx-6 px-6 pt-2 border-b border-gray-200 shadow-sm">
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4">
<div>
<h1 className="text-2xl font-bold text-gray-900 flex items-center gap-2">
<ListChecks className="w-7 h-7 text-emerald-600" />
Task Queue
</h1>
<p className="text-gray-500 mt-1">
{totalActive} active, {totalPending} pending tasks
</p>
</div>
<div className="flex gap-2">
<button
onClick={handleGenerateResync}
disabled={actionLoading}
className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 disabled:opacity-50"
<div className="flex items-center gap-4">
{/* Pool Toggle */}
<button
onClick={togglePool}
disabled={poolLoading}
className={`flex items-center gap-2 px-4 py-2 rounded-lg font-medium transition-colors ${
poolPaused
? 'bg-emerald-100 text-emerald-700 hover:bg-emerald-200'
: 'bg-red-100 text-red-700 hover:bg-red-200'
}`}
>
<Calendar className="w-4 h-4" />
Generate Resync
</button>
<button
onClick={handleRecoverStale}
disabled={actionLoading}
className="flex items-center gap-2 px-4 py-2 bg-gray-600 text-white rounded-lg hover:bg-gray-700 disabled:opacity-50"
>
<Zap className="w-4 h-4" />
Recover Stale
</button>
<button
onClick={fetchData}
className="flex items-center gap-2 px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200"
>
<RefreshCw className="w-4 h-4" />
Refresh
{poolPaused ? (
<>
<Play className={`w-5 h-5 ${poolLoading ? 'animate-pulse' : ''}`} />
Start Pool
</>
) : (
<>
<Square className={`w-5 h-5 ${poolLoading ? 'animate-pulse' : ''}`} />
Stop Pool
</>
)}
</button>
<span className="text-sm text-gray-400">Auto-refreshes every 15s</span>
</div>
</div>
</div>
{/* Action Message */}
{actionMessage && (
<div
className={`p-4 rounded-lg ${
actionMessage.startsWith('Error')
? 'bg-red-50 text-red-700'
: 'bg-green-50 text-green-700'
}`}
>
{actionMessage}
</div>
)}
{error && (
<div className="p-4 bg-red-50 text-red-700 rounded-lg">{error}</div>
)}
@@ -281,7 +281,7 @@ export default function TasksDashboard() {
>
<div className="flex items-center gap-2 mb-2">
<span className={`p-1.5 rounded ${STATUS_COLORS[status]}`}>
{STATUS_ICONS[status]}
{getStatusIcon(status, poolPaused)}
</span>
<span className="text-sm font-medium text-gray-600 capitalize">{status}</span>
</div>
@@ -496,7 +496,7 @@ export default function TasksDashboard() {
STATUS_COLORS[task.status]
}`}
>
{STATUS_ICONS[task.status]}
{getStatusIcon(task.status, poolPaused)}
{task.status}
</span>
</td>

View File

@@ -18,9 +18,11 @@ import {
Server,
MapPin,
Trash2,
PowerOff,
Undo2,
Plus,
Minus,
Loader2,
MemoryStick,
AlertTriangle,
} from 'lucide-react';
// Worker from registry
@@ -39,16 +41,25 @@ interface Worker {
tasks_completed: number;
tasks_failed: number;
current_task_id: number | null;
current_task_ids?: number[]; // Multiple concurrent tasks
active_task_count?: number;
max_concurrent_tasks?: number;
health_status: string;
seconds_since_heartbeat: number;
decommission_requested?: boolean;
decommission_reason?: string;
metadata: {
cpu?: number;
memory?: number;
memoryTotal?: number;
memory_mb?: number;
memory_total_mb?: number;
memory_percent?: number; // NEW: memory as percentage
cpu_user_ms?: number;
cpu_system_ms?: number;
cpu_percent?: number; // NEW: CPU percentage
is_backing_off?: boolean; // NEW: resource backoff state
backoff_reason?: string; // NEW: why backing off
proxy_location?: {
city?: string;
state?: string;
@@ -72,14 +83,6 @@ interface Task {
worker_id: string | null;
}
// K8s replica info (added 2024-12-10)
interface K8sReplicas {
current: number;
desired: number;
available: number;
updated: number;
}
function formatRelativeTime(dateStr: string | null): string {
if (!dateStr) return '-';
const date = new Date(dateStr);
@@ -220,69 +223,257 @@ function HealthBadge({ status, healthStatus }: { status: string; healthStatus: s
);
}
// Format CPU time for display
function formatCpuTime(ms: number): string {
if (ms < 1000) return `${ms}ms`;
if (ms < 60000) return `${(ms / 1000).toFixed(1)}s`;
return `${(ms / 60000).toFixed(1)}m`;
}
// Resource usage badge showing memory%, CPU%, and backoff status
function ResourceBadge({ worker }: { worker: Worker }) {
const memPercent = worker.metadata?.memory_percent;
const cpuPercent = worker.metadata?.cpu_percent;
const isBackingOff = worker.metadata?.is_backing_off;
const backoffReason = worker.metadata?.backoff_reason;
if (isBackingOff) {
return (
<div className="flex items-center gap-1.5" title={backoffReason || 'Backing off due to resource pressure'}>
<AlertTriangle className="w-4 h-4 text-amber-500 animate-pulse" />
<span className="text-xs text-amber-600 font-medium">Backing off</span>
</div>
);
}
// No data yet
if (memPercent === undefined && cpuPercent === undefined) {
return <span className="text-gray-400 text-xs">-</span>;
}
// Color based on usage level
const getColor = (pct: number) => {
if (pct >= 90) return 'text-red-600';
if (pct >= 75) return 'text-amber-600';
if (pct >= 50) return 'text-yellow-600';
return 'text-emerald-600';
};
return (
<div className="flex flex-col gap-0.5 text-xs">
{memPercent !== undefined && (
<div className="flex items-center gap-1" title={`Memory: ${worker.metadata?.memory_mb || 0}MB / ${worker.metadata?.memory_total_mb || 0}MB`}>
<MemoryStick className={`w-3 h-3 ${getColor(memPercent)}`} />
<span className={getColor(memPercent)}>{memPercent}%</span>
</div>
)}
{cpuPercent !== undefined && (
<div className="flex items-center gap-1">
<Cpu className={`w-3 h-3 ${getColor(cpuPercent)}`} />
<span className={getColor(cpuPercent)}>{cpuPercent}%</span>
</div>
)}
</div>
);
}
// Task count badge showing active/max concurrent tasks
function TaskCountBadge({ worker, tasks }: { worker: Worker; tasks: Task[] }) {
const activeCount = worker.active_task_count ?? (worker.current_task_id ? 1 : 0);
const maxCount = worker.max_concurrent_tasks ?? 1;
const taskIds = worker.current_task_ids ?? (worker.current_task_id ? [worker.current_task_id] : []);
if (activeCount === 0) {
return <span className="text-gray-400 text-sm">Idle</span>;
}
// Get task names for tooltip
const taskNames = taskIds.map(id => {
const task = tasks.find(t => t.id === id);
return task ? `#${id}: ${task.role}${task.dispensary_name ? ` (${task.dispensary_name})` : ''}` : `#${id}`;
}).join('\n');
return (
<div className="flex items-center gap-2" title={taskNames}>
<span className="text-sm font-medium text-blue-600">
{activeCount}/{maxCount} tasks
</span>
{taskIds.length === 1 && (
<span className="text-xs text-gray-500">#{taskIds[0]}</span>
)}
</div>
);
}
// Pod visualization - shows pod as hub with worker nodes radiating out
function PodVisualization({
podName,
workers,
isSelected = false,
onSelect
}: {
podName: string;
workers: Worker[];
isSelected?: boolean;
onSelect?: () => void;
}) {
const busyCount = workers.filter(w => w.current_task_id !== null).length;
const allBusy = busyCount === workers.length;
const allIdle = busyCount === 0;
// Aggregate resource stats for the pod
const totalMemoryMb = workers.reduce((sum, w) => sum + (w.metadata?.memory_mb || 0), 0);
const totalCpuUserMs = workers.reduce((sum, w) => sum + (w.metadata?.cpu_user_ms || 0), 0);
const totalCpuSystemMs = workers.reduce((sum, w) => sum + (w.metadata?.cpu_system_ms || 0), 0);
const totalCompleted = workers.reduce((sum, w) => sum + w.tasks_completed, 0);
const totalFailed = workers.reduce((sum, w) => sum + w.tasks_failed, 0);
// Pod color based on worker status
const podColor = allBusy ? 'bg-blue-500' : allIdle ? 'bg-emerald-500' : 'bg-yellow-500';
const podBorder = allBusy ? 'border-blue-400' : allIdle ? 'border-emerald-400' : 'border-yellow-400';
const podGlow = allBusy ? 'shadow-blue-200' : allIdle ? 'shadow-emerald-200' : 'shadow-yellow-200';
// Selection ring
const selectionRing = isSelected ? 'ring-4 ring-purple-400 ring-offset-2' : '';
// Build pod tooltip
const podTooltip = [
`Pod: ${podName}`,
`Workers: ${busyCount}/${workers.length} busy`,
`Memory: ${totalMemoryMb} MB (RSS)`,
`CPU: ${formatCpuTime(totalCpuUserMs)} user, ${formatCpuTime(totalCpuSystemMs)} system`,
`Tasks: ${totalCompleted} completed, ${totalFailed} failed`,
'Click to select',
].join('\n');
return (
<div className="flex flex-col items-center p-4">
{/* Pod hub */}
<div className="relative">
{/* Center pod circle */}
<div
className={`w-20 h-20 rounded-full ${podColor} border-4 ${podBorder} shadow-lg ${podGlow} ${selectionRing} flex items-center justify-center text-white font-bold text-xs text-center leading-tight z-10 relative cursor-pointer hover:scale-105 transition-all`}
title={podTooltip}
onClick={onSelect}
>
<span className="px-1">{podName}</span>
</div>
{/* Worker nodes radiating out */}
{workers.map((worker, index) => {
const angle = (index * 360) / workers.length - 90; // Start from top
const radians = (angle * Math.PI) / 180;
const radius = 55; // Distance from center
const x = Math.cos(radians) * radius;
const y = Math.sin(radians) * radius;
const isBusy = worker.current_task_id !== null;
const isDecommissioning = worker.decommission_requested;
const workerColor = isDecommissioning ? 'bg-orange-500' : isBusy ? 'bg-blue-500' : 'bg-emerald-500';
const workerBorder = isDecommissioning ? 'border-orange-300' : isBusy ? 'border-blue-300' : 'border-emerald-300';
// Line from center to worker
const lineLength = radius - 10;
const lineX = Math.cos(radians) * (lineLength / 2 + 10);
const lineY = Math.sin(radians) * (lineLength / 2 + 10);
return (
<div key={worker.id}>
{/* Connection line */}
<div
className={`absolute w-0.5 ${isDecommissioning ? 'bg-orange-300' : isBusy ? 'bg-blue-300' : 'bg-emerald-300'}`}
style={{
height: `${lineLength}px`,
left: '50%',
top: '50%',
transform: `translate(-50%, -50%) translate(${lineX}px, ${lineY}px) rotate(${angle + 90}deg)`,
transformOrigin: 'center',
}}
/>
{/* Worker node */}
<div
className={`absolute w-6 h-6 rounded-full ${workerColor} border-2 ${workerBorder} flex items-center justify-center text-white text-xs font-bold cursor-pointer hover:scale-110 transition-transform`}
style={{
left: '50%',
top: '50%',
transform: `translate(-50%, -50%) translate(${x}px, ${y}px)`,
}}
title={`${worker.friendly_name}\nStatus: ${isDecommissioning ? 'Stopping after current task' : isBusy ? `Working on task #${worker.current_task_id}` : 'Idle - waiting for tasks'}\nMemory: ${worker.metadata?.memory_mb || 0} MB\nCPU: ${formatCpuTime(worker.metadata?.cpu_user_ms || 0)} user, ${formatCpuTime(worker.metadata?.cpu_system_ms || 0)} sys\nCompleted: ${worker.tasks_completed} | Failed: ${worker.tasks_failed}\nLast heartbeat: ${new Date(worker.last_heartbeat_at).toLocaleTimeString()}`}
>
{index + 1}
</div>
</div>
);
})}
</div>
{/* Pod stats */}
<div className="mt-12 text-center">
<p className="text-xs text-gray-500">
{busyCount}/{workers.length} busy
</p>
{isSelected && (
<p className="text-xs text-purple-600 font-medium mt-1">Selected</p>
)}
</div>
</div>
);
}
// Group workers by pod
function groupWorkersByPod(workers: Worker[]): Map<string, Worker[]> {
const pods = new Map<string, Worker[]>();
for (const worker of workers) {
const podName = worker.pod_name || 'Unknown';
if (!pods.has(podName)) {
pods.set(podName, []);
}
pods.get(podName)!.push(worker);
}
return pods;
}
// Format estimated time remaining
function formatEstimatedTime(hours: number): string {
if (hours < 1) {
return `${Math.round(hours * 60)} minutes`;
}
if (hours < 24) {
return `${hours.toFixed(1)} hours`;
}
const days = hours / 24;
if (days < 7) {
return `${days.toFixed(1)} days`;
}
return `${(days / 7).toFixed(1)} weeks`;
}
export function WorkersDashboard() {
const [workers, setWorkers] = useState<Worker[]>([]);
const [tasks, setTasks] = useState<Task[]>([]);
const [pendingTaskCount, setPendingTaskCount] = useState<number>(0);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
// K8s scaling state (added 2024-12-10)
const [k8sReplicas, setK8sReplicas] = useState<K8sReplicas | null>(null);
const [k8sError, setK8sError] = useState<string | null>(null);
const [scaling, setScaling] = useState(false);
const [targetReplicas, setTargetReplicas] = useState<number | null>(null);
// Pod selection state
const [selectedPod, setSelectedPod] = useState<string | null>(null);
// Pagination
const [page, setPage] = useState(0);
const workersPerPage = 15;
// Fetch K8s replica count (added 2024-12-10)
const fetchK8sReplicas = useCallback(async () => {
try {
const res = await api.get('/api/workers/k8s/replicas');
if (res.data.success && res.data.replicas) {
setK8sReplicas(res.data.replicas);
if (targetReplicas === null) {
setTargetReplicas(res.data.replicas.desired);
}
setK8sError(null);
}
} catch (err: any) {
// K8s not available (local dev or no RBAC)
setK8sError(err.response?.data?.error || 'K8s not available');
setK8sReplicas(null);
}
}, [targetReplicas]);
// Scale workers (added 2024-12-10)
const handleScale = useCallback(async (replicas: number) => {
if (replicas < 0 || replicas > 20) return;
setScaling(true);
try {
const res = await api.post('/api/workers/k8s/scale', { replicas });
if (res.data.success) {
setTargetReplicas(replicas);
// Refresh after a short delay to see the change
setTimeout(fetchK8sReplicas, 1000);
}
} catch (err: any) {
console.error('Scale error:', err);
setK8sError(err.response?.data?.error || 'Failed to scale');
} finally {
setScaling(false);
}
}, [fetchK8sReplicas]);
const fetchData = useCallback(async () => {
try {
// Fetch workers from registry
const workersRes = await api.get('/api/worker-registry/workers');
// Fetch running tasks to get current task details
const tasksRes = await api.get('/api/tasks?status=running&limit=100');
// Fetch workers from registry, running tasks, and task counts
const [workersRes, tasksRes, countsRes] = await Promise.all([
api.get('/api/worker-registry/workers'),
api.get('/api/tasks?status=running&limit=100'),
api.get('/api/tasks/counts'),
]);
setWorkers(workersRes.data.workers || []);
setTasks(tasksRes.data.tasks || []);
setPendingTaskCount(countsRes.data?.pending || 0);
setError(null);
} catch (err: any) {
console.error('Fetch error:', err);
@@ -292,16 +483,6 @@ export function WorkersDashboard() {
}
}, []);
// Cleanup stale workers
const handleCleanupStale = async () => {
try {
await api.post('/api/worker-registry/cleanup', { stale_threshold_minutes: 2 });
fetchData();
} catch (err: any) {
console.error('Cleanup error:', err);
}
};
// Remove a single worker
const handleRemoveWorker = async (workerId: string) => {
if (!confirm('Remove this worker from the registry?')) return;
@@ -313,16 +494,51 @@ export function WorkersDashboard() {
}
};
// Decommission a worker (graceful shutdown after current task)
const handleDecommissionWorker = async (workerId: string, friendlyName: string) => {
if (!confirm(`Decommission ${friendlyName}? Worker will stop after completing its current task.`)) return;
try {
const res = await api.post(`/api/worker-registry/workers/${workerId}/decommission`, {
reason: 'Manual decommission from admin UI'
});
if (res.data.success) {
fetchData();
}
} catch (err: any) {
console.error('Decommission error:', err);
alert(err.response?.data?.error || 'Failed to decommission worker');
}
};
// Cancel decommission
const handleCancelDecommission = async (workerId: string) => {
try {
await api.post(`/api/worker-registry/workers/${workerId}/cancel-decommission`);
fetchData();
} catch (err: any) {
console.error('Cancel decommission error:', err);
}
};
// Add a worker by scaling up the K8s deployment
const handleAddWorker = async () => {
try {
const res = await api.post('/api/workers/k8s/scale-up');
if (res.data.success) {
// Refresh after a short delay to see the new worker
setTimeout(fetchData, 2000);
}
} catch (err: any) {
console.error('Add worker error:', err);
alert(err.response?.data?.error || 'Failed to add worker. K8s scaling may not be available.');
}
};
useEffect(() => {
fetchData();
fetchK8sReplicas(); // Added 2024-12-10
const interval = setInterval(fetchData, 5000);
const k8sInterval = setInterval(fetchK8sReplicas, 10000); // K8s refresh every 10s
return () => {
clearInterval(interval);
clearInterval(k8sInterval);
};
}, [fetchData, fetchK8sReplicas]);
return () => clearInterval(interval);
}, [fetchData]);
// Paginated workers
const paginatedWorkers = workers.slice(
@@ -362,25 +578,9 @@ export function WorkersDashboard() {
<h1 className="text-2xl font-bold text-gray-900">Workers</h1>
<p className="text-gray-500 mt-1">
{workers.length} registered workers ({busyWorkers.length} busy, {idleWorkers.length} idle)
<span className="text-xs text-gray-400 ml-2">(auto-refresh 5s)</span>
</p>
</div>
<div className="flex items-center gap-2">
<button
onClick={handleCleanupStale}
className="flex items-center gap-2 px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 transition-colors"
title="Mark stale workers (no heartbeat > 2 min) as offline"
>
<Trash2 className="w-4 h-4" />
Cleanup Stale
</button>
<button
onClick={() => fetchData()}
className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 transition-colors"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
</div>
</div>
{error && (
@@ -389,68 +589,6 @@ export function WorkersDashboard() {
</div>
)}
{/* K8s Scaling Card (added 2024-12-10) */}
{k8sReplicas && (
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-purple-100 rounded-lg flex items-center justify-center">
<Server className="w-5 h-5 text-purple-600" />
</div>
<div>
<p className="text-sm text-gray-500">K8s Worker Pods</p>
<p className="text-xl font-semibold">
{k8sReplicas.current} / {k8sReplicas.desired}
{k8sReplicas.current !== k8sReplicas.desired && (
<span className="text-sm font-normal text-yellow-600 ml-2">scaling...</span>
)}
</p>
</div>
</div>
<div className="flex items-center gap-2">
<button
onClick={() => handleScale((targetReplicas || k8sReplicas.desired) - 1)}
disabled={scaling || (targetReplicas || k8sReplicas.desired) <= 0}
className="w-8 h-8 flex items-center justify-center bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
title="Scale down"
>
<Minus className="w-4 h-4" />
</button>
<input
type="number"
min="0"
max="20"
value={targetReplicas ?? k8sReplicas.desired}
onChange={(e) => setTargetReplicas(Math.max(0, Math.min(20, parseInt(e.target.value) || 0)))}
onBlur={() => {
if (targetReplicas !== null && targetReplicas !== k8sReplicas.desired) {
handleScale(targetReplicas);
}
}}
onKeyDown={(e) => {
if (e.key === 'Enter' && targetReplicas !== null && targetReplicas !== k8sReplicas.desired) {
handleScale(targetReplicas);
}
}}
className="w-16 text-center border border-gray-300 rounded-lg px-2 py-1 text-lg font-semibold"
/>
<button
onClick={() => handleScale((targetReplicas || k8sReplicas.desired) + 1)}
disabled={scaling || (targetReplicas || k8sReplicas.desired) >= 20}
className="w-8 h-8 flex items-center justify-center bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
title="Scale up"
>
<Plus className="w-4 h-4" />
</button>
{scaling && <Loader2 className="w-4 h-4 text-purple-600 animate-spin ml-2" />}
</div>
</div>
{k8sError && (
<p className="text-xs text-red-500 mt-2">{k8sError}</p>
)}
</div>
)}
{/* Stats Cards */}
<div className="grid grid-cols-5 gap-4">
<div className="bg-white rounded-lg border border-gray-200 p-4">
@@ -510,6 +648,197 @@ export function WorkersDashboard() {
</div>
</div>
{/* Estimated Completion Time Card */}
{pendingTaskCount > 0 && activeWorkers.length > 0 && (() => {
// Calculate average task rate across all workers
const totalHoursUp = activeWorkers.reduce((sum, w) => {
if (!w.started_at) return sum;
const start = new Date(w.started_at);
const now = new Date();
return sum + (now.getTime() - start.getTime()) / (1000 * 60 * 60);
}, 0);
const totalTasksDone = totalCompleted + totalFailed;
const avgTasksPerHour = totalHoursUp > 0.1 ? totalTasksDone / totalHoursUp : 0;
const estimatedHours = avgTasksPerHour > 0 ? pendingTaskCount / avgTasksPerHour : null;
return (
<div className="bg-gradient-to-r from-amber-50 to-orange-50 rounded-lg border border-amber-200 p-4">
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-amber-100 rounded-lg flex items-center justify-center">
<Clock className="w-5 h-5 text-amber-600" />
</div>
<div>
<p className="text-sm text-amber-700 font-medium">Estimated Time to Complete Queue</p>
<p className="text-2xl font-bold text-amber-900">
{estimatedHours !== null ? formatEstimatedTime(estimatedHours) : 'Calculating...'}
</p>
</div>
</div>
<div className="text-right text-sm text-amber-700">
<p><span className="font-semibold">{pendingTaskCount}</span> pending tasks</p>
<p><span className="font-semibold">{activeWorkers.length}</span> active workers</p>
{avgTasksPerHour > 0 && (
<p className="text-xs text-amber-600 mt-1">
~{avgTasksPerHour.toFixed(1)} tasks/hour
</p>
)}
</div>
</div>
</div>
);
})()}
{/* Worker Pods Visualization */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="px-4 py-3 border-b border-gray-200 bg-gray-50">
<div className="flex items-center justify-between">
<div>
<h3 className="text-sm font-semibold text-gray-900 flex items-center gap-2">
<Zap className="w-4 h-4 text-emerald-500" />
Worker Pods ({Array.from(groupWorkersByPod(workers)).length} pods, {activeWorkers.length} workers)
</h3>
<p className="text-xs text-gray-500 mt-0.5">
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-emerald-500"></span> idle</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-blue-500"></span> busy</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-yellow-500"></span> mixed</span>
<span className="mx-2">|</span>
<span className="inline-flex items-center gap-1"><span className="w-2 h-2 rounded-full bg-orange-500"></span> stopping</span>
</p>
</div>
<div className="text-sm text-gray-500">
{busyWorkers.length} busy, {activeWorkers.length - busyWorkers.length} idle
{selectedPod && (
<button
onClick={() => setSelectedPod(null)}
className="ml-3 text-xs text-purple-600 hover:text-purple-800 underline"
>
Clear selection
</button>
)}
</div>
</div>
</div>
{workers.length === 0 ? (
<div className="px-4 py-12 text-center text-gray-500">
<Users className="w-12 h-12 mx-auto mb-3 text-gray-300" />
<p className="font-medium">No worker pods running</p>
<p className="text-xs mt-1">Start pods to process tasks from the queue</p>
</div>
) : (
<div className="p-6">
<div className="flex flex-wrap justify-center gap-8">
{Array.from(groupWorkersByPod(workers)).map(([podName, podWorkers]) => (
<PodVisualization
key={podName}
podName={podName}
workers={podWorkers}
isSelected={selectedPod === podName}
onSelect={() => setSelectedPod(selectedPod === podName ? null : podName)}
/>
))}
</div>
{/* Selected Pod Control Panel */}
{selectedPod && (() => {
const podWorkers = groupWorkersByPod(workers).get(selectedPod) || [];
const busyInPod = podWorkers.filter(w => w.current_task_id !== null).length;
const idleInPod = podWorkers.filter(w => w.current_task_id === null && !w.decommission_requested).length;
const stoppingInPod = podWorkers.filter(w => w.decommission_requested).length;
return (
<div className="mt-6 border-t border-gray-200 pt-6">
<div className="bg-purple-50 rounded-lg border border-purple-200 p-4">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-purple-100 rounded-lg flex items-center justify-center">
<Server className="w-5 h-5 text-purple-600" />
</div>
<div>
<h4 className="font-semibold text-purple-900">{selectedPod}</h4>
<p className="text-xs text-purple-600">
{podWorkers.length} workers: {busyInPod} busy, {idleInPod} idle{stoppingInPod > 0 && `, ${stoppingInPod} stopping`}
</p>
</div>
</div>
</div>
{/* Worker list in selected pod */}
<div className="space-y-2">
{podWorkers.map((worker) => {
const isBusy = worker.current_task_id !== null;
const isDecommissioning = worker.decommission_requested;
return (
<div key={worker.id} className="flex items-center justify-between bg-white rounded-lg px-3 py-2 border border-purple-100">
<div className="flex items-center gap-3">
<div className={`w-8 h-8 rounded-full flex items-center justify-center text-white text-sm font-bold ${
isDecommissioning ? 'bg-orange-500' :
isBusy ? 'bg-blue-500' : 'bg-emerald-500'
}`}>
{worker.friendly_name?.charAt(0) || '?'}
</div>
<div>
<p className="text-sm font-medium text-gray-900">{worker.friendly_name}</p>
<p className="text-xs text-gray-500">
{isDecommissioning ? (
<span className="text-orange-600">Stopping after current task...</span>
) : isBusy ? (
<span className="text-blue-600">Working on task #{worker.current_task_id}</span>
) : (
<span className="text-emerald-600">Idle - ready for tasks</span>
)}
</p>
</div>
</div>
<div className="flex items-center gap-2">
{isDecommissioning ? (
<button
onClick={() => handleCancelDecommission(worker.worker_id)}
className="flex items-center gap-1.5 px-3 py-1.5 text-sm bg-white border border-gray-300 text-gray-700 rounded-lg hover:bg-gray-50 transition-colors"
title="Cancel decommission"
>
<Undo2 className="w-4 h-4" />
Cancel
</button>
) : (
<button
onClick={() => handleDecommissionWorker(worker.worker_id, worker.friendly_name)}
className="flex items-center gap-1.5 px-3 py-1.5 text-sm bg-orange-100 text-orange-700 rounded-lg hover:bg-orange-200 transition-colors"
title={isBusy ? 'Worker will stop after completing current task' : 'Remove idle worker'}
>
<PowerOff className="w-4 h-4" />
{isBusy ? 'Stop after task' : 'Remove'}
</button>
)}
</div>
</div>
);
})}
</div>
{/* Add Worker button */}
<div className="mt-4 pt-4 border-t border-purple-200">
<button
onClick={handleAddWorker}
className="flex items-center gap-1.5 px-3 py-2 text-sm bg-emerald-100 text-emerald-700 rounded-lg hover:bg-emerald-200 transition-colors"
>
<Plus className="w-4 h-4" />
Add Worker
</button>
</div>
</div>
</div>
);
})()}
</div>
)}
</div>
{/* Workers Table */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="px-4 py-3 border-b border-gray-200 bg-gray-50 flex items-center justify-between">
@@ -552,10 +881,10 @@ export function WorkersDashboard() {
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Worker</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Role</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Status</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Exit Location</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Current Task</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Resources</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Tasks</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Duration</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Utilization</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Throughput</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Heartbeat</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase"></th>
</tr>
@@ -570,16 +899,29 @@ export function WorkersDashboard() {
<tr key={worker.id} className="hover:bg-gray-50">
<td className="px-4 py-3">
<div className="flex items-center gap-3">
<div className={`w-10 h-10 rounded-full flex items-center justify-center text-white font-bold text-sm ${
<div className={`w-10 h-10 rounded-full flex items-center justify-center text-white font-bold text-sm relative ${
worker.decommission_requested ? 'bg-orange-500' :
worker.health_status === 'offline' ? 'bg-gray-400' :
worker.health_status === 'stale' ? 'bg-yellow-500' :
worker.health_status === 'busy' ? 'bg-blue-500' :
'bg-emerald-500'
}`}>
{worker.friendly_name?.charAt(0) || '?'}
{worker.decommission_requested && (
<div className="absolute -top-1 -right-1 w-4 h-4 bg-red-500 rounded-full flex items-center justify-center">
<PowerOff className="w-2.5 h-2.5 text-white" />
</div>
)}
</div>
<div>
<p className="font-medium text-gray-900">{worker.friendly_name}</p>
<p className="font-medium text-gray-900 flex items-center gap-1.5">
{worker.friendly_name}
{worker.decommission_requested && (
<span className="text-xs text-orange-600 bg-orange-100 px-1.5 py-0.5 rounded" title={worker.decommission_reason || 'Pending decommission'}>
stopping
</span>
)}
</p>
<p className="text-xs text-gray-400 font-mono">{worker.worker_id.slice(0, 20)}...</p>
</div>
</div>
@@ -591,45 +933,10 @@ export function WorkersDashboard() {
<HealthBadge status={worker.status} healthStatus={worker.health_status} />
</td>
<td className="px-4 py-3">
{(() => {
const loc = worker.metadata?.proxy_location;
if (!loc) {
return <span className="text-gray-400 text-sm">-</span>;
}
const parts = [loc.city, loc.state, loc.country].filter(Boolean);
if (parts.length === 0) {
return loc.isRotating ? (
<span className="text-xs text-purple-600 font-medium" title="Rotating proxy - exit location varies per request">
Rotating
</span>
) : (
<span className="text-gray-400 text-sm">Unknown</span>
);
}
return (
<div className="flex items-center gap-1.5" title={loc.timezone || ''}>
<MapPin className="w-3 h-3 text-gray-400" />
<span className="text-sm text-gray-700">
{parts.join(', ')}
</span>
{loc.isRotating && (
<span className="text-xs text-purple-500" title="Rotating proxy">*</span>
)}
</div>
);
})()}
<ResourceBadge worker={worker} />
</td>
<td className="px-4 py-3">
{worker.current_task_id ? (
<div>
<span className="text-sm text-gray-900">Task #{worker.current_task_id}</span>
{currentTask?.dispensary_name && (
<p className="text-xs text-gray-500">{currentTask.dispensary_name}</p>
)}
</div>
) : (
<span className="text-gray-400 text-sm">Idle</span>
)}
<TaskCountBadge worker={worker} tasks={tasks} />
</td>
<td className="px-4 py-3">
{currentTask?.started_at ? (

36
k8s/scraper-rbac.yaml Normal file
View File

@@ -0,0 +1,36 @@
# RBAC configuration for scraper pod to control worker scaling
# Allows the scraper to read and scale the scraper-worker statefulset
apiVersion: v1
kind: ServiceAccount
metadata:
name: scraper-sa
namespace: dispensary-scraper
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: worker-scaler
namespace: dispensary-scraper
rules:
# Allow reading deployment and statefulset status
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets"]
verbs: ["get", "list"]
# Allow scaling deployments and statefulsets
- apiGroups: ["apps"]
resources: ["deployments/scale", "statefulsets/scale"]
verbs: ["get", "patch", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: scraper-worker-scaler
namespace: dispensary-scraper
subjects:
- kind: ServiceAccount
name: scraper-sa
namespace: dispensary-scraper
roleRef:
kind: Role
name: worker-scaler
apiGroup: rbac.authorization.k8s.io

View File

@@ -12,7 +12,7 @@ metadata:
name: scraper-worker
namespace: dispensary-scraper
spec:
replicas: 5
replicas: 25
selector:
matchLabels:
app: scraper-worker

View File

@@ -25,6 +25,7 @@ spec:
labels:
app: scraper
spec:
serviceAccountName: scraper-sa
imagePullSecrets:
- name: regcred
containers:

365
workflow-12102025.md Normal file
View File

@@ -0,0 +1,365 @@
# Workflow Documentation - December 10, 2025
## Purpose
This document captures the intended behavior for the CannaiQ crawl system, specifically around proxy rotation, fingerprinting, and anti-detection.
---
## Stealth & Anti-Detection Requirements
### 1. Task Determines Work, Proxy Determines Identity
The task payload contains:
- `dispensary_id` - which store to crawl
- `role` - what type of work (product_resync, entry_point_discovery, etc.)
The **proxy** determines the session identity:
- Proxy location (city, state, timezone) → sets Accept-Language and timezone headers
- Language is always English (`en-US`)
**Flow:**
```
Task claimed
└─► Get proxy from rotation
└─► Proxy has location (city, state, timezone)
└─► Build headers using proxy's timezone
- Accept-Language: en-US,en;q=0.9
- Timezone-consistent behavior
```
### 2. On 403 Block - Immediate Backoff
When a 403 is received:
1. **Immediately** stop using current IP
2. Get a new proxy (new IP)
3. Get a new UA/fingerprint
4. Retry the request
**Per-proxy failure tracking:**
- Track UA rotation attempts per proxy
- After 3 UA/fingerprint rotations on the same proxy → disable that proxy
- This means: if we rotate UA 3 times and still get 403, the proxy is burned
### 3. Fingerprint Rotation Rules
Each request uses:
- Proxy (IP)
- User-Agent
- sec-ch-ua headers (Client Hints)
- Accept-Language (from proxy location)
On 403:
1. Record failure on current proxy
2. Rotate to new proxy
3. Pick new random fingerprint
4. If same proxy fails 3 times with different fingerprints → disable proxy
### 4. Proxy Table Schema
```sql
CREATE TABLE proxies (
id SERIAL PRIMARY KEY,
host VARCHAR(255) NOT NULL,
port INTEGER NOT NULL,
username VARCHAR(100),
password VARCHAR(100),
protocol VARCHAR(10) DEFAULT 'http',
active BOOLEAN DEFAULT true,
-- Location (determines session headers)
city VARCHAR(100),
state VARCHAR(50),
country VARCHAR(100),
country_code VARCHAR(10),
timezone VARCHAR(50),
-- Health tracking
failure_count INTEGER DEFAULT 0,
consecutive_403_count INTEGER DEFAULT 0, -- Track 403s specifically
last_used_at TIMESTAMPTZ,
last_failure_at TIMESTAMPTZ,
last_error TEXT,
-- Performance
response_time_ms INTEGER,
max_connections INTEGER DEFAULT 1
);
```
### 5. Failure Threshold
- **3 consecutive 403s** with different fingerprints → disable proxy
- Reset `consecutive_403_count` to 0 on successful request
- General `failure_count` tracks all errors (timeouts, connection errors, etc.)
---
## Implementation Status
### COMPLETED - December 10, 2025
All code changes have been implemented per this specification:
#### 1. crawl-rotator.ts ✅
- [x] Added `consecutive403Count` to Proxy interface
- [x] Added `markBlocked()` method that increments `consecutive_403_count` and disables proxy at 3
- [x] Added `getProxyTimezone()` to return current proxy's timezone
- [x] `markSuccess()` now resets `consecutive_403_count` to 0
- [x] Replaced hardcoded UA list with `intoli/user-agents` library for realistic fingerprints
- [x] `BrowserFingerprint` interface includes full fingerprint data (UA, platform, screen size, viewport, sec-ch-ua headers)
#### 2. client.ts ✅
- [x] `startSession()` no longer takes state/timezone params
- [x] `startSession()` gets identity from proxy via `crawlRotator.getProxyLocation()`
- [x] Added `handle403Block()` that:
- Calls `crawlRotator.recordBlock()` (tracks consecutive 403s)
- Immediately rotates both proxy and fingerprint via `rotateBoth()`
- Returns false if no more proxies available
- [x] `executeGraphQL()` calls `handle403Block()` on 403 (not `rotateProxyOn403`)
- [x] `fetchPage()` uses same 403 handling
- [x] 500ms backoff after rotation (not linear delay)
#### 3. Task Handlers ✅
- [x] `entry-point-discovery.ts`: `startSession()` called with no params
- [x] `product-refresh.ts`: `startSession()` called with no params
#### 4. Dependencies ✅
- [x] Added `user-agents` npm package for realistic UA generation
---
## Files Changed
| File | Changes |
|------|---------|
| `backend/src/services/crawl-rotator.ts` | Complete rewrite with `consecutive403Count`, `markBlocked()`, `intoli/user-agents` |
| `backend/src/platforms/dutchie/client.ts` | `startSession()` uses proxy location, `handle403Block()` for 403 handling |
| `backend/src/tasks/handlers/entry-point-discovery.ts` | `startSession()` no params |
| `backend/src/tasks/handlers/product-refresh.ts` | `startSession()` no params |
| `backend/package.json` | Added `user-agents` dependency |
---
## Migration Required
The `proxies` table needs `consecutive_403_count` column if not already present:
```sql
ALTER TABLE proxies ADD COLUMN IF NOT EXISTS consecutive_403_count INTEGER DEFAULT 0;
```
---
## Key Behaviors Summary
| Behavior | Implementation |
|----------|----------------|
| Session identity | From proxy location (`getProxyLocation()`) |
| Language | Always `en-US,en;q=0.9` |
| 403 handling | `handle403Block()``recordBlock()``rotateBoth()` |
| Proxy disable | After 3 consecutive 403s (`consecutive403Count >= 3`) |
| Success reset | `markSuccess()` resets `consecutive403Count` to 0 |
| UA generation | `intoli/user-agents` library (daily updated, realistic fingerprints) |
| Fingerprint data | Full: UA, platform, screen size, viewport, sec-ch-ua headers |
---
## User-Agent Generation
### Data Source
The `intoli/user-agents` npm library provides daily-updated market share data collected from Intoli's residential proxy network (millions of real users). The package auto-releases new versions daily to npm.
### Device Category Distribution (hardcoded)
| Category | Share |
|----------|-------|
| Mobile | 62% |
| Desktop | 36% |
| Tablet | 2% |
### Browser Filter (whitelist only)
Only these browsers are allowed:
- Chrome (67%)
- Safari (20%)
- Edge (6%)
- Firefox (3%)
Samsung Internet, Opera, and other niche browsers are filtered out.
### Desktop OS Distribution (from library)
| OS | Share |
|----|-------|
| Windows | 72% |
| macOS | 17% |
| Linux | 4% |
### UA Lifecycle
1. **Session start** (new proxy IP obtained) → Roll device category (62/36/2) → Generate UA filtered to device + top 4 browsers → Store on session
2. **UA sticks** until IP rotates (403 block or manual rotation)
3. **IP rotation** triggers new UA generation
### Failure Handling
- If UA generation fails → Alert admin dashboard, **stop crawl immediately**
- No fallback to static UA list
- This forces investigation rather than silent degradation
### Session Logging
Each session logs:
- Device category (mobile/desktop/tablet)
- Full UA string
- Browser name (Chrome/Safari/Edge/Firefox)
- IP address (from proxy)
- Session start timestamp
Logs are rotated monthly.
### Implementation
Located in `backend/src/services/crawl-rotator.ts`:
```typescript
// Per workflow-12102025.md: Device category distribution
const DEVICE_WEIGHTS = { mobile: 62, desktop: 36, tablet: 2 };
// Per workflow-12102025.md: Browser whitelist
const ALLOWED_BROWSERS = ['Chrome', 'Safari', 'Edge', 'Firefox'];
```
---
## HTTP Fingerprinting
### Goal
Make HTTP requests indistinguishable from real browser traffic. No repeatable footprint.
### Components
1. **Full Header Set** - All headers a real browser sends
2. **Header Ordering** - Browser-specific order (Chrome vs Firefox vs Safari)
3. **TLS Fingerprint** - Use `curl-impersonate` to match browser TLS signature
4. **Dynamic Referer** - Set per dispensary being crawled
5. **Natural Randomization** - Vary optional headers like real users
### Required Headers
| Header | Chrome | Firefox | Safari | Notes |
|--------|--------|---------|--------|-------|
| `User-Agent` | ✅ | ✅ | ✅ | From UA generation |
| `Accept` | ✅ | ✅ | ✅ | Content types |
| `Accept-Language` | ✅ | ✅ | ✅ | Always `en-US,en;q=0.9` |
| `Accept-Encoding` | ✅ | ✅ | ✅ | `gzip, deflate, br` |
| `Connection` | ✅ | ✅ | ✅ | `keep-alive` |
| `Origin` | ✅ | ✅ | ✅ | `https://dutchie.com` (POST only) |
| `Referer` | ✅ | ✅ | ✅ | Dynamic per dispensary |
| `sec-ch-ua` | ✅ | ❌ | ❌ | Chromium only |
| `sec-ch-ua-mobile` | ✅ | ❌ | ❌ | Chromium only |
| `sec-ch-ua-platform` | ✅ | ❌ | ❌ | Chromium only |
| `sec-fetch-dest` | ✅ | ✅ | ❌ | `empty` for XHR |
| `sec-fetch-mode` | ✅ | ✅ | ❌ | `cors` for XHR |
| `sec-fetch-site` | ✅ | ✅ | ❌ | `same-origin` |
| `Upgrade-Insecure-Requests` | ✅ | ✅ | ✅ | `1` (page loads only) |
| `DNT` | ~30% | ~30% | ~30% | Randomized per session |
### Header Ordering
Each browser sends headers in a specific order. Fingerprinting services detect mismatches.
**Chrome order (GraphQL request):**
1. Host
2. Connection
3. Content-Length (POST)
4. sec-ch-ua
5. DNT (if enabled)
6. sec-ch-ua-mobile
7. User-Agent
8. sec-ch-ua-platform
9. Content-Type (POST)
10. Accept
11. Origin (POST)
12. sec-fetch-site
13. sec-fetch-mode
14. sec-fetch-dest
15. Referer
16. Accept-Encoding
17. Accept-Language
**Firefox order (GraphQL request):**
1. Host
2. User-Agent
3. Accept
4. Accept-Language
5. Accept-Encoding
6. Content-Type (POST)
7. Content-Length (POST)
8. Origin (POST)
9. DNT (if enabled)
10. Connection
11. Referer
12. sec-fetch-dest
13. sec-fetch-mode
14. sec-fetch-site
**Safari order (GraphQL request):**
1. Host
2. Connection
3. Content-Length (POST)
4. Accept
5. User-Agent
6. Content-Type (POST)
7. Origin (POST)
8. Referer
9. Accept-Encoding
10. Accept-Language
### TLS Fingerprinting
Use `curl-impersonate` instead of standard curl:
- `curl_chrome131` - Mimics Chrome 131 TLS handshake
- `curl_ff133` - Mimics Firefox 133 TLS handshake
- `curl_safari17` - Mimics Safari 17 TLS handshake
Match TLS binary to browser in UA.
### Dynamic Referer
Set Referer to the dispensary's actual page URL:
```
Crawling "harvest-of-tempe" → Referer: https://dutchie.com/dispensary/harvest-of-tempe
Crawling "zen-leaf-mesa" → Referer: https://dutchie.com/dispensary/zen-leaf-mesa
```
Derived from dispensary's `menu_url` field.
### Natural Randomization
Per-session randomization (set once when session starts, consistent for session):
| Feature | Distribution | Implementation |
|---------|--------------|----------------|
| DNT header | 30% have it | `Math.random() < 0.30` |
| Accept quality values | Slight variation | `q=0.9` vs `q=0.8` |
### Implementation Files
| File | Purpose |
|------|---------|
| `src/services/crawl-rotator.ts` | `BrowserFingerprint` includes full header config |
| `src/platforms/dutchie/client.ts` | Build headers from fingerprint, use curl-impersonate |
| `src/services/http-fingerprint.ts` | Header ordering per browser (NEW) |