Compare commits

...

37 Commits

Author SHA1 Message Date
Kelly
aea93bc96b fix(ci): Revert volume caching - may have broken CI trigger 2025-12-10 08:53:10 -07:00
Kelly
4e84f30f8b feat: Auto-retry tasks, 403 proxy rotation, task deletion
- Fix 403 handler to rotate BOTH proxy and fingerprint (was only fingerprint)
- Add auto-retry logic to task service (retry up to max_retries before failing)
- Add error tooltip on task status badge showing retry count and error message
- Add DELETE /api/tasks/:id endpoint (only for non-running tasks)
- Add delete button to JobQueue task table

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 08:41:14 -07:00
Kelly
b20a0a4fa5 fix: Add generic delete method to ApiClient + CI speedups
- Add delete<T>() method to ApiClient for WorkersDashboard cleanup
- Add npm cache volume for faster npm ci
- Add TypeScript incremental builds with tsBuildInfoFile cache
- Should significantly speed up repeated CI runs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 08:27:11 -07:00
Kelly
6eb1babc86 feat: Auto-migrations on startup, worker exit location, proxy improvements
- Add auto-migration system that runs SQL files from migrations/ on server startup
- Track applied migrations in schema_migrations table
- Show proxy exit location in Workers dashboard
- Add "Cleanup Stale" button to remove old workers
- Add remove button for individual workers
- Include proxy location (city, state, country) in worker heartbeats
- Update Proxy interface with location fields
- Re-enable bulk proxy import without ON CONFLICT

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 08:05:24 -07:00
kelly
9a9c2f76a2 Merge pull request 'feat: Stealth worker system with mandatory proxy rotation' (#10) from feat/stealth-worker-system into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/10
2025-12-10 08:13:42 +00:00
Kelly
56cc171287 feat: Stealth worker system with mandatory proxy rotation
## Worker System
- Role-agnostic workers that can handle any task type
- Pod-based architecture with StatefulSet (5-15 pods, 5 workers each)
- Custom pod names (Aethelgard, Xylos, Kryll, etc.)
- Worker registry with friendly names and resource monitoring
- Hub-and-spoke visualization on JobQueue page

## Stealth & Anti-Detection (REQUIRED)
- Proxies are MANDATORY - workers fail to start without active proxies
- CrawlRotator initializes on worker startup
- Loads proxies from `proxies` table
- Auto-rotates proxy + fingerprint on 403 errors
- 12 browser fingerprints (Chrome, Firefox, Safari, Edge)
- Locale/timezone matching for geographic consistency

## Task System
- Renamed product_resync → product_refresh
- Task chaining: store_discovery → entry_point → product_discovery
- Priority-based claiming with FOR UPDATE SKIP LOCKED
- Heartbeat and stale task recovery

## UI Updates
- JobQueue: Pod visualization, resource monitoring on hover
- WorkersDashboard: Simplified worker list
- Removed unused filters from task list

## Other
- IP2Location service for visitor analytics
- Findagram consumer features scaffolding
- Documentation updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 00:44:59 -07:00
Kelly
0295637ed6 fix: Public API column mappings and OOS detection
- Fix store_products column references (name_raw, brand_name_raw, category_raw)
- Fix v_product_snapshots column references (crawled_at, *_cents pricing)
- Fix dispensaries column references (zipcode, logo_image, remove hours/amenities)
- Add services and license_type to dispensary API response
- Add consecutive_misses OOS tracking to product-resync handler
- Add migration 075 for consecutive_misses column
- Add CRAWL_PIPELINE.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 20:44:53 -07:00
Kelly
9c6dd37316 fix(ci): Use YAML list format for docker-buildx build_args
The woodpecker docker-buildx plugin requires build_args as a YAML list,
not a comma-separated string. This fixes the build version/hash not being
passed to the Docker image.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 18:03:50 -07:00
kelly
524d13209a Merge pull request 'fix: Remove legacy imports from task handlers' (#9) from fix/task-handler-typescript-errors into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/9
2025-12-10 00:42:39 +00:00
Kelly
9199db3927 fix: Remove legacy imports from task handlers
- Remove non-existent DutchieClient import from product-resync and entry-point-discovery
- Remove non-existent DiscoveryCrawler import from store-discovery
- Use scrapeStore from scraper-v2 for product resync
- Use discoverState from discovery module for store discovery
- Fix Pool type by using getPool() instead of pool wrapper
- Update FullDiscoveryResult property access to use correct field names

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 17:25:19 -07:00
Kelly
a0652c7c73 fix(types): Fix TypeScript errors in TasksDashboard, Layout, and Users
- Fix TaskCounts type in api.ts to match TasksDashboard interface
- Make VersionInfo.version optional in Layout.tsx
- Fix boolean type in Users.tsx disabled prop

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 17:02:40 -07:00
Kelly
89c262ee20 feat(tasks): Add unified task-based worker architecture
Replace fragmented job systems (job_schedules, dispensary_crawl_jobs, SyncOrchestrator)
with a single unified task queue:

- Add worker_tasks table with atomic task claiming via SELECT FOR UPDATE SKIP LOCKED
- Add TaskService for CRUD, claiming, and capacity metrics
- Add TaskWorker with role-based handlers (resync, discovery, analytics)
- Add /api/tasks endpoints for management and migration from legacy systems
- Add TasksDashboard UI and integrate task counts into main dashboard
- Add comprehensive documentation

Task roles: store_discovery, entry_point_discovery, product_discovery, product_resync, analytics_refresh

Run workers with: WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 16:27:03 -07:00
Kelly
7f9cf559cf fix(k8s): Update worker deployment to use v2 hydration worker
The old dutchie-az/services/worker.js no longer exists. Workers now use
the hydration pipeline at dist/scripts/run-hydration.js with --loop mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 15:01:18 -07:00
Kelly
bbe039c868 feat(api): Add job queue management endpoints and fix SQL type errors
- Add GET /api/job-queue/available - list dispensaries available for crawling
- Add GET /api/job-queue/history - get recent job history with results
- Add POST /api/job-queue/enqueue-batch - queue multiple dispensaries at once
- Add POST /api/job-queue/enqueue-state - queue all crawl-enabled dispensaries for a state
- Add POST /api/job-queue/clear-pending - clear pending jobs with optional filters
- Fix SQL parameter type errors by adding explicit casts ($2::text, $3::integer)
- Fix route ordering to prevent /:id from matching /available and /history

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 14:10:55 -07:00
Kelly
4e5c09a2a5 chore(dashboard): Remove DeployStatus block
Version info already shown in Layout sidebar header.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 14:10:22 -07:00
Kelly
7f65598332 feat(admin): Show version info at top of sidebar
- Add package.json version to /api/version endpoint
- Move version display from footer to top (next to logo)
- Show format: v1.5.1 (abc1234) - 12/9/2024

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 13:58:36 -07:00
Kelly
75315ed91e fix(ci): Use comma-separated build_args for docker-buildx plugin
The docker-buildx plugin expects build_args as a comma-separated string,
not a YAML list. This should fix the build_sha/build_time being null.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 13:56:37 -07:00
Kelly
7fe7d17b43 fix(consumer): Use relative API URLs for findadispo/findagram
The consumer frontends were hardcoded to use cannaiq.co as the API
URL, but each domain has its own /api path in the ingress that routes
to the shared backend. Using relative URLs allows each site to make
API calls to its own domain.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 13:38:10 -07:00
Kelly
7e517b5801 ci: Use self-hosted base images to avoid Docker Hub rate limits
Cached node:20, node:20-slim, and nginx:alpine to code.cannabrands.app.
No more Docker Hub dependency for builds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 13:07:21 -07:00
Kelly
38ba9021d1 ci: Retry build (Docker Hub rate limit) 2025-12-09 12:58:36 -07:00
Kelly
ddebad48d3 ci: Remove auto-migrations from deploy step
Database was restored from backup - no migrations needed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 12:52:04 -07:00
Kelly
1cebf2e296 fix(health): Add build_sha and build_time to health endpoint
Reads APP_GIT_SHA and APP_BUILD_TIME env vars set during Docker build.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 12:22:52 -07:00
Kelly
1d6e67d837 feat(api): Add store metrics endpoints with localhost bypass
New public API v1 endpoints for third-party integrations:
- GET /api/v1/stores/:id/metrics - Store performance metrics
- GET /api/v1/stores/:id/product-metrics - Product-level price changes
- GET /api/v1/stores/:id/competitor-snapshot - Competitive intelligence

Also adds localhost IP bypass for local development testing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 12:14:13 -07:00
Kelly
cfb4b6e4ce fix(cannaiq): Fix TypeScript error in DeployStatus component
Properly destructure api.get response which returns { data: T }

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 12:08:29 -07:00
Kelly
f418c403d6 feat(auth): Add *.cannabrands.app to trusted origins whitelist
Adds pattern-based origin matching to support wildcard subdomains.
All *.cannabrands.app origins now bypass API key authentication.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 12:06:14 -07:00
Kelly
be4221af46 ci: Retrigger build 2025-12-09 11:54:16 -07:00
Kelly
ca07606b05 feat(k8s): Add Redis deployment for production
- Add k8s/redis.yaml with Redis 7 Alpine deployment
- Add REDIS_HOST and REDIS_PORT to configmap
- Redis configured with 200MB max memory and LRU eviction
- 1GB persistent volume for data persistence

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:40:11 -07:00
Kelly
baf1bf2eb7 fix(health): Require Redis in production, optional in local
Redis health check now returns error status when not configured in
production/staging environments, but remains optional in local dev.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:38:49 -07:00
Kelly
4ef3a8d72b fix(build): Fix TypeScript errors breaking CI build
- Add missing 'original' property to LocalImageSizes in brand logo download
- Remove test scripts with type errors (test-image-download.ts, test-stealth-with-db.ts)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:36:28 -07:00
Kelly
09dd756eff feat(admin): Add deploy status panel to dashboard
Shows running version vs latest git commit, pipeline status with steps,
and how many commits behind if not on latest. Uses Woodpecker and Gitea
APIs to fetch CI/CD information. Auto-refreshes every 30 seconds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:26:41 -07:00
Kelly
ec8ef6210c ci: Run migrations inside K8s cluster after deploy
DB is internal to the cluster, so migrations must run via kubectl exec
into the scraper pod after deployment completes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:16:21 -07:00
Kelly
a9b7a4d7a9 ci: Add proper SQL migration runner with tracking
- Creates run-migrations.ts that reads migrations/*.sql files
- Tracks applied migrations in schema_migrations table by filename
- Handles existing version-based schema by adding filename column
- CI now runs migrations before deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:12:50 -07:00
Kelly
5119d5ccf9 ci: Add migration step before deploy
Migrations now run automatically after Docker builds but before K8s deploy.
Requires DATABASE_URL secret to be configured in Woodpecker.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:09:49 -07:00
Kelly
91efd1d03d feat(images): Add local image storage with on-demand resizing
- Store product images locally with hierarchy: /images/products/<state>/<store>/<brand>/<product>/
- Add /img/* proxy endpoint for on-demand resizing via Sharp
- Implement per-product image checking to skip existing downloads
- Fix pathToUrl() to correctly generate /images/... URLs
- Add frontend getImageUrl() helper with preset sizes (thumb, medium, large)
- Update all product pages to use optimized image URLs
- Add stealth session support for Dutchie GraphQL crawls
- Include test scripts for crawl and image verification

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:04:50 -07:00
Kelly
aa776226b0 fix(consumer): Wire findagram/findadispo to public API
- Update Dockerfiles to use cannaiq.co as API base URL
- Change findagram API client from /api/az to /api/v1 endpoints
- Add trusted origin bypass in public-api middleware for consumer sites
- Consumer sites (findagram.co, findadispo.com) can now access /api/v1
  endpoints without API key authentication

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:04:50 -07:00
kelly
e9435150e9 Merge pull request 'feature/wp-plugin-versioning-and-fixes' (#7) from feature/wp-plugin-versioning-and-fixes into master 2025-12-09 17:15:33 +00:00
Kelly
d399b966e6 ci: trigger build 2025-12-09 10:03:29 -07:00
109 changed files with 16413 additions and 2336 deletions

View File

@@ -6,7 +6,7 @@ steps:
# PR VALIDATION: Parallel type checks (PRs only)
# ===========================================
typecheck-backend:
image: node:20
image: code.cannabrands.app/creationshop/node:20
commands:
- cd backend
- npm ci --prefer-offline
@@ -16,7 +16,7 @@ steps:
event: pull_request
typecheck-cannaiq:
image: node:20
image: code.cannabrands.app/creationshop/node:20
commands:
- cd cannaiq
- npm ci --prefer-offline
@@ -26,7 +26,7 @@ steps:
event: pull_request
typecheck-findadispo:
image: node:20
image: code.cannabrands.app/creationshop/node:20
commands:
- cd findadispo/frontend
- npm ci --prefer-offline
@@ -36,7 +36,7 @@ steps:
event: pull_request
typecheck-findagram:
image: node:20
image: code.cannabrands.app/creationshop/node:20
commands:
- cd findagram/frontend
- npm ci --prefer-offline
@@ -65,7 +65,7 @@ steps:
platforms: linux/amd64
provenance: false
build_args:
- APP_BUILD_VERSION=${CI_COMMIT_SHA:0:8}
- APP_BUILD_VERSION=${CI_COMMIT_SHA}
- APP_GIT_SHA=${CI_COMMIT_SHA}
- APP_BUILD_TIME=${CI_PIPELINE_CREATED}
- CONTAINER_IMAGE_TAG=${CI_COMMIT_SHA:0:8}
@@ -138,7 +138,7 @@ steps:
event: push
# ===========================================
# STAGE 3: Deploy (after all Docker builds)
# STAGE 3: Deploy (after Docker builds)
# ===========================================
deploy:
image: bitnami/kubectl:latest

View File

@@ -213,22 +213,23 @@ CannaiQ has **TWO databases** with distinct purposes:
| Table | Purpose | Row Count |
|-------|---------|-----------|
| `dispensaries` | Store/dispensary records | ~188+ rows |
| `dutchie_products` | Product catalog | ~37,000+ rows |
| `dutchie_product_snapshots` | Price/stock history | ~millions |
| `store_products` | Canonical product schema | ~37,000+ rows |
| `store_product_snapshots` | Canonical snapshot schema | growing |
| `store_products` | Product catalog | ~37,000+ rows |
| `store_product_snapshots` | Price/stock history | ~millions |
**LEGACY TABLES (EMPTY - DO NOT USE):**
| Table | Status | Action |
|-------|--------|--------|
| `stores` | EMPTY (0 rows) | Use `dispensaries` instead |
| `products` | EMPTY (0 rows) | Use `dutchie_products` or `store_products` |
| `products` | EMPTY (0 rows) | Use `store_products` instead |
| `dutchie_products` | LEGACY (0 rows) | Use `store_products` instead |
| `dutchie_product_snapshots` | LEGACY (0 rows) | Use `store_product_snapshots` instead |
| `categories` | EMPTY (0 rows) | Categories stored in product records |
**Code must NEVER:**
- Query the `stores` table (use `dispensaries`)
- Query the `products` table (use `dutchie_products` or `store_products`)
- Query the `products` table (use `store_products`)
- Query the `dutchie_products` table (use `store_products`)
- Query the `categories` table (categories are in product records)
**CRITICAL RULES:**
@@ -343,23 +344,23 @@ npx tsx src/scripts/etl/042_legacy_import.ts
- SCHEMA ONLY - no data inserts from legacy tables
**ETL Script 042** (`backend/src/scripts/etl/042_legacy_import.ts`):
- Copies data from `dutchie_products``store_products`
- Copies data from `dutchie_product_snapshots``store_product_snapshots`
- Copies data from legacy `dutchie_legacy.dutchie_products``store_products`
- Copies data from legacy `dutchie_legacy.dutchie_product_snapshots``store_product_snapshots`
- Extracts brands from product data into `brands` table
- Links dispensaries to chains and states
- INSERT-ONLY and IDEMPOTENT (uses ON CONFLICT DO NOTHING)
- Run manually: `cd backend && npx tsx src/scripts/etl/042_legacy_import.ts`
**Tables touched by ETL:**
| Source Table | Target Table |
|--------------|--------------|
| Source Table (dutchie_legacy) | Target Table (dutchie_menus) |
|-------------------------------|------------------------------|
| `dutchie_products` | `store_products` |
| `dutchie_product_snapshots` | `store_product_snapshots` |
| (brand names extracted) | `brands` |
| (state codes mapped) | `dispensaries.state_id` |
| (chain names matched) | `dispensaries.chain_id` |
**Legacy tables remain intact** - `dutchie_products` and `dutchie_product_snapshots` are not modified.
**Note:** The legacy `dutchie_products` and `dutchie_product_snapshots` tables in `dutchie_legacy` are read-only sources. All new crawl data goes directly to `store_products` and `store_product_snapshots`.
**Migration 045** (`backend/migrations/045_add_image_columns.sql`):
- Adds `thumbnail_url` to `store_products` and `store_product_snapshots`
@@ -459,15 +460,66 @@ const result = await pool.query(`
### Local Storage Structure
```
/storage/products/{brand}/{state}/{product_id}/
/storage/images/products/{state}/{store}/{brand}/{product}/
image-{hash}.webp
image-{hash}-medium.webp
image-{hash}-thumb.webp
/storage/brands/{brand}/
/storage/images/brands/{brand}/
logo-{hash}.webp
```
### Image Proxy API (On-Demand Resizing)
Images are stored at full resolution and resized on-demand via the `/img` endpoint.
**Endpoint:** `GET /img/<path>?<params>`
**Parameters:**
| Param | Description | Example |
|-------|-------------|---------|
| `w` | Width in pixels (max 4000) | `?w=200` |
| `h` | Height in pixels (max 4000) | `?h=200` |
| `q` | Quality 1-100 (default 80) | `?q=70` |
| `fit` | Resize mode: cover, contain, fill, inside, outside | `?fit=cover` |
| `blur` | Blur sigma 0.3-1000 | `?blur=5` |
| `gray` | Grayscale (1 = enabled) | `?gray=1` |
| `format` | Output: webp, jpeg, png, avif (default webp) | `?format=jpeg` |
**Examples:**
```bash
# Thumbnail (50px)
GET /img/products/az/store/brand/product/image-abc123.webp?w=50
# Card image (200px, cover fit)
GET /img/products/az/store/brand/product/image-abc123.webp?w=200&h=200&fit=cover
# JPEG at 70% quality
GET /img/products/az/store/brand/product/image-abc123.webp?w=400&format=jpeg&q=70
# Grayscale blur
GET /img/products/az/store/brand/product/image-abc123.webp?w=200&gray=1&blur=3
```
**Frontend Usage:**
```typescript
import { getImageUrl, ImageSizes } from '../lib/images';
// Returns /img/products/.../image.webp?w=50 for local images
// Returns original URL for remote images (CDN, etc.)
const thumbUrl = getImageUrl(product.image_url, ImageSizes.thumb);
const cardUrl = getImageUrl(product.image_url, ImageSizes.medium);
const detailUrl = getImageUrl(product.image_url, ImageSizes.detail);
```
**Size Presets:**
| Preset | Width | Use Case |
|--------|-------|----------|
| `thumb` | 50px | Table thumbnails |
| `small` | 100px | Small cards |
| `medium` | 200px | Grid cards |
| `large` | 400px | Large cards |
| `detail` | 600px | Product detail |
| `full` | - | No resize |
### Storage Adapter
```typescript
@@ -480,8 +532,9 @@ import { saveImage, getImageUrl } from '../utils/storage-adapter';
| File | Purpose |
|------|---------|
| `backend/src/utils/local-storage.ts` | Local filesystem adapter |
| `backend/src/utils/storage-adapter.ts` | Unified storage abstraction |
| `backend/src/utils/image-storage.ts` | Image download and storage |
| `backend/src/routes/image-proxy.ts` | On-demand image resizing endpoint |
| `cannaiq/src/lib/images.ts` | Frontend image URL helper |
| `docker-compose.local.yml` | Local stack without MinIO |
| `start-local.sh` | Convenience startup script |
@@ -829,7 +882,7 @@ export default defineConfig({
18) **Dashboard Architecture**
- **Frontend**: Rebuild the frontend with `VITE_API_URL` pointing to the correct backend and redeploy.
- **Backend**: `/api/dashboard/stats` MUST use the canonical DB pool. Use the correct tables: `dutchie_products`, `dispensaries`, and views like `v_dashboard_stats`, `v_latest_snapshots`.
- **Backend**: `/api/dashboard/stats` MUST use the canonical DB pool. Use the correct tables: `store_products`, `dispensaries`, and views like `v_dashboard_stats`, `v_latest_snapshots`.
19) **Deployment (Gitea + Kubernetes)**
- **Registry**: Gitea at `code.cannabrands.app/creationshop/dispensary-scraper`

3
backend/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
# IP2Location database (downloaded separately)
data/ip2location/

View File

@@ -1,6 +1,6 @@
# Build stage
# Image: code.cannabrands.app/creationshop/dispensary-scraper
FROM node:20-slim AS builder
FROM code.cannabrands.app/creationshop/node:20-slim AS builder
WORKDIR /app
@@ -11,7 +11,7 @@ COPY . .
RUN npm run build
# Production stage
FROM node:20-slim
FROM code.cannabrands.app/creationshop/node:20-slim
# Build arguments for version info
ARG APP_BUILD_VERSION=dev

View File

@@ -0,0 +1,538 @@
# Crawl Pipeline Documentation
## Overview
The crawl pipeline fetches product data from Dutchie dispensary menus and stores it in the canonical database. This document covers the complete flow from task scheduling to data storage.
---
## Pipeline Stages
```
┌─────────────────────┐
│ store_discovery │ Find new dispensaries
└─────────┬───────────┘
┌─────────────────────┐
│ entry_point_discovery│ Resolve slug → platform_dispensary_id
└─────────┬───────────┘
┌─────────────────────┐
│ product_discovery │ Initial product crawl
└─────────┬───────────┘
┌─────────────────────┐
│ product_resync │ Recurring crawl (every 4 hours)
└─────────────────────┘
```
---
## Stage Details
### 1. Store Discovery
**Purpose:** Find new dispensaries to crawl
**Handler:** `src/tasks/handlers/store-discovery.ts`
**Flow:**
1. Query Dutchie `ConsumerDispensaries` GraphQL for cities/states
2. Extract dispensary info (name, address, menu_url)
3. Insert into `dutchie_discovery_locations`
4. Queue `entry_point_discovery` for each new location
---
### 2. Entry Point Discovery
**Purpose:** Resolve menu URL slug to platform_dispensary_id (MongoDB ObjectId)
**Handler:** `src/tasks/handlers/entry-point-discovery.ts`
**Flow:**
1. Load dispensary from database
2. Extract slug from `menu_url`:
- `/embedded-menu/<slug>` or `/dispensary/<slug>`
3. Start stealth session (fingerprint + proxy)
4. Query `resolveDispensaryIdWithDetails(slug)` via GraphQL
5. Update dispensary with `platform_dispensary_id`
6. Queue `product_discovery` task
**Example:**
```
menu_url: https://dutchie.com/embedded-menu/deeply-rooted
slug: deeply-rooted
platform_dispensary_id: 6405ef617056e8014d79101b
```
---
### 3. Product Discovery
**Purpose:** Initial crawl of a new dispensary
**Handler:** `src/tasks/handlers/product-discovery.ts`
Same as product_resync but for first-time crawls.
---
### 4. Product Resync
**Purpose:** Recurring crawl to capture price/stock changes
**Handler:** `src/tasks/handlers/product-resync.ts`
**Flow:**
#### Step 1: Load Dispensary Info
```sql
SELECT id, name, platform_dispensary_id, menu_url, state
FROM dispensaries
WHERE id = $1 AND crawl_enabled = true
```
#### Step 2: Start Stealth Session
- Generate random browser fingerprint
- Set locale/timezone matching state
- Optional proxy rotation
#### Step 3: Fetch Products via GraphQL
**Endpoint:** `https://dutchie.com/api-3/graphql`
**Variables:**
```javascript
{
includeEnterpriseSpecials: false,
productsFilter: {
dispensaryId: "<platform_dispensary_id>",
pricingType: "rec",
Status: "All",
types: [],
useCache: false,
isDefaultSort: true,
sortBy: "popularSortIdx",
sortDirection: 1,
bypassOnlineThresholds: true,
isKioskMenu: false,
removeProductsBelowOptionThresholds: false
},
page: 0,
perPage: 100
}
```
**Key Notes:**
- `Status: "All"` returns all products (Active returns same count)
- `Status: null` returns 0 products (broken)
- `pricingType: "rec"` returns BOTH rec and med prices
- Paginate until `products.length < perPage` or `allProducts.length >= totalCount`
#### Step 4: Normalize Data
Transform raw Dutchie payload to canonical format via `DutchieNormalizer`.
#### Step 5: Upsert Products
Insert/update `store_products` table with normalized data.
#### Step 6: Create Snapshots
Insert point-in-time record to `store_product_snapshots`.
#### Step 7: Track Missing Products (OOS Detection)
```sql
-- Reset consecutive_misses for products IN the feed
UPDATE store_products
SET consecutive_misses = 0, last_seen_at = NOW()
WHERE dispensary_id = $1
AND provider = 'dutchie'
AND provider_product_id = ANY($2)
-- Increment for products NOT in feed
UPDATE store_products
SET consecutive_misses = consecutive_misses + 1
WHERE dispensary_id = $1
AND provider = 'dutchie'
AND provider_product_id NOT IN (...)
AND consecutive_misses < 3
-- Mark OOS at 3 consecutive misses
UPDATE store_products
SET stock_status = 'oos', is_in_stock = false
WHERE dispensary_id = $1
AND consecutive_misses >= 3
AND stock_status != 'oos'
```
#### Step 8: Download Images
For new products, download and store images locally.
#### Step 9: Update Dispensary
```sql
UPDATE dispensaries SET last_crawl_at = NOW() WHERE id = $1
```
---
## GraphQL Payload Structure
### Product Fields (from filteredProducts.products[])
| Field | Type | Description |
|-------|------|-------------|
| `_id` / `id` | string | MongoDB ObjectId (24 hex chars) |
| `Name` | string | Product display name |
| `brandName` | string | Brand name |
| `brand.name` | string | Brand name (nested) |
| `brand.description` | string | Brand description |
| `type` | string | Category (Flower, Edible, Concentrate, etc.) |
| `subcategory` | string | Subcategory |
| `strainType` | string | Hybrid, Indica, Sativa, N/A |
| `Status` | string | Always "Active" in feed |
| `Image` | string | Primary image URL |
| `images[]` | array | All product images |
### Pricing Fields
| Field | Type | Description |
|-------|------|-------------|
| `Prices[]` | number[] | Rec prices per option |
| `recPrices[]` | number[] | Rec prices |
| `medicalPrices[]` | number[] | Medical prices |
| `recSpecialPrices[]` | number[] | Rec sale prices |
| `medicalSpecialPrices[]` | number[] | Medical sale prices |
| `Options[]` | string[] | Size options ("1/8oz", "1g", etc.) |
| `rawOptions[]` | string[] | Raw weight options ("3.5g") |
### Inventory Fields (POSMetaData.children[])
| Field | Type | Description |
|-------|------|-------------|
| `quantity` | number | Total inventory count |
| `quantityAvailable` | number | Available for online orders |
| `kioskQuantityAvailable` | number | Available for kiosk orders |
| `option` | string | Which size option this is for |
### Potency Fields
| Field | Type | Description |
|-------|------|-------------|
| `THCContent.range[]` | number[] | THC percentage |
| `CBDContent.range[]` | number[] | CBD percentage |
| `cannabinoidsV2[]` | array | Detailed cannabinoid breakdown |
### Specials (specialData.bogoSpecials[])
| Field | Type | Description |
|-------|------|-------------|
| `specialName` | string | Deal name |
| `specialType` | string | "bogo", "sale", etc. |
| `itemsForAPrice.value` | string | Bundle price |
| `bogoRewards[].totalQuantity.quantity` | number | Required quantity |
---
## OOS Detection Logic
Products disappear from the Dutchie feed when they go out of stock. We track this via `consecutive_misses`:
| Scenario | Action |
|----------|--------|
| Product in feed | `consecutive_misses = 0` |
| Product missing 1st time | `consecutive_misses = 1` |
| Product missing 2nd time | `consecutive_misses = 2` |
| Product missing 3rd time | `consecutive_misses = 3`, mark `stock_status = 'oos'` |
| Product returns to feed | `consecutive_misses = 0`, update stock_status |
**Why 3 misses?**
- Protects against false positives from crawl failures
- Single bad crawl doesn't trigger mass OOS alerts
- Balances detection speed vs accuracy
---
## Database Tables
### store_products
Current state of each product:
- `provider_product_id` - Dutchie's MongoDB ObjectId
- `name_raw`, `brand_name_raw` - Raw values from feed
- `price_rec`, `price_med` - Current prices
- `is_in_stock`, `stock_status` - Availability
- `consecutive_misses` - OOS detection counter
- `last_seen_at` - Last time product was in feed
### store_product_snapshots
Point-in-time records for historical analysis:
- One row per product per crawl
- Captures price, stock, potency at that moment
- Used for price history, analytics
### dispensaries
Store metadata:
- `platform_dispensary_id` - MongoDB ObjectId for GraphQL
- `menu_url` - Source URL
- `last_crawl_at` - Last successful crawl
- `crawl_enabled` - Whether to crawl
---
## Worker Roles
Workers pull tasks from the `worker_tasks` queue based on their assigned role.
| Role | Name | Description | Handler |
|------|------|-------------|---------|
| `product_resync` | Product Resync | Re-crawl dispensary products for price/stock changes | `handleProductResync` |
| `product_discovery` | Product Discovery | Initial product discovery for new dispensaries | `handleProductDiscovery` |
| `store_discovery` | Store Discovery | Discover new dispensary locations | `handleStoreDiscovery` |
| `entry_point_discovery` | Entry Point Discovery | Resolve platform IDs from menu URLs | `handleEntryPointDiscovery` |
| `analytics_refresh` | Analytics Refresh | Refresh materialized views and analytics | `handleAnalyticsRefresh` |
**API Endpoint:** `GET /api/worker-registry/roles`
---
## Scheduling
Crawls are scheduled via `worker_tasks` table:
| Role | Frequency | Description |
|------|-----------|-------------|
| `product_resync` | Every 4 hours | Regular product refresh |
| `product_discovery` | On-demand | First crawl for new stores |
| `entry_point_discovery` | On-demand | New store setup |
| `store_discovery` | Daily | Find new stores |
| `analytics_refresh` | Daily | Refresh analytics materialized views |
---
## Priority & On-Demand Tasks
Tasks are claimed by workers in order of **priority DESC, created_at ASC**.
### Priority Levels
| Priority | Use Case | Example |
|----------|----------|---------|
| 0 | Scheduled/batch tasks | Daily product_resync generation |
| 10 | On-demand/chained tasks | entry_point → product_discovery |
| Higher | Urgent/manual triggers | Admin-triggered immediate crawl |
### Task Chaining
When a task completes, the system automatically creates follow-up tasks:
```
store_discovery (completed)
└─► entry_point_discovery (priority: 10) for each new store
entry_point_discovery (completed, success)
└─► product_discovery (priority: 10) for that store
product_discovery (completed)
└─► [no chain] Store enters regular resync schedule
```
### On-Demand Task Creation
Use the task service to create high-priority tasks:
```typescript
// Create immediate product resync for a store
await taskService.createTask({
role: 'product_resync',
dispensary_id: 123,
platform: 'dutchie',
priority: 20, // Higher than batch tasks
});
// Convenience methods with default high priority (10)
await taskService.createEntryPointTask(dispensaryId, 'dutchie');
await taskService.createProductDiscoveryTask(dispensaryId, 'dutchie');
await taskService.createStoreDiscoveryTask('dutchie', 'AZ');
```
### Claim Function
The `claim_task()` SQL function atomically claims tasks:
- Respects priority ordering (higher = first)
- Uses `FOR UPDATE SKIP LOCKED` for concurrency
- Prevents multiple active tasks per store
---
## Image Storage
Images are downloaded from Dutchie's AWS S3 and stored locally with on-demand resizing.
### Storage Path
```
/storage/images/products/<state>/<store>/<brand>/<product_id>/image-<hash>.webp
/storage/images/brands/<brand>/logo-<hash>.webp
```
**Example:**
```
/storage/images/products/az/az-deeply-rooted/bud-bros/6913e3cd444eac3935e928b9/image-ae38b1f9.webp
```
### Image Proxy API
Served via `/img/*` with on-demand resizing using **sharp**:
```
GET /img/products/az/az-deeply-rooted/bud-bros/6913e3cd444eac3935e928b9/image-ae38b1f9.webp?w=200
```
| Param | Description |
|-------|-------------|
| `w` | Width in pixels (max 4000) |
| `h` | Height in pixels (max 4000) |
| `q` | Quality 1-100 (default 80) |
| `fit` | cover, contain, fill, inside, outside |
| `blur` | Blur sigma (0.3-1000) |
| `gray` | Grayscale (1 = enabled) |
| `format` | webp, jpeg, png, avif (default webp) |
### Key Files
| File | Purpose |
|------|---------|
| `src/utils/image-storage.ts` | Download & save images to local filesystem |
| `src/routes/image-proxy.ts` | On-demand resize/transform at `/img/*` |
### Download Rules
| Scenario | Image Action |
|----------|--------------|
| **New product (first crawl)** | Download if `primaryImageUrl` exists |
| **Existing product (refresh)** | Download only if `local_image_path` is NULL (backfill) |
| **Product already has local image** | Skip download entirely |
**Logic:**
- Images are downloaded **once** and never re-downloaded on subsequent crawls
- `skipIfExists: true` - filesystem check prevents re-download even if queued
- First crawl: all products get images
- Refresh crawl: only new products or products missing local images
### Storage Rules
- **NO MinIO** - local filesystem only (`STORAGE_DRIVER=local`)
- Store full resolution, resize on-demand via `/img` proxy
- Convert to webp for consistency using **sharp**
- Preserve original Dutchie URL as fallback in `image_url` column
- Local path stored in `local_image_path` column
---
## Stealth & Anti-Detection
**PROXIES ARE REQUIRED** - Workers will fail to start if no active proxies are available in the database. All HTTP requests to Dutchie go through a proxy.
Workers automatically initialize anti-detection systems on startup.
### Components
| Component | Purpose | Source |
|-----------|---------|--------|
| **CrawlRotator** | Coordinates proxy + UA rotation | `src/services/crawl-rotator.ts` |
| **ProxyRotator** | Round-robin proxy selection, health tracking | `src/services/crawl-rotator.ts` |
| **UserAgentRotator** | Cycles through realistic browser fingerprints | `src/services/crawl-rotator.ts` |
| **Dutchie Client** | Curl-based HTTP with auto-retry on 403 | `src/platforms/dutchie/client.ts` |
### Initialization Flow
```
Worker Start
├─► initializeStealth()
│ │
│ ├─► CrawlRotator.initialize()
│ │ └─► Load proxies from `proxies` table
│ │
│ └─► setCrawlRotator(rotator)
│ └─► Wire to Dutchie client
└─► Process tasks...
```
### Stealth Session (per task)
Each crawl task starts a stealth session:
```typescript
// In product-refresh.ts, entry-point-discovery.ts
const session = startSession(dispensary.state || 'AZ', 'America/Phoenix');
```
This creates a new identity with:
- **Random fingerprint:** Chrome/Firefox/Safari/Edge on Win/Mac/Linux
- **Accept-Language:** Matches timezone (e.g., `America/Phoenix``en-US,en;q=0.9`)
- **sec-ch-ua headers:** Proper Client Hints for the browser profile
### On 403 Block
When Dutchie returns 403, the client automatically:
1. Records failure on current proxy (increments `failure_count`)
2. If proxy has 5+ failures, deactivates it
3. Rotates to next healthy proxy
4. Rotates fingerprint
5. Retries the request
### Proxy Table Schema
```sql
CREATE TABLE proxies (
id SERIAL PRIMARY KEY,
host VARCHAR(255) NOT NULL,
port INTEGER NOT NULL,
username VARCHAR(100),
password VARCHAR(100),
protocol VARCHAR(10) DEFAULT 'http', -- http, https, socks5
is_active BOOLEAN DEFAULT true,
last_used_at TIMESTAMPTZ,
failure_count INTEGER DEFAULT 0,
success_count INTEGER DEFAULT 0,
avg_response_time_ms INTEGER,
last_failure_at TIMESTAMPTZ,
last_error TEXT
);
```
### Configuration
Proxies are mandatory. There is no environment variable to disable them. Workers will refuse to start without active proxies in the database.
### Fingerprints Available
The client includes 6 browser fingerprints:
- Chrome 131 on Windows
- Chrome 131 on macOS
- Chrome 120 on Windows
- Firefox 133 on Windows
- Safari 17.2 on macOS
- Edge 131 on Windows
Each includes proper `sec-ch-ua`, `sec-ch-ua-platform`, and `sec-ch-ua-mobile` headers.
---
## Error Handling
- **GraphQL errors:** Logged, task marked failed, retried later
- **Normalization errors:** Logged as warnings, continue with valid products
- **Image download errors:** Non-fatal, logged, continue
- **Database errors:** Task fails, will be retried
- **403 blocks:** Auto-rotate proxy + fingerprint, retry (up to 3 retries)
---
## Files
| File | Purpose |
|------|---------|
| `src/tasks/handlers/product-resync.ts` | Main crawl handler |
| `src/tasks/handlers/entry-point-discovery.ts` | Slug → ID resolution |
| `src/platforms/dutchie/index.ts` | GraphQL client, session management |
| `src/hydration/normalizers/dutchie.ts` | Payload normalization |
| `src/hydration/canonical-upsert.ts` | Database upsert logic |
| `src/utils/image-storage.ts` | Image download and local storage |
| `src/routes/image-proxy.ts` | On-demand image resizing |
| `migrations/075_consecutive_misses.sql` | OOS tracking column |

View File

@@ -0,0 +1,400 @@
# Worker Task Architecture
This document describes the unified task-based worker system that replaces the legacy fragmented job systems.
## Overview
The task worker architecture provides a single, unified system for managing all background work in CannaiQ:
- **Store discovery** - Find new dispensaries on platforms
- **Entry point discovery** - Resolve platform IDs from menu URLs
- **Product discovery** - Initial product fetch for new stores
- **Product resync** - Regular price/stock updates for existing stores
- **Analytics refresh** - Refresh materialized views and analytics
## Architecture
### Database Tables
**`worker_tasks`** - Central task queue
```sql
CREATE TABLE worker_tasks (
id SERIAL PRIMARY KEY,
role task_role NOT NULL, -- What type of work
dispensary_id INTEGER, -- Which store (if applicable)
platform VARCHAR(50), -- Which platform (dutchie, etc.)
status task_status DEFAULT 'pending',
priority INTEGER DEFAULT 0, -- Higher = process first
scheduled_for TIMESTAMP, -- Don't process before this time
worker_id VARCHAR(100), -- Which worker claimed it
claimed_at TIMESTAMP,
started_at TIMESTAMP,
completed_at TIMESTAMP,
last_heartbeat_at TIMESTAMP, -- For stale detection
result JSONB, -- Output from handler
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
```
**Key indexes:**
- `idx_worker_tasks_pending_priority` - For efficient task claiming
- `idx_worker_tasks_active_dispensary` - Prevents concurrent tasks per store (partial unique index)
### Task Roles
| Role | Purpose | Per-Store | Scheduled |
|------|---------|-----------|-----------|
| `store_discovery` | Find new stores on a platform | No | Daily |
| `entry_point_discovery` | Resolve platform IDs | Yes | On-demand |
| `product_discovery` | Initial product fetch | Yes | After entry_point |
| `product_resync` | Price/stock updates | Yes | Every 4 hours |
| `analytics_refresh` | Refresh MVs | No | Daily |
### Task Lifecycle
```
pending → claimed → running → completed
failed
```
1. **pending** - Task is waiting to be picked up
2. **claimed** - Worker has claimed it (atomic via SELECT FOR UPDATE SKIP LOCKED)
3. **running** - Worker is actively processing
4. **completed** - Task finished successfully
5. **failed** - Task encountered an error
6. **stale** - Task lost its worker (recovered automatically)
## Files
### Core Files
| File | Purpose |
|------|---------|
| `src/tasks/task-service.ts` | TaskService - CRUD, claiming, capacity metrics |
| `src/tasks/task-worker.ts` | TaskWorker - Main worker loop |
| `src/tasks/index.ts` | Module exports |
| `src/routes/tasks.ts` | API endpoints |
| `migrations/074_worker_task_queue.sql` | Database schema |
### Task Handlers
| File | Role |
|------|------|
| `src/tasks/handlers/store-discovery.ts` | `store_discovery` |
| `src/tasks/handlers/entry-point-discovery.ts` | `entry_point_discovery` |
| `src/tasks/handlers/product-discovery.ts` | `product_discovery` |
| `src/tasks/handlers/product-resync.ts` | `product_resync` |
| `src/tasks/handlers/analytics-refresh.ts` | `analytics_refresh` |
## Running Workers
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `WORKER_ROLE` | (required) | Which task role to process |
| `WORKER_ID` | auto-generated | Custom worker identifier |
| `POLL_INTERVAL_MS` | 5000 | How often to check for tasks |
| `HEARTBEAT_INTERVAL_MS` | 30000 | How often to update heartbeat |
### Starting a Worker
```bash
# Start a product resync worker
WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts
# Start with custom ID
WORKER_ROLE=product_resync WORKER_ID=resync-1 npx tsx src/tasks/task-worker.ts
# Start multiple workers for different roles
WORKER_ROLE=store_discovery npx tsx src/tasks/task-worker.ts &
WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts &
```
### Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: task-worker-resync
spec:
replicas: 3
template:
spec:
containers:
- name: worker
image: code.cannabrands.app/creationshop/dispensary-scraper:latest
command: ["npx", "tsx", "src/tasks/task-worker.ts"]
env:
- name: WORKER_ROLE
value: "product_resync"
```
## API Endpoints
### Task Management
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks` | GET | List tasks with filters |
| `/api/tasks` | POST | Create a new task |
| `/api/tasks/:id` | GET | Get task by ID |
| `/api/tasks/counts` | GET | Get counts by status |
| `/api/tasks/capacity` | GET | Get capacity metrics |
| `/api/tasks/capacity/:role` | GET | Get role-specific capacity |
| `/api/tasks/recover-stale` | POST | Recover tasks from dead workers |
### Task Generation
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks/generate/resync` | POST | Generate daily resync tasks |
| `/api/tasks/generate/discovery` | POST | Create store discovery task |
### Migration (from legacy systems)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks/migration/status` | GET | Compare old vs new systems |
| `/api/tasks/migration/disable-old-schedules` | POST | Disable job_schedules |
| `/api/tasks/migration/cancel-pending-crawl-jobs` | POST | Cancel old crawl jobs |
| `/api/tasks/migration/create-resync-tasks` | POST | Create tasks for all stores |
| `/api/tasks/migration/full-migrate` | POST | One-click migration |
### Role-Specific Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks/role/:role/last-completion` | GET | Last completion time |
| `/api/tasks/role/:role/recent` | GET | Recent completions |
| `/api/tasks/store/:id/active` | GET | Check if store has active task |
## Capacity Planning
The `v_worker_capacity` view provides real-time metrics:
```sql
SELECT * FROM v_worker_capacity;
```
Returns:
- `pending_tasks` - Tasks waiting to be claimed
- `ready_tasks` - Tasks ready now (scheduled_for is null or past)
- `claimed_tasks` - Tasks claimed but not started
- `running_tasks` - Tasks actively processing
- `completed_last_hour` - Recent completions
- `failed_last_hour` - Recent failures
- `active_workers` - Workers with recent heartbeats
- `avg_duration_sec` - Average task duration
- `tasks_per_worker_hour` - Throughput estimate
- `estimated_hours_to_drain` - Time to clear queue
### Scaling Recommendations
```javascript
// API: GET /api/tasks/capacity/:role
{
"role": "product_resync",
"pending_tasks": 500,
"active_workers": 3,
"workers_needed": {
"for_1_hour": 10,
"for_4_hours": 3,
"for_8_hours": 2
}
}
```
## Task Chaining
Tasks can automatically create follow-up tasks:
```
store_discovery → entry_point_discovery → product_discovery
(store has platform_dispensary_id)
Daily resync tasks
```
The `chainNextTask()` method handles this automatically.
## Stale Task Recovery
Tasks are considered stale if `last_heartbeat_at` is older than the threshold (default 10 minutes).
```sql
SELECT recover_stale_tasks(10); -- 10 minute threshold
```
Or via API:
```bash
curl -X POST /api/tasks/recover-stale \
-H 'Content-Type: application/json' \
-d '{"threshold_minutes": 10}'
```
## Migration from Legacy Systems
### Legacy Systems Replaced
1. **job_schedules + job_run_logs** - Scheduled job definitions
2. **dispensary_crawl_jobs** - Per-dispensary crawl queue
3. **SyncOrchestrator + HydrationWorker** - Raw payload processing
### Migration Steps
**Option 1: One-Click Migration**
```bash
curl -X POST /api/tasks/migration/full-migrate
```
This will:
1. Disable all job_schedules
2. Cancel pending dispensary_crawl_jobs
3. Generate resync tasks for all stores
4. Create discovery and analytics tasks
**Option 2: Manual Migration**
```bash
# 1. Check current status
curl /api/tasks/migration/status
# 2. Disable old schedules
curl -X POST /api/tasks/migration/disable-old-schedules
# 3. Cancel pending crawl jobs
curl -X POST /api/tasks/migration/cancel-pending-crawl-jobs
# 4. Create resync tasks
curl -X POST /api/tasks/migration/create-resync-tasks \
-H 'Content-Type: application/json' \
-d '{"state_code": "AZ"}'
# 5. Generate daily resync schedule
curl -X POST /api/tasks/generate/resync \
-H 'Content-Type: application/json' \
-d '{"batches_per_day": 6}'
```
## Per-Store Locking
The system prevents concurrent tasks for the same store using a partial unique index:
```sql
CREATE UNIQUE INDEX idx_worker_tasks_active_dispensary
ON worker_tasks (dispensary_id)
WHERE dispensary_id IS NOT NULL
AND status IN ('claimed', 'running');
```
This ensures only one task can be active per store at any time.
## Task Priority
Tasks are claimed in priority order (higher first), then by creation time:
```sql
ORDER BY priority DESC, created_at ASC
```
Default priorities:
- `store_discovery`: 0
- `entry_point_discovery`: 10 (high - new stores)
- `product_discovery`: 10 (high - new stores)
- `product_resync`: 0
- `analytics_refresh`: 0
## Scheduled Tasks
Tasks can be scheduled for future execution:
```javascript
await taskService.createTask({
role: 'product_resync',
dispensary_id: 123,
scheduled_for: new Date('2025-01-10T06:00:00Z'),
});
```
The `generate_resync_tasks()` function creates staggered tasks throughout the day:
```sql
SELECT generate_resync_tasks(6, '2025-01-10'); -- 6 batches = every 4 hours
```
## Dashboard Integration
The admin dashboard shows task queue status in the main overview:
```
Task Queue Summary
------------------
Pending: 45
Running: 3
Completed: 1,234
Failed: 12
```
Full task management is available at `/admin/tasks`.
## Error Handling
Failed tasks include the error message in `error_message` and can be retried:
```sql
-- View failed tasks
SELECT id, role, dispensary_id, error_message, retry_count
FROM worker_tasks
WHERE status = 'failed'
ORDER BY completed_at DESC
LIMIT 20;
-- Retry failed tasks
UPDATE worker_tasks
SET status = 'pending', retry_count = retry_count + 1
WHERE status = 'failed' AND retry_count < max_retries;
```
## Monitoring
### Logs
Workers log to stdout:
```
[TaskWorker] Starting worker worker-product_resync-a1b2c3d4 for role: product_resync
[TaskWorker] Claimed task 123 (product_resync) for dispensary 456
[TaskWorker] Task 123 completed successfully
```
### Health Check
Check if workers are active:
```sql
SELECT worker_id, role, COUNT(*), MAX(last_heartbeat_at)
FROM worker_tasks
WHERE last_heartbeat_at > NOW() - INTERVAL '5 minutes'
GROUP BY worker_id, role;
```
### Metrics
```sql
-- Tasks by status
SELECT status, COUNT(*) FROM worker_tasks GROUP BY status;
-- Tasks by role
SELECT role, status, COUNT(*) FROM worker_tasks GROUP BY role, status;
-- Average duration by role
SELECT role, AVG(EXTRACT(EPOCH FROM (completed_at - started_at))) as avg_seconds
FROM worker_tasks
WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '24 hours'
GROUP BY role;
```

View File

@@ -0,0 +1,69 @@
apiVersion: batch/v1
kind: CronJob
metadata:
name: ip2location-update
namespace: default
spec:
# Run on the 1st of every month at 3am UTC
schedule: "0 3 1 * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
containers:
- name: ip2location-updater
image: curlimages/curl:latest
command:
- /bin/sh
- -c
- |
set -e
echo "Downloading IP2Location LITE DB5..."
# Download to temp
cd /tmp
curl -L -o ip2location.zip "https://www.ip2location.com/download/?token=${IP2LOCATION_TOKEN}&file=DB5LITEBIN"
# Extract
unzip -o ip2location.zip
# Find and copy the BIN file
BIN_FILE=$(ls *.BIN 2>/dev/null | head -1)
if [ -z "$BIN_FILE" ]; then
echo "ERROR: No BIN file found"
exit 1
fi
# Copy to shared volume
cp "$BIN_FILE" /data/IP2LOCATION-LITE-DB5.BIN
echo "Done! Database updated: /data/IP2LOCATION-LITE-DB5.BIN"
env:
- name: IP2LOCATION_TOKEN
valueFrom:
secretKeyRef:
name: dutchie-backend-secret
key: IP2LOCATION_TOKEN
volumeMounts:
- name: ip2location-data
mountPath: /data
restartPolicy: OnFailure
volumes:
- name: ip2location-data
persistentVolumeClaim:
claimName: ip2location-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ip2location-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi

View File

@@ -26,6 +26,12 @@ spec:
name: dutchie-backend-config
- secretRef:
name: dutchie-backend-secret
env:
- name: IP2LOCATION_DB_PATH
value: /data/ip2location/IP2LOCATION-LITE-DB5.BIN
volumeMounts:
- name: ip2location-data
mountPath: /data/ip2location
resources:
requests:
memory: "256Mi"
@@ -45,3 +51,7 @@ spec:
port: 3010
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: ip2location-data
persistentVolumeClaim:
claimName: ip2location-pvc

View File

@@ -0,0 +1,12 @@
-- Add timezone column to proxies table for geo-consistent fingerprinting
-- This allows matching Accept-Language and other headers to proxy location
ALTER TABLE proxies
ADD COLUMN IF NOT EXISTS timezone VARCHAR(50);
-- Add timezone to failed_proxies as well
ALTER TABLE failed_proxies
ADD COLUMN IF NOT EXISTS timezone VARCHAR(50);
-- Comment explaining usage
COMMENT ON COLUMN proxies.timezone IS 'IANA timezone (e.g., America/Phoenix) for geo-consistent fingerprinting';

View File

@@ -0,0 +1,322 @@
-- Migration 074: Worker Task Queue System
-- Implements role-based task queue with per-store locking and capacity tracking
-- Task queue table
CREATE TABLE IF NOT EXISTS worker_tasks (
id SERIAL PRIMARY KEY,
-- Task identification
role VARCHAR(50) NOT NULL, -- store_discovery, entry_point_discovery, product_discovery, product_resync, analytics_refresh
dispensary_id INTEGER REFERENCES dispensaries(id) ON DELETE CASCADE,
platform VARCHAR(20), -- dutchie, jane, treez, etc.
-- Task state
status VARCHAR(20) NOT NULL DEFAULT 'pending',
priority INTEGER DEFAULT 0, -- Higher = more urgent
-- Scheduling
scheduled_for TIMESTAMPTZ, -- For batch scheduling (e.g., every 4 hours)
-- Ownership
worker_id VARCHAR(100), -- Pod name or worker ID
claimed_at TIMESTAMPTZ,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
last_heartbeat_at TIMESTAMPTZ,
-- Results
result JSONB, -- Task output data
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3,
-- Metadata
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
-- Constraints
CONSTRAINT valid_status CHECK (status IN ('pending', 'claimed', 'running', 'completed', 'failed', 'stale'))
);
-- Indexes for efficient task claiming
CREATE INDEX IF NOT EXISTS idx_worker_tasks_pending
ON worker_tasks(role, priority DESC, created_at ASC)
WHERE status = 'pending';
CREATE INDEX IF NOT EXISTS idx_worker_tasks_claimed
ON worker_tasks(worker_id, claimed_at)
WHERE status = 'claimed';
CREATE INDEX IF NOT EXISTS idx_worker_tasks_running
ON worker_tasks(worker_id, last_heartbeat_at)
WHERE status = 'running';
CREATE INDEX IF NOT EXISTS idx_worker_tasks_dispensary
ON worker_tasks(dispensary_id)
WHERE dispensary_id IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_worker_tasks_scheduled
ON worker_tasks(scheduled_for)
WHERE status = 'pending' AND scheduled_for IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_worker_tasks_history
ON worker_tasks(role, completed_at DESC)
WHERE status IN ('completed', 'failed');
-- Partial unique index to prevent duplicate active tasks per store
-- Only one task can be claimed/running for a given dispensary at a time
CREATE UNIQUE INDEX IF NOT EXISTS idx_worker_tasks_unique_active_store
ON worker_tasks(dispensary_id)
WHERE status IN ('claimed', 'running') AND dispensary_id IS NOT NULL;
-- Worker registration table (tracks active workers)
CREATE TABLE IF NOT EXISTS worker_registry (
id SERIAL PRIMARY KEY,
worker_id VARCHAR(100) UNIQUE NOT NULL,
role VARCHAR(50) NOT NULL,
pod_name VARCHAR(100),
hostname VARCHAR(100),
started_at TIMESTAMPTZ DEFAULT NOW(),
last_heartbeat_at TIMESTAMPTZ DEFAULT NOW(),
tasks_completed INTEGER DEFAULT 0,
tasks_failed INTEGER DEFAULT 0,
status VARCHAR(20) DEFAULT 'active',
CONSTRAINT valid_worker_status CHECK (status IN ('active', 'idle', 'offline'))
);
CREATE INDEX IF NOT EXISTS idx_worker_registry_role
ON worker_registry(role, status);
CREATE INDEX IF NOT EXISTS idx_worker_registry_heartbeat
ON worker_registry(last_heartbeat_at)
WHERE status = 'active';
-- Task completion tracking (summarized history)
CREATE TABLE IF NOT EXISTS task_completion_log (
id SERIAL PRIMARY KEY,
role VARCHAR(50) NOT NULL,
date DATE NOT NULL DEFAULT CURRENT_DATE,
hour INTEGER NOT NULL DEFAULT EXTRACT(HOUR FROM NOW()),
tasks_created INTEGER DEFAULT 0,
tasks_completed INTEGER DEFAULT 0,
tasks_failed INTEGER DEFAULT 0,
avg_duration_sec NUMERIC(10,2),
min_duration_sec NUMERIC(10,2),
max_duration_sec NUMERIC(10,2),
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(role, date, hour)
);
-- Capacity planning view
CREATE OR REPLACE VIEW v_worker_capacity AS
SELECT
role,
COUNT(*) FILTER (WHERE status = 'pending') as pending_tasks,
COUNT(*) FILTER (WHERE status = 'pending' AND (scheduled_for IS NULL OR scheduled_for <= NOW())) as ready_tasks,
COUNT(*) FILTER (WHERE status = 'claimed') as claimed_tasks,
COUNT(*) FILTER (WHERE status = 'running') as running_tasks,
COUNT(*) FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') as completed_last_hour,
COUNT(*) FILTER (WHERE status = 'failed' AND completed_at > NOW() - INTERVAL '1 hour') as failed_last_hour,
COUNT(DISTINCT worker_id) FILTER (WHERE status IN ('claimed', 'running')) as active_workers,
AVG(EXTRACT(EPOCH FROM (completed_at - started_at)))
FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') as avg_duration_sec,
-- Capacity planning metrics
CASE
WHEN COUNT(*) FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') > 0
THEN 3600.0 / NULLIF(AVG(EXTRACT(EPOCH FROM (completed_at - started_at)))
FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour'), 0)
ELSE NULL
END as tasks_per_worker_hour,
-- Estimated time to drain queue
CASE
WHEN COUNT(DISTINCT worker_id) FILTER (WHERE status IN ('claimed', 'running')) > 0
AND COUNT(*) FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') > 0
THEN COUNT(*) FILTER (WHERE status = 'pending') / NULLIF(
COUNT(DISTINCT worker_id) FILTER (WHERE status IN ('claimed', 'running')) *
(3600.0 / NULLIF(AVG(EXTRACT(EPOCH FROM (completed_at - started_at)))
FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour'), 0)),
0
)
ELSE NULL
END as estimated_hours_to_drain
FROM worker_tasks
GROUP BY role;
-- Task history view (for UI)
CREATE OR REPLACE VIEW v_task_history AS
SELECT
t.id,
t.role,
t.dispensary_id,
d.name as dispensary_name,
t.platform,
t.status,
t.priority,
t.worker_id,
t.scheduled_for,
t.claimed_at,
t.started_at,
t.completed_at,
t.error_message,
t.retry_count,
t.created_at,
EXTRACT(EPOCH FROM (t.completed_at - t.started_at)) as duration_sec
FROM worker_tasks t
LEFT JOIN dispensaries d ON d.id = t.dispensary_id
ORDER BY t.created_at DESC;
-- Function to claim a task atomically
CREATE OR REPLACE FUNCTION claim_task(
p_role VARCHAR(50),
p_worker_id VARCHAR(100)
) RETURNS worker_tasks AS $$
DECLARE
claimed_task worker_tasks;
BEGIN
UPDATE worker_tasks
SET
status = 'claimed',
worker_id = p_worker_id,
claimed_at = NOW(),
updated_at = NOW()
WHERE id = (
SELECT id FROM worker_tasks
WHERE role = p_role
AND status = 'pending'
AND (scheduled_for IS NULL OR scheduled_for <= NOW())
-- Exclude stores that already have an active task
AND (dispensary_id IS NULL OR dispensary_id NOT IN (
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
AND dispensary_id IS NOT NULL
))
ORDER BY priority DESC, created_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING * INTO claimed_task;
RETURN claimed_task;
END;
$$ LANGUAGE plpgsql;
-- Function to mark stale tasks (workers that died)
CREATE OR REPLACE FUNCTION recover_stale_tasks(
stale_threshold_minutes INTEGER DEFAULT 10
) RETURNS INTEGER AS $$
DECLARE
recovered_count INTEGER;
BEGIN
WITH stale AS (
UPDATE worker_tasks
SET
status = 'pending',
worker_id = NULL,
claimed_at = NULL,
started_at = NULL,
retry_count = retry_count + 1,
updated_at = NOW()
WHERE status IN ('claimed', 'running')
AND last_heartbeat_at < NOW() - (stale_threshold_minutes || ' minutes')::INTERVAL
AND retry_count < max_retries
RETURNING id
)
SELECT COUNT(*) INTO recovered_count FROM stale;
-- Mark tasks that exceeded retries as failed
UPDATE worker_tasks
SET
status = 'failed',
error_message = 'Exceeded max retries after worker failures',
completed_at = NOW(),
updated_at = NOW()
WHERE status IN ('claimed', 'running')
AND last_heartbeat_at < NOW() - (stale_threshold_minutes || ' minutes')::INTERVAL
AND retry_count >= max_retries;
RETURN recovered_count;
END;
$$ LANGUAGE plpgsql;
-- Function to generate daily resync tasks
CREATE OR REPLACE FUNCTION generate_resync_tasks(
p_batches_per_day INTEGER DEFAULT 6, -- Every 4 hours
p_date DATE DEFAULT CURRENT_DATE
) RETURNS INTEGER AS $$
DECLARE
store_count INTEGER;
stores_per_batch INTEGER;
batch_num INTEGER;
scheduled_time TIMESTAMPTZ;
created_count INTEGER := 0;
BEGIN
-- Count active stores that need resync
SELECT COUNT(*) INTO store_count
FROM dispensaries
WHERE crawl_enabled = true
AND menu_type = 'dutchie'
AND platform_dispensary_id IS NOT NULL;
IF store_count = 0 THEN
RETURN 0;
END IF;
stores_per_batch := CEIL(store_count::NUMERIC / p_batches_per_day);
FOR batch_num IN 0..(p_batches_per_day - 1) LOOP
scheduled_time := p_date + (batch_num * 4 || ' hours')::INTERVAL;
INSERT INTO worker_tasks (role, dispensary_id, platform, scheduled_for, priority)
SELECT
'product_resync',
d.id,
'dutchie',
scheduled_time,
0
FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY id) as rn
FROM dispensaries
WHERE crawl_enabled = true
AND menu_type = 'dutchie'
AND platform_dispensary_id IS NOT NULL
) d
WHERE d.rn > (batch_num * stores_per_batch)
AND d.rn <= ((batch_num + 1) * stores_per_batch)
ON CONFLICT DO NOTHING;
GET DIAGNOSTICS created_count = created_count + ROW_COUNT;
END LOOP;
RETURN created_count;
END;
$$ LANGUAGE plpgsql;
-- Trigger to update timestamp
CREATE OR REPLACE FUNCTION update_worker_tasks_timestamp()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS worker_tasks_updated_at ON worker_tasks;
CREATE TRIGGER worker_tasks_updated_at
BEFORE UPDATE ON worker_tasks
FOR EACH ROW
EXECUTE FUNCTION update_worker_tasks_timestamp();
-- Comments
COMMENT ON TABLE worker_tasks IS 'Central task queue for all worker roles';
COMMENT ON TABLE worker_registry IS 'Registry of active workers and their stats';
COMMENT ON TABLE task_completion_log IS 'Hourly aggregated task completion metrics';
COMMENT ON VIEW v_worker_capacity IS 'Real-time capacity planning metrics per role';
COMMENT ON VIEW v_task_history IS 'Task history with dispensary details for UI';
COMMENT ON FUNCTION claim_task IS 'Atomically claim a task for a worker, respecting per-store locking';
COMMENT ON FUNCTION recover_stale_tasks IS 'Release tasks from dead workers back to pending';
COMMENT ON FUNCTION generate_resync_tasks IS 'Generate daily product resync tasks in batches';

View File

@@ -0,0 +1,13 @@
-- Migration 075: Add consecutive_misses column to store_products
-- Used to track how many consecutive crawls a product has been missing from the feed
-- After 3 consecutive misses, product is marked as OOS
ALTER TABLE store_products
ADD COLUMN IF NOT EXISTS consecutive_misses INTEGER NOT NULL DEFAULT 0;
-- Index for finding products that need OOS check
CREATE INDEX IF NOT EXISTS idx_store_products_consecutive_misses
ON store_products (dispensary_id, consecutive_misses)
WHERE consecutive_misses > 0;
COMMENT ON COLUMN store_products.consecutive_misses IS 'Number of consecutive crawls where product was not in feed. Reset to 0 when seen. At 3, mark OOS.';

View File

@@ -0,0 +1,71 @@
-- Visitor location analytics for Findagram
-- Tracks visitor locations to understand popular areas
CREATE TABLE IF NOT EXISTS visitor_locations (
id SERIAL PRIMARY KEY,
-- Location data (from IP lookup)
ip_hash VARCHAR(64), -- Hashed IP for privacy (SHA256)
city VARCHAR(100),
state VARCHAR(100),
state_code VARCHAR(10),
country VARCHAR(100),
country_code VARCHAR(10),
latitude DECIMAL(10, 7),
longitude DECIMAL(10, 7),
-- Visit metadata
domain VARCHAR(50) NOT NULL, -- 'findagram.co', 'findadispo.com', etc.
page_path VARCHAR(255), -- '/products', '/dispensaries/123', etc.
referrer VARCHAR(500),
user_agent VARCHAR(500),
-- Session tracking
session_id VARCHAR(64), -- For grouping page views in a session
-- Timestamps
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Indexes for analytics queries
CREATE INDEX IF NOT EXISTS idx_visitor_locations_domain ON visitor_locations(domain);
CREATE INDEX IF NOT EXISTS idx_visitor_locations_city_state ON visitor_locations(city, state_code);
CREATE INDEX IF NOT EXISTS idx_visitor_locations_created_at ON visitor_locations(created_at);
CREATE INDEX IF NOT EXISTS idx_visitor_locations_session ON visitor_locations(session_id);
-- Aggregated daily stats (materialized for performance)
CREATE TABLE IF NOT EXISTS visitor_location_stats (
id SERIAL PRIMARY KEY,
date DATE NOT NULL,
domain VARCHAR(50) NOT NULL,
city VARCHAR(100),
state VARCHAR(100),
state_code VARCHAR(10),
country_code VARCHAR(10),
-- Metrics
visit_count INTEGER DEFAULT 0,
unique_sessions INTEGER DEFAULT 0,
UNIQUE(date, domain, city, state_code, country_code)
);
CREATE INDEX IF NOT EXISTS idx_visitor_stats_date ON visitor_location_stats(date);
CREATE INDEX IF NOT EXISTS idx_visitor_stats_domain ON visitor_location_stats(domain);
CREATE INDEX IF NOT EXISTS idx_visitor_stats_state ON visitor_location_stats(state_code);
-- View for easy querying of top locations
CREATE OR REPLACE VIEW v_top_visitor_locations AS
SELECT
domain,
city,
state,
state_code,
country_code,
COUNT(*) as total_visits,
COUNT(DISTINCT session_id) as unique_sessions,
MAX(created_at) as last_visit
FROM visitor_locations
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY domain, city, state, state_code, country_code
ORDER BY total_visits DESC;

View File

@@ -0,0 +1,141 @@
-- Migration 076: Worker Registry for Dynamic Workers
-- Workers register on startup, receive a friendly name, and report heartbeats
-- Name pool for workers (expandable, no hardcoding)
CREATE TABLE IF NOT EXISTS worker_name_pool (
id SERIAL PRIMARY KEY,
name VARCHAR(50) UNIQUE NOT NULL,
in_use BOOLEAN DEFAULT FALSE,
assigned_to VARCHAR(100), -- worker_id
assigned_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Seed with initial names (can add more via API)
INSERT INTO worker_name_pool (name) VALUES
('Alice'), ('Bella'), ('Clara'), ('Diana'), ('Elena'),
('Fiona'), ('Grace'), ('Hazel'), ('Iris'), ('Julia'),
('Katie'), ('Luna'), ('Mia'), ('Nora'), ('Olive'),
('Pearl'), ('Quinn'), ('Rosa'), ('Sara'), ('Tara'),
('Uma'), ('Vera'), ('Wendy'), ('Xena'), ('Yuki'), ('Zara'),
('Amber'), ('Blake'), ('Coral'), ('Dawn'), ('Echo'),
('Fleur'), ('Gem'), ('Haven'), ('Ivy'), ('Jade'),
('Kira'), ('Lotus'), ('Maple'), ('Nova'), ('Onyx'),
('Pixel'), ('Quest'), ('Raven'), ('Sage'), ('Terra'),
('Unity'), ('Violet'), ('Willow'), ('Xylo'), ('Yara'), ('Zen')
ON CONFLICT (name) DO NOTHING;
-- Worker registry - tracks active workers
CREATE TABLE IF NOT EXISTS worker_registry (
id SERIAL PRIMARY KEY,
worker_id VARCHAR(100) UNIQUE NOT NULL, -- e.g., "pod-abc123" or uuid
friendly_name VARCHAR(50), -- assigned from pool
role VARCHAR(50) NOT NULL, -- task role
pod_name VARCHAR(100), -- k8s pod name
hostname VARCHAR(100), -- machine hostname
ip_address VARCHAR(50), -- worker IP
status VARCHAR(20) DEFAULT 'starting', -- starting, active, idle, offline, terminated
started_at TIMESTAMPTZ DEFAULT NOW(),
last_heartbeat_at TIMESTAMPTZ DEFAULT NOW(),
last_task_at TIMESTAMPTZ,
tasks_completed INTEGER DEFAULT 0,
tasks_failed INTEGER DEFAULT 0,
current_task_id INTEGER,
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Indexes for worker registry
CREATE INDEX IF NOT EXISTS idx_worker_registry_status ON worker_registry(status);
CREATE INDEX IF NOT EXISTS idx_worker_registry_role ON worker_registry(role);
CREATE INDEX IF NOT EXISTS idx_worker_registry_heartbeat ON worker_registry(last_heartbeat_at);
-- Function to assign a name to a new worker
CREATE OR REPLACE FUNCTION assign_worker_name(p_worker_id VARCHAR(100))
RETURNS VARCHAR(50) AS $$
DECLARE
v_name VARCHAR(50);
BEGIN
-- Try to get an unused name
UPDATE worker_name_pool
SET in_use = TRUE, assigned_to = p_worker_id, assigned_at = NOW()
WHERE id = (
SELECT id FROM worker_name_pool
WHERE in_use = FALSE
ORDER BY RANDOM()
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING name INTO v_name;
-- If no names available, generate one
IF v_name IS NULL THEN
v_name := 'Worker-' || SUBSTRING(p_worker_id FROM 1 FOR 8);
END IF;
RETURN v_name;
END;
$$ LANGUAGE plpgsql;
-- Function to release a worker's name back to the pool
CREATE OR REPLACE FUNCTION release_worker_name(p_worker_id VARCHAR(100))
RETURNS VOID AS $$
BEGIN
UPDATE worker_name_pool
SET in_use = FALSE, assigned_to = NULL, assigned_at = NULL
WHERE assigned_to = p_worker_id;
END;
$$ LANGUAGE plpgsql;
-- Function to mark stale workers as offline
CREATE OR REPLACE FUNCTION mark_stale_workers(stale_threshold_minutes INTEGER DEFAULT 5)
RETURNS INTEGER AS $$
DECLARE
v_count INTEGER;
BEGIN
UPDATE worker_registry
SET status = 'offline', updated_at = NOW()
WHERE status IN ('active', 'idle', 'starting')
AND last_heartbeat_at < NOW() - (stale_threshold_minutes || ' minutes')::INTERVAL
RETURNING COUNT(*) INTO v_count;
-- Release names from offline workers
PERFORM release_worker_name(worker_id)
FROM worker_registry
WHERE status = 'offline'
AND last_heartbeat_at < NOW() - INTERVAL '30 minutes';
RETURN COALESCE(v_count, 0);
END;
$$ LANGUAGE plpgsql;
-- View for dashboard
CREATE OR REPLACE VIEW v_active_workers AS
SELECT
wr.id,
wr.worker_id,
wr.friendly_name,
wr.role,
wr.status,
wr.pod_name,
wr.hostname,
wr.started_at,
wr.last_heartbeat_at,
wr.last_task_at,
wr.tasks_completed,
wr.tasks_failed,
wr.current_task_id,
EXTRACT(EPOCH FROM (NOW() - wr.last_heartbeat_at)) as seconds_since_heartbeat,
CASE
WHEN wr.status = 'offline' THEN 'offline'
WHEN wr.last_heartbeat_at < NOW() - INTERVAL '2 minutes' THEN 'stale'
WHEN wr.current_task_id IS NOT NULL THEN 'busy'
ELSE 'ready'
END as health_status
FROM worker_registry wr
WHERE wr.status != 'terminated'
ORDER BY wr.status = 'active' DESC, wr.last_heartbeat_at DESC;
COMMENT ON TABLE worker_registry IS 'Tracks all workers that have registered with the system';
COMMENT ON TABLE worker_name_pool IS 'Pool of friendly names for workers - expandable via API';

View File

@@ -0,0 +1,35 @@
-- Migration: Add visitor location and dispensary name to click events
-- Captures where visitors are clicking from and which dispensary
-- Add visitor location columns
ALTER TABLE product_click_events
ADD COLUMN IF NOT EXISTS visitor_city VARCHAR(100);
ALTER TABLE product_click_events
ADD COLUMN IF NOT EXISTS visitor_state VARCHAR(10);
ALTER TABLE product_click_events
ADD COLUMN IF NOT EXISTS visitor_lat DECIMAL(10, 7);
ALTER TABLE product_click_events
ADD COLUMN IF NOT EXISTS visitor_lng DECIMAL(10, 7);
-- Add dispensary name for easier reporting
ALTER TABLE product_click_events
ADD COLUMN IF NOT EXISTS dispensary_name VARCHAR(255);
-- Create index for location-based analytics
CREATE INDEX IF NOT EXISTS idx_product_click_events_visitor_state
ON product_click_events(visitor_state)
WHERE visitor_state IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_product_click_events_visitor_city
ON product_click_events(visitor_city)
WHERE visitor_city IS NOT NULL;
-- Add comments
COMMENT ON COLUMN product_click_events.visitor_city IS 'City where the visitor is located (from IP geolocation)';
COMMENT ON COLUMN product_click_events.visitor_state IS 'State where the visitor is located (from IP geolocation)';
COMMENT ON COLUMN product_click_events.visitor_lat IS 'Visitor latitude (from IP geolocation)';
COMMENT ON COLUMN product_click_events.visitor_lng IS 'Visitor longitude (from IP geolocation)';
COMMENT ON COLUMN product_click_events.dispensary_name IS 'Name of the dispensary (denormalized for easier reporting)';

View File

@@ -1026,6 +1026,17 @@
"url": "https://github.com/sponsors/fb55"
}
},
"node_modules/csv-parser": {
"version": "3.2.0",
"resolved": "https://registry.npmjs.org/csv-parser/-/csv-parser-3.2.0.tgz",
"integrity": "sha512-fgKbp+AJbn1h2dcAHKIdKNSSjfp43BZZykXsCjzALjKy80VXQNHPFJ6T9Afwdzoj24aMkq8GwDS7KGcDPpejrA==",
"bin": {
"csv-parser": "bin/csv-parser"
},
"engines": {
"node": ">= 10"
}
},
"node_modules/data-uri-to-buffer": {
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/data-uri-to-buffer/-/data-uri-to-buffer-6.0.2.tgz",
@@ -2235,6 +2246,14 @@
"node": ">= 12"
}
},
"node_modules/ip2location-nodejs": {
"version": "9.7.0",
"resolved": "https://registry.npmjs.org/ip2location-nodejs/-/ip2location-nodejs-9.7.0.tgz",
"integrity": "sha512-eQ4T5TXm1cx0+pQcRycPiuaiRuoDEMd9O89Be7Ugk555qi9UY9enXSznkkqr3kQRyUaXx7zj5dORC5LGTPOttA==",
"dependencies": {
"csv-parser": "^3.0.0"
}
},
"node_modules/ipaddr.js": {
"version": "2.2.0",
"resolved": "https://registry.npmjs.org/ipaddr.js/-/ipaddr.js-2.2.0.tgz",

View File

@@ -21,6 +21,7 @@
"helmet": "^7.1.0",
"https-proxy-agent": "^7.0.2",
"ioredis": "^5.8.2",
"ip2location-nodejs": "^9.7.0",
"ipaddr.js": "^2.2.0",
"jsonwebtoken": "^9.0.2",
"minio": "^7.1.3",
@@ -1531,6 +1532,17 @@
"url": "https://github.com/sponsors/fb55"
}
},
"node_modules/csv-parser": {
"version": "3.2.0",
"resolved": "https://registry.npmjs.org/csv-parser/-/csv-parser-3.2.0.tgz",
"integrity": "sha512-fgKbp+AJbn1h2dcAHKIdKNSSjfp43BZZykXsCjzALjKy80VXQNHPFJ6T9Afwdzoj24aMkq8GwDS7KGcDPpejrA==",
"bin": {
"csv-parser": "bin/csv-parser"
},
"engines": {
"node": ">= 10"
}
},
"node_modules/data-uri-to-buffer": {
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/data-uri-to-buffer/-/data-uri-to-buffer-6.0.2.tgz",
@@ -2754,6 +2766,14 @@
"node": ">= 12"
}
},
"node_modules/ip2location-nodejs": {
"version": "9.7.0",
"resolved": "https://registry.npmjs.org/ip2location-nodejs/-/ip2location-nodejs-9.7.0.tgz",
"integrity": "sha512-eQ4T5TXm1cx0+pQcRycPiuaiRuoDEMd9O89Be7Ugk555qi9UY9enXSznkkqr3kQRyUaXx7zj5dORC5LGTPOttA==",
"dependencies": {
"csv-parser": "^3.0.0"
}
},
"node_modules/ipaddr.js": {
"version": "2.2.0",
"resolved": "https://registry.npmjs.org/ipaddr.js/-/ipaddr.js-2.2.0.tgz",

View File

@@ -35,6 +35,7 @@
"helmet": "^7.1.0",
"https-proxy-agent": "^7.0.2",
"ioredis": "^5.8.2",
"ip2location-nodejs": "^9.7.0",
"ipaddr.js": "^2.2.0",
"jsonwebtoken": "^9.0.2",
"minio": "^7.1.3",

View File

@@ -0,0 +1,65 @@
#!/bin/bash
# Download IP2Location LITE DB3 (City-level) database
# Free for commercial use with attribution
# https://lite.ip2location.com/database/db3-ip-country-region-city
set -e
DATA_DIR="${1:-./data/ip2location}"
DB_FILE="IP2LOCATION-LITE-DB3.BIN"
mkdir -p "$DATA_DIR"
cd "$DATA_DIR"
echo "Downloading IP2Location LITE DB3 database..."
# IP2Location LITE DB3 - includes city, region, country, lat/lng
# You need to register at https://lite.ip2location.com/ to get a download token
# Then set IP2LOCATION_TOKEN environment variable
if [ -z "$IP2LOCATION_TOKEN" ]; then
echo ""
echo "ERROR: IP2LOCATION_TOKEN not set"
echo ""
echo "To download the database:"
echo "1. Register free at https://lite.ip2location.com/"
echo "2. Get your download token from the dashboard"
echo "3. Run: IP2LOCATION_TOKEN=your_token ./scripts/download-ip2location.sh"
echo ""
exit 1
fi
# Download DB3.LITE (IPv4 + City)
DOWNLOAD_URL="https://www.ip2location.com/download/?token=${IP2LOCATION_TOKEN}&file=DB3LITEBIN"
echo "Downloading from IP2Location..."
curl -L -o ip2location.zip "$DOWNLOAD_URL"
echo "Extracting..."
unzip -o ip2location.zip
# Rename to standard name
if [ -f "IP2LOCATION-LITE-DB3.BIN" ]; then
echo "Database ready: $DATA_DIR/IP2LOCATION-LITE-DB3.BIN"
elif [ -f "IP-COUNTRY-REGION-CITY.BIN" ]; then
mv "IP-COUNTRY-REGION-CITY.BIN" "$DB_FILE"
echo "Database ready: $DATA_DIR/$DB_FILE"
else
# Find whatever BIN file was extracted
BIN_FILE=$(ls *.BIN 2>/dev/null | head -1)
if [ -n "$BIN_FILE" ]; then
mv "$BIN_FILE" "$DB_FILE"
echo "Database ready: $DATA_DIR/$DB_FILE"
else
echo "ERROR: No BIN file found in archive"
ls -la
exit 1
fi
fi
# Cleanup
rm -f ip2location.zip *.txt LICENSE* README*
echo ""
echo "Done! Database saved to: $DATA_DIR/$DB_FILE"
echo "Update monthly by re-running this script."

View File

@@ -29,6 +29,11 @@ const TRUSTED_ORIGINS = [
'http://localhost:5173',
];
// Pattern-based trusted origins (wildcards)
const TRUSTED_ORIGIN_PATTERNS = [
/^https:\/\/.*\.cannabrands\.app$/, // *.cannabrands.app
];
// Trusted IPs for internal pod-to-pod communication
const TRUSTED_IPS = [
'127.0.0.1',
@@ -42,8 +47,16 @@ const TRUSTED_IPS = [
function isTrustedRequest(req: Request): boolean {
// Check origin header
const origin = req.headers.origin;
if (origin && TRUSTED_ORIGINS.includes(origin)) {
return true;
if (origin) {
if (TRUSTED_ORIGINS.includes(origin)) {
return true;
}
// Check pattern-based origins (wildcards like *.cannabrands.app)
for (const pattern of TRUSTED_ORIGIN_PATTERNS) {
if (pattern.test(origin)) {
return true;
}
}
}
// Check referer header (for same-origin requests without CORS)
@@ -54,6 +67,18 @@ function isTrustedRequest(req: Request): boolean {
return true;
}
}
// Check pattern-based referers
try {
const refererUrl = new URL(referer);
const refererOrigin = refererUrl.origin;
for (const pattern of TRUSTED_ORIGIN_PATTERNS) {
if (pattern.test(refererOrigin)) {
return true;
}
}
} catch {
// Invalid referer URL, skip
}
}
// Check IP for internal requests (pod-to-pod, localhost)

View File

@@ -0,0 +1,141 @@
/**
* Auto-Migration System
*
* Runs SQL migration files from the migrations/ folder automatically on server startup.
* Uses a schema_migrations table to track which migrations have been applied.
*
* Safe to run multiple times - only applies new migrations.
*/
import { Pool } from 'pg';
import fs from 'fs';
import path from 'path';
const MIGRATIONS_DIR = path.join(__dirname, '../../migrations');
/**
* Ensure schema_migrations table exists
*/
async function ensureMigrationsTable(pool: Pool): Promise<void> {
await pool.query(`
CREATE TABLE IF NOT EXISTS schema_migrations (
id SERIAL PRIMARY KEY,
name VARCHAR(255) UNIQUE NOT NULL,
applied_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
)
`);
}
/**
* Get list of already-applied migrations
*/
async function getAppliedMigrations(pool: Pool): Promise<Set<string>> {
const result = await pool.query('SELECT name FROM schema_migrations');
return new Set(result.rows.map(row => row.name));
}
/**
* Get list of migration files from disk
*/
function getMigrationFiles(): string[] {
if (!fs.existsSync(MIGRATIONS_DIR)) {
console.log('[AutoMigrate] No migrations directory found');
return [];
}
return fs.readdirSync(MIGRATIONS_DIR)
.filter(f => f.endsWith('.sql'))
.sort(); // Sort alphabetically (001_, 002_, etc.)
}
/**
* Run a single migration file
*/
async function runMigration(pool: Pool, filename: string): Promise<void> {
const filepath = path.join(MIGRATIONS_DIR, filename);
const sql = fs.readFileSync(filepath, 'utf8');
const client = await pool.connect();
try {
await client.query('BEGIN');
// Run the migration SQL
await client.query(sql);
// Record that this migration was applied
await client.query(
'INSERT INTO schema_migrations (name) VALUES ($1) ON CONFLICT (name) DO NOTHING',
[filename]
);
await client.query('COMMIT');
console.log(`[AutoMigrate] ✓ Applied: ${filename}`);
} catch (error: any) {
await client.query('ROLLBACK');
console.error(`[AutoMigrate] ✗ Failed: ${filename}`);
throw error;
} finally {
client.release();
}
}
/**
* Run all pending migrations
*
* @param pool - Database connection pool
* @returns Number of migrations applied
*/
export async function runAutoMigrations(pool: Pool): Promise<number> {
console.log('[AutoMigrate] Checking for pending migrations...');
try {
// Ensure migrations table exists
await ensureMigrationsTable(pool);
// Get applied and available migrations
const applied = await getAppliedMigrations(pool);
const available = getMigrationFiles();
// Find pending migrations
const pending = available.filter(f => !applied.has(f));
if (pending.length === 0) {
console.log('[AutoMigrate] No pending migrations');
return 0;
}
console.log(`[AutoMigrate] Found ${pending.length} pending migrations`);
// Run each pending migration in order
for (const filename of pending) {
await runMigration(pool, filename);
}
console.log(`[AutoMigrate] Successfully applied ${pending.length} migrations`);
return pending.length;
} catch (error: any) {
console.error('[AutoMigrate] Migration failed:', error.message);
// Don't crash the server - log and continue
// The specific failing migration will have been rolled back
return -1;
}
}
/**
* Check migration status without running anything
*/
export async function checkMigrationStatus(pool: Pool): Promise<{
applied: string[];
pending: string[];
}> {
await ensureMigrationsTable(pool);
const applied = await getAppliedMigrations(pool);
const available = getMigrationFiles();
return {
applied: available.filter(f => applied.has(f)),
pending: available.filter(f => !applied.has(f)),
};
}

View File

@@ -0,0 +1,200 @@
#!/usr/bin/env npx tsx
/**
* Database Migration Runner
*
* Runs SQL migrations from backend/migrations/*.sql in order.
* Tracks applied migrations in schema_migrations table.
*
* Usage:
* npx tsx src/db/run-migrations.ts
*
* Environment:
* DATABASE_URL or CANNAIQ_DB_* variables
*/
import { Pool } from 'pg';
import * as fs from 'fs/promises';
import * as path from 'path';
import dotenv from 'dotenv';
dotenv.config();
function getConnectionString(): string {
if (process.env.DATABASE_URL) {
return process.env.DATABASE_URL;
}
if (process.env.CANNAIQ_DB_URL) {
return process.env.CANNAIQ_DB_URL;
}
const host = process.env.CANNAIQ_DB_HOST || 'localhost';
const port = process.env.CANNAIQ_DB_PORT || '54320';
const name = process.env.CANNAIQ_DB_NAME || 'dutchie_menus';
const user = process.env.CANNAIQ_DB_USER || 'dutchie';
const pass = process.env.CANNAIQ_DB_PASS || 'dutchie_local_pass';
return `postgresql://${user}:${pass}@${host}:${port}/${name}`;
}
interface MigrationFile {
filename: string;
number: number;
path: string;
}
async function getMigrationFiles(migrationsDir: string): Promise<MigrationFile[]> {
const files = await fs.readdir(migrationsDir);
const migrations: MigrationFile[] = files
.filter(f => f.endsWith('.sql'))
.map(filename => {
// Extract number from filename like "005_api_tokens.sql" or "073_proxy_timezone.sql"
const match = filename.match(/^(\d+)_/);
if (!match) return null;
return {
filename,
number: parseInt(match[1], 10),
path: path.join(migrationsDir, filename),
};
})
.filter((m): m is MigrationFile => m !== null)
.sort((a, b) => a.number - b.number);
return migrations;
}
async function ensureMigrationsTable(pool: Pool): Promise<void> {
// Migrate to filename-based tracking (handles duplicate version numbers)
// Check if old version-based PK exists
const pkCheck = await pool.query(`
SELECT constraint_name FROM information_schema.table_constraints
WHERE table_name = 'schema_migrations' AND constraint_type = 'PRIMARY KEY'
`);
if (pkCheck.rows.length === 0) {
// Table doesn't exist, create with filename as PK
await pool.query(`
CREATE TABLE IF NOT EXISTS schema_migrations (
filename VARCHAR(255) NOT NULL PRIMARY KEY,
version VARCHAR(10),
name VARCHAR(255),
applied_at TIMESTAMPTZ DEFAULT NOW()
)
`);
} else {
// Table exists - add filename column if missing
await pool.query(`
ALTER TABLE schema_migrations ADD COLUMN IF NOT EXISTS filename VARCHAR(255)
`);
// Populate filename from version+name for existing rows
await pool.query(`
UPDATE schema_migrations SET filename = version || '_' || name || '.sql'
WHERE filename IS NULL
`);
}
}
async function getAppliedMigrations(pool: Pool): Promise<Set<string>> {
// Try filename first, fall back to version_name combo
const result = await pool.query(`
SELECT COALESCE(filename, version || '_' || name || '.sql') as filename
FROM schema_migrations
`);
return new Set(result.rows.map(r => r.filename));
}
async function applyMigration(pool: Pool, migration: MigrationFile): Promise<void> {
const sql = await fs.readFile(migration.path, 'utf-8');
// Extract version and name from filename like "005_api_tokens.sql"
const version = String(migration.number).padStart(3, '0');
const name = migration.filename.replace(/^\d+_/, '').replace(/\.sql$/, '');
const client = await pool.connect();
try {
await client.query('BEGIN');
// Run the migration SQL
await client.query(sql);
// Record that it was applied - use INSERT with ON CONFLICT for safety
await client.query(`
INSERT INTO schema_migrations (filename, version, name)
VALUES ($1, $2, $3)
ON CONFLICT DO NOTHING
`, [migration.filename, version, name]);
await client.query('COMMIT');
} catch (error) {
await client.query('ROLLBACK');
throw error;
} finally {
client.release();
}
}
async function main() {
const pool = new Pool({ connectionString: getConnectionString() });
// Migrations directory relative to this file
const migrationsDir = path.resolve(__dirname, '../../migrations');
console.log('╔════════════════════════════════════════════════════════════╗');
console.log('║ DATABASE MIGRATION RUNNER ║');
console.log('╚════════════════════════════════════════════════════════════╝');
console.log(`Migrations dir: ${migrationsDir}`);
console.log('');
try {
// Ensure tracking table exists
await ensureMigrationsTable(pool);
// Get all migration files
const allMigrations = await getMigrationFiles(migrationsDir);
console.log(`Found ${allMigrations.length} migration files`);
// Get already-applied migrations
const applied = await getAppliedMigrations(pool);
console.log(`Already applied: ${applied.size} migrations`);
console.log('');
// Find pending migrations (compare by filename)
const pending = allMigrations.filter(m => !applied.has(m.filename));
if (pending.length === 0) {
console.log('✅ No pending migrations. Database is up to date.');
await pool.end();
return;
}
console.log(`Pending migrations: ${pending.length}`);
console.log('─'.repeat(60));
// Apply each pending migration
for (const migration of pending) {
process.stdout.write(` ${migration.filename}... `);
try {
await applyMigration(pool, migration);
console.log('✅');
} catch (error: any) {
console.log('❌');
console.error(`\nError applying ${migration.filename}:`);
console.error(error.message);
process.exit(1);
}
}
console.log('');
console.log('═'.repeat(60));
console.log(`✅ Applied ${pending.length} migrations successfully`);
} catch (error: any) {
console.error('Migration runner failed:', error.message);
process.exit(1);
} finally {
await pool.end();
}
}
main();

View File

@@ -191,6 +191,23 @@ export async function runFullDiscovery(
}
}
// Step 5: Detect dropped stores (in DB but not in discovery results)
if (!dryRun) {
console.log('\n[Discovery] Step 5: Detecting dropped stores...');
const droppedResult = await detectDroppedStores(pool, stateCode);
if (droppedResult.droppedCount > 0) {
console.log(`[Discovery] Found ${droppedResult.droppedCount} dropped stores:`);
droppedResult.droppedStores.slice(0, 10).forEach(s => {
console.log(` - ${s.name} (${s.city}, ${s.state}) - last seen: ${s.lastSeenAt}`);
});
if (droppedResult.droppedCount > 10) {
console.log(` ... and ${droppedResult.droppedCount - 10} more`);
}
} else {
console.log(`[Discovery] No dropped stores detected`);
}
}
return {
cities: cityResult,
locations: locationResults,
@@ -200,6 +217,107 @@ export async function runFullDiscovery(
};
}
// ============================================================
// DROPPED STORE DETECTION
// ============================================================
export interface DroppedStoreResult {
droppedCount: number;
droppedStores: Array<{
id: number;
name: string;
city: string;
state: string;
platformDispensaryId: string;
lastSeenAt: string;
}>;
}
/**
* Detect stores that exist in dispensaries but were not found in discovery.
* Marks them as status='dropped' for manual review.
*
* A store is considered "dropped" if:
* 1. It has a platform_dispensary_id (was verified via Dutchie)
* 2. It was NOT seen in the latest discovery crawl (last_seen_at in discovery < 24h ago)
* 3. It's currently marked as 'open' status
*/
export async function detectDroppedStores(
pool: Pool,
stateCode?: string
): Promise<DroppedStoreResult> {
// Find dispensaries that:
// 1. Have platform_dispensary_id (verified Dutchie stores)
// 2. Are currently 'open' status
// 3. Have a linked discovery record that wasn't seen in the last discovery run
// (last_seen_at in dutchie_discovery_locations is older than 24 hours)
const params: any[] = [];
let stateFilter = '';
if (stateCode) {
stateFilter = ` AND d.state = $1`;
params.push(stateCode);
}
const query = `
WITH recently_seen AS (
SELECT DISTINCT platform_location_id
FROM dutchie_discovery_locations
WHERE last_seen_at > NOW() - INTERVAL '24 hours'
AND active = true
)
SELECT
d.id,
d.name,
d.city,
d.state,
d.platform_dispensary_id,
d.updated_at as last_seen_at
FROM dispensaries d
WHERE d.platform_dispensary_id IS NOT NULL
AND d.platform = 'dutchie'
AND (d.status = 'open' OR d.status IS NULL)
AND d.crawl_enabled = true
AND d.platform_dispensary_id NOT IN (SELECT platform_location_id FROM recently_seen)
${stateFilter}
ORDER BY d.name
`;
const result = await pool.query(query, params);
const droppedStores = result.rows;
// Mark these stores as 'dropped' status
if (droppedStores.length > 0) {
const ids = droppedStores.map(s => s.id);
await pool.query(`
UPDATE dispensaries
SET status = 'dropped', updated_at = NOW()
WHERE id = ANY($1::int[])
`, [ids]);
// Log to promotion log for audit
for (const store of droppedStores) {
await pool.query(`
INSERT INTO dutchie_promotion_log
(dispensary_id, action, state_code, store_name, triggered_by)
VALUES ($1, 'dropped', $2, $3, 'discovery_detection')
`, [store.id, store.state, store.name]);
}
}
return {
droppedCount: droppedStores.length,
droppedStores: droppedStores.map(s => ({
id: s.id,
name: s.name,
city: s.city,
state: s.state,
platformDispensaryId: s.platform_dispensary_id,
lastSeenAt: s.last_seen_at,
})),
};
}
// ============================================================
// SINGLE CITY DISCOVERY
// ============================================================

View File

@@ -16,6 +16,12 @@ import {
NormalizedBrand,
NormalizationResult,
} from './types';
import {
downloadProductImage,
ProductImageContext,
isImageStorageReady,
LocalImageSizes,
} from '../utils/image-storage';
const BATCH_SIZE = 100;
@@ -23,10 +29,21 @@ const BATCH_SIZE = 100;
// PRODUCT UPSERTS
// ============================================================
export interface NewProductInfo {
id: number; // store_products.id
externalProductId: string; // provider_product_id
name: string;
brandName: string | null;
primaryImageUrl: string | null;
hasLocalImage?: boolean; // True if local_image_path is already set
}
export interface UpsertProductsResult {
upserted: number;
new: number;
updated: number;
newProducts: NewProductInfo[]; // Details of newly created products
productsNeedingImages: NewProductInfo[]; // Products (new or updated) that need image downloads
}
/**
@@ -41,12 +58,14 @@ export async function upsertStoreProducts(
options: { dryRun?: boolean } = {}
): Promise<UpsertProductsResult> {
if (products.length === 0) {
return { upserted: 0, new: 0, updated: 0 };
return { upserted: 0, new: 0, updated: 0, newProducts: [], productsNeedingImages: [] };
}
const { dryRun = false } = options;
let newCount = 0;
let updatedCount = 0;
const newProducts: NewProductInfo[] = [];
const productsNeedingImages: NewProductInfo[] = [];
// Process in batches
for (let i = 0; i < products.length; i += BATCH_SIZE) {
@@ -104,7 +123,7 @@ export async function upsertStoreProducts(
image_url = EXCLUDED.image_url,
last_seen_at = NOW(),
updated_at = NOW()
RETURNING (xmax = 0) as is_new`,
RETURNING id, (xmax = 0) as is_new, (local_image_path IS NOT NULL) as has_local_image`,
[
product.dispensaryId,
product.platform,
@@ -129,10 +148,30 @@ export async function upsertStoreProducts(
]
);
if (result.rows[0]?.is_new) {
const row = result.rows[0];
const productInfo: NewProductInfo = {
id: row.id,
externalProductId: product.externalProductId,
name: product.name,
brandName: product.brandName,
primaryImageUrl: product.primaryImageUrl,
hasLocalImage: row.has_local_image,
};
if (row.is_new) {
newCount++;
// Track new products
newProducts.push(productInfo);
// New products always need images (if they have a source URL)
if (product.primaryImageUrl && !row.has_local_image) {
productsNeedingImages.push(productInfo);
}
} else {
updatedCount++;
// Updated products need images only if they don't have a local image yet
if (product.primaryImageUrl && !row.has_local_image) {
productsNeedingImages.push(productInfo);
}
}
}
@@ -149,6 +188,8 @@ export async function upsertStoreProducts(
upserted: newCount + updatedCount,
new: newCount,
updated: updatedCount,
newProducts,
productsNeedingImages,
};
}
@@ -564,6 +605,19 @@ export async function upsertBrands(
// FULL HYDRATION
// ============================================================
export interface ImageDownloadResult {
downloaded: number;
skipped: number;
failed: number;
bytesTotal: number;
}
export interface DispensaryContext {
stateCode: string;
storeSlug: string;
hasExistingProducts?: boolean; // True if store already has products with local images
}
export interface HydratePayloadResult {
productsUpserted: number;
productsNew: number;
@@ -574,6 +628,154 @@ export interface HydratePayloadResult {
variantsUpserted: number;
variantsNew: number;
variantSnapshotsCreated: number;
imagesDownloaded: number;
imagesSkipped: number;
imagesFailed: number;
imagesBytesTotal: number;
}
/**
* Helper to create slug from string
*/
function slugify(str: string): string {
return str
.toLowerCase()
.replace(/[^a-z0-9]+/g, '-')
.replace(/^-+|-+$/g, '')
.substring(0, 50) || 'unknown';
}
/**
* Download images for new products and update their local paths
*/
export async function downloadProductImages(
pool: Pool,
newProducts: NewProductInfo[],
dispensaryContext: DispensaryContext,
options: { dryRun?: boolean; concurrency?: number } = {}
): Promise<ImageDownloadResult> {
const { dryRun = false, concurrency = 5 } = options;
// Filter products that have images to download
const productsWithImages = newProducts.filter(p => p.primaryImageUrl);
if (productsWithImages.length === 0) {
return { downloaded: 0, skipped: 0, failed: 0, bytesTotal: 0 };
}
// Check if image storage is ready
if (!isImageStorageReady()) {
console.warn('[ImageDownload] Image storage not initialized, skipping downloads');
return { downloaded: 0, skipped: productsWithImages.length, failed: 0, bytesTotal: 0 };
}
if (dryRun) {
console.log(`[DryRun] Would download ${productsWithImages.length} images`);
return { downloaded: 0, skipped: productsWithImages.length, failed: 0, bytesTotal: 0 };
}
let downloaded = 0;
let skipped = 0;
let failed = 0;
let bytesTotal = 0;
// Process in batches with concurrency limit
for (let i = 0; i < productsWithImages.length; i += concurrency) {
const batch = productsWithImages.slice(i, i + concurrency);
const results = await Promise.allSettled(
batch.map(async (product) => {
const ctx: ProductImageContext = {
stateCode: dispensaryContext.stateCode,
storeSlug: dispensaryContext.storeSlug,
brandSlug: slugify(product.brandName || 'unknown'),
productId: product.externalProductId,
};
const result = await downloadProductImage(product.primaryImageUrl!, ctx, { skipIfExists: true });
if (result.success) {
// Update the database with local image path
const imagesJson = JSON.stringify({
full: result.urls!.full,
medium: result.urls!.medium,
thumb: result.urls!.thumb,
});
await pool.query(
`UPDATE store_products
SET local_image_path = $1, images = $2
WHERE id = $3`,
[result.urls!.full, imagesJson, product.id]
);
}
return result;
})
);
for (const result of results) {
if (result.status === 'fulfilled') {
const downloadResult = result.value;
if (downloadResult.success) {
if (downloadResult.skipped) {
skipped++;
} else {
downloaded++;
bytesTotal += downloadResult.bytesDownloaded || 0;
}
} else {
failed++;
console.warn(`[ImageDownload] Failed: ${downloadResult.error}`);
}
} else {
failed++;
console.error(`[ImageDownload] Error:`, result.reason);
}
}
}
console.log(`[ImageDownload] Downloaded: ${downloaded}, Skipped: ${skipped}, Failed: ${failed}, Bytes: ${bytesTotal}`);
return { downloaded, skipped, failed, bytesTotal };
}
/**
* Get dispensary context for image paths
* Also checks if this dispensary already has products with local images
* to skip unnecessary filesystem checks for existing stores
*/
async function getDispensaryContext(pool: Pool, dispensaryId: number): Promise<DispensaryContext | null> {
try {
const result = await pool.query(
`SELECT
d.state,
d.slug,
d.name,
EXISTS(
SELECT 1 FROM store_products sp
WHERE sp.dispensary_id = d.id
AND sp.local_image_path IS NOT NULL
LIMIT 1
) as has_local_images
FROM dispensaries d
WHERE d.id = $1`,
[dispensaryId]
);
if (result.rows.length === 0) {
return null;
}
const row = result.rows[0];
return {
stateCode: row.state || 'unknown',
storeSlug: row.slug || slugify(row.name || `store-${dispensaryId}`),
hasExistingProducts: row.has_local_images,
};
} catch (error) {
console.error('[getDispensaryContext] Error:', error);
return null;
}
}
/**
@@ -584,9 +786,9 @@ export async function hydrateToCanonical(
dispensaryId: number,
normResult: NormalizationResult,
crawlRunId: number | null,
options: { dryRun?: boolean } = {}
options: { dryRun?: boolean; downloadImages?: boolean } = {}
): Promise<HydratePayloadResult> {
const { dryRun = false } = options;
const { dryRun = false, downloadImages: shouldDownloadImages = true } = options;
// 1. Upsert brands
const brandResult = await upsertBrands(pool, normResult.brands, { dryRun });
@@ -634,6 +836,36 @@ export async function hydrateToCanonical(
{ dryRun }
);
// 6. Download images for products that need them
// This includes:
// - New products (always need images)
// - Updated products that don't have local images yet (backfill)
// This avoids:
// - Filesystem checks for products that already have local images
// - Unnecessary HTTP requests for products with existing images
let imageResult: ImageDownloadResult = { downloaded: 0, skipped: 0, failed: 0, bytesTotal: 0 };
if (shouldDownloadImages && productResult.productsNeedingImages.length > 0) {
const dispensaryContext = await getDispensaryContext(pool, dispensaryId);
if (dispensaryContext) {
const newCount = productResult.productsNeedingImages.filter(p => !p.hasLocalImage).length;
const backfillCount = productResult.productsNeedingImages.length - newCount;
console.log(`[Hydration] Downloading images for ${productResult.productsNeedingImages.length} products (${productResult.new} new, ${backfillCount} backfill)...`);
imageResult = await downloadProductImages(
pool,
productResult.productsNeedingImages,
dispensaryContext,
{ dryRun }
);
} else {
console.warn(`[Hydration] Could not get dispensary context for ID ${dispensaryId}, skipping image downloads`);
}
} else if (productResult.productsNeedingImages.length === 0 && productResult.upserted > 0) {
// All products already have local images
console.log(`[Hydration] All ${productResult.upserted} products already have local images, skipping downloads`);
}
return {
productsUpserted: productResult.upserted,
productsNew: productResult.new,
@@ -644,5 +876,9 @@ export async function hydrateToCanonical(
variantsUpserted: variantResult.upserted,
variantsNew: variantResult.new,
variantSnapshotsCreated: variantResult.snapshotsCreated,
imagesDownloaded: imageResult.downloaded,
imagesSkipped: imageResult.skipped,
imagesFailed: imageResult.failed,
imagesBytesTotal: imageResult.bytesTotal,
};
}

View File

@@ -6,7 +6,10 @@ import { initializeMinio, isMinioEnabled } from './utils/minio';
import { initializeImageStorage } from './utils/image-storage';
import { logger } from './services/logger';
import { cleanupOrphanedJobs } from './services/proxyTestQueue';
import { runAutoMigrations } from './db/auto-migrate';
import { getPool } from './db/pool';
import healthRoutes from './routes/health';
import imageProxyRoutes from './routes/image-proxy';
dotenv.config();
@@ -29,6 +32,10 @@ app.use(express.json());
const LOCAL_IMAGES_PATH = process.env.LOCAL_IMAGES_PATH || './public/images';
app.use('/images', express.static(LOCAL_IMAGES_PATH));
// Image proxy with on-demand resizing
// Usage: /img/products/az/store/brand/product/image.webp?w=200&h=200
app.use('/img', imageProxyRoutes);
// Serve static downloads (plugin files, etc.)
// Uses ./public/downloads relative to working directory (works for both Docker and local dev)
const LOCAL_DOWNLOADS_PATH = process.env.LOCAL_DOWNLOADS_PATH || './public/downloads';
@@ -102,6 +109,7 @@ import apiPermissionsRoutes from './routes/api-permissions';
import parallelScrapeRoutes from './routes/parallel-scrape';
import crawlerSandboxRoutes from './routes/crawler-sandbox';
import versionRoutes from './routes/version';
import deployStatusRoutes from './routes/deploy-status';
import publicApiRoutes from './routes/public-api';
import usersRoutes from './routes/users';
import staleProcessesRoutes from './routes/stale-processes';
@@ -121,7 +129,6 @@ import { createStatesRouter } from './routes/states';
import { createAnalyticsV2Router } from './routes/analytics-v2';
import { createDiscoveryRoutes } from './discovery';
import pipelineRoutes from './routes/pipeline';
import { getPool } from './db/pool';
// Consumer API routes (findadispo.com, findagram.co)
import consumerAuthRoutes from './routes/consumer-auth';
@@ -133,6 +140,8 @@ import eventsRoutes from './routes/events';
import clickAnalyticsRoutes from './routes/click-analytics';
import seoRoutes from './routes/seo';
import priceAnalyticsRoutes from './routes/price-analytics';
import tasksRoutes from './routes/tasks';
import workerRegistryRoutes from './routes/worker-registry';
// Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com)
// These domains can access the API without authentication
@@ -175,6 +184,8 @@ app.use('/api/api-permissions', apiPermissionsRoutes);
app.use('/api/parallel-scrape', parallelScrapeRoutes);
app.use('/api/crawler-sandbox', crawlerSandboxRoutes);
app.use('/api/version', versionRoutes);
app.use('/api/admin/deploy-status', deployStatusRoutes);
console.log('[DeployStatus] Routes registered at /api/admin/deploy-status');
app.use('/api/users', usersRoutes);
app.use('/api/stale-processes', staleProcessesRoutes);
// Admin routes - orchestrator actions
@@ -203,6 +214,14 @@ app.use('/api/monitor', workersRoutes);
app.use('/api/job-queue', jobQueueRoutes);
console.log('[Workers] Routes registered at /api/workers, /api/monitor, and /api/job-queue');
// Task queue management - worker tasks with capacity planning
app.use('/api/tasks', tasksRoutes);
console.log('[Tasks] Routes registered at /api/tasks');
// Worker registry - dynamic worker registration, heartbeats, and name management
app.use('/api/worker-registry', workerRegistryRoutes);
console.log('[WorkerRegistry] Routes registered at /api/worker-registry');
// Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation
try {
const analyticsV2Router = createAnalyticsV2Router(getPool());
@@ -289,6 +308,17 @@ async function startServer() {
try {
logger.info('system', 'Starting server...');
// Run auto-migrations before anything else
const pool = getPool();
const migrationsApplied = await runAutoMigrations(pool);
if (migrationsApplied > 0) {
logger.info('system', `Applied ${migrationsApplied} database migrations`);
} else if (migrationsApplied === 0) {
logger.info('system', 'Database schema up to date');
} else {
logger.warn('system', 'Some migrations failed - check logs');
}
await initializeMinio();
await initializeImageStorage();
logger.info('system', isMinioEnabled() ? 'MinIO storage initialized' : 'Local filesystem storage initialized');

View File

@@ -213,7 +213,24 @@ const FINGERPRINTS: Fingerprint[] = [
let currentFingerprintIndex = 0;
// Forward declaration for session (actual CrawlSession interface defined later)
let currentSession: {
sessionId: string;
fingerprint: Fingerprint;
proxyUrl: string | null;
stateCode?: string;
timezone?: string;
startedAt: Date;
} | null = null;
/**
* Get current fingerprint - returns session fingerprint if active, otherwise default
*/
export function getFingerprint(): Fingerprint {
// Use session fingerprint if a session is active
if (currentSession) {
return currentSession.fingerprint;
}
return FINGERPRINTS[currentFingerprintIndex];
}
@@ -228,6 +245,103 @@ export function resetFingerprint(): void {
currentFingerprintIndex = 0;
}
/**
* Get a random fingerprint from the pool
*/
export function getRandomFingerprint(): Fingerprint {
const index = Math.floor(Math.random() * FINGERPRINTS.length);
return FINGERPRINTS[index];
}
// ============================================================
// SESSION MANAGEMENT
// Per-session fingerprint rotation for stealth
// ============================================================
export interface CrawlSession {
sessionId: string;
fingerprint: Fingerprint;
proxyUrl: string | null;
stateCode?: string;
timezone?: string;
startedAt: Date;
}
// Note: currentSession variable declared earlier in file for proper scoping
/**
* Timezone to Accept-Language mapping
* US timezones all use en-US but this can be extended for international
*/
const TIMEZONE_TO_LOCALE: Record<string, string> = {
'America/Phoenix': 'en-US,en;q=0.9',
'America/Los_Angeles': 'en-US,en;q=0.9',
'America/Denver': 'en-US,en;q=0.9',
'America/Chicago': 'en-US,en;q=0.9',
'America/New_York': 'en-US,en;q=0.9',
'America/Detroit': 'en-US,en;q=0.9',
'America/Anchorage': 'en-US,en;q=0.9',
'Pacific/Honolulu': 'en-US,en;q=0.9',
};
/**
* Get Accept-Language header for a given timezone
*/
export function getLocaleForTimezone(timezone?: string): string {
if (!timezone) return 'en-US,en;q=0.9';
return TIMEZONE_TO_LOCALE[timezone] || 'en-US,en;q=0.9';
}
/**
* Start a new crawl session with a random fingerprint
* Call this before crawling a store to get a fresh identity
*/
export function startSession(stateCode?: string, timezone?: string): CrawlSession {
const baseFp = getRandomFingerprint();
// Override Accept-Language based on timezone for geographic consistency
const fingerprint: Fingerprint = {
...baseFp,
acceptLanguage: getLocaleForTimezone(timezone),
};
currentSession = {
sessionId: `session_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`,
fingerprint,
proxyUrl: currentProxy,
stateCode,
timezone,
startedAt: new Date(),
};
console.log(`[Dutchie Client] Started session ${currentSession.sessionId}`);
console.log(`[Dutchie Client] Fingerprint: ${fingerprint.userAgent.slice(0, 50)}...`);
console.log(`[Dutchie Client] Accept-Language: ${fingerprint.acceptLanguage}`);
if (timezone) {
console.log(`[Dutchie Client] Timezone: ${timezone}`);
}
return currentSession;
}
/**
* End the current crawl session
*/
export function endSession(): void {
if (currentSession) {
const duration = Math.round((Date.now() - currentSession.startedAt.getTime()) / 1000);
console.log(`[Dutchie Client] Ended session ${currentSession.sessionId} (${duration}s)`);
currentSession = null;
}
}
/**
* Get current active session
*/
export function getCurrentSession(): CrawlSession | null {
return currentSession;
}
// ============================================================
// CURL HTTP CLIENT
// ============================================================
@@ -420,7 +534,8 @@ export async function executeGraphQL(
}
if (response.status === 403 && retryOn403) {
console.warn(`[Dutchie Client] 403 blocked - rotating fingerprint...`);
console.warn(`[Dutchie Client] 403 blocked - rotating proxy and fingerprint...`);
await rotateProxyOn403('403 Forbidden on GraphQL');
rotateFingerprint();
attempt++;
await sleep(1000 * attempt);
@@ -503,7 +618,8 @@ export async function fetchPage(
}
if (response.status === 403 && retryOn403) {
console.warn(`[Dutchie Client] 403 blocked - rotating fingerprint...`);
console.warn(`[Dutchie Client] 403 blocked - rotating proxy and fingerprint...`);
await rotateProxyOn403('403 Forbidden on page fetch');
rotateFingerprint();
attempt++;
await sleep(1000 * attempt);

View File

@@ -18,6 +18,13 @@ export {
getFingerprint,
rotateFingerprint,
resetFingerprint,
getRandomFingerprint,
getLocaleForTimezone,
// Session Management (per-store fingerprint rotation)
startSession,
endSession,
getCurrentSession,
// Proxy
setProxy,
@@ -32,6 +39,7 @@ export {
// Types
type CurlResponse,
type Fingerprint,
type CrawlSession,
type ExecuteGraphQLOptions,
type FetchPageOptions,
} from './client';

View File

@@ -5,31 +5,35 @@ import { pool } from '../db/pool';
const router = Router();
router.use(authMiddleware);
// Get categories (flat list)
// Get categories (flat list) - derived from actual product data
router.get('/', async (req, res) => {
try {
const { store_id } = req.query;
const { store_id, in_stock_only } = req.query;
let query = `
SELECT
c.*,
COUNT(DISTINCT p.id) as product_count,
pc.name as parent_name
FROM categories c
LEFT JOIN store_products p ON c.name = p.category_raw
LEFT JOIN categories pc ON c.parent_id = pc.id
category_raw as name,
category_raw as slug,
COUNT(*) as product_count,
COUNT(*) FILTER (WHERE is_in_stock = true) as in_stock_count
FROM store_products
WHERE category_raw IS NOT NULL
`;
const params: any[] = [];
if (store_id) {
query += ' WHERE c.store_id = $1';
params.push(store_id);
query += ` AND dispensary_id = $${params.length}`;
}
if (in_stock_only === 'true') {
query += ` AND is_in_stock = true`;
}
query += `
GROUP BY c.id, pc.name
ORDER BY c.display_order, c.name
GROUP BY category_raw
ORDER BY category_raw
`;
const result = await pool.query(query, params);
@@ -40,49 +44,85 @@ router.get('/', async (req, res) => {
}
});
// Get category tree (hierarchical)
// Get category tree (hierarchical) - category -> subcategory structure from product data
router.get('/tree', async (req, res) => {
try {
const { store_id } = req.query;
const { store_id, in_stock_only } = req.query;
if (!store_id) {
return res.status(400).json({ error: 'store_id is required' });
// Get category + subcategory combinations with counts
let query = `
SELECT
category_raw as category,
subcategory_raw as subcategory,
COUNT(*) as product_count,
COUNT(*) FILTER (WHERE is_in_stock = true) as in_stock_count
FROM store_products
WHERE category_raw IS NOT NULL
`;
const params: any[] = [];
if (store_id) {
params.push(store_id);
query += ` AND dispensary_id = $${params.length}`;
}
// Get all categories for the store
const result = await pool.query(`
SELECT
c.*,
COUNT(DISTINCT p.id) as product_count
FROM categories c
LEFT JOIN store_products p ON c.name = p.category_raw AND p.is_in_stock = true AND p.dispensary_id = $1
WHERE c.store_id = $1
GROUP BY c.id
ORDER BY c.display_order, c.name
`, [store_id]);
if (in_stock_only === 'true') {
query += ` AND is_in_stock = true`;
}
// Build tree structure
const categories = result.rows;
const categoryMap = new Map();
const tree: any[] = [];
query += `
GROUP BY category_raw, subcategory_raw
ORDER BY category_raw, subcategory_raw
`;
// First pass: create map
categories.forEach((cat: { id: number; parent_id?: number }) => {
categoryMap.set(cat.id, { ...cat, children: [] });
});
const result = await pool.query(query, params);
// Second pass: build tree
categories.forEach((cat: { id: number; parent_id?: number }) => {
const node = categoryMap.get(cat.id);
if (cat.parent_id) {
const parent = categoryMap.get(cat.parent_id);
if (parent) {
parent.children.push(node);
}
} else {
tree.push(node);
// Build tree structure: category -> subcategories
const categoryMap = new Map<string, {
name: string;
slug: string;
product_count: number;
in_stock_count: number;
subcategories: Array<{
name: string;
slug: string;
product_count: number;
in_stock_count: number;
}>;
}>();
for (const row of result.rows) {
const category = row.category;
const subcategory = row.subcategory;
const count = parseInt(row.product_count);
const inStockCount = parseInt(row.in_stock_count);
if (!categoryMap.has(category)) {
categoryMap.set(category, {
name: category,
slug: category.toLowerCase().replace(/\s+/g, '-'),
product_count: 0,
in_stock_count: 0,
subcategories: []
});
}
});
const cat = categoryMap.get(category)!;
cat.product_count += count;
cat.in_stock_count += inStockCount;
if (subcategory) {
cat.subcategories.push({
name: subcategory,
slug: subcategory.toLowerCase().replace(/\s+/g, '-'),
product_count: count,
in_stock_count: inStockCount
});
}
}
const tree = Array.from(categoryMap.values());
res.json({ tree });
} catch (error) {
@@ -91,4 +131,91 @@ router.get('/tree', async (req, res) => {
}
});
// Get all unique subcategories for a category
router.get('/:category/subcategories', async (req, res) => {
try {
const { category } = req.params;
const { store_id, in_stock_only } = req.query;
let query = `
SELECT
subcategory_raw as name,
subcategory_raw as slug,
COUNT(*) as product_count,
COUNT(*) FILTER (WHERE is_in_stock = true) as in_stock_count
FROM store_products
WHERE category_raw = $1
AND subcategory_raw IS NOT NULL
`;
const params: any[] = [category];
if (store_id) {
params.push(store_id);
query += ` AND dispensary_id = $${params.length}`;
}
if (in_stock_only === 'true') {
query += ` AND is_in_stock = true`;
}
query += `
GROUP BY subcategory_raw
ORDER BY subcategory_raw
`;
const result = await pool.query(query, params);
res.json({
category,
subcategories: result.rows
});
} catch (error) {
console.error('Error fetching subcategories:', error);
res.status(500).json({ error: 'Failed to fetch subcategories' });
}
});
// Get global category summary (across all stores)
router.get('/summary', async (req, res) => {
try {
const { state } = req.query;
let query = `
SELECT
sp.category_raw as category,
COUNT(DISTINCT sp.id) as product_count,
COUNT(DISTINCT sp.dispensary_id) as store_count,
COUNT(*) FILTER (WHERE sp.is_in_stock = true) as in_stock_count
FROM store_products sp
`;
const params: any[] = [];
if (state) {
query += `
JOIN dispensaries d ON sp.dispensary_id = d.id
WHERE sp.category_raw IS NOT NULL
AND d.state = $1
`;
params.push(state);
} else {
query += ` WHERE sp.category_raw IS NOT NULL`;
}
query += `
GROUP BY sp.category_raw
ORDER BY product_count DESC
`;
const result = await pool.query(query, params);
res.json({
categories: result.rows,
total_categories: result.rows.length
});
} catch (error) {
console.error('Error fetching category summary:', error);
res.status(500).json({ error: 'Failed to fetch category summary' });
}
});
export default router;

View File

@@ -0,0 +1,269 @@
import { Router, Request, Response } from 'express';
import axios from 'axios';
const router = Router();
// Woodpecker API config - uses env vars or falls back
const WOODPECKER_SERVER = process.env.WOODPECKER_SERVER || 'https://ci.cannabrands.app';
const WOODPECKER_TOKEN = process.env.WOODPECKER_TOKEN;
const GITEA_SERVER = process.env.GITEA_SERVER || 'https://code.cannabrands.app';
const GITEA_TOKEN = process.env.GITEA_TOKEN;
const REPO_OWNER = 'Creationshop';
const REPO_NAME = 'dispensary-scraper';
interface PipelineStep {
name: string;
state: 'pending' | 'running' | 'success' | 'failure' | 'skipped';
started?: number;
stopped?: number;
}
interface PipelineInfo {
number: number;
status: string;
event: string;
branch: string;
message: string;
commit: string;
author: string;
created: number;
started?: number;
finished?: number;
steps?: PipelineStep[];
}
interface DeployStatusResponse {
running: {
sha: string;
sha_full: string;
build_time: string;
image_tag: string;
};
latest: {
sha: string;
sha_full: string;
message: string;
author: string;
timestamp: string;
} | null;
is_latest: boolean;
commits_behind: number;
pipeline: PipelineInfo | null;
error?: string;
}
/**
* Fetch latest commit from Gitea
*/
async function getLatestCommit(): Promise<{
sha: string;
message: string;
author: string;
timestamp: string;
} | null> {
if (!GITEA_TOKEN) {
console.warn('[DeployStatus] GITEA_TOKEN not set, skipping latest commit fetch');
return null;
}
try {
const response = await axios.get(
`${GITEA_SERVER}/api/v1/repos/${REPO_OWNER}/${REPO_NAME}/commits?limit=1`,
{
headers: { Authorization: `token ${GITEA_TOKEN}` },
timeout: 5000,
}
);
if (response.data && response.data.length > 0) {
const commit = response.data[0];
return {
sha: commit.sha,
message: commit.commit?.message?.split('\n')[0] || '',
author: commit.commit?.author?.name || commit.author?.login || 'unknown',
timestamp: commit.commit?.author?.date || commit.created,
};
}
} catch (error: any) {
console.error('[DeployStatus] Failed to fetch latest commit:', error.message);
}
return null;
}
/**
* Fetch latest pipeline from Woodpecker
*/
async function getLatestPipeline(): Promise<PipelineInfo | null> {
if (!WOODPECKER_TOKEN) {
console.warn('[DeployStatus] WOODPECKER_TOKEN not set, skipping pipeline fetch');
return null;
}
try {
// Get latest pipeline
const listResponse = await axios.get(
`${WOODPECKER_SERVER}/api/repos/${REPO_OWNER}/${REPO_NAME}/pipelines?page=1&per_page=1`,
{
headers: { Authorization: `Bearer ${WOODPECKER_TOKEN}` },
timeout: 5000,
}
);
if (!listResponse.data || listResponse.data.length === 0) {
return null;
}
const pipeline = listResponse.data[0];
// Get pipeline steps
let steps: PipelineStep[] = [];
try {
const stepsResponse = await axios.get(
`${WOODPECKER_SERVER}/api/repos/${REPO_OWNER}/${REPO_NAME}/pipelines/${pipeline.number}`,
{
headers: { Authorization: `Bearer ${WOODPECKER_TOKEN}` },
timeout: 5000,
}
);
if (stepsResponse.data?.workflows) {
for (const workflow of stepsResponse.data.workflows) {
if (workflow.children) {
for (const step of workflow.children) {
steps.push({
name: step.name,
state: step.state,
started: step.start_time,
stopped: step.end_time,
});
}
}
}
}
} catch (stepError) {
// Steps fetch failed, continue without them
}
return {
number: pipeline.number,
status: pipeline.status,
event: pipeline.event,
branch: pipeline.branch,
message: pipeline.message?.split('\n')[0] || '',
commit: pipeline.commit?.slice(0, 8) || '',
author: pipeline.author || 'unknown',
created: pipeline.created_at,
started: pipeline.started_at,
finished: pipeline.finished_at,
steps,
};
} catch (error: any) {
console.error('[DeployStatus] Failed to fetch pipeline:', error.message);
}
return null;
}
/**
* Count commits between two SHAs
*/
async function countCommitsBetween(fromSha: string, toSha: string): Promise<number> {
if (!GITEA_TOKEN || !fromSha || !toSha) return 0;
if (fromSha === toSha) return 0;
try {
const response = await axios.get(
`${GITEA_SERVER}/api/v1/repos/${REPO_OWNER}/${REPO_NAME}/commits?sha=${toSha}&limit=50`,
{
headers: { Authorization: `token ${GITEA_TOKEN}` },
timeout: 5000,
}
);
if (response.data) {
const commits = response.data;
for (let i = 0; i < commits.length; i++) {
if (commits[i].sha.startsWith(fromSha)) {
return i;
}
}
// If not found in first 50, assume more than 50 behind
return commits.length;
}
} catch (error: any) {
console.error('[DeployStatus] Failed to count commits:', error.message);
}
return 0;
}
/**
* GET /api/admin/deploy-status
* Returns deployment status with version comparison and CI info
*/
router.get('/', async (req: Request, res: Response) => {
try {
// Get running version from env vars (set during Docker build)
const runningSha = process.env.APP_GIT_SHA || 'unknown';
const running = {
sha: runningSha.slice(0, 8),
sha_full: runningSha,
build_time: process.env.APP_BUILD_TIME || new Date().toISOString(),
image_tag: process.env.CONTAINER_IMAGE_TAG?.slice(0, 8) || 'local',
};
// Fetch latest commit and pipeline in parallel
const [latestCommit, pipeline] = await Promise.all([
getLatestCommit(),
getLatestPipeline(),
]);
// Build latest info
const latest = latestCommit ? {
sha: latestCommit.sha.slice(0, 8),
sha_full: latestCommit.sha,
message: latestCommit.message,
author: latestCommit.author,
timestamp: latestCommit.timestamp,
} : null;
// Determine if running latest
const isLatest = latest
? runningSha.startsWith(latest.sha_full.slice(0, 8)) ||
latest.sha_full.startsWith(runningSha.slice(0, 8))
: true;
// Count commits behind
const commitsBehind = isLatest
? 0
: await countCommitsBetween(runningSha, latest?.sha_full || '');
const response: DeployStatusResponse = {
running,
latest,
is_latest: isLatest,
commits_behind: commitsBehind,
pipeline,
};
res.json(response);
} catch (error: any) {
console.error('[DeployStatus] Error:', error);
res.status(500).json({
error: error.message,
running: {
sha: process.env.APP_GIT_SHA?.slice(0, 8) || 'unknown',
sha_full: process.env.APP_GIT_SHA || 'unknown',
build_time: process.env.APP_BUILD_TIME || 'unknown',
image_tag: process.env.CONTAINER_IMAGE_TAG?.slice(0, 8) || 'local',
},
latest: null,
is_latest: true,
commits_behind: 0,
pipeline: null,
});
}
});
export default router;

View File

@@ -8,10 +8,12 @@ router.use(authMiddleware);
// Valid menu_type values
const VALID_MENU_TYPES = ['dutchie', 'treez', 'jane', 'weedmaps', 'leafly', 'meadow', 'blaze', 'flowhub', 'dispense', 'cova', 'other', 'unknown'];
// Get all dispensaries
// Get all dispensaries (with pagination)
router.get('/', async (req, res) => {
try {
const { menu_type, city, state, crawl_enabled, dutchie_verified } = req.query;
const { menu_type, city, state, crawl_enabled, dutchie_verified, status, limit, offset, search } = req.query;
const pageLimit = Math.min(parseInt(limit as string) || 50, 500);
const pageOffset = parseInt(offset as string) || 0;
let query = `
SELECT
@@ -98,15 +100,40 @@ router.get('/', async (req, res) => {
}
}
if (conditions.length > 0) {
query += ` WHERE ${conditions.join(' AND ')}`;
// Filter by status (e.g., 'dropped', 'open', 'closed')
if (status) {
conditions.push(`status = $${params.length + 1}`);
params.push(status);
}
// Search filter (name, dba_name, city, company_name)
if (search) {
conditions.push(`(name ILIKE $${params.length + 1} OR dba_name ILIKE $${params.length + 1} OR city ILIKE $${params.length + 1})`);
params.push(`%${search}%`);
}
// Build WHERE clause
const whereClause = conditions.length > 0 ? ` WHERE ${conditions.join(' AND ')}` : '';
// Get total count first
const countResult = await pool.query(`SELECT COUNT(*) FROM dispensaries${whereClause}`, params);
const total = parseInt(countResult.rows[0].count);
// Add pagination
query += whereClause;
query += ` ORDER BY name`;
query += ` LIMIT $${params.length + 1} OFFSET $${params.length + 2}`;
params.push(pageLimit, pageOffset);
const result = await pool.query(query, params);
res.json({ dispensaries: result.rows, total: result.rowCount });
res.json({
dispensaries: result.rows,
total,
limit: pageLimit,
offset: pageOffset,
hasMore: pageOffset + result.rows.length < total
});
} catch (error) {
console.error('Error fetching dispensaries:', error);
res.status(500).json({ error: 'Failed to fetch dispensaries' });
@@ -140,6 +167,7 @@ router.get('/stats/crawl-status', async (req, res) => {
COUNT(*) FILTER (WHERE crawl_enabled = false OR crawl_enabled IS NULL) as disabled_count,
COUNT(*) FILTER (WHERE dutchie_verified = true) as verified_count,
COUNT(*) FILTER (WHERE dutchie_verified = false OR dutchie_verified IS NULL) as unverified_count,
COUNT(*) FILTER (WHERE status = 'dropped') as dropped_count,
COUNT(*) as total_count
FROM dispensaries
`;
@@ -169,6 +197,34 @@ router.get('/stats/crawl-status', async (req, res) => {
}
});
// Get dropped stores count (for dashboard alert)
router.get('/stats/dropped', async (req, res) => {
try {
const result = await pool.query(`
SELECT
COUNT(*) as dropped_count,
json_agg(json_build_object(
'id', id,
'name', name,
'city', city,
'state', state,
'dropped_at', updated_at
) ORDER BY updated_at DESC) FILTER (WHERE status = 'dropped') as dropped_stores
FROM dispensaries
WHERE status = 'dropped'
`);
const row = result.rows[0];
res.json({
dropped_count: parseInt(row.dropped_count) || 0,
dropped_stores: row.dropped_stores || []
});
} catch (error) {
console.error('Error fetching dropped stores:', error);
res.status(500).json({ error: 'Failed to fetch dropped stores' });
}
});
// Get single dispensary by slug or ID
router.get('/:slugOrId', async (req, res) => {
try {

View File

@@ -22,11 +22,17 @@ interface ProductClickEventPayload {
store_id?: string;
brand_id?: string;
campaign_id?: string;
dispensary_name?: string;
action: 'view' | 'open_store' | 'open_product' | 'compare' | 'other';
source: string;
page_type?: string; // Page where event occurred (e.g., StoreDetailPage, BrandsIntelligence)
url_path?: string; // URL path for debugging
occurred_at?: string;
// Visitor location (from frontend IP geolocation)
visitor_city?: string;
visitor_state?: string;
visitor_lat?: number;
visitor_lng?: number;
}
/**
@@ -77,13 +83,14 @@ router.post('/product-click', optionalAuthMiddleware, async (req: Request, res:
// Insert the event with enhanced fields
await pool.query(
`INSERT INTO product_click_events
(product_id, store_id, brand_id, campaign_id, action, source, user_id, ip_address, user_agent, occurred_at, event_type, page_type, url_path, device_type)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)`,
(product_id, store_id, brand_id, campaign_id, dispensary_name, action, source, user_id, ip_address, user_agent, occurred_at, event_type, page_type, url_path, device_type, visitor_city, visitor_state, visitor_lat, visitor_lng)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19)`,
[
payload.product_id,
payload.store_id || null,
payload.brand_id || null,
payload.campaign_id || null,
payload.dispensary_name || null,
payload.action,
payload.source,
userId,
@@ -93,7 +100,11 @@ router.post('/product-click', optionalAuthMiddleware, async (req: Request, res:
'product_click', // event_type
payload.page_type || null,
payload.url_path || null,
deviceType
deviceType,
payload.visitor_city || null,
payload.visitor_state || null,
payload.visitor_lat || null,
payload.visitor_lng || null
]
);

View File

@@ -45,6 +45,8 @@ interface ApiHealth extends HealthStatus {
uptime: number;
timestamp: string;
version: string;
build_sha: string | null;
build_time: string | null;
}
interface DbHealth extends HealthStatus {
@@ -113,6 +115,8 @@ async function getApiHealth(): Promise<ApiHealth> {
uptime: Math.floor((Date.now() - serverStartTime) / 1000),
timestamp: new Date().toISOString(),
version: packageVersion,
build_sha: process.env.APP_GIT_SHA && process.env.APP_GIT_SHA !== 'unknown' ? process.env.APP_GIT_SHA : null,
build_time: process.env.APP_BUILD_TIME && process.env.APP_BUILD_TIME !== 'unknown' ? process.env.APP_BUILD_TIME : null,
};
}
@@ -138,14 +142,16 @@ async function getDbHealth(): Promise<DbHealth> {
async function getRedisHealth(): Promise<RedisHealth> {
const start = Date.now();
const isLocal = process.env.NODE_ENV === 'development' || process.env.NODE_ENV === 'local' || !process.env.NODE_ENV;
// Check if Redis is configured
if (!process.env.REDIS_URL && !process.env.REDIS_HOST) {
// Redis is optional in local dev, required in prod/staging
return {
status: 'ok', // Redis is optional
status: isLocal ? 'ok' : 'error',
connected: false,
latency_ms: 0,
error: 'Redis not configured',
error: isLocal ? 'Redis not configured (optional in local)' : 'Redis not configured (required in production)',
};
}

View File

@@ -0,0 +1,214 @@
/**
* Image Proxy Route
*
* On-demand image resizing service. Serves images with URL-based transforms.
*
* Usage:
* /img/<path>?w=200&h=200&q=80&fit=cover
*
* Parameters:
* w - width (pixels)
* h - height (pixels)
* q - quality (1-100, default 80)
* fit - resize fit: cover, contain, fill, inside, outside (default: inside)
* blur - blur sigma (0.3-1000)
* gray - grayscale (1 = enabled)
* format - output format: webp, jpeg, png, avif (default: webp)
*
* Examples:
* /img/products/az/store/brand/product/image.webp?w=200
* /img/products/az/store/brand/product/image.webp?w=600&h=400&fit=cover
* /img/products/az/store/brand/product/image.webp?w=100&blur=5&gray=1
*/
import { Router, Request, Response } from 'express';
import * as fs from 'fs/promises';
import * as path from 'path';
// @ts-ignore
const sharp = require('sharp');
const router = Router();
// Base path for images
function getImagesBasePath(): string {
if (process.env.IMAGES_PATH) {
return process.env.IMAGES_PATH;
}
if (process.env.STORAGE_BASE_PATH) {
return path.join(process.env.STORAGE_BASE_PATH, 'images');
}
return './storage/images';
}
const IMAGES_BASE_PATH = getImagesBasePath();
// Allowed fit modes
const ALLOWED_FITS = ['cover', 'contain', 'fill', 'inside', 'outside'] as const;
type FitMode = typeof ALLOWED_FITS[number];
// Allowed formats
const ALLOWED_FORMATS = ['webp', 'jpeg', 'jpg', 'png', 'avif'] as const;
type OutputFormat = typeof ALLOWED_FORMATS[number];
// Cache headers (1 year for immutable content-addressed images)
const CACHE_MAX_AGE = 31536000; // 1 year in seconds
interface TransformParams {
width?: number;
height?: number;
quality: number;
fit: FitMode;
blur?: number;
grayscale: boolean;
format: OutputFormat;
}
function parseTransformParams(query: any): TransformParams {
return {
width: query.w ? Math.min(Math.max(parseInt(query.w, 10), 1), 4000) : undefined,
height: query.h ? Math.min(Math.max(parseInt(query.h, 10), 1), 4000) : undefined,
quality: query.q ? Math.min(Math.max(parseInt(query.q, 10), 1), 100) : 80,
fit: ALLOWED_FITS.includes(query.fit) ? query.fit : 'inside',
blur: query.blur ? Math.min(Math.max(parseFloat(query.blur), 0.3), 1000) : undefined,
grayscale: query.gray === '1' || query.grayscale === '1',
format: ALLOWED_FORMATS.includes(query.format) ? query.format : 'webp',
};
}
function getContentType(format: OutputFormat): string {
switch (format) {
case 'jpeg':
case 'jpg':
return 'image/jpeg';
case 'png':
return 'image/png';
case 'avif':
return 'image/avif';
case 'webp':
default:
return 'image/webp';
}
}
/**
* Image proxy endpoint
* GET /img/*
*/
router.get('/*', async (req: Request, res: Response) => {
try {
// Get the image path from URL (everything after /img/)
const imagePath = req.params[0];
if (!imagePath) {
return res.status(400).json({ error: 'Image path required' });
}
// Security: prevent directory traversal
const normalizedPath = path.normalize(imagePath).replace(/^(\.\.(\/|\\|$))+/, '');
const basePath = path.resolve(IMAGES_BASE_PATH);
const fullPath = path.resolve(path.join(IMAGES_BASE_PATH, normalizedPath));
// Ensure path is within base directory
if (!fullPath.startsWith(basePath)) {
console.error(`[ImageProxy] Path traversal attempt: ${fullPath} not in ${basePath}`);
return res.status(403).json({ error: 'Access denied' });
}
// Check if file exists
try {
await fs.access(fullPath);
} catch {
return res.status(404).json({ error: 'Image not found' });
}
// Parse transform parameters
const params = parseTransformParams(req.query);
// Check if any transforms are requested
const hasTransforms = params.width || params.height || params.blur || params.grayscale;
// Read the original image
const imageBuffer = await fs.readFile(fullPath);
let outputBuffer: Buffer;
if (hasTransforms) {
// Apply transforms
let pipeline = sharp(imageBuffer);
// Resize
if (params.width || params.height) {
pipeline = pipeline.resize(params.width, params.height, {
fit: params.fit,
withoutEnlargement: true,
});
}
// Blur
if (params.blur) {
pipeline = pipeline.blur(params.blur);
}
// Grayscale
if (params.grayscale) {
pipeline = pipeline.grayscale();
}
// Output format
switch (params.format) {
case 'jpeg':
case 'jpg':
pipeline = pipeline.jpeg({ quality: params.quality });
break;
case 'png':
pipeline = pipeline.png({ quality: params.quality });
break;
case 'avif':
pipeline = pipeline.avif({ quality: params.quality });
break;
case 'webp':
default:
pipeline = pipeline.webp({ quality: params.quality });
}
outputBuffer = await pipeline.toBuffer();
} else {
// No transforms - serve original (but maybe convert format)
if (params.format !== 'webp' || params.quality !== 80) {
let pipeline = sharp(imageBuffer);
switch (params.format) {
case 'jpeg':
case 'jpg':
pipeline = pipeline.jpeg({ quality: params.quality });
break;
case 'png':
pipeline = pipeline.png({ quality: params.quality });
break;
case 'avif':
pipeline = pipeline.avif({ quality: params.quality });
break;
case 'webp':
default:
pipeline = pipeline.webp({ quality: params.quality });
}
outputBuffer = await pipeline.toBuffer();
} else {
outputBuffer = imageBuffer;
}
}
// Set headers
res.setHeader('Content-Type', getContentType(params.format));
res.setHeader('Cache-Control', `public, max-age=${CACHE_MAX_AGE}, immutable`);
res.setHeader('X-Image-Size', outputBuffer.length);
// Send image
res.send(outputBuffer);
} catch (error: any) {
console.error('[ImageProxy] Error:', error.message);
res.status(500).json({ error: 'Failed to process image' });
}
});
export default router;

View File

@@ -143,6 +143,152 @@ router.get('/', async (req: Request, res: Response) => {
}
});
/**
* GET /api/job-queue/available - List dispensaries available for crawling
* Query: { state_code?: string, limit?: number }
* NOTE: Must be defined BEFORE /:id route to avoid conflict
*/
router.get('/available', async (req: Request, res: Response) => {
try {
const { state_code, limit = '100' } = req.query;
let query = `
SELECT
d.id,
d.name,
d.city,
s.code as state_code,
d.platform_dispensary_id,
d.crawl_enabled,
(SELECT MAX(created_at) FROM dispensary_crawl_jobs WHERE dispensary_id = d.id AND status = 'completed') as last_crawl,
EXISTS (
SELECT 1 FROM dispensary_crawl_jobs
WHERE dispensary_id = d.id AND status IN ('pending', 'running')
) as has_pending_job
FROM dispensaries d
LEFT JOIN states s ON s.id = d.state_id
WHERE d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL
`;
const params: any[] = [];
let paramIndex = 1;
if (state_code) {
params.push((state_code as string).toUpperCase());
query += ` AND s.code = $${paramIndex++}`;
}
query += ` ORDER BY d.name LIMIT $${paramIndex}`;
params.push(parseInt(limit as string));
const { rows } = await pool.query(query, params);
// Get counts by state
const { rows: stateCounts } = await pool.query(`
SELECT s.code, COUNT(*) as count
FROM dispensaries d
JOIN states s ON s.id = d.state_id
WHERE d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL
GROUP BY s.code
ORDER BY count DESC
`);
res.json({
success: true,
dispensaries: rows,
total: rows.length,
by_state: stateCounts
});
} catch (error: any) {
console.error('[JobQueue] Error listing available:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/job-queue/history - Get recent job history with results
* Query: { state_code?: string, status?: string, limit?: number, hours?: number }
* NOTE: Must be defined BEFORE /:id route to avoid conflict
*/
router.get('/history', async (req: Request, res: Response) => {
try {
const {
state_code,
status,
limit = '50',
hours = '24'
} = req.query;
let query = `
SELECT
j.id,
j.dispensary_id,
d.name as dispensary_name,
s.code as state_code,
j.job_type,
j.status,
j.products_found,
j.error_message,
j.started_at,
j.completed_at,
j.duration_ms,
j.created_at
FROM dispensary_crawl_jobs j
LEFT JOIN dispensaries d ON d.id = j.dispensary_id
LEFT JOIN states s ON s.id = d.state_id
WHERE j.created_at > NOW() - INTERVAL '${parseInt(hours as string)} hours'
`;
const params: any[] = [];
let paramIndex = 1;
if (status && status !== 'all') {
params.push(status);
query += ` AND j.status = $${paramIndex++}`;
}
if (state_code) {
params.push((state_code as string).toUpperCase());
query += ` AND s.code = $${paramIndex++}`;
}
query += ` ORDER BY j.created_at DESC LIMIT $${paramIndex}`;
params.push(parseInt(limit as string));
const { rows } = await pool.query(query, params);
// Get summary stats
const { rows: stats } = await pool.query(`
SELECT
COUNT(*) FILTER (WHERE status = 'completed') as completed,
COUNT(*) FILTER (WHERE status = 'failed') as failed,
COUNT(*) FILTER (WHERE status = 'running') as running,
COUNT(*) FILTER (WHERE status = 'pending') as pending,
SUM(products_found) FILTER (WHERE status = 'completed') as total_products,
AVG(duration_ms) FILTER (WHERE status = 'completed') as avg_duration_ms
FROM dispensary_crawl_jobs
WHERE created_at > NOW() - INTERVAL '${parseInt(hours as string)} hours'
`);
res.json({
success: true,
jobs: rows,
summary: {
completed: parseInt(stats[0].completed) || 0,
failed: parseInt(stats[0].failed) || 0,
running: parseInt(stats[0].running) || 0,
pending: parseInt(stats[0].pending) || 0,
total_products: parseInt(stats[0].total_products) || 0,
avg_duration_ms: Math.round(parseFloat(stats[0].avg_duration_ms)) || null
},
hours: parseInt(hours as string)
});
} catch (error: any) {
console.error('[JobQueue] Error getting history:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/job-queue/stats - Queue statistics
*/
@@ -463,5 +609,165 @@ router.get('/paused', async (_req: Request, res: Response) => {
res.json({ success: true, queue_paused: queuePaused });
});
/**
* POST /api/job-queue/enqueue-batch - Queue multiple dispensaries at once
* Body: { dispensary_ids: number[], job_type?: string, priority?: number }
*/
router.post('/enqueue-batch', async (req: Request, res: Response) => {
try {
const { dispensary_ids, job_type = 'dutchie_product_crawl', priority = 0 } = req.body;
if (!Array.isArray(dispensary_ids) || dispensary_ids.length === 0) {
return res.status(400).json({ success: false, error: 'dispensary_ids array is required' });
}
if (dispensary_ids.length > 500) {
return res.status(400).json({ success: false, error: 'Maximum 500 dispensaries per batch' });
}
// Insert jobs, skipping duplicates
const { rows } = await pool.query(`
INSERT INTO dispensary_crawl_jobs (dispensary_id, job_type, priority, trigger_type, status, created_at)
SELECT
d.id,
$2::text,
$3::integer,
'api_batch',
'pending',
NOW()
FROM dispensaries d
WHERE d.id = ANY($1::int[])
AND d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL
AND NOT EXISTS (
SELECT 1 FROM dispensary_crawl_jobs cj
WHERE cj.dispensary_id = d.id
AND cj.job_type = $2::text
AND cj.status IN ('pending', 'running')
)
RETURNING id, dispensary_id
`, [dispensary_ids, job_type, priority]);
res.json({
success: true,
queued: rows.length,
requested: dispensary_ids.length,
job_ids: rows.map(r => r.id),
message: `Queued ${rows.length} of ${dispensary_ids.length} dispensaries`
});
} catch (error: any) {
console.error('[JobQueue] Error batch enqueuing:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/job-queue/enqueue-state - Queue all crawl-enabled dispensaries for a state
* Body: { state_code: string, job_type?: string, priority?: number, limit?: number }
*/
router.post('/enqueue-state', async (req: Request, res: Response) => {
try {
const { state_code, job_type = 'dutchie_product_crawl', priority = 0, limit = 200 } = req.body;
if (!state_code) {
return res.status(400).json({ success: false, error: 'state_code is required (e.g., "AZ")' });
}
// Get state_id and queue jobs
const { rows } = await pool.query(`
WITH target_state AS (
SELECT id FROM states WHERE code = $1
)
INSERT INTO dispensary_crawl_jobs (dispensary_id, job_type, priority, trigger_type, status, created_at)
SELECT
d.id,
$2::text,
$3::integer,
'api_state',
'pending',
NOW()
FROM dispensaries d, target_state
WHERE d.state_id = target_state.id
AND d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL
AND NOT EXISTS (
SELECT 1 FROM dispensary_crawl_jobs cj
WHERE cj.dispensary_id = d.id
AND cj.job_type = $2::text
AND cj.status IN ('pending', 'running')
)
LIMIT $4::integer
RETURNING id, dispensary_id
`, [state_code.toUpperCase(), job_type, priority, limit]);
// Get total available count
const countResult = await pool.query(`
WITH target_state AS (
SELECT id FROM states WHERE code = $1
)
SELECT COUNT(*) as total
FROM dispensaries d, target_state
WHERE d.state_id = target_state.id
AND d.crawl_enabled = true
AND d.platform_dispensary_id IS NOT NULL
`, [state_code.toUpperCase()]);
res.json({
success: true,
queued: rows.length,
total_available: parseInt(countResult.rows[0].total),
state: state_code.toUpperCase(),
job_type,
message: `Queued ${rows.length} dispensaries for ${state_code.toUpperCase()}`
});
} catch (error: any) {
console.error('[JobQueue] Error enqueuing state:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/job-queue/clear-pending - Clear all pending jobs (optionally filtered)
* Body: { state_code?: string, job_type?: string }
*/
router.post('/clear-pending', async (req: Request, res: Response) => {
try {
const { state_code, job_type } = req.body;
let query = `
UPDATE dispensary_crawl_jobs
SET status = 'cancelled', completed_at = NOW(), updated_at = NOW()
WHERE status = 'pending'
`;
const params: any[] = [];
let paramIndex = 1;
if (job_type) {
params.push(job_type);
query += ` AND job_type = $${paramIndex++}`;
}
if (state_code) {
params.push((state_code as string).toUpperCase());
query += ` AND dispensary_id IN (
SELECT d.id FROM dispensaries d
JOIN states s ON s.id = d.state_id
WHERE s.code = $${paramIndex++}
)`;
}
const result = await pool.query(query, params);
res.json({
success: true,
cleared: result.rowCount,
message: `Cancelled ${result.rowCount} pending jobs`
});
} catch (error: any) {
console.error('[JobQueue] Error clearing pending:', error);
res.status(500).json({ success: false, error: error.message });
}
});
export default router;
export { queuePaused };

View File

@@ -1,11 +1,29 @@
import { Router } from 'express';
import { authMiddleware } from '../auth/middleware';
import { pool } from '../db/pool';
import { getImageUrl } from '../utils/minio';
const router = Router();
router.use(authMiddleware);
/**
* Convert local image path to proxy URL
* /images/products/... -> /img/products/...
*/
function getImageUrl(localPath: string): string {
if (!localPath) return '';
// If already a full URL, return as-is
if (localPath.startsWith('http')) return localPath;
// Convert /images/ path to /img/ proxy path
if (localPath.startsWith('/images/')) {
return '/img' + localPath.substring(7);
}
// Handle paths without leading slash
if (localPath.startsWith('images/')) {
return '/img/' + localPath.substring(7);
}
return '/img/' + localPath;
}
// Freshness threshold: data older than this is considered stale
const STALE_THRESHOLD_HOURS = 4;

View File

@@ -2,7 +2,7 @@ import { Router } from 'express';
import { authMiddleware, requireRole } from '../auth/middleware';
import { pool } from '../db/pool';
import { testProxy, addProxy, addProxiesFromList } from '../services/proxy';
import { createProxyTestJob, getProxyTestJob, getActiveProxyTestJob, cancelProxyTestJob } from '../services/proxyTestQueue';
import { createProxyTestJob, getProxyTestJob, getActiveProxyTestJob, cancelProxyTestJob, ProxyTestMode } from '../services/proxyTestQueue';
const router = Router();
router.use(authMiddleware);
@@ -11,9 +11,10 @@ router.use(authMiddleware);
router.get('/', async (req, res) => {
try {
const result = await pool.query(`
SELECT id, host, port, protocol, active, is_anonymous,
SELECT id, host, port, protocol, username, password, active, is_anonymous,
last_tested_at, test_result, response_time_ms, created_at,
city, state, country, country_code, location_updated_at
city, state, country, country_code, location_updated_at,
COALESCE(max_connections, 1) as max_connections
FROM proxies
ORDER BY created_at DESC
`);
@@ -166,13 +167,39 @@ router.post('/:id/test', requireRole('superadmin', 'admin'), async (req, res) =>
});
// Start proxy test job
// Query params: mode=all|failed|inactive, concurrency=10
router.post('/test-all', requireRole('superadmin', 'admin'), async (req, res) => {
try {
const jobId = await createProxyTestJob();
res.json({ jobId, message: 'Proxy test job started' });
} catch (error) {
const mode = (req.query.mode as ProxyTestMode) || 'all';
const concurrency = parseInt(req.query.concurrency as string) || 10;
// Validate mode
if (!['all', 'failed', 'inactive'].includes(mode)) {
return res.status(400).json({ error: 'Invalid mode. Use: all, failed, or inactive' });
}
// Validate concurrency (1-50)
if (concurrency < 1 || concurrency > 50) {
return res.status(400).json({ error: 'Concurrency must be between 1 and 50' });
}
const jobId = await createProxyTestJob(mode, concurrency);
res.json({ jobId, mode, concurrency, message: `Proxy test job started (mode: ${mode}, concurrency: ${concurrency})` });
} catch (error: any) {
console.error('Error starting proxy test job:', error);
res.status(500).json({ error: 'Failed to start proxy test job' });
res.status(500).json({ error: error.message || 'Failed to start proxy test job' });
}
});
// Convenience endpoint: Test only failed proxies
router.post('/test-failed', requireRole('superadmin', 'admin'), async (req, res) => {
try {
const concurrency = parseInt(req.query.concurrency as string) || 10;
const jobId = await createProxyTestJob('failed', concurrency);
res.json({ jobId, mode: 'failed', concurrency, message: 'Retesting failed proxies...' });
} catch (error: any) {
console.error('Error starting failed proxy test:', error);
res.status(500).json({ error: error.message || 'Failed to start proxy test job' });
}
});
@@ -197,7 +224,7 @@ router.post('/test-job/:jobId/cancel', requireRole('superadmin', 'admin'), async
router.put('/:id', requireRole('superadmin', 'admin'), async (req, res) => {
try {
const { id } = req.params;
const { host, port, protocol, username, password, active } = req.body;
const { host, port, protocol, username, password, active, max_connections } = req.body;
const result = await pool.query(`
UPDATE proxies
@@ -207,10 +234,11 @@ router.put('/:id', requireRole('superadmin', 'admin'), async (req, res) => {
username = COALESCE($4, username),
password = COALESCE($5, password),
active = COALESCE($6, active),
max_connections = COALESCE($7, max_connections),
updated_at = CURRENT_TIMESTAMP
WHERE id = $7
WHERE id = $8
RETURNING *
`, [host, port, protocol, username, password, active, id]);
`, [host, port, protocol, username, password, active, max_connections, id]);
if (result.rows.length === 0) {
return res.status(404).json({ error: 'Proxy not found' });

File diff suppressed because it is too large Load Diff

595
backend/src/routes/tasks.ts Normal file
View File

@@ -0,0 +1,595 @@
/**
* Task Queue API Routes
*
* Endpoints for managing worker tasks, viewing capacity metrics,
* and generating batch tasks.
*/
import { Router, Request, Response } from 'express';
import {
taskService,
TaskRole,
TaskStatus,
TaskFilter,
} from '../tasks/task-service';
import { pool } from '../db/pool';
const router = Router();
/**
* GET /api/tasks
* List tasks with optional filters
*
* Query params:
* - role: Filter by role
* - status: Filter by status (comma-separated for multiple)
* - dispensary_id: Filter by dispensary
* - worker_id: Filter by worker
* - limit: Max results (default 100)
* - offset: Pagination offset
*/
router.get('/', async (req: Request, res: Response) => {
try {
const filter: TaskFilter = {};
if (req.query.role) {
filter.role = req.query.role as TaskRole;
}
if (req.query.status) {
const statuses = (req.query.status as string).split(',') as TaskStatus[];
filter.status = statuses.length === 1 ? statuses[0] : statuses;
}
if (req.query.dispensary_id) {
filter.dispensary_id = parseInt(req.query.dispensary_id as string, 10);
}
if (req.query.worker_id) {
filter.worker_id = req.query.worker_id as string;
}
if (req.query.limit) {
filter.limit = parseInt(req.query.limit as string, 10);
}
if (req.query.offset) {
filter.offset = parseInt(req.query.offset as string, 10);
}
const tasks = await taskService.listTasks(filter);
res.json({ tasks, count: tasks.length });
} catch (error: unknown) {
console.error('Error listing tasks:', error);
res.status(500).json({ error: 'Failed to list tasks' });
}
});
/**
* GET /api/tasks/counts
* Get task counts by status
*/
router.get('/counts', async (_req: Request, res: Response) => {
try {
const counts = await taskService.getTaskCounts();
res.json(counts);
} catch (error: unknown) {
console.error('Error getting task counts:', error);
res.status(500).json({ error: 'Failed to get task counts' });
}
});
/**
* GET /api/tasks/capacity
* Get capacity metrics for all roles
*/
router.get('/capacity', async (_req: Request, res: Response) => {
try {
const metrics = await taskService.getCapacityMetrics();
res.json({ metrics });
} catch (error: unknown) {
console.error('Error getting capacity metrics:', error);
res.status(500).json({ error: 'Failed to get capacity metrics' });
}
});
/**
* GET /api/tasks/capacity/:role
* Get capacity metrics for a specific role
*/
router.get('/capacity/:role', async (req: Request, res: Response) => {
try {
const role = req.params.role as TaskRole;
const capacity = await taskService.getRoleCapacity(role);
if (!capacity) {
return res.status(404).json({ error: 'Role not found or no data' });
}
// Calculate workers needed for different SLAs
const workersFor1Hour = await taskService.calculateWorkersNeeded(role, 1);
const workersFor4Hours = await taskService.calculateWorkersNeeded(role, 4);
const workersFor8Hours = await taskService.calculateWorkersNeeded(role, 8);
res.json({
...capacity,
workers_needed: {
for_1_hour: workersFor1Hour,
for_4_hours: workersFor4Hours,
for_8_hours: workersFor8Hours,
},
});
} catch (error: unknown) {
console.error('Error getting role capacity:', error);
res.status(500).json({ error: 'Failed to get role capacity' });
}
});
/**
* GET /api/tasks/:id
* Get a specific task by ID
*/
router.get('/:id', async (req: Request, res: Response) => {
try {
const taskId = parseInt(req.params.id, 10);
const task = await taskService.getTask(taskId);
if (!task) {
return res.status(404).json({ error: 'Task not found' });
}
res.json(task);
} catch (error: unknown) {
console.error('Error getting task:', error);
res.status(500).json({ error: 'Failed to get task' });
}
});
/**
* DELETE /api/tasks/:id
* Delete a specific task by ID
* Only allows deletion of failed, completed, or pending tasks (not running)
*/
router.delete('/:id', async (req: Request, res: Response) => {
try {
const taskId = parseInt(req.params.id, 10);
// First check if task exists and its status
const task = await taskService.getTask(taskId);
if (!task) {
return res.status(404).json({ error: 'Task not found' });
}
// Don't allow deleting running tasks
if (task.status === 'running' || task.status === 'claimed') {
return res.status(400).json({ error: 'Cannot delete a running or claimed task' });
}
// Delete the task
await pool.query('DELETE FROM worker_tasks WHERE id = $1', [taskId]);
res.json({ success: true, message: `Task ${taskId} deleted` });
} catch (error: unknown) {
console.error('Error deleting task:', error);
res.status(500).json({ error: 'Failed to delete task' });
}
});
/**
* POST /api/tasks
* Create a new task
*
* Body:
* - role: TaskRole (required)
* - dispensary_id: number (optional)
* - platform: string (optional)
* - priority: number (optional, default 0)
* - scheduled_for: ISO date string (optional)
*/
router.post('/', async (req: Request, res: Response) => {
try {
const { role, dispensary_id, platform, priority, scheduled_for } = req.body;
if (!role) {
return res.status(400).json({ error: 'Role is required' });
}
// Check if store already has an active task
if (dispensary_id) {
const hasActive = await taskService.hasActiveTask(dispensary_id);
if (hasActive) {
return res.status(409).json({
error: 'Store already has an active task',
dispensary_id,
});
}
}
const task = await taskService.createTask({
role,
dispensary_id,
platform,
priority,
scheduled_for: scheduled_for ? new Date(scheduled_for) : undefined,
});
res.status(201).json(task);
} catch (error: unknown) {
console.error('Error creating task:', error);
res.status(500).json({ error: 'Failed to create task' });
}
});
/**
* POST /api/tasks/generate/resync
* Generate daily resync tasks for all active stores
*
* Body:
* - batches_per_day: number (optional, default 6 = every 4 hours)
* - date: ISO date string (optional, default today)
*/
router.post('/generate/resync', async (req: Request, res: Response) => {
try {
const { batches_per_day, date } = req.body;
const batchesPerDay = batches_per_day ?? 6;
const targetDate = date ? new Date(date) : new Date();
const createdCount = await taskService.generateDailyResyncTasks(
batchesPerDay,
targetDate
);
res.json({
success: true,
tasks_created: createdCount,
batches_per_day: batchesPerDay,
date: targetDate.toISOString().split('T')[0],
});
} catch (error: unknown) {
console.error('Error generating resync tasks:', error);
res.status(500).json({ error: 'Failed to generate resync tasks' });
}
});
/**
* POST /api/tasks/generate/discovery
* Generate store discovery tasks for a platform
*
* Body:
* - platform: string (required, e.g., 'dutchie')
* - state_code: string (optional, e.g., 'AZ')
* - priority: number (optional)
*/
router.post('/generate/discovery', async (req: Request, res: Response) => {
try {
const { platform, state_code, priority } = req.body;
if (!platform) {
return res.status(400).json({ error: 'Platform is required' });
}
const task = await taskService.createStoreDiscoveryTask(
platform,
state_code,
priority ?? 0
);
res.status(201).json(task);
} catch (error: unknown) {
console.error('Error creating discovery task:', error);
res.status(500).json({ error: 'Failed to create discovery task' });
}
});
/**
* POST /api/tasks/recover-stale
* Recover stale tasks from dead workers
*
* Body:
* - threshold_minutes: number (optional, default 10)
*/
router.post('/recover-stale', async (req: Request, res: Response) => {
try {
const { threshold_minutes } = req.body;
const recovered = await taskService.recoverStaleTasks(threshold_minutes ?? 10);
res.json({
success: true,
tasks_recovered: recovered,
});
} catch (error: unknown) {
console.error('Error recovering stale tasks:', error);
res.status(500).json({ error: 'Failed to recover stale tasks' });
}
});
/**
* GET /api/tasks/role/:role/last-completion
* Get the last completion time for a role
*/
router.get('/role/:role/last-completion', async (req: Request, res: Response) => {
try {
const role = req.params.role as TaskRole;
const lastCompletion = await taskService.getLastCompletion(role);
res.json({
role,
last_completion: lastCompletion?.toISOString() ?? null,
time_since: lastCompletion
? Math.floor((Date.now() - lastCompletion.getTime()) / 1000)
: null,
});
} catch (error: unknown) {
console.error('Error getting last completion:', error);
res.status(500).json({ error: 'Failed to get last completion' });
}
});
/**
* GET /api/tasks/role/:role/recent
* Get recent completions for a role
*/
router.get('/role/:role/recent', async (req: Request, res: Response) => {
try {
const role = req.params.role as TaskRole;
const limit = parseInt(req.query.limit as string, 10) || 10;
const tasks = await taskService.getRecentCompletions(role, limit);
res.json({ tasks });
} catch (error: unknown) {
console.error('Error getting recent completions:', error);
res.status(500).json({ error: 'Failed to get recent completions' });
}
});
/**
* GET /api/tasks/store/:dispensaryId/active
* Check if a store has an active task
*/
router.get('/store/:dispensaryId/active', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId, 10);
const hasActive = await taskService.hasActiveTask(dispensaryId);
res.json({
dispensary_id: dispensaryId,
has_active_task: hasActive,
});
} catch (error: unknown) {
console.error('Error checking active task:', error);
res.status(500).json({ error: 'Failed to check active task' });
}
});
// ============================================================
// MIGRATION ROUTES - Disable old job systems
// ============================================================
/**
* GET /api/tasks/migration/status
* Get status of old job systems vs new task queue
*/
router.get('/migration/status', async (_req: Request, res: Response) => {
try {
// Get old job system counts
const [schedules, crawlJobs, rawPayloads, taskCounts] = await Promise.all([
pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE enabled = true) as enabled
FROM job_schedules
`),
pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE status = 'pending') as pending,
COUNT(*) FILTER (WHERE status = 'running') as running
FROM dispensary_crawl_jobs
`),
pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE processed = false) as unprocessed
FROM raw_payloads
`),
taskService.getTaskCounts(),
]);
res.json({
old_systems: {
job_schedules: {
total: parseInt(schedules.rows[0].total) || 0,
enabled: parseInt(schedules.rows[0].enabled) || 0,
},
dispensary_crawl_jobs: {
total: parseInt(crawlJobs.rows[0].total) || 0,
pending: parseInt(crawlJobs.rows[0].pending) || 0,
running: parseInt(crawlJobs.rows[0].running) || 0,
},
raw_payloads: {
total: parseInt(rawPayloads.rows[0].total) || 0,
unprocessed: parseInt(rawPayloads.rows[0].unprocessed) || 0,
},
},
new_task_queue: taskCounts,
recommendation: schedules.rows[0].enabled > 0
? 'Disable old job schedules before switching to new task queue'
: 'Ready to use new task queue',
});
} catch (error: unknown) {
console.error('Error getting migration status:', error);
res.status(500).json({ error: 'Failed to get migration status' });
}
});
/**
* POST /api/tasks/migration/disable-old-schedules
* Disable all old job schedules to prepare for new task queue
*/
router.post('/migration/disable-old-schedules', async (_req: Request, res: Response) => {
try {
const result = await pool.query(`
UPDATE job_schedules
SET enabled = false,
updated_at = NOW()
WHERE enabled = true
RETURNING id, job_name
`);
res.json({
success: true,
disabled_count: result.rowCount,
disabled_schedules: result.rows.map(r => ({ id: r.id, job_name: r.job_name })),
});
} catch (error: unknown) {
console.error('Error disabling old schedules:', error);
res.status(500).json({ error: 'Failed to disable old schedules' });
}
});
/**
* POST /api/tasks/migration/cancel-pending-crawl-jobs
* Cancel all pending crawl jobs from the old system
*/
router.post('/migration/cancel-pending-crawl-jobs', async (_req: Request, res: Response) => {
try {
const result = await pool.query(`
UPDATE dispensary_crawl_jobs
SET status = 'cancelled',
completed_at = NOW(),
updated_at = NOW()
WHERE status = 'pending'
RETURNING id
`);
res.json({
success: true,
cancelled_count: result.rowCount,
});
} catch (error: unknown) {
console.error('Error cancelling pending crawl jobs:', error);
res.status(500).json({ error: 'Failed to cancel pending crawl jobs' });
}
});
/**
* POST /api/tasks/migration/create-resync-tasks
* Create product_refresh tasks for all crawl-enabled dispensaries
*/
router.post('/migration/create-resync-tasks', async (req: Request, res: Response) => {
try {
const { priority = 0, state_code } = req.body;
let query = `
SELECT id, name FROM dispensaries
WHERE crawl_enabled = true
AND platform_dispensary_id IS NOT NULL
`;
const params: any[] = [];
if (state_code) {
query += `
AND state_id = (SELECT id FROM states WHERE code = $1)
`;
params.push(state_code.toUpperCase());
}
query += ` ORDER BY id`;
const dispensaries = await pool.query(query, params);
let created = 0;
for (const disp of dispensaries.rows) {
// Check if already has pending/running task
const hasActive = await taskService.hasActiveTask(disp.id);
if (!hasActive) {
await taskService.createTask({
role: 'product_refresh',
dispensary_id: disp.id,
platform: 'dutchie',
priority,
});
created++;
}
}
res.json({
success: true,
tasks_created: created,
dispensaries_checked: dispensaries.rows.length,
state_filter: state_code || 'all',
});
} catch (error: unknown) {
console.error('Error creating resync tasks:', error);
res.status(500).json({ error: 'Failed to create resync tasks' });
}
});
/**
* POST /api/tasks/migration/full-migrate
* One-click migration: disable old systems, create new tasks
*/
router.post('/migration/full-migrate', async (req: Request, res: Response) => {
try {
const results: any = {
success: true,
steps: [],
};
// Step 1: Disable old job schedules
const disableResult = await pool.query(`
UPDATE job_schedules
SET enabled = false, updated_at = NOW()
WHERE enabled = true
RETURNING id
`);
results.steps.push({
step: 'disable_job_schedules',
count: disableResult.rowCount,
});
// Step 2: Cancel pending crawl jobs
const cancelResult = await pool.query(`
UPDATE dispensary_crawl_jobs
SET status = 'cancelled', completed_at = NOW(), updated_at = NOW()
WHERE status = 'pending'
RETURNING id
`);
results.steps.push({
step: 'cancel_pending_crawl_jobs',
count: cancelResult.rowCount,
});
// Step 3: Generate initial resync tasks
const resyncCount = await taskService.generateDailyResyncTasks(6);
results.steps.push({
step: 'generate_resync_tasks',
count: resyncCount,
});
// Step 4: Create store discovery task
const discoveryTask = await taskService.createStoreDiscoveryTask('dutchie', undefined, 0);
results.steps.push({
step: 'create_discovery_task',
task_id: discoveryTask.id,
});
// Step 5: Create analytics refresh task
const analyticsTask = await taskService.createTask({
role: 'analytics_refresh',
priority: 0,
});
results.steps.push({
step: 'create_analytics_task',
task_id: analyticsTask.id,
});
results.message = 'Migration complete. New task workers will pick up tasks.';
res.json(results);
} catch (error: unknown) {
console.error('Error during full migration:', error);
res.status(500).json({ error: 'Failed to complete migration' });
}
});
export default router;

View File

@@ -1,18 +1,32 @@
import { Router, Request, Response } from 'express';
import { readFileSync } from 'fs';
import { join } from 'path';
const router = Router();
// Read package.json version at startup
let packageVersion = 'unknown';
try {
const packageJson = JSON.parse(readFileSync(join(__dirname, '../../package.json'), 'utf-8'));
packageVersion = packageJson.version || 'unknown';
} catch {
// Fallback if package.json not found
}
/**
* GET /api/version
* Returns build version information for display in admin UI
*/
router.get('/', async (req: Request, res: Response) => {
try {
const gitSha = process.env.APP_GIT_SHA || 'unknown';
const versionInfo = {
build_version: process.env.APP_BUILD_VERSION || 'dev',
git_sha: process.env.APP_GIT_SHA || 'local',
build_time: process.env.APP_BUILD_TIME || new Date().toISOString(),
image_tag: process.env.CONTAINER_IMAGE_TAG || 'local',
version: packageVersion,
build_version: process.env.APP_BUILD_VERSION?.slice(0, 8) || 'dev',
git_sha: gitSha.slice(0, 8) || 'unknown',
git_sha_full: gitSha,
build_time: process.env.APP_BUILD_TIME || 'unknown',
image_tag: process.env.CONTAINER_IMAGE_TAG?.slice(0, 8) || 'local',
};
res.json(versionInfo);

View File

@@ -0,0 +1,652 @@
/**
* Worker Registry API Routes
*
* Dynamic worker management - workers register on startup, get assigned names,
* and report heartbeats. Everything is API-driven, no hardcoding.
*
* Endpoints:
* POST /api/worker-registry/register - Worker reports for duty
* POST /api/worker-registry/heartbeat - Worker heartbeat
* POST /api/worker-registry/deregister - Worker signing off
* GET /api/worker-registry/workers - List all workers (for dashboard)
* GET /api/worker-registry/workers/:id - Get specific worker
* POST /api/worker-registry/cleanup - Mark stale workers offline
*
* GET /api/worker-registry/names - List all names in pool
* POST /api/worker-registry/names - Add names to pool
* DELETE /api/worker-registry/names/:name - Remove name from pool
*
* GET /api/worker-registry/roles - List available task roles
* POST /api/worker-registry/roles - Add a new role (future)
*/
import { Router, Request, Response } from 'express';
import { pool } from '../db/pool';
import os from 'os';
const router = Router();
// ============================================================
// WORKER REGISTRATION
// ============================================================
/**
* POST /api/worker-registry/register
* Worker reports for duty - gets assigned a friendly name
*
* Body:
* - role: string (optional) - task role, or null for role-agnostic workers
* - worker_id: string (optional) - custom ID, auto-generated if not provided
* - pod_name: string (optional) - k8s pod name
* - hostname: string (optional) - machine hostname
* - metadata: object (optional) - additional worker info
*
* Returns:
* - worker_id: assigned worker ID
* - friendly_name: assigned name from pool
* - role: confirmed role (or null if agnostic)
* - message: welcome message
*/
router.post('/register', async (req: Request, res: Response) => {
try {
const {
role = null, // Role is now optional - null means agnostic
worker_id,
pod_name,
hostname,
ip_address,
metadata = {}
} = req.body;
// Generate worker_id if not provided
const finalWorkerId = worker_id || `worker-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
const finalHostname = hostname || os.hostname();
const clientIp = ip_address || req.ip || req.socket.remoteAddress;
// Check if worker already registered
const existing = await pool.query(
'SELECT id, friendly_name, status FROM worker_registry WHERE worker_id = $1',
[finalWorkerId]
);
if (existing.rows.length > 0) {
// Re-activate existing worker
const { rows } = await pool.query(`
UPDATE worker_registry
SET status = 'active',
role = $1,
pod_name = $2,
hostname = $3,
ip_address = $4,
last_heartbeat_at = NOW(),
started_at = NOW(),
metadata = $5,
updated_at = NOW()
WHERE worker_id = $6
RETURNING id, worker_id, friendly_name, role
`, [role, pod_name, finalHostname, clientIp, metadata, finalWorkerId]);
const worker = rows[0];
const roleMsg = role ? `for ${role}` : 'as role-agnostic';
console.log(`[WorkerRegistry] Worker "${worker.friendly_name}" (${finalWorkerId}) re-registered ${roleMsg}`);
return res.json({
success: true,
worker_id: worker.worker_id,
friendly_name: worker.friendly_name,
role: worker.role,
message: role
? `Welcome back, ${worker.friendly_name}! You are assigned to ${role}.`
: `Welcome back, ${worker.friendly_name}! You are ready to take any task.`
});
}
// Assign a friendly name
const nameResult = await pool.query('SELECT assign_worker_name($1) as name', [finalWorkerId]);
const friendlyName = nameResult.rows[0].name;
// Register the worker
const { rows } = await pool.query(`
INSERT INTO worker_registry (
worker_id, friendly_name, role, pod_name, hostname, ip_address, status, metadata
) VALUES ($1, $2, $3, $4, $5, $6, 'active', $7)
RETURNING id, worker_id, friendly_name, role
`, [finalWorkerId, friendlyName, role, pod_name, finalHostname, clientIp, metadata]);
const worker = rows[0];
const roleMsg = role ? `for ${role}` : 'as role-agnostic';
console.log(`[WorkerRegistry] New worker "${friendlyName}" (${finalWorkerId}) reporting for duty ${roleMsg}`);
res.json({
success: true,
worker_id: worker.worker_id,
friendly_name: worker.friendly_name,
role: worker.role,
message: role
? `Hello ${friendlyName}! You are now registered for ${role}. Ready for work!`
: `Hello ${friendlyName}! You are ready to take any task from the pool.`
});
} catch (error: any) {
console.error('[WorkerRegistry] Registration error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/heartbeat
* Worker sends heartbeat to stay alive
*
* Body:
* - worker_id: string (required)
* - current_task_id: number (optional) - task currently being processed
* - status: string (optional) - 'active', 'idle'
*/
router.post('/heartbeat', async (req: Request, res: Response) => {
try {
const { worker_id, current_task_id, status = 'active', resources } = req.body;
if (!worker_id) {
return res.status(400).json({ success: false, error: 'worker_id is required' });
}
// Store resources in metadata jsonb column
const { rows } = await pool.query(`
UPDATE worker_registry
SET last_heartbeat_at = NOW(),
current_task_id = $1,
status = $2,
metadata = COALESCE(metadata, '{}'::jsonb) || COALESCE($4::jsonb, '{}'::jsonb),
updated_at = NOW()
WHERE worker_id = $3
RETURNING id, friendly_name, status
`, [current_task_id || null, status, worker_id, resources ? JSON.stringify(resources) : null]);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found - please register first' });
}
res.json({
success: true,
worker: rows[0]
});
} catch (error: any) {
console.error('[WorkerRegistry] Heartbeat error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/task-completed
* Worker reports task completion
*
* Body:
* - worker_id: string (required)
* - success: boolean (required)
*/
router.post('/task-completed', async (req: Request, res: Response) => {
try {
const { worker_id, success } = req.body;
if (!worker_id) {
return res.status(400).json({ success: false, error: 'worker_id is required' });
}
const incrementField = success ? 'tasks_completed' : 'tasks_failed';
const { rows } = await pool.query(`
UPDATE worker_registry
SET ${incrementField} = ${incrementField} + 1,
last_task_at = NOW(),
current_task_id = NULL,
status = 'idle',
updated_at = NOW()
WHERE worker_id = $1
RETURNING id, friendly_name, tasks_completed, tasks_failed
`, [worker_id]);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
res.json({ success: true, worker: rows[0] });
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/deregister
* Worker signing off (graceful shutdown)
*
* Body:
* - worker_id: string (required)
*/
router.post('/deregister', async (req: Request, res: Response) => {
try {
const { worker_id } = req.body;
if (!worker_id) {
return res.status(400).json({ success: false, error: 'worker_id is required' });
}
// Release the name back to the pool
await pool.query('SELECT release_worker_name($1)', [worker_id]);
// Mark as terminated
const { rows } = await pool.query(`
UPDATE worker_registry
SET status = 'terminated',
current_task_id = NULL,
updated_at = NOW()
WHERE worker_id = $1
RETURNING id, friendly_name
`, [worker_id]);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
console.log(`[WorkerRegistry] Worker "${rows[0].friendly_name}" (${worker_id}) signed off`);
res.json({
success: true,
message: `Goodbye ${rows[0].friendly_name}! Thanks for your work.`
});
} catch (error: any) {
console.error('[WorkerRegistry] Deregister error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// WORKER LISTING (for Dashboard)
// ============================================================
/**
* GET /api/worker-registry/workers
* List all workers (for dashboard)
*
* Query params:
* - status: filter by status (active, idle, offline, all)
* - role: filter by role
* - include_terminated: include terminated workers (default: false)
*/
router.get('/workers', async (req: Request, res: Response) => {
try {
const { status, role, include_terminated = 'false' } = req.query;
let whereClause = include_terminated === 'true' ? 'WHERE 1=1' : "WHERE status != 'terminated'";
const params: any[] = [];
let paramIndex = 1;
if (status && status !== 'all') {
whereClause += ` AND status = $${paramIndex}`;
params.push(status);
paramIndex++;
}
if (role) {
whereClause += ` AND role = $${paramIndex}`;
params.push(role);
paramIndex++;
}
const { rows } = await pool.query(`
SELECT
id,
worker_id,
friendly_name,
role,
pod_name,
hostname,
ip_address,
status,
started_at,
last_heartbeat_at,
last_task_at,
tasks_completed,
tasks_failed,
current_task_id,
metadata,
EXTRACT(EPOCH FROM (NOW() - last_heartbeat_at)) as seconds_since_heartbeat,
CASE
WHEN status = 'offline' OR status = 'terminated' THEN status
WHEN last_heartbeat_at < NOW() - INTERVAL '2 minutes' THEN 'stale'
WHEN current_task_id IS NOT NULL THEN 'busy'
ELSE 'ready'
END as health_status,
created_at
FROM worker_registry
${whereClause}
ORDER BY
CASE status
WHEN 'active' THEN 1
WHEN 'idle' THEN 2
WHEN 'offline' THEN 3
ELSE 4
END,
last_heartbeat_at DESC
`, params);
// Get summary counts
const { rows: summary } = await pool.query(`
SELECT
COUNT(*) FILTER (WHERE status = 'active') as active_count,
COUNT(*) FILTER (WHERE status = 'idle') as idle_count,
COUNT(*) FILTER (WHERE status = 'offline') as offline_count,
COUNT(*) FILTER (WHERE status != 'terminated') as total_count,
COUNT(DISTINCT role) FILTER (WHERE status IN ('active', 'idle')) as active_roles
FROM worker_registry
`);
res.json({
success: true,
workers: rows,
summary: summary[0]
});
} catch (error: any) {
console.error('[WorkerRegistry] List workers error:', error);
res.status(500).json({ success: false, error: error.message });
}
});
/**
* GET /api/worker-registry/workers/:workerId
* Get specific worker details
*/
router.get('/workers/:workerId', async (req: Request, res: Response) => {
try {
const { workerId } = req.params;
const { rows } = await pool.query(`
SELECT * FROM worker_registry WHERE worker_id = $1
`, [workerId]);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
res.json({ success: true, worker: rows[0] });
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* DELETE /api/worker-registry/workers/:workerId
* Remove a worker (admin action)
*/
router.delete('/workers/:workerId', async (req: Request, res: Response) => {
try {
const { workerId } = req.params;
// Release name
await pool.query('SELECT release_worker_name($1)', [workerId]);
// Delete worker
const { rows } = await pool.query(`
DELETE FROM worker_registry WHERE worker_id = $1 RETURNING friendly_name
`, [workerId]);
if (rows.length === 0) {
return res.status(404).json({ success: false, error: 'Worker not found' });
}
res.json({ success: true, message: `Worker ${rows[0].friendly_name} removed` });
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/cleanup
* Mark stale workers as offline
*
* Body:
* - stale_threshold_minutes: number (default: 5)
*/
router.post('/cleanup', async (req: Request, res: Response) => {
try {
const { stale_threshold_minutes = 5 } = req.body;
const { rows } = await pool.query(
'SELECT mark_stale_workers($1) as count',
[stale_threshold_minutes]
);
res.json({
success: true,
stale_workers_marked: rows[0].count,
message: `Marked ${rows[0].count} stale workers as offline`
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// NAME POOL MANAGEMENT
// ============================================================
/**
* GET /api/worker-registry/names
* List all names in the pool
*/
router.get('/names', async (_req: Request, res: Response) => {
try {
const { rows } = await pool.query(`
SELECT
id,
name,
in_use,
assigned_to,
assigned_at
FROM worker_name_pool
ORDER BY in_use DESC, name ASC
`);
const { rows: summary } = await pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE in_use = true) as in_use,
COUNT(*) FILTER (WHERE in_use = false) as available
FROM worker_name_pool
`);
res.json({
success: true,
names: rows,
summary: summary[0]
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* POST /api/worker-registry/names
* Add names to the pool
*
* Body:
* - names: string[] (required) - array of names to add
*/
router.post('/names', async (req: Request, res: Response) => {
try {
const { names } = req.body;
if (!names || !Array.isArray(names) || names.length === 0) {
return res.status(400).json({ success: false, error: 'names array is required' });
}
const values = names.map(n => `('${n.replace(/'/g, "''")}')`).join(', ');
const { rowCount } = await pool.query(`
INSERT INTO worker_name_pool (name)
VALUES ${values}
ON CONFLICT (name) DO NOTHING
`);
res.json({
success: true,
added: rowCount,
message: `Added ${rowCount} new names to the pool`
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
/**
* DELETE /api/worker-registry/names/:name
* Remove a name from the pool (only if not in use)
*/
router.delete('/names/:name', async (req: Request, res: Response) => {
try {
const { name } = req.params;
const { rows } = await pool.query(`
DELETE FROM worker_name_pool
WHERE name = $1 AND in_use = false
RETURNING name
`, [name]);
if (rows.length === 0) {
return res.status(400).json({
success: false,
error: 'Name not found or currently in use'
});
}
res.json({ success: true, message: `Name "${name}" removed from pool` });
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
// ============================================================
// ROLE MANAGEMENT
// ============================================================
/**
* GET /api/worker-registry/roles
* List available task roles
*/
router.get('/roles', async (_req: Request, res: Response) => {
// These are the roles the task handlers support
const roles = [
{
id: 'product_refresh',
name: 'Product Refresh',
description: 'Re-crawl dispensary products for price/stock changes',
handler: 'handleProductRefresh'
},
{
id: 'product_discovery',
name: 'Product Discovery',
description: 'Initial product discovery for new dispensaries',
handler: 'handleProductDiscovery'
},
{
id: 'store_discovery',
name: 'Store Discovery',
description: 'Discover new dispensary locations',
handler: 'handleStoreDiscovery'
},
{
id: 'entry_point_discovery',
name: 'Entry Point Discovery',
description: 'Resolve platform IDs from menu URLs',
handler: 'handleEntryPointDiscovery'
},
{
id: 'analytics_refresh',
name: 'Analytics Refresh',
description: 'Refresh materialized views and analytics',
handler: 'handleAnalyticsRefresh'
}
];
// Get active worker counts per role
try {
const { rows } = await pool.query(`
SELECT role, COUNT(*) as worker_count
FROM worker_registry
WHERE status IN ('active', 'idle')
GROUP BY role
`);
const countMap = new Map(rows.map(r => [r.role, parseInt(r.worker_count)]));
const rolesWithCounts = roles.map(r => ({
...r,
active_workers: countMap.get(r.id) || 0
}));
res.json({ success: true, roles: rolesWithCounts });
} catch {
// If table doesn't exist yet, just return roles without counts
res.json({ success: true, roles: roles.map(r => ({ ...r, active_workers: 0 })) });
}
});
/**
* GET /api/worker-registry/capacity
* Get capacity planning info
*/
router.get('/capacity', async (_req: Request, res: Response) => {
try {
// Get worker counts by role
const { rows: workerCounts } = await pool.query(`
SELECT role, COUNT(*) as count
FROM worker_registry
WHERE status IN ('active', 'idle')
GROUP BY role
`);
// Get pending task counts by role (if worker_tasks exists)
let taskCounts: any[] = [];
try {
const result = await pool.query(`
SELECT role, COUNT(*) as pending_count
FROM worker_tasks
WHERE status = 'pending'
GROUP BY role
`);
taskCounts = result.rows;
} catch {
// worker_tasks might not exist yet
}
// Get crawl-enabled store count
const storeCountResult = await pool.query(`
SELECT COUNT(*) as count
FROM dispensaries
WHERE crawl_enabled = true AND platform_dispensary_id IS NOT NULL
`);
const totalStores = parseInt(storeCountResult.rows[0].count);
const workerMap = new Map(workerCounts.map(r => [r.role, parseInt(r.count)]));
const taskMap = new Map(taskCounts.map(r => [r.role, parseInt(r.pending_count)]));
const roles = ['product_refresh', 'product_discovery', 'store_discovery', 'entry_point_discovery', 'analytics_refresh'];
const capacity = roles.map(role => ({
role,
active_workers: workerMap.get(role) || 0,
pending_tasks: taskMap.get(role) || 0,
// Rough estimate: 20 seconds per task, 4-hour cycle
tasks_per_worker_per_cycle: 720,
workers_needed_for_all_stores: Math.ceil(totalStores / 720)
}));
res.json({
success: true,
total_stores: totalStores,
capacity
});
} catch (error: any) {
res.status(500).json({ success: false, error: error.message });
}
});
export default router;

View File

@@ -0,0 +1,250 @@
#!/usr/bin/env npx tsx
/**
* Crawl Single Store - Verbose test showing each step
*
* Usage:
* DATABASE_URL="postgresql://dutchie:dutchie_local_pass@localhost:54320/dutchie_menus" \
* npx tsx src/scripts/crawl-single-store.ts <dispensaryId>
*
* Example:
* DATABASE_URL="..." npx tsx src/scripts/crawl-single-store.ts 112
*/
import { Pool } from 'pg';
import dotenv from 'dotenv';
import {
executeGraphQL,
startSession,
endSession,
getFingerprint,
GRAPHQL_HASHES,
DUTCHIE_CONFIG,
} from '../platforms/dutchie';
dotenv.config();
// ============================================================
// DATABASE CONNECTION
// ============================================================
function getConnectionString(): string {
if (process.env.DATABASE_URL) {
return process.env.DATABASE_URL;
}
if (process.env.CANNAIQ_DB_URL) {
return process.env.CANNAIQ_DB_URL;
}
const host = process.env.CANNAIQ_DB_HOST || 'localhost';
const port = process.env.CANNAIQ_DB_PORT || '54320';
const name = process.env.CANNAIQ_DB_NAME || 'dutchie_menus';
const user = process.env.CANNAIQ_DB_USER || 'dutchie';
const pass = process.env.CANNAIQ_DB_PASS || 'dutchie_local_pass';
return `postgresql://${user}:${pass}@${host}:${port}/${name}`;
}
const pool = new Pool({ connectionString: getConnectionString() });
// ============================================================
// MAIN
// ============================================================
async function main() {
const dispensaryId = parseInt(process.argv[2], 10);
if (!dispensaryId) {
console.error('Usage: npx tsx src/scripts/crawl-single-store.ts <dispensaryId>');
console.error('Example: npx tsx src/scripts/crawl-single-store.ts 112');
process.exit(1);
}
console.log('');
console.log('╔════════════════════════════════════════════════════════════╗');
console.log('║ SINGLE STORE CRAWL - VERBOSE OUTPUT ║');
console.log('╚════════════════════════════════════════════════════════════╝');
console.log('');
try {
// ============================================================
// STEP 1: Get dispensary info from database
// ============================================================
console.log('┌─────────────────────────────────────────────────────────────┐');
console.log('│ STEP 1: Load Dispensary Info from Database │');
console.log('└─────────────────────────────────────────────────────────────┘');
const dispResult = await pool.query(`
SELECT
id,
name,
platform_dispensary_id,
menu_url,
menu_type,
city,
state
FROM dispensaries
WHERE id = $1
`, [dispensaryId]);
if (dispResult.rows.length === 0) {
throw new Error(`Dispensary ${dispensaryId} not found`);
}
const disp = dispResult.rows[0];
console.log(` Dispensary ID: ${disp.id}`);
console.log(` Name: ${disp.name}`);
console.log(` City, State: ${disp.city}, ${disp.state}`);
console.log(` Menu Type: ${disp.menu_type}`);
console.log(` Platform ID: ${disp.platform_dispensary_id}`);
console.log(` Menu URL: ${disp.menu_url}`);
if (!disp.platform_dispensary_id) {
throw new Error('Dispensary does not have a platform_dispensary_id - cannot crawl');
}
// Extract cName from menu_url
const cNameMatch = disp.menu_url?.match(/\/(?:embedded-menu|dispensary)\/([^/?]+)/);
const cName = cNameMatch ? cNameMatch[1] : 'dispensary';
console.log(` cName (derived): ${cName}`);
console.log('');
// ============================================================
// STEP 2: Start stealth session
// ============================================================
console.log('┌─────────────────────────────────────────────────────────────┐');
console.log('│ STEP 2: Start Stealth Session │');
console.log('└─────────────────────────────────────────────────────────────┘');
// Use Arizona timezone for this store
const session = startSession(disp.state || 'AZ', 'America/Phoenix');
const fp = getFingerprint();
console.log(` Session ID: ${session.sessionId}`);
console.log(` User-Agent: ${fp.userAgent.slice(0, 60)}...`);
console.log(` Accept-Language: ${fp.acceptLanguage}`);
console.log(` Sec-CH-UA: ${fp.secChUa || '(not set)'}`);
console.log('');
// ============================================================
// STEP 3: Execute GraphQL query
// ============================================================
console.log('┌─────────────────────────────────────────────────────────────┐');
console.log('│ STEP 3: Execute GraphQL Query (FilteredProducts) │');
console.log('└─────────────────────────────────────────────────────────────┘');
const variables = {
includeEnterpriseSpecials: false,
productsFilter: {
dispensaryId: disp.platform_dispensary_id,
pricingType: 'rec',
Status: 'Active',
types: [],
useCache: true,
isDefaultSort: true,
sortBy: 'popularSortIdx',
sortDirection: 1,
bypassOnlineThresholds: true,
isKioskMenu: false,
removeProductsBelowOptionThresholds: false,
},
page: 0,
perPage: 100,
};
console.log(` Endpoint: ${DUTCHIE_CONFIG.graphqlEndpoint}`);
console.log(` Operation: FilteredProducts`);
console.log(` Hash: ${GRAPHQL_HASHES.FilteredProducts.slice(0, 20)}...`);
console.log(` dispensaryId: ${variables.productsFilter.dispensaryId}`);
console.log(` pricingType: ${variables.productsFilter.pricingType}`);
console.log(` Status: ${variables.productsFilter.Status}`);
console.log(` perPage: ${variables.perPage}`);
console.log('');
console.log(' Sending request...');
const startTime = Date.now();
const result = await executeGraphQL(
'FilteredProducts',
variables,
GRAPHQL_HASHES.FilteredProducts,
{ cName, maxRetries: 3 }
);
const elapsed = Date.now() - startTime;
console.log(` Response time: ${elapsed}ms`);
console.log('');
// ============================================================
// STEP 4: Process response
// ============================================================
console.log('┌─────────────────────────────────────────────────────────────┐');
console.log('│ STEP 4: Process Response │');
console.log('└─────────────────────────────────────────────────────────────┘');
const data = result?.data?.filteredProducts;
if (!data) {
console.log(' ERROR: No data returned from GraphQL');
console.log(' Raw result:', JSON.stringify(result, null, 2).slice(0, 500));
endSession();
return;
}
const products = data.products || [];
const totalCount = data.queryInfo?.totalCount || 0;
const totalPages = Math.ceil(totalCount / 100);
console.log(` Total products: ${totalCount}`);
console.log(` Products in page: ${products.length}`);
console.log(` Total pages: ${totalPages}`);
console.log('');
// Show first few products
console.log(' First 5 products:');
console.log(' ─────────────────────────────────────────────────────────');
for (let i = 0; i < Math.min(5, products.length); i++) {
const p = products[i];
const name = (p.name || 'Unknown').slice(0, 40);
const brand = (p.brand?.name || 'Unknown').slice(0, 15);
const price = p.Prices?.[0]?.price || p.medPrice || p.recPrice || 'N/A';
const category = p.type || p.category || 'N/A';
console.log(` ${i + 1}. ${name.padEnd(42)} | ${brand.padEnd(17)} | $${price}`);
}
console.log('');
// ============================================================
// STEP 5: End session
// ============================================================
console.log('┌─────────────────────────────────────────────────────────────┐');
console.log('│ STEP 5: End Session │');
console.log('└─────────────────────────────────────────────────────────────┘');
endSession();
console.log('');
// ============================================================
// SUMMARY
// ============================================================
console.log('╔════════════════════════════════════════════════════════════╗');
console.log('║ SUMMARY ║');
console.log('╠════════════════════════════════════════════════════════════╣');
console.log(`║ Store: ${disp.name.slice(0, 38).padEnd(38)}`);
console.log(`║ Products Found: ${String(totalCount).padEnd(38)}`);
console.log(`║ Response Time: ${(elapsed + 'ms').padEnd(38)}`);
console.log(`║ Status: ${'SUCCESS'.padEnd(38)}`);
console.log('╚════════════════════════════════════════════════════════════╝');
} catch (error: any) {
console.error('');
console.error('╔════════════════════════════════════════════════════════════╗');
console.error('║ ERROR ║');
console.error('╚════════════════════════════════════════════════════════════╝');
console.error(` ${error.message}`);
if (error.stack) {
console.error('');
console.error('Stack trace:');
console.error(error.stack.split('\n').slice(0, 5).join('\n'));
}
process.exit(1);
} finally {
await pool.end();
}
}
main();

View File

@@ -23,6 +23,7 @@ import {
DutchieNormalizer,
hydrateToCanonical,
} from '../hydration';
import { initializeImageStorage } from '../utils/image-storage';
dotenv.config();
@@ -137,6 +138,11 @@ async function main() {
console.log(`Test Crawl to Canonical - Dispensary ${dispensaryId}`);
console.log('============================================================\n');
// Initialize image storage
console.log('[Init] Initializing image storage...');
await initializeImageStorage();
console.log(' Image storage ready\n');
try {
// Step 1: Get dispensary info
console.log('[Step 1] Getting dispensary info...');

View File

@@ -0,0 +1,80 @@
#!/usr/bin/env npx tsx
/**
* Test Image Proxy - Standalone test without backend
*
* Usage:
* npx tsx src/scripts/test-image-proxy.ts
*/
import express from 'express';
import imageProxyRoutes from '../routes/image-proxy';
const app = express();
const PORT = 3099;
// Mount the image proxy
app.use('/img', imageProxyRoutes);
// Start server
app.listen(PORT, async () => {
console.log(`Test image proxy running on http://localhost:${PORT}`);
console.log('');
console.log('Testing image proxy...');
console.log('');
const axios = require('axios');
// Test cases
const tests = [
{
name: 'Original image',
url: '/img/products/az/az-deeply-rooted/clout-king/68b4b20a0f9ef3e90eb51e96/image-268a6e44.webp',
},
{
name: 'Resize to 200px width',
url: '/img/products/az/az-deeply-rooted/clout-king/68b4b20a0f9ef3e90eb51e96/image-268a6e44.webp?w=200',
},
{
name: 'Resize to 100x100 cover',
url: '/img/products/az/az-deeply-rooted/clout-king/68b4b20a0f9ef3e90eb51e96/image-268a6e44.webp?w=100&h=100&fit=cover',
},
{
name: 'Grayscale + blur',
url: '/img/products/az/az-deeply-rooted/clout-king/68b4b20a0f9ef3e90eb51e96/image-268a6e44.webp?w=200&gray=1&blur=2',
},
{
name: 'Convert to JPEG',
url: '/img/products/az/az-deeply-rooted/clout-king/68b4b20a0f9ef3e90eb51e96/image-268a6e44.webp?w=200&format=jpeg&q=70',
},
{
name: 'Non-existent image',
url: '/img/products/az/nonexistent/image.webp',
},
];
for (const test of tests) {
try {
const response = await axios.get(`http://localhost:${PORT}${test.url}`, {
responseType: 'arraybuffer',
validateStatus: () => true,
});
const contentType = response.headers['content-type'];
const size = response.data.length;
const status = response.status;
console.log(`${test.name}:`);
console.log(` URL: ${test.url.slice(0, 80)}${test.url.length > 80 ? '...' : ''}`);
console.log(` Status: ${status}`);
console.log(` Content-Type: ${contentType}`);
console.log(` Size: ${(size / 1024).toFixed(1)} KB`);
console.log('');
} catch (error: any) {
console.log(`${test.name}: ERROR - ${error.message}`);
console.log('');
}
}
console.log('Tests complete!');
process.exit(0);
});

View File

@@ -0,0 +1,117 @@
/**
* Test script for stealth session management
*
* Tests:
* 1. Per-session fingerprint rotation
* 2. Geographic consistency (timezone → Accept-Language)
* 3. Proxy location loading from database
*
* Usage:
* npx tsx src/scripts/test-stealth-session.ts
*/
import {
startSession,
endSession,
getCurrentSession,
getFingerprint,
getRandomFingerprint,
getLocaleForTimezone,
buildHeaders,
} from '../platforms/dutchie';
console.log('='.repeat(60));
console.log('STEALTH SESSION TEST');
console.log('='.repeat(60));
// Test 1: Timezone to Locale mapping
console.log('\n[Test 1] Timezone to Locale Mapping:');
const testTimezones = [
'America/Phoenix',
'America/Los_Angeles',
'America/New_York',
'America/Chicago',
undefined,
'Invalid/Timezone',
];
for (const tz of testTimezones) {
const locale = getLocaleForTimezone(tz);
console.log(` ${tz || '(undefined)'}${locale}`);
}
// Test 2: Random fingerprint selection
console.log('\n[Test 2] Random Fingerprint Selection (5 samples):');
for (let i = 0; i < 5; i++) {
const fp = getRandomFingerprint();
console.log(` ${i + 1}. ${fp.userAgent.slice(0, 60)}...`);
}
// Test 3: Session Management
console.log('\n[Test 3] Session Management:');
// Before session - should use default fingerprint
console.log(' Before session:');
const beforeFp = getFingerprint();
console.log(` getFingerprint(): ${beforeFp.userAgent.slice(0, 50)}...`);
console.log(` getCurrentSession(): ${getCurrentSession()}`);
// Start session with Arizona timezone
console.log('\n Starting session (AZ, America/Phoenix):');
const session1 = startSession('AZ', 'America/Phoenix');
console.log(` Session ID: ${session1.sessionId}`);
console.log(` Fingerprint UA: ${session1.fingerprint.userAgent.slice(0, 50)}...`);
console.log(` Accept-Language: ${session1.fingerprint.acceptLanguage}`);
console.log(` Timezone: ${session1.timezone}`);
// During session - should use session fingerprint
console.log('\n During session:');
const duringFp = getFingerprint();
console.log(` getFingerprint(): ${duringFp.userAgent.slice(0, 50)}...`);
console.log(` Same as session? ${duringFp.userAgent === session1.fingerprint.userAgent}`);
// Test buildHeaders with session
console.log('\n buildHeaders() during session:');
const headers = buildHeaders('/embedded-menu/test-store');
console.log(` User-Agent: ${headers['user-agent'].slice(0, 50)}...`);
console.log(` Accept-Language: ${headers['accept-language']}`);
console.log(` Origin: ${headers['origin']}`);
console.log(` Referer: ${headers['referer']}`);
// End session
console.log('\n Ending session:');
endSession();
console.log(` getCurrentSession(): ${getCurrentSession()}`);
// Test 4: Multiple sessions should have different fingerprints
console.log('\n[Test 4] Multiple Sessions (fingerprint variety):');
const fingerprints: string[] = [];
for (let i = 0; i < 10; i++) {
const session = startSession('CA', 'America/Los_Angeles');
fingerprints.push(session.fingerprint.userAgent);
endSession();
}
const uniqueCount = new Set(fingerprints).size;
console.log(` 10 sessions created, ${uniqueCount} unique fingerprints`);
console.log(` Variety: ${uniqueCount >= 3 ? '✅ Good' : '⚠️ Low - may need more fingerprint options'}`);
// Test 5: Geographic consistency check
console.log('\n[Test 5] Geographic Consistency:');
const geoTests = [
{ state: 'AZ', tz: 'America/Phoenix' },
{ state: 'CA', tz: 'America/Los_Angeles' },
{ state: 'NY', tz: 'America/New_York' },
{ state: 'IL', tz: 'America/Chicago' },
];
for (const { state, tz } of geoTests) {
const session = startSession(state, tz);
const consistent = session.fingerprint.acceptLanguage.includes('en-US');
console.log(` ${state} (${tz}): Accept-Language=${session.fingerprint.acceptLanguage} ${consistent ? '✅' : '❌'}`);
endSession();
}
console.log('\n' + '='.repeat(60));
console.log('TEST COMPLETE');
console.log('='.repeat(60));

View File

@@ -61,6 +61,13 @@ export interface Proxy {
failureCount: number;
successCount: number;
avgResponseTimeMs: number | null;
maxConnections: number; // Number of concurrent connections allowed (for rotating proxies)
// Location info (if known)
city?: string;
state?: string;
country?: string;
countryCode?: string;
timezone?: string;
}
export interface ProxyStats {
@@ -109,18 +116,27 @@ export class ProxyRotator {
username,
password,
protocol,
is_active as "isActive",
last_used_at as "lastUsedAt",
active as "isActive",
last_tested_at as "lastUsedAt",
failure_count as "failureCount",
success_count as "successCount",
avg_response_time_ms as "avgResponseTimeMs"
0 as "successCount",
response_time_ms as "avgResponseTimeMs",
COALESCE(max_connections, 1) as "maxConnections",
city,
state,
country,
country_code as "countryCode",
timezone
FROM proxies
WHERE is_active = true
ORDER BY failure_count ASC, last_used_at ASC NULLS FIRST
WHERE active = true
ORDER BY failure_count ASC, last_tested_at ASC NULLS FIRST
`);
this.proxies = result.rows;
console.log(`[ProxyRotator] Loaded ${this.proxies.length} active proxies`);
// Calculate total concurrent capacity
const totalCapacity = this.proxies.reduce((sum, p) => sum + p.maxConnections, 0);
console.log(`[ProxyRotator] Loaded ${this.proxies.length} active proxies (${totalCapacity} max concurrent connections)`);
} catch (error) {
// Table might not exist - that's okay
console.warn(`[ProxyRotator] Could not load proxies: ${error}`);
@@ -192,11 +208,11 @@ export class ProxyRotator {
UPDATE proxies
SET
failure_count = failure_count + 1,
last_failure_at = NOW(),
last_error = $2,
is_active = CASE WHEN failure_count >= 4 THEN false ELSE is_active END
updated_at = NOW(),
test_result = $2,
active = CASE WHEN failure_count >= 4 THEN false ELSE active END
WHERE id = $1
`, [proxyId, error || null]);
`, [proxyId, error || 'failed']);
} catch (err) {
console.error(`[ProxyRotator] Failed to update proxy ${proxyId}:`, err);
}
@@ -226,12 +242,13 @@ export class ProxyRotator {
await this.pool.query(`
UPDATE proxies
SET
success_count = success_count + 1,
last_used_at = NOW(),
avg_response_time_ms = CASE
WHEN avg_response_time_ms IS NULL THEN $2
ELSE (avg_response_time_ms * 0.8) + ($2 * 0.2)
END
last_tested_at = NOW(),
test_result = 'success',
response_time_ms = CASE
WHEN response_time_ms IS NULL THEN $2
ELSE (response_time_ms * 0.8 + $2 * 0.2)::integer
END,
updated_at = NOW()
WHERE id = $1
`, [proxyId, responseTimeMs || null]);
} catch (err) {
@@ -255,7 +272,7 @@ export class ProxyRotator {
*/
getStats(): ProxyStats {
const totalProxies = this.proxies.length;
const activeProxies = this.proxies.filter(p => p.isActive).length;
const activeProxies = this.proxies.reduce((sum, p) => sum + p.maxConnections, 0); // Total concurrent capacity
const blockedProxies = this.proxies.filter(p => p.failureCount >= 5).length;
const successRates = this.proxies
@@ -268,7 +285,7 @@ export class ProxyRotator {
return {
totalProxies,
activeProxies,
activeProxies, // Total concurrent capacity across all proxies
blockedProxies,
avgSuccessRate,
};
@@ -402,6 +419,26 @@ export class CrawlRotator {
await this.proxy.markFailed(current.id, error);
}
}
/**
* Get current proxy location info (for reporting)
* Note: For rotating proxies (like IPRoyal), the actual exit location varies per request
*/
getProxyLocation(): { city?: string; state?: string; country?: string; timezone?: string; isRotating: boolean } | null {
const current = this.proxy.getCurrent();
if (!current) return null;
// Check if this is a rotating proxy (max_connections > 1 usually indicates rotating)
const isRotating = current.maxConnections > 1;
return {
city: current.city,
state: current.state,
country: current.country,
timezone: current.timezone,
isRotating
};
}
}
// ============================================================

View File

@@ -0,0 +1,134 @@
/**
* IP2Location Service
*
* Uses local IP2Location LITE DB3 database for IP geolocation.
* No external API calls, no rate limits.
*
* Database: IP2Location LITE DB3 (free, monthly updates)
* Fields: country, region, city, latitude, longitude
*/
import path from 'path';
import fs from 'fs';
// @ts-ignore - no types for ip2location-nodejs
const { IP2Location } = require('ip2location-nodejs');
const DB_PATH = process.env.IP2LOCATION_DB_PATH ||
path.join(__dirname, '../../data/ip2location/IP2LOCATION-LITE-DB5.BIN');
let ip2location: any = null;
let dbLoaded = false;
/**
* Initialize IP2Location database
*/
export function initIP2Location(): boolean {
if (dbLoaded) return true;
try {
if (!fs.existsSync(DB_PATH)) {
console.warn(`IP2Location database not found at: ${DB_PATH}`);
console.warn('Run: ./scripts/download-ip2location.sh to download');
return false;
}
ip2location = new IP2Location();
ip2location.open(DB_PATH);
dbLoaded = true;
console.log('IP2Location database loaded successfully');
return true;
} catch (err) {
console.error('Failed to load IP2Location database:', err);
return false;
}
}
/**
* Close IP2Location database
*/
export function closeIP2Location(): void {
if (ip2location) {
ip2location.close();
ip2location = null;
dbLoaded = false;
}
}
export interface GeoLocation {
city: string | null;
state: string | null;
stateCode: string | null;
country: string | null;
countryCode: string | null;
lat: number | null;
lng: number | null;
}
/**
* Lookup IP address location
*
* @param ip - IPv4 or IPv6 address
* @returns Location data or null if not found
*/
export function lookupIP(ip: string): GeoLocation | null {
// Skip private/localhost IPs
if (!ip || ip === '127.0.0.1' || ip === '::1' ||
ip.startsWith('192.168.') || ip.startsWith('10.') ||
ip.startsWith('172.16.') || ip.startsWith('172.17.') ||
ip.startsWith('::ffff:127.') || ip.startsWith('::ffff:192.168.') ||
ip.startsWith('::ffff:10.')) {
return null;
}
// Strip IPv6 prefix if present
const cleanIP = ip.replace(/^::ffff:/, '');
// Initialize on first use if not already loaded
if (!dbLoaded) {
if (!initIP2Location()) {
return null;
}
}
try {
const result = ip2location.getAll(cleanIP);
if (!result || result.ip === '?' || result.countryShort === '-') {
return null;
}
// DB3 LITE doesn't include lat/lng - would need DB5+ for that
const lat = typeof result.latitude === 'number' && result.latitude !== 0 ? result.latitude : null;
const lng = typeof result.longitude === 'number' && result.longitude !== 0 ? result.longitude : null;
return {
city: result.city !== '-' ? result.city : null,
state: result.region !== '-' ? result.region : null,
stateCode: null, // DB3 doesn't include state codes
country: result.countryLong !== '-' ? result.countryLong : null,
countryCode: result.countryShort !== '-' ? result.countryShort : null,
lat,
lng,
};
} catch (err) {
console.error('IP2Location lookup error:', err);
return null;
}
}
/**
* Check if IP2Location database is available
*/
export function isIP2LocationAvailable(): boolean {
if (dbLoaded) return true;
return fs.existsSync(DB_PATH);
}
// Export singleton-style interface
export default {
init: initIP2Location,
close: closeIP2Location,
lookup: lookupIP,
isAvailable: isIP2LocationAvailable,
};

View File

@@ -276,7 +276,6 @@ export async function addProxiesFromList(proxies: Array<{
await pool.query(`
INSERT INTO proxies (host, port, protocol, username, password, active)
VALUES ($1, $2, $3, $4, $5, false)
ON CONFLICT (host, port, protocol) DO NOTHING
`, [
proxy.host,
proxy.port,
@@ -285,27 +284,9 @@ export async function addProxiesFromList(proxies: Array<{
proxy.password
]);
// Check if it was actually inserted
const result = await pool.query(`
SELECT id FROM proxies
WHERE host = $1 AND port = $2 AND protocol = $3
`, [proxy.host, proxy.port, proxy.protocol]);
if (result.rows.length > 0) {
// Check if it was just inserted (no last_tested_at means new)
const checkResult = await pool.query(`
SELECT last_tested_at FROM proxies
WHERE host = $1 AND port = $2 AND protocol = $3
`, [proxy.host, proxy.port, proxy.protocol]);
if (checkResult.rows[0].last_tested_at === null) {
added++;
if (added % 100 === 0) {
console.log(`📥 Imported ${added} proxies...`);
}
} else {
duplicates++;
}
added++;
if (added % 100 === 0) {
console.log(`📥 Imported ${added} proxies...`);
}
} catch (error: any) {
failed++;

View File

@@ -8,8 +8,12 @@ interface ProxyTestJob {
tested_proxies: number;
passed_proxies: number;
failed_proxies: number;
mode?: string; // 'all' | 'failed' | 'inactive'
}
// Concurrency settings
const DEFAULT_CONCURRENCY = 10; // Test 10 proxies at a time
// Simple in-memory queue - could be replaced with Bull/Bee-Queue for production
const activeJobs = new Map<number, { cancelled: boolean }>();
@@ -33,18 +37,35 @@ export async function cleanupOrphanedJobs(): Promise<void> {
}
}
export async function createProxyTestJob(): Promise<number> {
export type ProxyTestMode = 'all' | 'failed' | 'inactive';
export async function createProxyTestJob(mode: ProxyTestMode = 'all', concurrency: number = DEFAULT_CONCURRENCY): Promise<number> {
// Check for existing running jobs first
const existingJob = await getActiveProxyTestJob();
if (existingJob) {
throw new Error('A proxy test job is already running. Please cancel it first.');
}
const result = await pool.query(`
SELECT COUNT(*) as count FROM proxies
`);
// Get count based on mode
let countQuery: string;
switch (mode) {
case 'failed':
countQuery = `SELECT COUNT(*) as count FROM proxies WHERE test_result = 'failed' OR active = false`;
break;
case 'inactive':
countQuery = `SELECT COUNT(*) as count FROM proxies WHERE active = false`;
break;
default:
countQuery = `SELECT COUNT(*) as count FROM proxies`;
}
const result = await pool.query(countQuery);
const totalProxies = parseInt(result.rows[0].count);
if (totalProxies === 0) {
throw new Error(`No proxies to test with mode '${mode}'`);
}
const jobResult = await pool.query(`
INSERT INTO proxy_test_jobs (status, total_proxies)
VALUES ('pending', $1)
@@ -53,8 +74,8 @@ export async function createProxyTestJob(): Promise<number> {
const jobId = jobResult.rows[0].id;
// Start job in background
runProxyTestJob(jobId).catch(err => {
// Start job in background with mode and concurrency
runProxyTestJob(jobId, mode, concurrency).catch(err => {
console.error(`❌ Proxy test job ${jobId} failed:`, err);
});
@@ -111,7 +132,7 @@ export async function cancelProxyTestJob(jobId: number): Promise<boolean> {
return result.rows.length > 0;
}
async function runProxyTestJob(jobId: number): Promise<void> {
async function runProxyTestJob(jobId: number, mode: ProxyTestMode = 'all', concurrency: number = DEFAULT_CONCURRENCY): Promise<void> {
// Register job as active
activeJobs.set(jobId, { cancelled: false });
@@ -125,20 +146,30 @@ async function runProxyTestJob(jobId: number): Promise<void> {
WHERE id = $1
`, [jobId]);
console.log(`🔍 Starting proxy test job ${jobId}...`);
console.log(`🔍 Starting proxy test job ${jobId} (mode: ${mode}, concurrency: ${concurrency})...`);
// Get all proxies
const result = await pool.query(`
SELECT id, host, port, protocol, username, password
FROM proxies
ORDER BY id
`);
// Get proxies based on mode
let query: string;
switch (mode) {
case 'failed':
query = `SELECT id, host, port, protocol, username, password FROM proxies WHERE test_result = 'failed' OR active = false ORDER BY id`;
break;
case 'inactive':
query = `SELECT id, host, port, protocol, username, password FROM proxies WHERE active = false ORDER BY id`;
break;
default:
query = `SELECT id, host, port, protocol, username, password FROM proxies ORDER BY id`;
}
const result = await pool.query(query);
const proxies = result.rows;
let tested = 0;
let passed = 0;
let failed = 0;
for (const proxy of result.rows) {
// Process proxies in batches for parallel testing
for (let i = 0; i < proxies.length; i += concurrency) {
// Check if job was cancelled
const jobControl = activeJobs.get(jobId);
if (jobControl?.cancelled) {
@@ -146,23 +177,34 @@ async function runProxyTestJob(jobId: number): Promise<void> {
break;
}
// Test the proxy
const testResult = await testProxy(
proxy.host,
proxy.port,
proxy.protocol,
proxy.username,
proxy.password
const batch = proxies.slice(i, i + concurrency);
// Test batch in parallel
const batchResults = await Promise.all(
batch.map(async (proxy) => {
const testResult = await testProxy(
proxy.host,
proxy.port,
proxy.protocol,
proxy.username,
proxy.password
);
// Save result
await saveProxyTestResult(proxy.id, testResult);
return testResult.success;
})
);
// Save result
await saveProxyTestResult(proxy.id, testResult);
tested++;
if (testResult.success) {
passed++;
} else {
failed++;
// Count results
for (const success of batchResults) {
tested++;
if (success) {
passed++;
} else {
failed++;
}
}
// Update job progress
@@ -175,10 +217,8 @@ async function runProxyTestJob(jobId: number): Promise<void> {
WHERE id = $4
`, [tested, passed, failed, jobId]);
// Log progress every 10 proxies
if (tested % 10 === 0) {
console.log(`📊 Job ${jobId}: ${tested}/${result.rows.length} proxies tested (${passed} passed, ${failed} failed)`);
}
// Log progress
console.log(`📊 Job ${jobId}: ${tested}/${proxies.length} proxies tested (${passed} passed, ${failed} failed)`);
}
// Mark job as completed

View File

@@ -3,7 +3,7 @@ import StealthPlugin from 'puppeteer-extra-plugin-stealth';
import { Browser, Page } from 'puppeteer';
import { SocksProxyAgent } from 'socks-proxy-agent';
import { pool } from '../db/pool';
import { uploadImageFromUrl, getImageUrl } from '../utils/minio';
import { downloadProductImageLegacy } from '../utils/image-storage';
import { logger } from './logger';
import { registerScraper, updateScraperStats, completeScraper } from '../routes/scraper-monitor';
import { incrementProxyFailure, getActiveProxy, isBotDetectionError, putProxyInTimeout } from './proxy';
@@ -767,7 +767,8 @@ export async function saveProducts(storeId: number, categoryId: number, products
if (product.imageUrl && !localImagePath) {
try {
localImagePath = await uploadImageFromUrl(product.imageUrl, productId);
const result = await downloadProductImageLegacy(product.imageUrl, 0, productId);
localImagePath = result.urls?.original || null;
await client.query(`
UPDATE products
SET local_image_path = $1

View File

@@ -0,0 +1,92 @@
/**
* Analytics Refresh Handler
*
* Refreshes materialized views and pre-computed analytics tables.
* Should run daily or on-demand after major data changes.
*/
import { TaskContext, TaskResult } from '../task-worker';
export async function handleAnalyticsRefresh(ctx: TaskContext): Promise<TaskResult> {
const { pool } = ctx;
console.log(`[AnalyticsRefresh] Starting analytics refresh...`);
const refreshed: string[] = [];
const failed: string[] = [];
// List of materialized views to refresh
const materializedViews = [
'mv_state_metrics',
'mv_brand_metrics',
'mv_category_metrics',
'v_brand_summary',
'v_dashboard_stats',
];
for (const viewName of materializedViews) {
try {
// Heartbeat before each refresh
await ctx.heartbeat();
// Check if view exists
const existsResult = await pool.query(`
SELECT EXISTS (
SELECT 1 FROM pg_matviews WHERE matviewname = $1
UNION
SELECT 1 FROM pg_views WHERE viewname = $1
) as exists
`, [viewName]);
if (!existsResult.rows[0].exists) {
console.log(`[AnalyticsRefresh] View ${viewName} does not exist, skipping`);
continue;
}
// Try to refresh (only works for materialized views)
try {
await pool.query(`REFRESH MATERIALIZED VIEW CONCURRENTLY ${viewName}`);
refreshed.push(viewName);
console.log(`[AnalyticsRefresh] Refreshed ${viewName}`);
} catch (refreshError: any) {
// Try non-concurrent refresh
try {
await pool.query(`REFRESH MATERIALIZED VIEW ${viewName}`);
refreshed.push(viewName);
console.log(`[AnalyticsRefresh] Refreshed ${viewName} (non-concurrent)`);
} catch (nonConcurrentError: any) {
// Not a materialized view or other error
console.log(`[AnalyticsRefresh] ${viewName} is not a materialized view or refresh failed`);
}
}
} catch (error: any) {
console.error(`[AnalyticsRefresh] Error refreshing ${viewName}:`, error.message);
failed.push(viewName);
}
}
// Run analytics capture functions if they exist
const captureFunctions = [
'capture_brand_snapshots',
'capture_category_snapshots',
];
for (const funcName of captureFunctions) {
try {
await pool.query(`SELECT ${funcName}()`);
console.log(`[AnalyticsRefresh] Executed ${funcName}()`);
} catch (error: any) {
// Function might not exist
console.log(`[AnalyticsRefresh] ${funcName}() not available`);
}
}
console.log(`[AnalyticsRefresh] Complete: ${refreshed.length} refreshed, ${failed.length} failed`);
return {
success: failed.length === 0,
refreshed,
failed,
error: failed.length > 0 ? `Failed to refresh: ${failed.join(', ')}` : undefined,
};
}

View File

@@ -0,0 +1,188 @@
/**
* Entry Point Discovery Handler
*
* Resolves platform IDs for a discovered store using Dutchie GraphQL.
* This is the step between store_discovery and product_discovery.
*
* Flow:
* 1. Load dispensary info from database
* 2. Extract slug from menu_url
* 3. Start stealth session (fingerprint + optional proxy)
* 4. Query Dutchie GraphQL to resolve slug → platform_dispensary_id
* 5. Update dispensary record with resolved ID
* 6. Queue product_discovery task if successful
*/
import { TaskContext, TaskResult } from '../task-worker';
import { startSession, endSession } from '../../platforms/dutchie';
import { resolveDispensaryIdWithDetails } from '../../platforms/dutchie/queries';
export async function handleEntryPointDiscovery(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const dispensaryId = task.dispensary_id;
if (!dispensaryId) {
return { success: false, error: 'No dispensary_id specified for entry_point_discovery task' };
}
try {
// ============================================================
// STEP 1: Load dispensary info
// ============================================================
const dispResult = await pool.query(`
SELECT id, name, menu_url, platform_dispensary_id, menu_type, state
FROM dispensaries
WHERE id = $1
`, [dispensaryId]);
if (dispResult.rows.length === 0) {
return { success: false, error: `Dispensary ${dispensaryId} not found` };
}
const dispensary = dispResult.rows[0];
// If already has platform_dispensary_id, we're done
if (dispensary.platform_dispensary_id) {
console.log(`[EntryPointDiscovery] Dispensary ${dispensaryId} already has platform ID: ${dispensary.platform_dispensary_id}`);
return {
success: true,
alreadyResolved: true,
platformId: dispensary.platform_dispensary_id,
};
}
const menuUrl = dispensary.menu_url;
if (!menuUrl) {
return { success: false, error: `Dispensary ${dispensaryId} has no menu_url` };
}
console.log(`[EntryPointDiscovery] Resolving platform ID for ${dispensary.name}`);
console.log(`[EntryPointDiscovery] Menu URL: ${menuUrl}`);
// ============================================================
// STEP 2: Extract slug from menu URL
// ============================================================
let slug: string | null = null;
const embeddedMatch = menuUrl.match(/\/embedded-menu\/([^/?]+)/);
const dispensaryMatch = menuUrl.match(/\/dispensary\/([^/?]+)/);
if (embeddedMatch) {
slug = embeddedMatch[1];
} else if (dispensaryMatch) {
slug = dispensaryMatch[1];
}
if (!slug) {
// Mark as non-dutchie menu type
await pool.query(`
UPDATE dispensaries
SET menu_type = 'unknown', updated_at = NOW()
WHERE id = $1
`, [dispensaryId]);
return {
success: false,
error: `Could not extract slug from menu_url: ${menuUrl}`,
};
}
console.log(`[EntryPointDiscovery] Extracted slug: ${slug}`);
await ctx.heartbeat();
// ============================================================
// STEP 3: Start stealth session
// ============================================================
const session = startSession(dispensary.state || 'AZ', 'America/Phoenix');
console.log(`[EntryPointDiscovery] Session started: ${session.sessionId}`);
try {
// ============================================================
// STEP 4: Resolve platform ID via GraphQL
// ============================================================
console.log(`[EntryPointDiscovery] Querying Dutchie GraphQL for slug: ${slug}`);
const result = await resolveDispensaryIdWithDetails(slug);
if (!result.dispensaryId) {
// Resolution failed - could be 403, 404, or invalid response
const reason = result.httpStatus
? `HTTP ${result.httpStatus}`
: result.error || 'Unknown error';
console.log(`[EntryPointDiscovery] Failed to resolve ${slug}: ${reason}`);
// Mark as failed resolution but keep menu_type as dutchie
await pool.query(`
UPDATE dispensaries
SET
menu_type = CASE
WHEN $2 = 404 THEN 'removed'
WHEN $2 = 403 THEN 'blocked'
ELSE 'dutchie'
END,
updated_at = NOW()
WHERE id = $1
`, [dispensaryId, result.httpStatus || 0]);
return {
success: false,
error: `Could not resolve platform ID: ${reason}`,
slug,
httpStatus: result.httpStatus,
};
}
const platformId = result.dispensaryId;
console.log(`[EntryPointDiscovery] Resolved ${slug} -> ${platformId}`);
await ctx.heartbeat();
// ============================================================
// STEP 5: Update dispensary with resolved ID
// ============================================================
await pool.query(`
UPDATE dispensaries
SET
platform_dispensary_id = $2,
menu_type = 'dutchie',
crawl_enabled = true,
updated_at = NOW()
WHERE id = $1
`, [dispensaryId, platformId]);
console.log(`[EntryPointDiscovery] Updated dispensary ${dispensaryId} with platform ID`);
// ============================================================
// STEP 6: Queue product_discovery task
// ============================================================
await pool.query(`
INSERT INTO worker_tasks (role, dispensary_id, priority, scheduled_for)
VALUES ('product_discovery', $1, 5, NOW())
ON CONFLICT DO NOTHING
`, [dispensaryId]);
console.log(`[EntryPointDiscovery] Queued product_discovery task for dispensary ${dispensaryId}`);
return {
success: true,
platformId,
slug,
queuedProductDiscovery: true,
};
} finally {
// Always end session
endSession();
}
} catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error(`[EntryPointDiscovery] Error for dispensary ${dispensaryId}:`, errorMessage);
return {
success: false,
error: errorMessage,
};
}
}

View File

@@ -0,0 +1,11 @@
/**
* Task Handlers Index
*
* Exports all task handlers for the task worker.
*/
export { handleProductRefresh } from './product-refresh';
export { handleProductDiscovery } from './product-discovery';
export { handleStoreDiscovery } from './store-discovery';
export { handleEntryPointDiscovery } from './entry-point-discovery';
export { handleAnalyticsRefresh } from './analytics-refresh';

View File

@@ -0,0 +1,16 @@
/**
* Product Discovery Handler
*
* Initial product fetch for stores that have 0 products.
* Same logic as product_resync, but for initial discovery.
*/
import { TaskContext, TaskResult } from '../task-worker';
import { handleProductRefresh } from './product-refresh';
export async function handleProductDiscovery(ctx: TaskContext): Promise<TaskResult> {
// Product discovery is essentially the same as refresh for the first time
// The main difference is in when this task is triggered (new store vs scheduled)
console.log(`[ProductDiscovery] Starting initial product fetch for dispensary ${ctx.task.dispensary_id}`);
return handleProductRefresh(ctx);
}

View File

@@ -0,0 +1,344 @@
/**
* Product Refresh Handler
*
* Re-crawls a store to capture price/stock changes using the GraphQL pipeline.
*
* Flow:
* 1. Load dispensary info from database
* 2. Start stealth session (fingerprint + optional proxy)
* 3. Fetch products via GraphQL (Status: 'All')
* 4. Normalize data via DutchieNormalizer
* 5. Upsert to store_products and store_product_snapshots
* 6. Track missing products (increment consecutive_misses, mark OOS at 3)
* 7. Download new product images
* 8. End session
*/
import { TaskContext, TaskResult } from '../task-worker';
import {
executeGraphQL,
startSession,
endSession,
GRAPHQL_HASHES,
DUTCHIE_CONFIG,
} from '../../platforms/dutchie';
import { DutchieNormalizer } from '../../hydration/normalizers/dutchie';
import {
upsertStoreProducts,
createStoreProductSnapshots,
downloadProductImages,
} from '../../hydration/canonical-upsert';
const normalizer = new DutchieNormalizer();
export async function handleProductRefresh(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const dispensaryId = task.dispensary_id;
if (!dispensaryId) {
return { success: false, error: 'No dispensary_id specified for product_refresh task' };
}
try {
// ============================================================
// STEP 1: Load dispensary info
// ============================================================
const dispResult = await pool.query(`
SELECT
id, name, platform_dispensary_id, menu_url, menu_type, city, state
FROM dispensaries
WHERE id = $1 AND crawl_enabled = true
`, [dispensaryId]);
if (dispResult.rows.length === 0) {
return { success: false, error: `Dispensary ${dispensaryId} not found or not crawl_enabled` };
}
const dispensary = dispResult.rows[0];
const platformId = dispensary.platform_dispensary_id;
if (!platformId) {
return { success: false, error: `Dispensary ${dispensaryId} has no platform_dispensary_id` };
}
// Extract cName from menu_url
const cNameMatch = dispensary.menu_url?.match(/\/(?:embedded-menu|dispensary)\/([^/?]+)/);
const cName = cNameMatch ? cNameMatch[1] : 'dispensary';
console.log(`[ProductResync] Starting crawl for ${dispensary.name} (ID: ${dispensaryId})`);
console.log(`[ProductResync] Platform ID: ${platformId}, cName: ${cName}`);
// ============================================================
// STEP 2: Start stealth session
// ============================================================
const session = startSession(dispensary.state || 'AZ', 'America/Phoenix');
console.log(`[ProductResync] Session started: ${session.sessionId}`);
await ctx.heartbeat();
// ============================================================
// STEP 3: Fetch products via GraphQL (Status: 'All')
// ============================================================
const allProducts: any[] = [];
let page = 0;
let totalCount = 0;
const perPage = DUTCHIE_CONFIG.perPage;
const maxPages = DUTCHIE_CONFIG.maxPages;
try {
while (page < maxPages) {
const variables = {
includeEnterpriseSpecials: false,
productsFilter: {
dispensaryId: platformId,
pricingType: 'rec',
Status: 'All',
types: [],
useCache: false,
isDefaultSort: true,
sortBy: 'popularSortIdx',
sortDirection: 1,
bypassOnlineThresholds: true,
isKioskMenu: false,
removeProductsBelowOptionThresholds: false,
},
page,
perPage,
};
console.log(`[ProductResync] Fetching page ${page + 1}...`);
const result = await executeGraphQL(
'FilteredProducts',
variables,
GRAPHQL_HASHES.FilteredProducts,
{ cName, maxRetries: 3 }
);
const data = result?.data?.filteredProducts;
if (!data || !data.products) {
if (page === 0) {
throw new Error('No product data returned from GraphQL');
}
break;
}
const products = data.products;
allProducts.push(...products);
if (page === 0) {
totalCount = data.queryInfo?.totalCount || products.length;
console.log(`[ProductResync] Total products reported: ${totalCount}`);
}
if (allProducts.length >= totalCount || products.length < perPage) {
break;
}
page++;
if (page < maxPages) {
await new Promise(r => setTimeout(r, DUTCHIE_CONFIG.pageDelayMs));
}
if (page % 5 === 0) {
await ctx.heartbeat();
}
}
console.log(`[ProductResync] Fetched ${allProducts.length} products in ${page + 1} pages`);
} finally {
endSession();
}
if (allProducts.length === 0) {
return {
success: false,
error: 'No products returned from GraphQL',
productsProcessed: 0,
};
}
await ctx.heartbeat();
// ============================================================
// STEP 4: Normalize data
// ============================================================
console.log(`[ProductResync] Normalizing ${allProducts.length} products...`);
// Build RawPayload for the normalizer
const rawPayload = {
id: `resync-${dispensaryId}-${Date.now()}`,
dispensary_id: dispensaryId,
crawl_run_id: null,
platform: 'dutchie',
payload_version: 1,
raw_json: { data: { filteredProducts: { products: allProducts } } },
product_count: allProducts.length,
pricing_type: 'dual',
crawl_mode: 'dual_mode',
fetched_at: new Date(),
processed: false,
normalized_at: null,
hydration_error: null,
hydration_attempts: 0,
created_at: new Date(),
};
const normalizationResult = normalizer.normalize(rawPayload);
if (normalizationResult.errors.length > 0) {
console.warn(`[ProductResync] Normalization warnings: ${normalizationResult.errors.map(e => e.message).join(', ')}`);
}
if (normalizationResult.products.length === 0) {
return {
success: false,
error: 'Normalization produced no products',
productsProcessed: 0,
};
}
console.log(`[ProductResync] Normalized ${normalizationResult.products.length} products`);
await ctx.heartbeat();
// ============================================================
// STEP 5: Upsert to canonical tables
// ============================================================
console.log(`[ProductResync] Upserting to store_products...`);
const upsertResult = await upsertStoreProducts(
pool,
normalizationResult.products,
normalizationResult.pricing,
normalizationResult.availability
);
console.log(`[ProductResync] Upserted: ${upsertResult.upserted} (${upsertResult.new} new, ${upsertResult.updated} updated)`);
await ctx.heartbeat();
// Create snapshots
console.log(`[ProductResync] Creating snapshots...`);
const snapshotsResult = await createStoreProductSnapshots(
pool,
dispensaryId,
normalizationResult.products,
normalizationResult.pricing,
normalizationResult.availability,
null // No crawl_run_id in new system
);
console.log(`[ProductResync] Created ${snapshotsResult.created} snapshots`);
await ctx.heartbeat();
// ============================================================
// STEP 6: Track missing products (consecutive_misses logic)
// - Products in feed: reset consecutive_misses to 0
// - Products not in feed: increment consecutive_misses
// - At 3 consecutive misses: mark as OOS
// ============================================================
const currentProductIds = allProducts
.map((p: any) => p._id || p.id)
.filter(Boolean);
// Reset consecutive_misses for products that ARE in the feed
if (currentProductIds.length > 0) {
await pool.query(`
UPDATE store_products
SET consecutive_misses = 0, last_seen_at = NOW()
WHERE dispensary_id = $1
AND provider = 'dutchie'
AND provider_product_id = ANY($2)
`, [dispensaryId, currentProductIds]);
}
// Increment consecutive_misses for products NOT in the feed
const incrementResult = await pool.query(`
UPDATE store_products
SET consecutive_misses = consecutive_misses + 1
WHERE dispensary_id = $1
AND provider = 'dutchie'
AND provider_product_id NOT IN (SELECT unnest($2::text[]))
AND consecutive_misses < 3
RETURNING id
`, [dispensaryId, currentProductIds]);
const incrementedCount = incrementResult.rowCount || 0;
if (incrementedCount > 0) {
console.log(`[ProductResync] Incremented consecutive_misses for ${incrementedCount} products`);
}
// Mark as OOS any products that hit 3 consecutive misses
const oosResult = await pool.query(`
UPDATE store_products
SET stock_status = 'oos', is_in_stock = false
WHERE dispensary_id = $1
AND provider = 'dutchie'
AND consecutive_misses >= 3
AND stock_status != 'oos'
RETURNING id
`, [dispensaryId]);
const markedOosCount = oosResult.rowCount || 0;
if (markedOosCount > 0) {
console.log(`[ProductResync] Marked ${markedOosCount} products as OOS (3+ consecutive misses)`);
}
await ctx.heartbeat();
// ============================================================
// STEP 7: Download images for new products
// ============================================================
if (upsertResult.productsNeedingImages.length > 0) {
console.log(`[ProductResync] Downloading images for ${upsertResult.productsNeedingImages.length} products...`);
try {
const dispensaryContext = {
stateCode: dispensary.state || 'AZ',
storeSlug: cName,
};
await downloadProductImages(
pool,
upsertResult.productsNeedingImages,
dispensaryContext
);
} catch (imgError: any) {
// Image download errors shouldn't fail the whole task
console.warn(`[ProductResync] Image download error (non-fatal): ${imgError.message}`);
}
}
// ============================================================
// STEP 8: Update dispensary last_crawl_at
// ============================================================
await pool.query(`
UPDATE dispensaries
SET last_crawl_at = NOW()
WHERE id = $1
`, [dispensaryId]);
console.log(`[ProductResync] Completed ${dispensary.name}`);
return {
success: true,
productsProcessed: normalizationResult.products.length,
snapshotsCreated: snapshotsResult.created,
newProducts: upsertResult.new,
updatedProducts: upsertResult.updated,
markedOos: markedOosCount,
};
} catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error(`[ProductResync] Error for dispensary ${dispensaryId}:`, errorMessage);
return {
success: false,
error: errorMessage,
};
}
}

View File

@@ -0,0 +1,66 @@
/**
* Store Discovery Handler
*
* Discovers new stores by crawling location APIs and adding them
* to discovery_locations table.
*/
import { TaskContext, TaskResult } from '../task-worker';
import { discoverState } from '../../discovery';
export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const platform = task.platform || 'default';
console.log(`[StoreDiscovery] Starting discovery for platform: ${platform}`);
try {
// Get states to discover
const statesResult = await pool.query(`
SELECT code FROM states WHERE active = true ORDER BY code
`);
const stateCodes = statesResult.rows.map(r => r.code);
if (stateCodes.length === 0) {
return { success: true, storesDiscovered: 0, message: 'No active states to discover' };
}
let totalDiscovered = 0;
let totalPromoted = 0;
// Run discovery for each state
for (const stateCode of stateCodes) {
// Heartbeat before each state
await ctx.heartbeat();
console.log(`[StoreDiscovery] Discovering stores in ${stateCode}...`);
try {
const result = await discoverState(pool, stateCode);
totalDiscovered += result.totalLocationsFound || 0;
totalPromoted += result.totalLocationsUpserted || 0;
console.log(`[StoreDiscovery] ${stateCode}: found ${result.totalLocationsFound}, upserted ${result.totalLocationsUpserted}`);
} catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error(`[StoreDiscovery] Error discovering ${stateCode}:`, errorMessage);
// Continue with other states
}
}
console.log(`[StoreDiscovery] Complete: ${totalDiscovered} discovered, ${totalPromoted} promoted`);
return {
success: true,
storesDiscovered: totalDiscovered,
storesPromoted: totalPromoted,
statesProcessed: stateCodes.length,
};
} catch (error: unknown) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error(`[StoreDiscovery] Error:`, errorMessage);
return {
success: false,
error: errorMessage,
};
}
}

View File

@@ -0,0 +1,25 @@
/**
* Task Queue Module
*
* Exports task service, worker, and types for use throughout the application.
*/
export {
taskService,
TaskRole,
TaskStatus,
WorkerTask,
CreateTaskParams,
CapacityMetrics,
TaskFilter,
} from './task-service';
export { TaskWorker, TaskContext, TaskResult } from './task-worker';
export {
handleProductRefresh,
handleProductDiscovery,
handleStoreDiscovery,
handleEntryPointDiscovery,
handleAnalyticsRefresh,
} from './handlers';

View File

@@ -0,0 +1,93 @@
#!/usr/bin/env npx tsx
/**
* Start Pod - Simulates a Kubernetes pod locally
*
* Starts 5 workers with a pod name from the predefined list.
*
* Usage:
* npx tsx src/tasks/start-pod.ts <pod-index>
* npx tsx src/tasks/start-pod.ts 0 # Starts pod "Aethelgard" with 5 workers
* npx tsx src/tasks/start-pod.ts 1 # Starts pod "Xylos" with 5 workers
*/
import { spawn } from 'child_process';
import path from 'path';
const POD_NAMES = [
'Aethelgard',
'Xylos',
'Kryll',
'Coriolis',
'Dimidium',
'Veridia',
'Zetani',
'Talos IV',
'Onyx',
'Celestia',
'Gormand',
'Betha',
'Ragnar',
'Syphon',
'Axiom',
'Nadir',
'Terra Nova',
'Acheron',
'Nexus',
'Vespera',
'Helios Prime',
'Oasis',
'Mordina',
'Cygnus',
'Umbra',
];
const WORKERS_PER_POD = 5;
async function main() {
const podIndex = parseInt(process.argv[2] ?? '0', 10);
if (podIndex < 0 || podIndex >= POD_NAMES.length) {
console.error(`Invalid pod index: ${podIndex}. Must be 0-${POD_NAMES.length - 1}`);
process.exit(1);
}
const podName = POD_NAMES[podIndex];
console.log(`[Pod] Starting pod "${podName}" with ${WORKERS_PER_POD} workers...`);
const workerScript = path.join(__dirname, 'task-worker.ts');
const workers: ReturnType<typeof spawn>[] = [];
for (let i = 1; i <= WORKERS_PER_POD; i++) {
const workerId = `${podName}-worker-${i}`;
const worker = spawn('npx', ['tsx', workerScript], {
env: {
...process.env,
WORKER_ID: workerId,
POD_NAME: podName,
},
stdio: 'inherit',
});
workers.push(worker);
console.log(`[Pod] Started worker ${i}/${WORKERS_PER_POD}: ${workerId}`);
}
// Handle shutdown
const shutdown = () => {
console.log(`\n[Pod] Shutting down pod "${podName}"...`);
workers.forEach(w => w.kill('SIGTERM'));
setTimeout(() => process.exit(0), 2000);
};
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
// Keep the process alive
await new Promise(() => {});
}
main().catch(err => {
console.error('[Pod] Fatal error:', err);
process.exit(1);
});

View File

@@ -0,0 +1,548 @@
/**
* Task Service
*
* Central service for managing worker tasks with:
* - Atomic task claiming (per-store locking)
* - Task lifecycle management
* - Auto-chaining of related tasks
* - Capacity planning metrics
*/
import { pool } from '../db/pool';
export type TaskRole =
| 'store_discovery'
| 'entry_point_discovery'
| 'product_discovery'
| 'product_refresh'
| 'analytics_refresh';
export type TaskStatus =
| 'pending'
| 'claimed'
| 'running'
| 'completed'
| 'failed'
| 'stale';
export interface WorkerTask {
id: number;
role: TaskRole;
dispensary_id: number | null;
dispensary_name?: string; // JOINed from dispensaries
dispensary_slug?: string; // JOINed from dispensaries
platform: string | null;
status: TaskStatus;
priority: number;
scheduled_for: Date | null;
worker_id: string | null;
claimed_at: Date | null;
started_at: Date | null;
completed_at: Date | null;
last_heartbeat_at: Date | null;
result: Record<string, unknown> | null;
error_message: string | null;
retry_count: number;
max_retries: number;
created_at: Date;
updated_at: Date;
}
export interface CreateTaskParams {
role: TaskRole;
dispensary_id?: number;
platform?: string;
priority?: number;
scheduled_for?: Date;
}
export interface CapacityMetrics {
role: string;
pending_tasks: number;
ready_tasks: number;
claimed_tasks: number;
running_tasks: number;
completed_last_hour: number;
failed_last_hour: number;
active_workers: number;
avg_duration_sec: number | null;
tasks_per_worker_hour: number | null;
estimated_hours_to_drain: number | null;
}
export interface TaskFilter {
role?: TaskRole;
status?: TaskStatus | TaskStatus[];
dispensary_id?: number;
worker_id?: string;
limit?: number;
offset?: number;
}
class TaskService {
/**
* Create a new task
*/
async createTask(params: CreateTaskParams): Promise<WorkerTask> {
const result = await pool.query(
`INSERT INTO worker_tasks (role, dispensary_id, platform, priority, scheduled_for)
VALUES ($1, $2, $3, $4, $5)
RETURNING *`,
[
params.role,
params.dispensary_id ?? null,
params.platform ?? null,
params.priority ?? 0,
params.scheduled_for ?? null,
]
);
return result.rows[0] as WorkerTask;
}
/**
* Create multiple tasks in a batch
*/
async createTasks(tasks: CreateTaskParams[]): Promise<number> {
if (tasks.length === 0) return 0;
const values = tasks.map((t, i) => {
const base = i * 5;
return `($${base + 1}, $${base + 2}, $${base + 3}, $${base + 4}, $${base + 5})`;
});
const params = tasks.flatMap((t) => [
t.role,
t.dispensary_id ?? null,
t.platform ?? null,
t.priority ?? 0,
t.scheduled_for ?? null,
]);
const result = await pool.query(
`INSERT INTO worker_tasks (role, dispensary_id, platform, priority, scheduled_for)
VALUES ${values.join(', ')}
ON CONFLICT DO NOTHING`,
params
);
return result.rowCount ?? 0;
}
/**
* Claim a task atomically for a worker
* If role is null, claims ANY available task (role-agnostic worker)
*/
async claimTask(role: TaskRole | null, workerId: string): Promise<WorkerTask | null> {
if (role) {
// Role-specific claiming - use the SQL function
const result = await pool.query(
`SELECT * FROM claim_task($1, $2)`,
[role, workerId]
);
return (result.rows[0] as WorkerTask) || null;
}
// Role-agnostic claiming - claim ANY pending task
const result = await pool.query(`
UPDATE worker_tasks
SET
status = 'claimed',
worker_id = $1,
claimed_at = NOW()
WHERE id = (
SELECT id FROM worker_tasks
WHERE status = 'pending'
AND (scheduled_for IS NULL OR scheduled_for <= NOW())
-- Exclude stores that already have an active task
AND (dispensary_id IS NULL OR dispensary_id NOT IN (
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
AND dispensary_id IS NOT NULL
))
ORDER BY priority DESC, created_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING *
`, [workerId]);
return (result.rows[0] as WorkerTask) || null;
}
/**
* Mark a task as running (worker started processing)
*/
async startTask(taskId: number): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET status = 'running', started_at = NOW(), last_heartbeat_at = NOW()
WHERE id = $1`,
[taskId]
);
}
/**
* Update heartbeat to prevent stale detection
*/
async heartbeat(taskId: number): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET last_heartbeat_at = NOW()
WHERE id = $1 AND status = 'running'`,
[taskId]
);
}
/**
* Mark a task as completed
*/
async completeTask(taskId: number, result?: Record<string, unknown>): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET status = 'completed', completed_at = NOW(), result = $2
WHERE id = $1`,
[taskId, result ? JSON.stringify(result) : null]
);
}
/**
* Mark a task as failed, with auto-retry if under max_retries
* Returns true if task was re-queued for retry, false if permanently failed
*/
async failTask(taskId: number, errorMessage: string): Promise<boolean> {
// Get current retry state
const result = await pool.query(
`SELECT retry_count, max_retries FROM worker_tasks WHERE id = $1`,
[taskId]
);
if (result.rows.length === 0) {
return false;
}
const { retry_count, max_retries } = result.rows[0];
const newRetryCount = (retry_count || 0) + 1;
if (newRetryCount < (max_retries || 3)) {
// Re-queue for retry - reset to pending with incremented retry_count
await pool.query(
`UPDATE worker_tasks
SET status = 'pending',
worker_id = NULL,
claimed_at = NULL,
started_at = NULL,
retry_count = $2,
error_message = $3,
updated_at = NOW()
WHERE id = $1`,
[taskId, newRetryCount, `Retry ${newRetryCount}: ${errorMessage}`]
);
console.log(`[TaskService] Task ${taskId} queued for retry ${newRetryCount}/${max_retries || 3}`);
return true;
}
// Max retries exceeded - mark as permanently failed
await pool.query(
`UPDATE worker_tasks
SET status = 'failed',
completed_at = NOW(),
retry_count = $2,
error_message = $3
WHERE id = $1`,
[taskId, newRetryCount, `Failed after ${newRetryCount} attempts: ${errorMessage}`]
);
console.log(`[TaskService] Task ${taskId} permanently failed after ${newRetryCount} attempts`);
return false;
}
/**
* Get a task by ID
*/
async getTask(taskId: number): Promise<WorkerTask | null> {
const result = await pool.query(
`SELECT * FROM worker_tasks WHERE id = $1`,
[taskId]
);
return (result.rows[0] as WorkerTask) || null;
}
/**
* List tasks with filters
*/
async listTasks(filter: TaskFilter = {}): Promise<WorkerTask[]> {
const conditions: string[] = [];
const params: (string | number | string[])[] = [];
let paramIndex = 1;
if (filter.role) {
conditions.push(`t.role = $${paramIndex++}`);
params.push(filter.role);
}
if (filter.status) {
if (Array.isArray(filter.status)) {
conditions.push(`t.status = ANY($${paramIndex++})`);
params.push(filter.status);
} else {
conditions.push(`t.status = $${paramIndex++}`);
params.push(filter.status);
}
}
if (filter.dispensary_id) {
conditions.push(`t.dispensary_id = $${paramIndex++}`);
params.push(filter.dispensary_id);
}
if (filter.worker_id) {
conditions.push(`t.worker_id = $${paramIndex++}`);
params.push(filter.worker_id);
}
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const limit = filter.limit ?? 100;
const offset = filter.offset ?? 0;
const result = await pool.query(
`SELECT
t.*,
d.name as dispensary_name,
d.slug as dispensary_slug
FROM worker_tasks t
LEFT JOIN dispensaries d ON d.id = t.dispensary_id
${whereClause}
ORDER BY t.created_at DESC
LIMIT ${limit} OFFSET ${offset}`,
params
);
return result.rows as WorkerTask[];
}
/**
* Get capacity metrics for all roles
*/
async getCapacityMetrics(): Promise<CapacityMetrics[]> {
const result = await pool.query(
`SELECT * FROM v_worker_capacity`
);
return result.rows as CapacityMetrics[];
}
/**
* Get capacity metrics for a specific role
*/
async getRoleCapacity(role: TaskRole): Promise<CapacityMetrics | null> {
const result = await pool.query(
`SELECT * FROM v_worker_capacity WHERE role = $1`,
[role]
);
return (result.rows[0] as CapacityMetrics) || null;
}
/**
* Recover stale tasks from dead workers
*/
async recoverStaleTasks(staleThresholdMinutes = 10): Promise<number> {
const result = await pool.query(
`SELECT recover_stale_tasks($1)`,
[staleThresholdMinutes]
);
return (result.rows[0] as { recover_stale_tasks: number })?.recover_stale_tasks ?? 0;
}
/**
* Generate daily resync tasks for all active stores
*/
async generateDailyResyncTasks(batchesPerDay = 6, date?: Date): Promise<number> {
const result = await pool.query(
`SELECT generate_resync_tasks($1, $2)`,
[batchesPerDay, date ?? new Date()]
);
return (result.rows[0] as { generate_resync_tasks: number })?.generate_resync_tasks ?? 0;
}
/**
* Chain next task after completion
* Called automatically when a task completes successfully
*/
async chainNextTask(completedTask: WorkerTask): Promise<WorkerTask | null> {
if (completedTask.status !== 'completed') {
return null;
}
switch (completedTask.role) {
case 'store_discovery': {
// New stores discovered -> create entry_point_discovery tasks
const newStoreIds = (completedTask.result as { newStoreIds?: number[] })?.newStoreIds;
if (newStoreIds && newStoreIds.length > 0) {
for (const storeId of newStoreIds) {
await this.createTask({
role: 'entry_point_discovery',
dispensary_id: storeId,
platform: completedTask.platform ?? undefined,
priority: 10, // High priority for new stores
});
}
}
break;
}
case 'entry_point_discovery': {
// Entry point resolved -> create product_discovery task
const success = (completedTask.result as { success?: boolean })?.success;
if (success && completedTask.dispensary_id) {
return this.createTask({
role: 'product_discovery',
dispensary_id: completedTask.dispensary_id,
platform: completedTask.platform ?? undefined,
priority: 10,
});
}
break;
}
case 'product_discovery': {
// Product discovery done -> store is now ready for regular resync
// No immediate chaining needed; will be picked up by daily batch generation
break;
}
}
return null;
}
/**
* Create store discovery task for a platform/state
*/
async createStoreDiscoveryTask(
platform: string,
stateCode?: string,
priority = 0
): Promise<WorkerTask> {
return this.createTask({
role: 'store_discovery',
platform,
priority,
});
}
/**
* Create entry point discovery task for a specific store
*/
async createEntryPointTask(
dispensaryId: number,
platform: string,
priority = 10
): Promise<WorkerTask> {
return this.createTask({
role: 'entry_point_discovery',
dispensary_id: dispensaryId,
platform,
priority,
});
}
/**
* Create product discovery task for a specific store
*/
async createProductDiscoveryTask(
dispensaryId: number,
platform: string,
priority = 10
): Promise<WorkerTask> {
return this.createTask({
role: 'product_discovery',
dispensary_id: dispensaryId,
platform,
priority,
});
}
/**
* Get task counts by status for dashboard
*/
async getTaskCounts(): Promise<Record<TaskStatus, number>> {
const result = await pool.query(
`SELECT status, COUNT(*) as count
FROM worker_tasks
GROUP BY status`
);
const counts: Record<TaskStatus, number> = {
pending: 0,
claimed: 0,
running: 0,
completed: 0,
failed: 0,
stale: 0,
};
for (const row of result.rows) {
const typedRow = row as { status: TaskStatus; count: string };
counts[typedRow.status] = parseInt(typedRow.count, 10);
}
return counts;
}
/**
* Get recent task completions for a role
*/
async getRecentCompletions(role: TaskRole, limit = 10): Promise<WorkerTask[]> {
const result = await pool.query(
`SELECT * FROM worker_tasks
WHERE role = $1 AND status = 'completed'
ORDER BY completed_at DESC
LIMIT $2`,
[role, limit]
);
return result.rows as WorkerTask[];
}
/**
* Check if a store has any active tasks
*/
async hasActiveTask(dispensaryId: number): Promise<boolean> {
const result = await pool.query(
`SELECT EXISTS(
SELECT 1 FROM worker_tasks
WHERE dispensary_id = $1
AND status IN ('claimed', 'running')
) as exists`,
[dispensaryId]
);
return (result.rows[0] as { exists: boolean })?.exists ?? false;
}
/**
* Get the last completion time for a role
*/
async getLastCompletion(role: TaskRole): Promise<Date | null> {
const result = await pool.query(
`SELECT MAX(completed_at) as completed_at
FROM worker_tasks
WHERE role = $1 AND status = 'completed'`,
[role]
);
return (result.rows[0] as { completed_at: Date | null })?.completed_at ?? null;
}
/**
* Calculate workers needed to complete tasks within SLA
*/
async calculateWorkersNeeded(role: TaskRole, slaHours: number): Promise<number> {
const capacity = await this.getRoleCapacity(role);
if (!capacity || !capacity.tasks_per_worker_hour) {
return 1; // Default to 1 worker if no data
}
const pendingTasks = capacity.pending_tasks;
const tasksPerWorkerHour = capacity.tasks_per_worker_hour;
const totalTaskCapacityNeeded = pendingTasks / slaHours;
return Math.ceil(totalTaskCapacityNeeded / tasksPerWorkerHour);
}
}
export const taskService = new TaskService();

View File

@@ -0,0 +1,459 @@
/**
* Task Worker
*
* A unified worker that pulls tasks from the worker_tasks queue.
* Workers register on startup, get a friendly name, and pull tasks.
*
* Architecture:
* - Tasks are generated on schedule (by scheduler or API)
* - Workers PULL tasks from the pool (not assigned to them)
* - Tasks are claimed in order of priority (DESC) then creation time (ASC)
* - Workers report heartbeats to worker_registry
* - Workers are ROLE-AGNOSTIC by default (can handle any task type)
*
* Stealth & Anti-Detection:
* PROXIES ARE REQUIRED - workers will fail to start if no proxies available.
*
* On startup, workers initialize the CrawlRotator which provides:
* - Proxy rotation: Loads proxies from `proxies` table, ALL requests use proxy
* - User-Agent rotation: Cycles through realistic browser fingerprints
* - Fingerprint rotation: Changes browser profile on blocks
* - Locale/timezone: Matches Accept-Language to target state
*
* The CrawlRotator is wired to the Dutchie client via setCrawlRotator().
* Task handlers call startSession() which picks a random fingerprint.
* On 403 errors, the client automatically:
* 1. Records failure on current proxy
* 2. Rotates to next proxy
* 3. Rotates fingerprint
* 4. Retries the request
*
* Usage:
* npx tsx src/tasks/task-worker.ts # Role-agnostic (any task)
* WORKER_ROLE=product_refresh npx tsx src/tasks/task-worker.ts # Role-specific
*
* Environment:
* WORKER_ROLE - Which task role to process (optional, null = any task)
* WORKER_ID - Optional custom worker ID (auto-generated if not provided)
* POD_NAME - Kubernetes pod name (optional)
* POLL_INTERVAL_MS - How often to check for tasks (default: 5000)
* HEARTBEAT_INTERVAL_MS - How often to update heartbeat (default: 30000)
* API_BASE_URL - Backend API URL for registration (default: http://localhost:3010)
*/
import { Pool } from 'pg';
import { v4 as uuidv4 } from 'uuid';
import { taskService, TaskRole, WorkerTask } from './task-service';
import { getPool } from '../db/pool';
import os from 'os';
// Stealth/rotation support
import { CrawlRotator } from '../services/crawl-rotator';
import { setCrawlRotator } from '../platforms/dutchie';
// Task handlers by role
import { handleProductRefresh } from './handlers/product-refresh';
import { handleProductDiscovery } from './handlers/product-discovery';
import { handleStoreDiscovery } from './handlers/store-discovery';
import { handleEntryPointDiscovery } from './handlers/entry-point-discovery';
import { handleAnalyticsRefresh } from './handlers/analytics-refresh';
const POLL_INTERVAL_MS = parseInt(process.env.POLL_INTERVAL_MS || '5000');
const HEARTBEAT_INTERVAL_MS = parseInt(process.env.HEARTBEAT_INTERVAL_MS || '30000');
const API_BASE_URL = process.env.API_BASE_URL || 'http://localhost:3010';
export interface TaskContext {
pool: Pool;
workerId: string;
task: WorkerTask;
heartbeat: () => Promise<void>;
}
export interface TaskResult {
success: boolean;
productsProcessed?: number;
snapshotsCreated?: number;
storesDiscovered?: number;
error?: string;
[key: string]: unknown;
}
type TaskHandler = (ctx: TaskContext) => Promise<TaskResult>;
const TASK_HANDLERS: Record<TaskRole, TaskHandler> = {
product_refresh: handleProductRefresh,
product_discovery: handleProductDiscovery,
store_discovery: handleStoreDiscovery,
entry_point_discovery: handleEntryPointDiscovery,
analytics_refresh: handleAnalyticsRefresh,
};
export class TaskWorker {
private pool: Pool;
private workerId: string;
private role: TaskRole | null; // null = role-agnostic (any task)
private friendlyName: string = '';
private isRunning: boolean = false;
private heartbeatInterval: NodeJS.Timeout | null = null;
private registryHeartbeatInterval: NodeJS.Timeout | null = null;
private currentTask: WorkerTask | null = null;
private crawlRotator: CrawlRotator;
constructor(role: TaskRole | null = null, workerId?: string) {
this.pool = getPool();
this.role = role;
this.workerId = workerId || `worker-${uuidv4().slice(0, 8)}`;
this.crawlRotator = new CrawlRotator(this.pool);
}
/**
* Initialize stealth systems (proxy rotation, fingerprints)
* Called once on worker startup before processing any tasks.
*
* IMPORTANT: Proxies are REQUIRED. Workers will fail to start if no proxies available.
*/
private async initializeStealth(): Promise<void> {
// Load proxies from database
await this.crawlRotator.initialize();
const stats = this.crawlRotator.proxy.getStats();
if (stats.activeProxies === 0) {
throw new Error('No active proxies available. Workers MUST use proxies for all requests. Add proxies to the database before starting workers.');
}
console.log(`[TaskWorker] Loaded ${stats.activeProxies} proxies (${stats.avgSuccessRate.toFixed(1)}% avg success rate)`);
// Wire rotator to Dutchie client - proxies will be used for ALL requests
setCrawlRotator(this.crawlRotator);
console.log(`[TaskWorker] Stealth initialized: ${this.crawlRotator.userAgent.getCount()} fingerprints, proxy REQUIRED for all requests`);
}
/**
* Register worker with the registry (get friendly name)
*/
private async register(): Promise<void> {
try {
const response = await fetch(`${API_BASE_URL}/api/worker-registry/register`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
role: this.role,
worker_id: this.workerId,
pod_name: process.env.POD_NAME || process.env.HOSTNAME,
hostname: os.hostname(),
metadata: {
pid: process.pid,
node_version: process.version,
started_at: new Date().toISOString()
}
})
});
const data = await response.json();
if (data.success) {
this.friendlyName = data.friendly_name;
console.log(`[TaskWorker] ${data.message}`);
} else {
console.warn(`[TaskWorker] Registration warning: ${data.error}`);
this.friendlyName = this.workerId.slice(0, 12);
}
} catch (error: any) {
// Registration is optional - worker can still function without it
console.warn(`[TaskWorker] Could not register with API (will continue): ${error.message}`);
this.friendlyName = this.workerId.slice(0, 12);
}
}
/**
* Deregister worker from the registry
*/
private async deregister(): Promise<void> {
try {
await fetch(`${API_BASE_URL}/api/worker-registry/deregister`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ worker_id: this.workerId })
});
console.log(`[TaskWorker] ${this.friendlyName} signed off`);
} catch {
// Ignore deregistration errors
}
}
/**
* Send heartbeat to registry with resource usage and proxy location
*/
private async sendRegistryHeartbeat(): Promise<void> {
try {
const memUsage = process.memoryUsage();
const cpuUsage = process.cpuUsage();
const proxyLocation = this.crawlRotator.getProxyLocation();
await fetch(`${API_BASE_URL}/api/worker-registry/heartbeat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
worker_id: this.workerId,
current_task_id: this.currentTask?.id || null,
status: this.currentTask ? 'active' : 'idle',
resources: {
memory_mb: Math.round(memUsage.heapUsed / 1024 / 1024),
memory_total_mb: Math.round(memUsage.heapTotal / 1024 / 1024),
memory_rss_mb: Math.round(memUsage.rss / 1024 / 1024),
cpu_user_ms: Math.round(cpuUsage.user / 1000),
cpu_system_ms: Math.round(cpuUsage.system / 1000),
proxy_location: proxyLocation,
}
})
});
} catch {
// Ignore heartbeat errors
}
}
/**
* Report task completion to registry
*/
private async reportTaskCompletion(success: boolean): Promise<void> {
try {
await fetch(`${API_BASE_URL}/api/worker-registry/task-completed`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
worker_id: this.workerId,
success
})
});
} catch {
// Ignore errors
}
}
/**
* Start registry heartbeat interval
*/
private startRegistryHeartbeat(): void {
this.registryHeartbeatInterval = setInterval(async () => {
await this.sendRegistryHeartbeat();
}, HEARTBEAT_INTERVAL_MS);
}
/**
* Stop registry heartbeat interval
*/
private stopRegistryHeartbeat(): void {
if (this.registryHeartbeatInterval) {
clearInterval(this.registryHeartbeatInterval);
this.registryHeartbeatInterval = null;
}
}
/**
* Start the worker loop
*/
async start(): Promise<void> {
this.isRunning = true;
// Initialize stealth systems (proxy rotation, fingerprints)
await this.initializeStealth();
// Register with the API to get a friendly name
await this.register();
// Start registry heartbeat
this.startRegistryHeartbeat();
const roleMsg = this.role ? `for role: ${this.role}` : '(role-agnostic - any task)';
console.log(`[TaskWorker] ${this.friendlyName} starting ${roleMsg}`);
while (this.isRunning) {
try {
await this.processNextTask();
} catch (error: any) {
console.error(`[TaskWorker] Loop error:`, error.message);
await this.sleep(POLL_INTERVAL_MS);
}
}
console.log(`[TaskWorker] Worker ${this.workerId} stopped`);
}
/**
* Stop the worker
*/
async stop(): Promise<void> {
this.isRunning = false;
this.stopHeartbeat();
this.stopRegistryHeartbeat();
await this.deregister();
console.log(`[TaskWorker] ${this.friendlyName} stopped`);
}
/**
* Process the next available task
*/
private async processNextTask(): Promise<void> {
// Try to claim a task
const task = await taskService.claimTask(this.role, this.workerId);
if (!task) {
// No tasks available, wait and retry
await this.sleep(POLL_INTERVAL_MS);
return;
}
this.currentTask = task;
console.log(`[TaskWorker] Claimed task ${task.id} (${task.role}) for dispensary ${task.dispensary_id || 'N/A'}`);
// Start heartbeat
this.startHeartbeat(task.id);
try {
// Mark as running
await taskService.startTask(task.id);
// Get handler for this role
const handler = TASK_HANDLERS[task.role];
if (!handler) {
throw new Error(`No handler registered for role: ${task.role}`);
}
// Create context
const ctx: TaskContext = {
pool: this.pool,
workerId: this.workerId,
task,
heartbeat: async () => {
await taskService.heartbeat(task.id);
},
};
// Execute the task
const result = await handler(ctx);
if (result.success) {
// Mark as completed
await taskService.completeTask(task.id, result);
await this.reportTaskCompletion(true);
console.log(`[TaskWorker] ${this.friendlyName} completed task ${task.id}`);
// Chain next task if applicable
const chainedTask = await taskService.chainNextTask({
...task,
status: 'completed',
result,
});
if (chainedTask) {
console.log(`[TaskWorker] Chained new task ${chainedTask.id} (${chainedTask.role})`);
}
} else {
// Mark as failed
await taskService.failTask(task.id, result.error || 'Unknown error');
await this.reportTaskCompletion(false);
console.log(`[TaskWorker] ${this.friendlyName} failed task ${task.id}: ${result.error}`);
}
} catch (error: any) {
// Mark as failed
await taskService.failTask(task.id, error.message);
await this.reportTaskCompletion(false);
console.error(`[TaskWorker] ${this.friendlyName} task ${task.id} error:`, error.message);
} finally {
this.stopHeartbeat();
this.currentTask = null;
}
}
/**
* Start heartbeat interval
*/
private startHeartbeat(taskId: number): void {
this.heartbeatInterval = setInterval(async () => {
try {
await taskService.heartbeat(taskId);
} catch (error: any) {
console.warn(`[TaskWorker] Heartbeat failed:`, error.message);
}
}, HEARTBEAT_INTERVAL_MS);
}
/**
* Stop heartbeat interval
*/
private stopHeartbeat(): void {
if (this.heartbeatInterval) {
clearInterval(this.heartbeatInterval);
this.heartbeatInterval = null;
}
}
/**
* Sleep helper
*/
private sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
/**
* Get worker info
*/
getInfo(): { workerId: string; role: TaskRole | null; isRunning: boolean; currentTaskId: number | null } {
return {
workerId: this.workerId,
role: this.role,
isRunning: this.isRunning,
currentTaskId: this.currentTask?.id || null,
};
}
}
// ============================================================
// CLI ENTRY POINT
// ============================================================
async function main(): Promise<void> {
const role = process.env.WORKER_ROLE as TaskRole | undefined;
const validRoles: TaskRole[] = [
'store_discovery',
'entry_point_discovery',
'product_discovery',
'product_refresh',
'analytics_refresh',
];
// If role specified, validate it
if (role && !validRoles.includes(role)) {
console.error(`Error: Invalid WORKER_ROLE: ${role}`);
console.error(`Valid roles: ${validRoles.join(', ')}`);
console.error('Or omit WORKER_ROLE for role-agnostic worker (any task)');
process.exit(1);
}
const workerId = process.env.WORKER_ID;
// Pass null for role-agnostic, or the specific role
const worker = new TaskWorker(role || null, workerId);
// Handle graceful shutdown
process.on('SIGTERM', () => {
console.log('[TaskWorker] Received SIGTERM, shutting down...');
worker.stop();
});
process.on('SIGINT', () => {
console.log('[TaskWorker] Received SIGINT, shutting down...');
worker.stop();
});
await worker.start();
}
// Run if this is the main module
if (require.main === module) {
main().catch((error) => {
console.error('[TaskWorker] Fatal error:', error);
process.exit(1);
});
}
export { main };

View File

@@ -1,26 +1,29 @@
/**
* Local Image Storage Utility
*
* Downloads and stores product images to local filesystem.
* Replaces MinIO-based storage with simple local file storage.
* Downloads and stores product images to local filesystem with proper hierarchy.
*
* Directory structure:
* /images/products/<dispensary_id>/<product_id>.webp
* /images/products/<dispensary_id>/<product_id>-thumb.webp
* /images/products/<dispensary_id>/<product_id>-medium.webp
* /images/brands/<brand_slug>.webp
* /images/products/<state>/<store_slug>/<brand_slug>/<product_id>/image.webp
* /images/products/<state>/<store_slug>/<brand_slug>/<product_id>/image-medium.webp
* /images/products/<state>/<store_slug>/<brand_slug>/<product_id>/image-thumb.webp
* /images/brands/<brand_slug>/logo.webp
*
* This structure allows:
* - Easy migration to MinIO/S3 (bucket per state)
* - Browsing by state/store/brand
* - Multiple images per product (future: gallery)
*/
import axios from 'axios';
import sharp from 'sharp';
// @ts-ignore - sharp module typing quirk
const sharp = require('sharp');
import * as fs from 'fs/promises';
import * as path from 'path';
import { createHash } from 'crypto';
// Base path for image storage - configurable via env
// Uses project-relative paths by default, NOT /app or other privileged paths
function getImagesBasePath(): string {
// Priority: IMAGES_PATH > STORAGE_BASE_PATH/images > ./storage/images
if (process.env.IMAGES_PATH) {
return process.env.IMAGES_PATH;
}
@@ -35,16 +38,28 @@ const IMAGES_BASE_PATH = getImagesBasePath();
const IMAGES_PUBLIC_URL = process.env.IMAGES_PUBLIC_URL || '/images';
export interface LocalImageSizes {
full: string; // URL path: /images/products/123/456.webp
medium: string; // URL path: /images/products/123/456-medium.webp
thumb: string; // URL path: /images/products/123/456-thumb.webp
original: string; // URL path to original image
// Legacy compatibility - all point to original until we add image proxy
full: string;
medium: string;
thumb: string;
}
export interface DownloadResult {
success: boolean;
urls?: LocalImageSizes;
localPaths?: LocalImageSizes;
error?: string;
bytesDownloaded?: number;
skipped?: boolean; // True if image already exists
}
export interface ProductImageContext {
stateCode: string; // e.g., "AZ", "CA"
storeSlug: string; // e.g., "deeply-rooted"
brandSlug: string; // e.g., "high-west-farms"
productId: string; // External product ID
dispensaryId?: number; // For backwards compat
}
/**
@@ -58,6 +73,17 @@ async function ensureDir(dirPath: string): Promise<void> {
}
}
/**
* Sanitize a string for use in file paths
*/
function slugify(str: string): string {
return str
.toLowerCase()
.replace(/[^a-z0-9]+/g, '-')
.replace(/^-+|-+$/g, '')
.substring(0, 50) || 'unknown';
}
/**
* Generate a short hash from a URL for deduplication
*/
@@ -81,53 +107,30 @@ async function downloadImage(imageUrl: string): Promise<Buffer> {
}
/**
* Process and save image in multiple sizes
* Returns the file paths relative to IMAGES_BASE_PATH
* Process and save original image (convert to webp for consistency)
*
* We store only the original - resizing will be done on-demand via
* an image proxy service (imgproxy, thumbor, or similar) in the future.
*/
async function processAndSaveImage(
buffer: Buffer,
outputDir: string,
baseFilename: string
): Promise<{ full: string; medium: string; thumb: string; totalBytes: number }> {
): Promise<{ original: string; totalBytes: number }> {
await ensureDir(outputDir);
const fullPath = path.join(outputDir, `${baseFilename}.webp`);
const mediumPath = path.join(outputDir, `${baseFilename}-medium.webp`);
const thumbPath = path.join(outputDir, `${baseFilename}-thumb.webp`);
const originalPath = path.join(outputDir, `${baseFilename}.webp`);
// Process images in parallel
const [fullBuffer, mediumBuffer, thumbBuffer] = await Promise.all([
// Full: max 1200x1200, high quality
sharp(buffer)
.resize(1200, 1200, { fit: 'inside', withoutEnlargement: true })
.webp({ quality: 85 })
.toBuffer(),
// Medium: 600x600
sharp(buffer)
.resize(600, 600, { fit: 'inside', withoutEnlargement: true })
.webp({ quality: 80 })
.toBuffer(),
// Thumb: 200x200
sharp(buffer)
.resize(200, 200, { fit: 'inside', withoutEnlargement: true })
.webp({ quality: 75 })
.toBuffer(),
]);
// Convert to webp, preserve original dimensions, high quality
const originalBuffer = await sharp(buffer)
.webp({ quality: 90 })
.toBuffer();
// Save all sizes
await Promise.all([
fs.writeFile(fullPath, fullBuffer),
fs.writeFile(mediumPath, mediumBuffer),
fs.writeFile(thumbPath, thumbBuffer),
]);
const totalBytes = fullBuffer.length + mediumBuffer.length + thumbBuffer.length;
await fs.writeFile(originalPath, originalBuffer);
return {
full: fullPath,
medium: mediumPath,
thumb: thumbPath,
totalBytes,
original: originalPath,
totalBytes: originalBuffer.length,
};
}
@@ -135,47 +138,107 @@ async function processAndSaveImage(
* Convert a file path to a public URL
*/
function pathToUrl(filePath: string): string {
// Find /products/ or /brands/ in the path and extract from there
const productsMatch = filePath.match(/(\/products\/.*)/);
const brandsMatch = filePath.match(/(\/brands\/.*)/);
if (productsMatch) {
return `${IMAGES_PUBLIC_URL}${productsMatch[1]}`;
}
if (brandsMatch) {
return `${IMAGES_PUBLIC_URL}${brandsMatch[1]}`;
}
// Fallback: try to replace base path (works if paths match exactly)
const relativePath = filePath.replace(IMAGES_BASE_PATH, '');
return `${IMAGES_PUBLIC_URL}${relativePath}`;
}
/**
* Download and store a product image locally
* Build the directory path for a product image
* Structure: /images/products/<state>/<store>/<brand>/<product>/
*/
function buildProductImagePath(ctx: ProductImageContext): string {
const state = slugify(ctx.stateCode || 'unknown');
const store = slugify(ctx.storeSlug || 'unknown');
const brand = slugify(ctx.brandSlug || 'unknown');
const product = slugify(ctx.productId || 'unknown');
return path.join(IMAGES_BASE_PATH, 'products', state, store, brand, product);
}
/**
* Download and store a product image with proper hierarchy
*
* @param imageUrl - The third-party image URL to download
* @param dispensaryId - The dispensary ID (for directory organization)
* @param productId - The product ID or external ID (for filename)
* @param ctx - Product context (state, store, brand, product)
* @param options - Download options
* @returns Download result with local URLs
*/
export async function downloadProductImage(
imageUrl: string,
dispensaryId: number,
productId: string | number
ctx: ProductImageContext,
options: { skipIfExists?: boolean } = {}
): Promise<DownloadResult> {
const { skipIfExists = true } = options;
try {
if (!imageUrl) {
return { success: false, error: 'No image URL provided' };
}
const outputDir = buildProductImagePath(ctx);
const urlHash = hashUrl(imageUrl);
const baseFilename = `image-${urlHash}`;
// Check if image already exists
if (skipIfExists) {
const existingPath = path.join(outputDir, `${baseFilename}.webp`);
try {
await fs.access(existingPath);
// Image exists, return existing URL
const url = pathToUrl(existingPath);
return {
success: true,
skipped: true,
urls: {
original: url,
full: url,
medium: url,
thumb: url,
},
localPaths: {
original: existingPath,
full: existingPath,
medium: existingPath,
thumb: existingPath,
},
};
} catch {
// Image doesn't exist, continue to download
}
}
// Download the image
const buffer = await downloadImage(imageUrl);
// Organize by dispensary ID
const outputDir = path.join(IMAGES_BASE_PATH, 'products', String(dispensaryId));
// Use product ID + URL hash for uniqueness
const urlHash = hashUrl(imageUrl);
const baseFilename = `${productId}-${urlHash}`;
// Process and save
// Process and save (original only)
const result = await processAndSaveImage(buffer, outputDir, baseFilename);
const url = pathToUrl(result.original);
return {
success: true,
urls: {
full: pathToUrl(result.full),
medium: pathToUrl(result.medium),
thumb: pathToUrl(result.thumb),
original: url,
full: url,
medium: url,
thumb: url,
},
localPaths: {
original: result.original,
full: result.original,
medium: result.original,
thumb: result.original,
},
bytesDownloaded: result.totalBytes,
};
@@ -188,33 +251,71 @@ export async function downloadProductImage(
}
/**
* Download and store a brand logo locally
* Legacy function - backwards compatible with old signature
* Maps to new hierarchy using dispensary_id as store identifier
*/
export async function downloadProductImageLegacy(
imageUrl: string,
dispensaryId: number,
productId: string | number
): Promise<DownloadResult> {
return downloadProductImage(imageUrl, {
stateCode: 'unknown',
storeSlug: `store-${dispensaryId}`,
brandSlug: 'unknown',
productId: String(productId),
dispensaryId,
});
}
/**
* Download and store a brand logo
*
* @param logoUrl - The brand logo URL
* @param brandId - The brand ID or slug
* @param brandSlug - The brand slug/ID
* @returns Download result with local URL
*/
export async function downloadBrandLogo(
logoUrl: string,
brandId: string
brandSlug: string,
options: { skipIfExists?: boolean } = {}
): Promise<DownloadResult> {
const { skipIfExists = true } = options;
try {
if (!logoUrl) {
return { success: false, error: 'No logo URL provided' };
}
const safeBrandSlug = slugify(brandSlug);
const outputDir = path.join(IMAGES_BASE_PATH, 'brands', safeBrandSlug);
const urlHash = hashUrl(logoUrl);
const baseFilename = `logo-${urlHash}`;
// Check if logo already exists
if (skipIfExists) {
const existingPath = path.join(outputDir, `${baseFilename}.webp`);
try {
await fs.access(existingPath);
return {
success: true,
skipped: true,
urls: {
original: pathToUrl(existingPath),
full: pathToUrl(existingPath),
medium: pathToUrl(existingPath),
thumb: pathToUrl(existingPath),
},
};
} catch {
// Logo doesn't exist, continue
}
}
// Download the image
const buffer = await downloadImage(logoUrl);
// Brand logos go in /images/brands/
const outputDir = path.join(IMAGES_BASE_PATH, 'brands');
// Sanitize brand ID for filename
const safeBrandId = brandId.replace(/[^a-zA-Z0-9-_]/g, '_');
const urlHash = hashUrl(logoUrl);
const baseFilename = `${safeBrandId}-${urlHash}`;
// Process and save (single size for logos)
// Brand logos in their own directory
await ensureDir(outputDir);
const logoPath = path.join(outputDir, `${baseFilename}.webp`);
@@ -228,6 +329,7 @@ export async function downloadBrandLogo(
return {
success: true,
urls: {
original: pathToUrl(logoPath),
full: pathToUrl(logoPath),
medium: pathToUrl(logoPath),
thumb: pathToUrl(logoPath),
@@ -243,20 +345,16 @@ export async function downloadBrandLogo(
}
/**
* Check if a local image already exists
* Check if a product image already exists
*/
export async function imageExists(
dispensaryId: number,
productId: string | number,
export async function productImageExists(
ctx: ProductImageContext,
imageUrl: string
): Promise<boolean> {
const outputDir = buildProductImagePath(ctx);
const urlHash = hashUrl(imageUrl);
const imagePath = path.join(
IMAGES_BASE_PATH,
'products',
String(dispensaryId),
`${productId}-${urlHash}.webp`
);
const imagePath = path.join(outputDir, `image-${urlHash}.webp`);
try {
await fs.access(imagePath);
return true;
@@ -266,24 +364,27 @@ export async function imageExists(
}
/**
* Delete a product's local images
* Get the local image URL for a product (if exists)
*/
export async function deleteProductImages(
dispensaryId: number,
productId: string | number,
imageUrl?: string
): Promise<void> {
const productDir = path.join(IMAGES_BASE_PATH, 'products', String(dispensaryId));
const prefix = imageUrl
? `${productId}-${hashUrl(imageUrl)}`
: String(productId);
export async function getProductImageUrl(
ctx: ProductImageContext,
imageUrl: string
): Promise<LocalImageSizes | null> {
const outputDir = buildProductImagePath(ctx);
const urlHash = hashUrl(imageUrl);
const imagePath = path.join(outputDir, `image-${urlHash}.webp`);
try {
const files = await fs.readdir(productDir);
const toDelete = files.filter(f => f.startsWith(prefix));
await Promise.all(toDelete.map(f => fs.unlink(path.join(productDir, f))));
await fs.access(imagePath);
const url = pathToUrl(imagePath);
return {
original: url,
full: url,
medium: url,
thumb: url,
};
} catch {
// Directory might not exist, that's fine
return null;
}
}
@@ -296,19 +397,17 @@ export function isImageStorageReady(): boolean {
/**
* Initialize the image storage directories
* Does NOT throw on failure - logs warning and continues
*/
export async function initializeImageStorage(): Promise<void> {
try {
await ensureDir(path.join(IMAGES_BASE_PATH, 'products'));
await ensureDir(path.join(IMAGES_BASE_PATH, 'brands'));
console.log(`Image storage initialized at ${IMAGES_BASE_PATH}`);
console.log(`[ImageStorage] Initialized at ${IMAGES_BASE_PATH}`);
imageStorageReady = true;
} catch (error: any) {
console.warn(`⚠️ WARNING: Could not initialize image storage at ${IMAGES_BASE_PATH}: ${error.message}`);
console.warn(' Image upload/processing is disabled. Server will continue without image features.');
console.warn(`[ImageStorage] WARNING: Could not initialize at ${IMAGES_BASE_PATH}: ${error.message}`);
console.warn(' Image features disabled. Server will continue without image downloads.');
imageStorageReady = false;
// Do NOT throw - server should still start
}
}
@@ -316,34 +415,43 @@ export async function initializeImageStorage(): Promise<void> {
* Get storage stats
*/
export async function getStorageStats(): Promise<{
productsDir: string;
brandsDir: string;
basePath: string;
productCount: number;
brandCount: number;
totalSizeBytes: number;
}> {
const productsDir = path.join(IMAGES_BASE_PATH, 'products');
const brandsDir = path.join(IMAGES_BASE_PATH, 'brands');
let productCount = 0;
let brandCount = 0;
let totalSizeBytes = 0;
try {
const productDirs = await fs.readdir(productsDir);
for (const dir of productDirs) {
const files = await fs.readdir(path.join(productsDir, dir));
productCount += files.filter(f => f.endsWith('.webp') && !f.includes('-')).length;
}
} catch { /* ignore */ }
async function countDir(dirPath: string): Promise<{ count: number; size: number }> {
let count = 0;
let size = 0;
try {
const entries = await fs.readdir(dirPath, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(dirPath, entry.name);
if (entry.isDirectory()) {
const sub = await countDir(fullPath);
count += sub.count;
size += sub.size;
} else if (entry.name.endsWith('.webp') && !entry.name.includes('-')) {
count++;
const stat = await fs.stat(fullPath);
size += stat.size;
}
}
} catch { /* ignore */ }
return { count, size };
}
try {
const brandFiles = await fs.readdir(brandsDir);
brandCount = brandFiles.filter(f => f.endsWith('.webp')).length;
} catch { /* ignore */ }
const products = await countDir(path.join(IMAGES_BASE_PATH, 'products'));
const brands = await countDir(path.join(IMAGES_BASE_PATH, 'brands'));
return {
productsDir,
brandsDir,
productCount,
brandCount,
basePath: IMAGES_BASE_PATH,
productCount: products.count,
brandCount: brands.count,
totalSizeBytes: products.size + brands.size,
};
}

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM code.cannabrands.app/creationshop/node:20-slim AS builder
WORKDIR /app
@@ -20,7 +20,7 @@ COPY . .
RUN npm run build
# Production stage
FROM nginx:alpine
FROM code.cannabrands.app/creationshop/nginx:alpine
# Copy built assets from builder stage
COPY --from=builder /app/dist /usr/share/nginx/html

View File

@@ -7,8 +7,8 @@
<title>CannaIQ - Cannabis Menu Intelligence Platform</title>
<meta name="description" content="CannaIQ provides real-time cannabis dispensary menu data, product tracking, and analytics for dispensaries across Arizona." />
<meta name="keywords" content="cannabis, dispensary, menu, products, analytics, Arizona" />
<script type="module" crossorigin src="/assets/index-DTnhZh6X.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-9PqXc--D.css">
<script type="module" crossorigin src="/assets/index-BML8-px1.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-B2gR-58G.css">
</head>
<body>
<div id="root"></div>

View File

@@ -47,6 +47,7 @@ import StateDetail from './pages/StateDetail';
import { Discovery } from './pages/Discovery';
import { WorkersDashboard } from './pages/WorkersDashboard';
import { JobQueue } from './pages/JobQueue';
import TasksDashboard from './pages/TasksDashboard';
import { ScraperOverviewDashboard } from './pages/ScraperOverviewDashboard';
import { SeoOrchestrator } from './pages/admin/seo/SeoOrchestrator';
import { StatePage } from './pages/public/StatePage';
@@ -124,6 +125,8 @@ export default function App() {
<Route path="/workers" element={<PrivateRoute><WorkersDashboard /></PrivateRoute>} />
{/* Job Queue Management */}
<Route path="/job-queue" element={<PrivateRoute><JobQueue /></PrivateRoute>} />
{/* Task Queue Dashboard */}
<Route path="/tasks" element={<PrivateRoute><TasksDashboard /></PrivateRoute>} />
{/* Scraper Overview Dashboard (new primary) */}
<Route path="/scraper/overview" element={<PrivateRoute><ScraperOverviewDashboard /></PrivateRoute>} />
<Route path="*" element={<Navigate to="/dashboard" replace />} />

View File

@@ -0,0 +1,329 @@
import { useEffect, useState } from 'react';
import { api } from '../lib/api';
interface PipelineStep {
name: string;
state: 'pending' | 'running' | 'success' | 'failure' | 'skipped';
}
interface DeployStatusData {
running: {
sha: string;
sha_full: string;
build_time: string;
image_tag: string;
};
latest: {
sha: string;
sha_full: string;
message: string;
author: string;
timestamp: string;
} | null;
is_latest: boolean;
commits_behind: number;
pipeline: {
number: number;
status: string;
event: string;
branch: string;
message: string;
commit: string;
author: string;
created: number;
steps?: PipelineStep[];
} | null;
error?: string;
}
const statusColors: Record<string, string> = {
success: '#10b981',
running: '#f59e0b',
pending: '#6b7280',
failure: '#ef4444',
error: '#ef4444',
skipped: '#9ca3af',
};
const statusIcons: Record<string, string> = {
success: '\u2713',
running: '\u25B6',
pending: '\u25CB',
failure: '\u2717',
error: '\u2717',
skipped: '\u2212',
};
export function DeployStatus() {
const [data, setData] = useState<DeployStatusData | null>(null);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
const fetchStatus = async () => {
try {
setLoading(true);
const { data: responseData } = await api.get<DeployStatusData>('/api/admin/deploy-status');
setData(responseData);
setError(null);
} catch (err: any) {
setError(err.message || 'Failed to fetch deploy status');
} finally {
setLoading(false);
}
};
useEffect(() => {
fetchStatus();
// Auto-refresh every 30 seconds
const interval = setInterval(fetchStatus, 30000);
return () => clearInterval(interval);
}, []);
const formatTime = (timestamp: string | number) => {
const date = typeof timestamp === 'number'
? new Date(timestamp * 1000)
: new Date(timestamp);
return date.toLocaleString();
};
const formatTimeAgo = (timestamp: string | number) => {
const date = typeof timestamp === 'number'
? new Date(timestamp * 1000)
: new Date(timestamp);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffMins = Math.floor(diffMs / 60000);
const diffHours = Math.floor(diffMins / 60);
const diffDays = Math.floor(diffHours / 24);
if (diffMins < 1) return 'just now';
if (diffMins < 60) return `${diffMins}m ago`;
if (diffHours < 24) return `${diffHours}h ago`;
return `${diffDays}d ago`;
};
if (loading && !data) {
return (
<div style={{ padding: '20px', background: '#1f2937', borderRadius: '8px', color: '#9ca3af' }}>
Loading deploy status...
</div>
);
}
if (error && !data) {
return (
<div style={{ padding: '20px', background: '#1f2937', borderRadius: '8px', color: '#ef4444' }}>
Error: {error}
</div>
);
}
if (!data) return null;
const pipelineStatus = data.pipeline?.status || 'unknown';
return (
<div style={{
background: '#1f2937',
borderRadius: '8px',
overflow: 'hidden',
border: '1px solid #374151'
}}>
{/* Header */}
<div style={{
padding: '16px 20px',
borderBottom: '1px solid #374151',
display: 'flex',
justifyContent: 'space-between',
alignItems: 'center'
}}>
<div style={{ display: 'flex', alignItems: 'center', gap: '12px' }}>
<span style={{ fontSize: '18px', fontWeight: '600', color: '#f3f4f6' }}>
Deploy Status
</span>
{data.is_latest ? (
<span style={{
background: '#10b981',
color: 'white',
padding: '2px 8px',
borderRadius: '4px',
fontSize: '12px',
fontWeight: '500'
}}>
Up to date
</span>
) : (
<span style={{
background: '#f59e0b',
color: 'white',
padding: '2px 8px',
borderRadius: '4px',
fontSize: '12px',
fontWeight: '500'
}}>
{data.commits_behind} commit{data.commits_behind !== 1 ? 's' : ''} behind
</span>
)}
</div>
<button
onClick={fetchStatus}
disabled={loading}
style={{
background: '#374151',
border: 'none',
padding: '6px 12px',
borderRadius: '4px',
color: '#9ca3af',
cursor: loading ? 'not-allowed' : 'pointer',
fontSize: '12px'
}}
>
{loading ? 'Refreshing...' : 'Refresh'}
</button>
</div>
{/* Version Info */}
<div style={{ padding: '16px 20px', display: 'grid', gridTemplateColumns: '1fr 1fr', gap: '20px' }}>
{/* Running Version */}
<div>
<div style={{ color: '#9ca3af', fontSize: '12px', marginBottom: '8px', textTransform: 'uppercase' }}>
Running Version
</div>
<div style={{ display: 'flex', alignItems: 'center', gap: '8px' }}>
<code style={{
background: '#374151',
padding: '4px 8px',
borderRadius: '4px',
color: '#10b981',
fontSize: '14px',
fontFamily: 'monospace'
}}>
{data.running.sha}
</code>
<span style={{ color: '#6b7280', fontSize: '12px' }}>
{formatTimeAgo(data.running.build_time)}
</span>
</div>
</div>
{/* Latest Commit */}
<div>
<div style={{ color: '#9ca3af', fontSize: '12px', marginBottom: '8px', textTransform: 'uppercase' }}>
Latest Commit
</div>
{data.latest ? (
<div>
<div style={{ display: 'flex', alignItems: 'center', gap: '8px' }}>
<code style={{
background: '#374151',
padding: '4px 8px',
borderRadius: '4px',
color: data.is_latest ? '#10b981' : '#f59e0b',
fontSize: '14px',
fontFamily: 'monospace'
}}>
{data.latest.sha}
</code>
<span style={{ color: '#6b7280', fontSize: '12px' }}>
{formatTimeAgo(data.latest.timestamp)}
</span>
</div>
<div style={{
color: '#9ca3af',
fontSize: '13px',
marginTop: '4px',
overflow: 'hidden',
textOverflow: 'ellipsis',
whiteSpace: 'nowrap',
maxWidth: '300px'
}}>
{data.latest.message}
</div>
</div>
) : (
<span style={{ color: '#6b7280' }}>Unable to fetch</span>
)}
</div>
</div>
{/* Pipeline Status */}
{data.pipeline && (
<div style={{
padding: '16px 20px',
borderTop: '1px solid #374151',
background: '#111827'
}}>
<div style={{
display: 'flex',
justifyContent: 'space-between',
alignItems: 'center',
marginBottom: '12px'
}}>
<div style={{ display: 'flex', alignItems: 'center', gap: '10px' }}>
<span style={{
color: statusColors[pipelineStatus] || '#6b7280',
fontSize: '16px'
}}>
{statusIcons[pipelineStatus] || '?'}
</span>
<span style={{ color: '#f3f4f6', fontWeight: '500' }}>
Pipeline #{data.pipeline.number}
</span>
<span style={{
color: statusColors[pipelineStatus] || '#6b7280',
fontSize: '13px',
textTransform: 'capitalize'
}}>
{pipelineStatus}
</span>
</div>
<span style={{ color: '#6b7280', fontSize: '12px' }}>
{data.pipeline.branch} \u2022 {data.pipeline.commit}
</span>
</div>
{/* Pipeline Steps */}
{data.pipeline.steps && data.pipeline.steps.length > 0 && (
<div style={{
display: 'flex',
gap: '4px',
flexWrap: 'wrap'
}}>
{data.pipeline.steps.map((step, idx) => (
<div
key={idx}
style={{
display: 'flex',
alignItems: 'center',
gap: '4px',
background: '#1f2937',
padding: '4px 8px',
borderRadius: '4px',
fontSize: '12px'
}}
>
<span style={{ color: statusColors[step.state] || '#6b7280' }}>
{statusIcons[step.state] || '?'}
</span>
<span style={{ color: '#9ca3af' }}>{step.name}</span>
</div>
))}
</div>
)}
{/* Commit message */}
<div style={{
color: '#6b7280',
fontSize: '12px',
marginTop: '8px',
overflow: 'hidden',
textOverflow: 'ellipsis',
whiteSpace: 'nowrap'
}}>
{data.pipeline.message}
</div>
</div>
)}
</div>
);
}

View File

@@ -20,9 +20,11 @@ import {
Menu,
X,
Users,
UserCog,
ListOrdered,
Key,
Bot
Bot,
ListChecks
} from 'lucide-react';
interface LayoutProps {
@@ -30,6 +32,7 @@ interface LayoutProps {
}
interface VersionInfo {
version?: string;
build_version: string;
git_sha: string;
build_time: string;
@@ -124,7 +127,14 @@ export function Layout({ children }: LayoutProps) {
<path d="M3.5 6C2 8 1 10.5 1 13C1 18.5 6 22 12 22C18 22 23 18.5 23 13C23 10.5 22 8 20.5 6L12 12L3.5 6Z" opacity="0.7" />
</svg>
</div>
<span className="text-lg font-bold text-gray-900">CannaIQ</span>
<div>
<span className="text-lg font-bold text-gray-900">CannaIQ</span>
{versionInfo && (
<p className="text-xs text-gray-400">
v{versionInfo.version} ({versionInfo.git_sha}) {versionInfo.build_time !== 'unknown' && `- ${new Date(versionInfo.build_time).toLocaleDateString()}`}
</p>
)}
</div>
</div>
<p className="text-xs text-gray-500 mt-2 truncate">{user?.email}</p>
</div>
@@ -152,8 +162,10 @@ export function Layout({ children }: LayoutProps) {
<NavSection title="Admin">
<NavLink to="/admin/orchestrator" icon={<Activity className="w-4 h-4" />} label="Orchestrator" isActive={isActive('/admin/orchestrator')} />
<NavLink to="/users" icon={<UserCog className="w-4 h-4" />} label="Users" isActive={isActive('/users')} />
<NavLink to="/workers" icon={<Users className="w-4 h-4" />} label="Workers" isActive={isActive('/workers')} />
<NavLink to="/job-queue" icon={<ListOrdered className="w-4 h-4" />} label="Job Queue" isActive={isActive('/job-queue')} />
<NavLink to="/tasks" icon={<ListChecks className="w-4 h-4" />} label="Task Queue" isActive={isActive('/tasks')} />
<NavLink to="/admin/seo" icon={<FileText className="w-4 h-4" />} label="SEO Pages" isActive={isActive('/admin/seo')} />
<NavLink to="/proxies" icon={<Shield className="w-4 h-4" />} label="Proxies" isActive={isActive('/proxies')} />
<NavLink to="/api-permissions" icon={<Key className="w-4 h-4" />} label="API Keys" isActive={isActive('/api-permissions')} />
@@ -169,14 +181,6 @@ export function Layout({ children }: LayoutProps) {
<span>Logout</span>
</button>
</div>
{/* Version Footer */}
{versionInfo && (
<div className="px-3 py-2 border-t border-gray-200 bg-gray-50">
<p className="text-xs text-gray-500 text-center">{versionInfo.build_version} ({versionInfo.git_sha.slice(0, 7)})</p>
<p className="text-xs text-gray-400 text-center mt-0.5">{versionInfo.image_tag}</p>
</div>
)}
</>
);

View File

@@ -69,6 +69,13 @@ class ApiClient {
return { data };
}
async delete<T = any>(endpoint: string): Promise<{ data: T }> {
const data = await this.request<T>(endpoint, {
method: 'DELETE',
});
return { data };
}
// Auth
async login(email: string, password: string) {
return this.request<{ token: string; user: any }>('/api/auth/login', {
@@ -113,8 +120,21 @@ class ApiClient {
});
}
async getDispensaries() {
return this.request<{ dispensaries: any[] }>('/api/dispensaries');
async getDispensaries(params?: { limit?: number; offset?: number; search?: string; city?: string; state?: string; crawl_enabled?: string; status?: string }) {
const searchParams = new URLSearchParams();
if (params?.limit) searchParams.append('limit', params.limit.toString());
if (params?.offset) searchParams.append('offset', params.offset.toString());
if (params?.search) searchParams.append('search', params.search);
if (params?.city) searchParams.append('city', params.city);
if (params?.state) searchParams.append('state', params.state);
if (params?.crawl_enabled) searchParams.append('crawl_enabled', params.crawl_enabled);
if (params?.status) searchParams.append('status', params.status);
const queryString = searchParams.toString() ? `?${searchParams.toString()}` : '';
return this.request<{ dispensaries: any[]; total: number; limit: number; offset: number; hasMore: boolean }>(`/api/dispensaries${queryString}`);
}
async getDroppedStores() {
return this.request<{ dropped_count: number; dropped_stores: any[] }>('/api/dispensaries/stats/dropped');
}
async getDispensary(slug: string) {
@@ -2769,6 +2789,101 @@ class ApiClient {
sampleValues: Record<string, any>;
}>(`/api/seo/templates/variables/${encodeURIComponent(pageType)}`);
}
// ==========================================
// Task Queue API
// ==========================================
async getTasks(params?: {
role?: string;
status?: string;
dispensary_id?: number;
limit?: number;
offset?: number;
}) {
const query = new URLSearchParams();
if (params?.role) query.set('role', params.role);
if (params?.status) query.set('status', params.status);
if (params?.dispensary_id) query.set('dispensary_id', String(params.dispensary_id));
if (params?.limit) query.set('limit', String(params.limit));
if (params?.offset) query.set('offset', String(params.offset));
const qs = query.toString();
return this.request<{ tasks: any[]; count: number }>(`/api/tasks${qs ? '?' + qs : ''}`);
}
async getTask(id: number) {
return this.request<any>(`/api/tasks/${id}`);
}
async getTaskCounts() {
return this.request<{
pending: number;
claimed: number;
running: number;
completed: number;
failed: number;
stale: number;
}>('/api/tasks/counts');
}
async getTaskCapacity() {
return this.request<{ metrics: any[] }>('/api/tasks/capacity');
}
async getRoleCapacity(role: string) {
return this.request<any>(`/api/tasks/capacity/${role}`);
}
async createTask(params: {
role: string;
dispensary_id?: number;
platform?: string;
priority?: number;
scheduled_for?: string;
}) {
return this.request<any>('/api/tasks', {
method: 'POST',
body: JSON.stringify(params),
});
}
async generateResyncTasks(params?: { batches_per_day?: number; date?: string }) {
return this.request<{ success: boolean; tasks_created: number }>('/api/tasks/generate/resync', {
method: 'POST',
body: JSON.stringify(params ?? {}),
});
}
async generateDiscoveryTask(platform: string, stateCode?: string, priority?: number) {
return this.request<any>('/api/tasks/generate/discovery', {
method: 'POST',
body: JSON.stringify({ platform, state_code: stateCode, priority }),
});
}
async recoverStaleTasks(thresholdMinutes?: number) {
return this.request<{ success: boolean; tasks_recovered: number }>('/api/tasks/recover-stale', {
method: 'POST',
body: JSON.stringify({ threshold_minutes: thresholdMinutes }),
});
}
async getLastRoleCompletion(role: string) {
return this.request<{ role: string; last_completion: string | null; time_since: number | null }>(
`/api/tasks/role/${role}/last-completion`
);
}
async getRecentRoleCompletions(role: string, limit?: number) {
const qs = limit ? `?limit=${limit}` : '';
return this.request<{ tasks: any[] }>(`/api/tasks/role/${role}/recent${qs}`);
}
async checkStoreActiveTask(dispensaryId: number) {
return this.request<{ dispensary_id: number; has_active_task: boolean }>(
`/api/tasks/store/${dispensaryId}/active`
);
}
}
export const api = new ApiClient(API_URL);

119
cannaiq/src/lib/images.ts Normal file
View File

@@ -0,0 +1,119 @@
/**
* Image URL utilities for on-demand resizing
*
* Uses the backend's /img proxy endpoint for local images.
* Falls back to original URL for remote images.
*/
const API_BASE = import.meta.env.VITE_API_URL || '';
interface ImageOptions {
width?: number;
height?: number;
quality?: number;
fit?: 'cover' | 'contain' | 'fill' | 'inside' | 'outside';
}
/**
* Check if URL is a local image path
*/
function isLocalImage(url: string): boolean {
return url.startsWith('/images/') || url.startsWith('/img/');
}
/**
* Build an image URL with optional resize parameters
*
* @param imageUrl - Original image URL (local or remote)
* @param options - Resize options
* @returns Optimized image URL
*
* @example
* // Thumbnail (50px)
* getImageUrl(product.image_url, { width: 50 })
*
* // Card image (200px)
* getImageUrl(product.image_url, { width: 200 })
*
* // Detail view (600px)
* getImageUrl(product.image_url, { width: 600 })
*
* // Square crop
* getImageUrl(product.image_url, { width: 200, height: 200, fit: 'cover' })
*/
export function getImageUrl(
imageUrl: string | null | undefined,
options: ImageOptions = {}
): string | null {
if (!imageUrl) return null;
// For remote images (AWS, Dutchie CDN, etc.), return as-is
// These can't be resized by our proxy
if (imageUrl.startsWith('http://') || imageUrl.startsWith('https://')) {
return imageUrl;
}
// For local images, use the /img proxy with resize params
if (isLocalImage(imageUrl)) {
// Convert /images/ path to /img/ proxy path
let proxyPath = imageUrl;
if (imageUrl.startsWith('/images/')) {
proxyPath = imageUrl.replace('/images/', '/img/');
}
// Build query params
const params = new URLSearchParams();
if (options.width) params.set('w', String(options.width));
if (options.height) params.set('h', String(options.height));
if (options.quality) params.set('q', String(options.quality));
if (options.fit) params.set('fit', options.fit);
const queryString = params.toString();
const url = queryString ? `${proxyPath}?${queryString}` : proxyPath;
// Prepend API base if needed
return API_BASE ? `${API_BASE}${url}` : url;
}
// Unknown format, return as-is
return imageUrl;
}
/**
* Preset sizes for common use cases
*/
export const ImageSizes = {
/** Tiny thumbnail for lists (50px) */
thumb: { width: 50 },
/** Small card (100px) */
small: { width: 100 },
/** Medium card (200px) */
medium: { width: 200 },
/** Large card (400px) */
large: { width: 400 },
/** Detail view (600px) */
detail: { width: 600 },
/** Full size (no resize) */
full: {},
} as const;
/**
* Convenience function for thumbnail
*/
export function getThumbUrl(imageUrl: string | null | undefined): string | null {
return getImageUrl(imageUrl, ImageSizes.thumb);
}
/**
* Convenience function for card images
*/
export function getCardUrl(imageUrl: string | null | undefined): string | null {
return getImageUrl(imageUrl, ImageSizes.medium);
}
/**
* Convenience function for detail images
*/
export function getDetailUrl(imageUrl: string | null | undefined): string | null {
return getImageUrl(imageUrl, ImageSizes.detail);
}

View File

@@ -18,7 +18,11 @@ import {
Globe,
MapPin,
ArrowRight,
BarChart3
BarChart3,
ListChecks,
Play,
CheckCircle2,
XCircle
} from 'lucide-react';
import {
LineChart,
@@ -41,12 +45,34 @@ export function Dashboard() {
const [refreshing, setRefreshing] = useState(false);
const [pendingChangesCount, setPendingChangesCount] = useState(0);
const [showNotification, setShowNotification] = useState(false);
const [taskCounts, setTaskCounts] = useState<Record<string, number> | null>(null);
const [droppedStoresCount, setDroppedStoresCount] = useState(0);
const [showDroppedAlert, setShowDroppedAlert] = useState(false);
useEffect(() => {
loadData();
checkNotificationStatus();
checkDroppedStores();
}, []);
const checkDroppedStores = async () => {
try {
const data = await api.getDroppedStores();
setDroppedStoresCount(data.dropped_count);
// Check if notification was dismissed for this count
const dismissedCount = localStorage.getItem('dismissedDroppedStoresCount');
const isDismissed = dismissedCount && parseInt(dismissedCount) >= data.dropped_count;
setShowDroppedAlert(data.dropped_count > 0 && !isDismissed);
} catch (error) {
console.error('Failed to check dropped stores:', error);
}
};
const handleDismissDroppedAlert = () => {
localStorage.setItem('dismissedDroppedStoresCount', droppedStoresCount.toString());
setShowDroppedAlert(false);
};
const checkNotificationStatus = async () => {
try {
// Fetch real pending changes count from API
@@ -119,6 +145,15 @@ export function Dashboard() {
// National stats not critical, just skip
setNationalStats(null);
}
// Fetch task queue counts
try {
const counts = await api.getTaskCounts();
setTaskCounts(counts);
} catch {
// Task counts not critical, just skip
setTaskCounts(null);
}
} catch (error) {
console.error('Failed to load dashboard:', error);
} finally {
@@ -200,6 +235,40 @@ export function Dashboard() {
</div>
)}
{/* Dropped Stores Alert */}
{showDroppedAlert && (
<div className="mb-6 bg-red-50 border-l-4 border-red-500 rounded-lg p-3 sm:p-4">
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-3 sm:gap-4">
<div className="flex items-start sm:items-center gap-3 flex-1">
<Store className="w-5 h-5 text-red-600 flex-shrink-0 mt-0.5 sm:mt-0" />
<div className="flex-1">
<h3 className="text-sm font-semibold text-red-900">
{droppedStoresCount} dropped store{droppedStoresCount !== 1 ? 's' : ''} need{droppedStoresCount === 1 ? 's' : ''} review
</h3>
<p className="text-xs sm:text-sm text-red-700 mt-0.5">
These stores were not found in the latest Dutchie discovery and may have stopped using the platform
</p>
</div>
</div>
<div className="flex items-center gap-2 pl-8 sm:pl-0">
<button
onClick={() => navigate('/dispensaries?status=dropped')}
className="btn btn-sm bg-red-600 hover:bg-red-700 text-white border-none"
>
Review
</button>
<button
onClick={handleDismissDroppedAlert}
className="btn btn-sm btn-ghost text-red-900 hover:bg-red-100"
aria-label="Dismiss notification"
>
<X className="w-4 h-4" />
</button>
</div>
</div>
</div>
)}
<div className="space-y-8">
{/* Header */}
<div className="flex flex-col sm:flex-row sm:justify-between sm:items-center gap-4">
@@ -471,6 +540,60 @@ export function Dashboard() {
</div>
)}
{/* Task Queue Summary */}
{taskCounts && (
<div className="bg-white rounded-xl border border-gray-200 p-4 sm:p-6">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="p-2 bg-violet-50 rounded-lg">
<ListChecks className="w-5 h-5 text-violet-600" />
</div>
<div>
<h3 className="text-sm sm:text-base font-semibold text-gray-900">Task Queue</h3>
<p className="text-xs text-gray-500">Worker task processing status</p>
</div>
</div>
<button
onClick={() => navigate('/tasks')}
className="flex items-center gap-1 text-sm text-violet-600 hover:text-violet-700"
>
View Dashboard
<ArrowRight className="w-4 h-4" />
</button>
</div>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4">
<div className="p-3 bg-amber-50 rounded-lg">
<div className="flex items-center gap-2">
<Clock className="w-4 h-4 text-amber-600" />
<span className="text-xs text-gray-500">Pending</span>
</div>
<div className="text-xl font-bold text-amber-600 mt-1">{taskCounts.pending || 0}</div>
</div>
<div className="p-3 bg-blue-50 rounded-lg">
<div className="flex items-center gap-2">
<Play className="w-4 h-4 text-blue-600" />
<span className="text-xs text-gray-500">Running</span>
</div>
<div className="text-xl font-bold text-blue-600 mt-1">{(taskCounts.claimed || 0) + (taskCounts.running || 0)}</div>
</div>
<div className="p-3 bg-emerald-50 rounded-lg">
<div className="flex items-center gap-2">
<CheckCircle2 className="w-4 h-4 text-emerald-600" />
<span className="text-xs text-gray-500">Completed</span>
</div>
<div className="text-xl font-bold text-emerald-600 mt-1">{taskCounts.completed || 0}</div>
</div>
<div className="p-3 bg-red-50 rounded-lg">
<div className="flex items-center gap-2">
<XCircle className="w-4 h-4 text-red-600" />
<span className="text-xs text-gray-500">Failed</span>
</div>
<div className="text-xl font-bold text-red-600 mt-1">{(taskCounts.failed || 0) + (taskCounts.stale || 0)}</div>
</div>
</div>
</div>
)}
{/* Activity Lists */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4 sm:gap-6">
{/* Recent Scrapes */}

View File

@@ -1,33 +1,73 @@
import React, { useEffect, useState } from 'react';
import React, { useEffect, useState, useCallback } from 'react';
import { useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { Building2, Phone, Mail, MapPin, ExternalLink, Search, Eye, Pencil, X, Save } from 'lucide-react';
import { Building2, Phone, Mail, MapPin, ExternalLink, Search, Eye, Pencil, X, Save, ChevronLeft, ChevronRight } from 'lucide-react';
const PAGE_SIZE = 50;
export function Dispensaries() {
const navigate = useNavigate();
const [dispensaries, setDispensaries] = useState<any[]>([]);
const [loading, setLoading] = useState(true);
const [searchTerm, setSearchTerm] = useState('');
const [filterCity, setFilterCity] = useState('');
const [debouncedSearch, setDebouncedSearch] = useState('');
const [filterState, setFilterState] = useState('');
const [filterStatus, setFilterStatus] = useState('');
const [editingDispensary, setEditingDispensary] = useState<any | null>(null);
const [editForm, setEditForm] = useState<any>({});
const [total, setTotal] = useState(0);
const [offset, setOffset] = useState(0);
const [hasMore, setHasMore] = useState(false);
const [states, setStates] = useState<string[]>([]);
// Debounce search
useEffect(() => {
loadDispensaries();
const timer = setTimeout(() => {
setDebouncedSearch(searchTerm);
setOffset(0); // Reset to first page on search
}, 300);
return () => clearTimeout(timer);
}, [searchTerm]);
// Load states once for filter dropdown
useEffect(() => {
const loadStates = async () => {
try {
const data = await api.getDispensaries({ limit: 500, crawl_enabled: 'all' });
const uniqueStates = Array.from(new Set(data.dispensaries.map((d: any) => d.state).filter(Boolean))).sort() as string[];
setStates(uniqueStates);
} catch (error) {
console.error('Failed to load states:', error);
}
};
loadStates();
}, []);
const loadDispensaries = async () => {
const loadDispensaries = useCallback(async () => {
setLoading(true);
try {
const data = await api.getDispensaries();
const data = await api.getDispensaries({
limit: PAGE_SIZE,
offset,
search: debouncedSearch || undefined,
state: filterState || undefined,
status: filterStatus || undefined,
crawl_enabled: 'all'
});
setDispensaries(data.dispensaries);
setTotal(data.total);
setHasMore(data.hasMore);
} catch (error) {
console.error('Failed to load dispensaries:', error);
} finally {
setLoading(false);
}
};
}, [offset, debouncedSearch, filterState, filterStatus]);
useEffect(() => {
loadDispensaries();
}, [loadDispensaries]);
const handleEdit = (dispensary: any) => {
setEditingDispensary(dispensary);
@@ -59,17 +99,23 @@ export function Dispensaries() {
setEditForm({});
};
const filteredDispensaries = dispensaries.filter(disp => {
const searchLower = searchTerm.toLowerCase();
const matchesSearch = !searchTerm ||
disp.name.toLowerCase().includes(searchLower) ||
(disp.company_name && disp.company_name.toLowerCase().includes(searchLower)) ||
(disp.dba_name && disp.dba_name.toLowerCase().includes(searchLower));
const matchesCity = !filterCity || disp.city === filterCity;
return matchesSearch && matchesCity;
});
const currentPage = Math.floor(offset / PAGE_SIZE) + 1;
const totalPages = Math.ceil(total / PAGE_SIZE);
const cities = Array.from(new Set(dispensaries.map(d => d.city).filter(Boolean))).sort();
const goToPage = (page: number) => {
const newOffset = (page - 1) * PAGE_SIZE;
setOffset(newOffset);
};
const handleStateFilter = (state: string) => {
setFilterState(state);
setOffset(0); // Reset to first page
};
const handleStatusFilter = (status: string) => {
setFilterStatus(status);
setOffset(0); // Reset to first page
};
return (
<Layout>
@@ -78,13 +124,13 @@ export function Dispensaries() {
<div>
<h1 className="text-2xl font-bold text-gray-900">Dispensaries</h1>
<p className="text-sm text-gray-600 mt-1">
AZDHS official dispensary directory ({dispensaries.length} total)
USA and Canada Dispensary Directory ({total} total)
</p>
</div>
{/* Filters */}
<div className="bg-white rounded-lg border border-gray-200 p-4">
<div className="grid grid-cols-2 gap-4">
<div className="grid grid-cols-3 gap-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Search
@@ -102,19 +148,36 @@ export function Dispensaries() {
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Filter by City
Filter by State
</label>
<select
value={filterCity}
onChange={(e) => setFilterCity(e.target.value)}
value={filterState}
onChange={(e) => handleStateFilter(e.target.value)}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
>
<option value="">All Cities</option>
{cities.map(city => (
<option key={city} value={city}>{city}</option>
<option value="">All States</option>
{states.map(state => (
<option key={state} value={state}>{state}</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Filter by Status
</label>
<select
value={filterStatus}
onChange={(e) => handleStatusFilter(e.target.value)}
className={`w-full px-3 py-2 border rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500 ${
filterStatus === 'dropped' ? 'border-red-300 bg-red-50' : 'border-gray-300'
}`}
>
<option value="">All Statuses</option>
<option value="open">Open</option>
<option value="dropped">Dropped (Needs Review)</option>
<option value="closed">Closed</option>
</select>
</div>
</div>
</div>
@@ -133,9 +196,6 @@ export function Dispensaries() {
<th className="px-4 py-3 text-left text-xs font-medium text-gray-700 uppercase tracking-wider">
Name
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-700 uppercase tracking-wider">
Company
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-700 uppercase tracking-wider">
Address
</th>
@@ -157,14 +217,14 @@ export function Dispensaries() {
</tr>
</thead>
<tbody className="divide-y divide-gray-200">
{filteredDispensaries.length === 0 ? (
{dispensaries.length === 0 ? (
<tr>
<td colSpan={8} className="px-4 py-8 text-center text-sm text-gray-500">
<td colSpan={7} className="px-4 py-8 text-center text-sm text-gray-500">
No dispensaries found
</td>
</tr>
) : (
filteredDispensaries.map((disp) => (
dispensaries.map((disp) => (
<tr key={disp.id} className="hover:bg-gray-50">
<td className="px-4 py-3">
<div className="flex items-center gap-2">
@@ -181,13 +241,10 @@ export function Dispensaries() {
</div>
</div>
</td>
<td className="px-4 py-3">
<span className="text-sm text-gray-600">{disp.company_name || '-'}</span>
</td>
<td className="px-4 py-3">
<div className="flex items-start gap-1">
<MapPin className="w-3 h-3 text-gray-400 flex-shrink-0 mt-0.5" />
<span className="text-sm text-gray-600">{disp.address || '-'}</span>
<span className="text-sm text-gray-600">{disp.address1 || '-'}</span>
</div>
</td>
<td className="px-4 py-3">
@@ -266,10 +323,33 @@ export function Dispensaries() {
</table>
</div>
{/* Footer */}
{/* Footer with Pagination */}
<div className="bg-gray-50 px-4 py-3 border-t border-gray-200">
<div className="text-sm text-gray-600">
Showing {filteredDispensaries.length} of {dispensaries.length} dispensaries
<div className="flex items-center justify-between">
<div className="text-sm text-gray-600">
Showing {offset + 1}-{Math.min(offset + dispensaries.length, total)} of {total} dispensaries
</div>
<div className="flex items-center gap-2">
<button
onClick={() => goToPage(currentPage - 1)}
disabled={currentPage === 1}
className="inline-flex items-center gap-1 px-3 py-1.5 text-sm font-medium text-gray-700 bg-white border border-gray-300 rounded-lg hover:bg-gray-50 disabled:opacity-50 disabled:cursor-not-allowed"
>
<ChevronLeft className="w-4 h-4" />
Prev
</button>
<span className="text-sm text-gray-600">
Page {currentPage} of {totalPages}
</span>
<button
onClick={() => goToPage(currentPage + 1)}
disabled={!hasMore}
className="inline-flex items-center gap-1 px-3 py-1.5 text-sm font-medium text-gray-700 bg-white border border-gray-300 rounded-lg hover:bg-gray-50 disabled:opacity-50 disabled:cursor-not-allowed"
>
Next
<ChevronRight className="w-4 h-4" />
</button>
</div>
</div>
</div>
</div>

View File

@@ -2,6 +2,7 @@ import { useEffect, useState } from 'react';
import { useParams, useNavigate, Link } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { getImageUrl, ImageSizes } from '../lib/images';
import {
Building2,
Phone,
@@ -497,7 +498,7 @@ export function DispensaryDetail() {
<td className="whitespace-nowrap">
{product.image_url ? (
<img
src={product.image_url}
src={getImageUrl(product.image_url, ImageSizes.thumb) || product.image_url}
alt={product.name}
className="w-12 h-12 object-cover rounded"
onError={(e) => e.currentTarget.style.display = 'none'}
@@ -686,7 +687,7 @@ export function DispensaryDetail() {
<div className="flex items-start gap-3">
{special.image_url && (
<img
src={special.image_url}
src={getImageUrl(special.image_url, ImageSizes.small) || special.image_url}
alt={special.name}
className="w-16 h-16 object-cover rounded"
onError={(e) => e.currentTarget.style.display = 'none'}

File diff suppressed because it is too large Load Diff

View File

@@ -23,6 +23,7 @@ import {
ArrowUpCircle,
} from 'lucide-react';
import { StoreOrchestratorPanel } from '../components/StoreOrchestratorPanel';
import { DeployStatus } from '../components/DeployStatus';
interface CrawlHealth {
status: 'ok' | 'degraded' | 'stale' | 'error';
@@ -286,6 +287,9 @@ export function OrchestratorDashboard() {
</div>
</div>
{/* Deploy Status Panel */}
<DeployStatus />
{/* Metrics Cards - Clickable - Responsive: 2→3→4→7 columns */}
{metrics && (
<div className="grid grid-cols-2 sm:grid-cols-3 md:grid-cols-4 xl:grid-cols-7 gap-3 md:gap-4">

View File

@@ -3,6 +3,7 @@ import { Layout } from '../components/Layout';
import { Package, ArrowLeft, TrendingUp, TrendingDown, DollarSign, Search, Filter, ChevronDown, X, LineChart } from 'lucide-react';
import { useNavigate, useSearchParams } from 'react-router-dom';
import { api } from '../lib/api';
import { getImageUrl, ImageSizes } from '../lib/images';
interface Product {
id: number;
@@ -324,7 +325,7 @@ export function OrchestratorProducts() {
<div className="flex items-center gap-3">
{product.image_url ? (
<img
src={product.image_url}
src={getImageUrl(product.image_url, ImageSizes.thumb) || product.image_url}
alt={product.name}
className="w-10 h-10 rounded object-cover"
/>
@@ -395,7 +396,7 @@ export function OrchestratorProducts() {
<div className="flex items-center gap-4">
{selectedProduct.image_url ? (
<img
src={selectedProduct.image_url}
src={getImageUrl(selectedProduct.image_url, ImageSizes.small) || selectedProduct.image_url}
alt={selectedProduct.name}
className="w-16 h-16 rounded object-cover"
/>

View File

@@ -3,6 +3,7 @@ import { Layout } from '../components/Layout';
import { Scale, Search, Package, Store, Trophy, TrendingDown, TrendingUp, MapPin } from 'lucide-react';
import { useNavigate, useSearchParams } from 'react-router-dom';
import { api } from '../lib/api';
import { getImageUrl, ImageSizes } from '../lib/images';
interface CompareResult {
product_id: number;
@@ -311,7 +312,7 @@ export function PriceCompare() {
<div className="flex items-center gap-3">
{item.image_url ? (
<img
src={item.image_url}
src={getImageUrl(item.image_url, ImageSizes.thumb) || item.image_url}
alt={item.product_name}
className="w-10 h-10 rounded object-cover"
/>

View File

@@ -2,6 +2,7 @@ import { useEffect, useState } from 'react';
import { useParams, useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { getImageUrl, ImageSizes } from '../lib/images';
import { ArrowLeft, ExternalLink, Package, Code, Copy, CheckCircle, FileJson, TrendingUp, TrendingDown, Minus, BarChart3 } from 'lucide-react';
export function ProductDetail() {
@@ -114,14 +115,9 @@ export function ProductDetail() {
const metadata = product.metadata || {};
const getImageUrl = () => {
if (product.image_url_full) return product.image_url_full;
if (product.medium_path) return `/api/images/dutchie/${product.medium_path}`;
if (product.thumbnail_path) return `/api/images/dutchie/${product.thumbnail_path}`;
return null;
};
const imageUrl = getImageUrl();
// Use the centralized image URL helper for on-demand resizing
const productImageUrl = product.image_url_full || product.image_url || product.medium_path || product.thumbnail_path;
const imageUrl = getImageUrl(productImageUrl, ImageSizes.detail);
return (
<Layout>

View File

@@ -2,6 +2,7 @@ import { useEffect, useState } from 'react';
import { useSearchParams, useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { getImageUrl, ImageSizes } from '../lib/images';
export function Products() {
const [searchParams, setSearchParams] = useSearchParams();
@@ -417,9 +418,9 @@ function ProductCard({ product, onViewDetails }: { product: any; onViewDetails:
onMouseEnter={(e) => e.currentTarget.style.transform = 'translateY(-4px)'}
onMouseLeave={(e) => e.currentTarget.style.transform = 'translateY(0)'}
>
{product.image_url_full ? (
{(product.image_url_full || product.image_url) ? (
<img
src={product.image_url_full}
src={getImageUrl(product.image_url_full || product.image_url, ImageSizes.medium) || product.image_url_full || product.image_url}
alt={product.name}
style={{
width: '100%',

View File

@@ -2,12 +2,13 @@ import { useEffect, useState } from 'react';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { Toast } from '../components/Toast';
import { Shield, CheckCircle, XCircle, RefreshCw, Plus, MapPin, Clock, TrendingUp, Trash2, AlertCircle, Upload, FileText, X } from 'lucide-react';
import { Shield, CheckCircle, XCircle, RefreshCw, Plus, MapPin, Clock, TrendingUp, Trash2, AlertCircle, Upload, FileText, X, Edit2 } from 'lucide-react';
export function Proxies() {
const [proxies, setProxies] = useState<any[]>([]);
const [loading, setLoading] = useState(true);
const [showAddForm, setShowAddForm] = useState(false);
const [editingProxy, setEditingProxy] = useState<any>(null);
const [testing, setTesting] = useState<{ [key: number]: boolean }>({});
const [activeJob, setActiveJob] = useState<any>(null);
const [notification, setNotification] = useState<{ message: string; type: 'success' | 'error' | 'info' } | null>(null);
@@ -342,6 +343,18 @@ export function Proxies() {
/>
)}
{/* Edit Proxy Modal */}
{editingProxy && (
<EditProxyModal
proxy={editingProxy}
onClose={() => setEditingProxy(null)}
onSuccess={() => {
setEditingProxy(null);
loadProxies();
}}
/>
)}
{/* Proxy List */}
<div className="space-y-3">
{proxies.map(proxy => (
@@ -360,6 +373,9 @@ export function Proxies() {
{proxy.is_anonymous && (
<span className="px-2 py-1 text-xs font-medium bg-blue-50 text-blue-700 rounded">Anonymous</span>
)}
{proxy.max_connections > 1 && (
<span className="px-2 py-1 text-xs font-medium bg-orange-50 text-orange-700 rounded">{proxy.max_connections} connections</span>
)}
{(proxy.city || proxy.state || proxy.country) && (
<span className="px-2 py-1 text-xs font-medium bg-purple-50 text-purple-700 rounded flex items-center gap-1">
<MapPin className="w-3 h-3" />
@@ -394,6 +410,13 @@ export function Proxies() {
</div>
<div className="flex gap-2">
<button
onClick={() => setEditingProxy(proxy)}
className="inline-flex items-center gap-1 px-3 py-1.5 bg-gray-50 text-gray-700 rounded-lg hover:bg-gray-100 transition-colors text-sm font-medium"
>
<Edit2 className="w-4 h-4" />
Edit
</button>
{!proxy.active ? (
<button
onClick={() => handleRetest(proxy.id)}
@@ -762,3 +785,157 @@ function AddProxyForm({ onClose, onSuccess }: { onClose: () => void; onSuccess:
</div>
);
}
function EditProxyModal({ proxy, onClose, onSuccess }: { proxy: any; onClose: () => void; onSuccess: () => void }) {
const [host, setHost] = useState(proxy.host || '');
const [port, setPort] = useState(proxy.port?.toString() || '');
const [protocol, setProtocol] = useState(proxy.protocol || 'http');
const [username, setUsername] = useState(proxy.username || '');
const [password, setPassword] = useState(proxy.password || '');
const [maxConnections, setMaxConnections] = useState(proxy.max_connections?.toString() || '1');
const [loading, setSaving] = useState(false);
const [notification, setNotification] = useState<{ message: string; type: 'success' | 'error' | 'info' } | null>(null);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
setSaving(true);
try {
await api.updateProxy(proxy.id, {
host,
port: parseInt(port),
protocol,
username: username || undefined,
password: password || undefined,
max_connections: parseInt(maxConnections) || 1,
});
onSuccess();
} catch (error: any) {
setNotification({ message: 'Failed to update proxy: ' + error.message, type: 'error' });
} finally {
setSaving(false);
}
};
return (
<div className="fixed inset-0 bg-black/50 flex items-center justify-center z-50">
<div className="bg-white rounded-xl shadow-xl max-w-md w-full mx-4">
{notification && (
<Toast
message={notification.message}
type={notification.type}
onClose={() => setNotification(null)}
/>
)}
<div className="p-6">
<div className="flex justify-between items-center mb-6">
<h2 className="text-xl font-semibold text-gray-900">Edit Proxy</h2>
<button onClick={onClose} className="p-2 hover:bg-gray-100 rounded-lg transition-colors">
<X className="w-5 h-5 text-gray-500" />
</button>
</div>
<form onSubmit={handleSubmit}>
<div className="space-y-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Host</label>
<input
type="text"
value={host}
onChange={(e) => setHost(e.target.value)}
required
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
/>
</div>
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Port</label>
<input
type="number"
value={port}
onChange={(e) => setPort(e.target.value)}
required
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Protocol</label>
<select
value={protocol}
onChange={(e) => setProtocol(e.target.value)}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
>
<option value="http">HTTP</option>
<option value="https">HTTPS</option>
<option value="socks5">SOCKS5</option>
</select>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Username</label>
<input
type="text"
value={username}
onChange={(e) => setUsername(e.target.value)}
placeholder="Optional"
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Password</label>
<input
type="password"
value={password}
onChange={(e) => setPassword(e.target.value)}
placeholder="Optional"
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">Max Connections</label>
<input
type="number"
value={maxConnections}
onChange={(e) => setMaxConnections(e.target.value)}
min="1"
max="500"
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
/>
<p className="text-xs text-gray-500 mt-1">For rotating proxies - allows concurrent connections</p>
</div>
</div>
<div className="flex justify-end gap-3 mt-6 pt-6 border-t border-gray-200">
<button
type="button"
onClick={onClose}
className="px-4 py-2 text-gray-700 hover:bg-gray-50 rounded-lg transition-colors font-medium"
>
Cancel
</button>
<button
type="submit"
disabled={loading}
className="inline-flex items-center gap-2 px-4 py-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 transition-colors font-medium disabled:opacity-50"
>
{loading ? (
<>
<div className="w-4 h-4 border-2 border-white border-t-transparent rounded-full animate-spin"></div>
Saving...
</>
) : (
'Save Changes'
)}
</button>
</div>
</form>
</div>
</div>
</div>
);
}

View File

@@ -3,6 +3,7 @@ import { Layout } from '../components/Layout';
import { Tag, Package, Store, Percent, Search, Filter, ArrowUpDown, ExternalLink } from 'lucide-react';
import { useNavigate, useSearchParams } from 'react-router-dom';
import { api } from '../lib/api';
import { getImageUrl, ImageSizes } from '../lib/images';
interface Special {
variant_id: number;
@@ -284,7 +285,7 @@ export function Specials() {
<div className="relative">
{special.image_url ? (
<img
src={special.image_url}
src={getImageUrl(special.image_url, ImageSizes.medium) || special.image_url}
alt={special.product_name}
className="w-full h-32 object-cover"
/>

View File

@@ -2,6 +2,7 @@ import { useEffect, useState } from 'react';
import { useParams, useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { getImageUrl as getResizedImageUrl, ImageSizes } from '../lib/images';
import {
Package, Tag, Zap, Clock, ExternalLink, CheckCircle, XCircle,
AlertCircle, Building, MapPin, RefreshCw, Calendar, Activity
@@ -101,9 +102,10 @@ export function StoreDetail() {
};
const getImageUrl = (product: any) => {
if (product.image_url_full) return product.image_url_full;
if (product.medium_path) return `/api/images/dutchie/${product.medium_path}`;
if (product.thumbnail_path) return `/api/images/dutchie/${product.thumbnail_path}`;
const rawUrl = product.image_url_full || product.image_url || product.medium_path || product.thumbnail_path;
if (rawUrl) {
return getResizedImageUrl(rawUrl, ImageSizes.medium) || rawUrl;
}
return 'https://via.placeholder.com/300x300?text=No+Image';
};

View File

@@ -3,6 +3,7 @@ import { useParams, useNavigate } from 'react-router-dom';
import { Layout } from '../components/Layout';
import { api } from '../lib/api';
import { trackProductView } from '../lib/analytics';
import { getImageUrl, ImageSizes } from '../lib/images';
import {
Building2,
Phone,
@@ -470,7 +471,7 @@ export function StoreDetailPage() {
<td className="whitespace-nowrap">
{product.image_url ? (
<img
src={product.image_url}
src={getImageUrl(product.image_url, ImageSizes.thumb) || product.image_url}
alt={product.name}
className="w-12 h-12 object-cover rounded"
onError={(e) => e.currentTarget.style.display = 'none'}

View File

@@ -0,0 +1,525 @@
import { useState, useEffect } from 'react';
import { api } from '../lib/api';
import { Layout } from '../components/Layout';
import {
ListChecks,
Clock,
CheckCircle2,
XCircle,
AlertTriangle,
PlayCircle,
RefreshCw,
Search,
ChevronDown,
ChevronUp,
Gauge,
Users,
Calendar,
Zap,
} from 'lucide-react';
interface Task {
id: number;
role: string;
dispensary_id: number | null;
dispensary_name?: string;
platform: string | null;
status: string;
priority: number;
scheduled_for: string | null;
worker_id: string | null;
claimed_at: string | null;
started_at: string | null;
completed_at: string | null;
error_message: string | null;
retry_count: number;
created_at: string;
duration_sec?: number;
}
interface CapacityMetric {
role: string;
pending_tasks: number;
ready_tasks: number;
claimed_tasks: number;
running_tasks: number;
completed_last_hour: number;
failed_last_hour: number;
active_workers: number;
avg_duration_sec: number | null;
tasks_per_worker_hour: number | null;
estimated_hours_to_drain: number | null;
workers_needed?: {
for_1_hour: number;
for_4_hours: number;
for_8_hours: number;
};
}
interface TaskCounts {
pending: number;
claimed: number;
running: number;
completed: number;
failed: number;
stale: number;
}
const ROLES = [
'store_discovery',
'entry_point_discovery',
'product_discovery',
'product_refresh',
'analytics_refresh',
];
const STATUS_COLORS: Record<string, string> = {
pending: 'bg-yellow-100 text-yellow-800',
claimed: 'bg-blue-100 text-blue-800',
running: 'bg-indigo-100 text-indigo-800',
completed: 'bg-green-100 text-green-800',
failed: 'bg-red-100 text-red-800',
stale: 'bg-gray-100 text-gray-800',
};
const STATUS_ICONS: Record<string, React.ReactNode> = {
pending: <Clock className="w-4 h-4" />,
claimed: <PlayCircle className="w-4 h-4" />,
running: <RefreshCw className="w-4 h-4 animate-spin" />,
completed: <CheckCircle2 className="w-4 h-4" />,
failed: <XCircle className="w-4 h-4" />,
stale: <AlertTriangle className="w-4 h-4" />,
};
function formatDuration(seconds: number | null): string {
if (seconds === null) return '-';
if (seconds < 60) return `${Math.round(seconds)}s`;
if (seconds < 3600) return `${Math.floor(seconds / 60)}m ${Math.round(seconds % 60)}s`;
return `${Math.floor(seconds / 3600)}h ${Math.floor((seconds % 3600) / 60)}m`;
}
function formatTimeAgo(dateStr: string | null): string {
if (!dateStr) return '-';
const date = new Date(dateStr);
const now = new Date();
const diff = (now.getTime() - date.getTime()) / 1000;
if (diff < 60) return `${Math.round(diff)}s ago`;
if (diff < 3600) return `${Math.floor(diff / 60)}m ago`;
if (diff < 86400) return `${Math.floor(diff / 3600)}h ago`;
return `${Math.floor(diff / 86400)}d ago`;
}
export default function TasksDashboard() {
const [tasks, setTasks] = useState<Task[]>([]);
const [counts, setCounts] = useState<TaskCounts | null>(null);
const [capacity, setCapacity] = useState<CapacityMetric[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
// Filters
const [roleFilter, setRoleFilter] = useState<string>('');
const [statusFilter, setStatusFilter] = useState<string>('');
const [searchQuery, setSearchQuery] = useState('');
const [showCapacity, setShowCapacity] = useState(true);
// Actions
const [actionLoading, setActionLoading] = useState(false);
const [actionMessage, setActionMessage] = useState<string | null>(null);
const fetchData = async () => {
try {
const [tasksRes, countsRes, capacityRes] = await Promise.all([
api.getTasks({
role: roleFilter || undefined,
status: statusFilter || undefined,
limit: 100,
}),
api.getTaskCounts(),
api.getTaskCapacity(),
]);
setTasks(tasksRes.tasks || []);
setCounts(countsRes);
setCapacity(capacityRes.metrics || []);
setError(null);
} catch (err: any) {
setError(err.message || 'Failed to load tasks');
} finally {
setLoading(false);
}
};
useEffect(() => {
fetchData();
const interval = setInterval(fetchData, 10000); // Refresh every 10 seconds
return () => clearInterval(interval);
}, [roleFilter, statusFilter]);
const handleGenerateResync = async () => {
setActionLoading(true);
try {
const result = await api.generateResyncTasks();
setActionMessage(`Generated ${result.tasks_created} resync tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const handleRecoverStale = async () => {
setActionLoading(true);
try {
const result = await api.recoverStaleTasks();
setActionMessage(`Recovered ${result.tasks_recovered} stale tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const filteredTasks = tasks.filter((task) => {
if (searchQuery) {
const query = searchQuery.toLowerCase();
return (
task.role.toLowerCase().includes(query) ||
task.dispensary_name?.toLowerCase().includes(query) ||
task.worker_id?.toLowerCase().includes(query) ||
String(task.id).includes(query)
);
}
return true;
});
const totalActive = (counts?.claimed || 0) + (counts?.running || 0);
const totalPending = counts?.pending || 0;
if (loading) {
return (
<Layout>
<div className="flex items-center justify-center h-64">
<RefreshCw className="w-8 h-8 animate-spin text-emerald-600" />
</div>
</Layout>
);
}
return (
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4">
<div>
<h1 className="text-2xl font-bold text-gray-900 flex items-center gap-2">
<ListChecks className="w-7 h-7 text-emerald-600" />
Task Queue
</h1>
<p className="text-gray-500 mt-1">
{totalActive} active, {totalPending} pending tasks
</p>
</div>
<div className="flex gap-2">
<button
onClick={handleGenerateResync}
disabled={actionLoading}
className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 disabled:opacity-50"
>
<Calendar className="w-4 h-4" />
Generate Resync
</button>
<button
onClick={handleRecoverStale}
disabled={actionLoading}
className="flex items-center gap-2 px-4 py-2 bg-gray-600 text-white rounded-lg hover:bg-gray-700 disabled:opacity-50"
>
<Zap className="w-4 h-4" />
Recover Stale
</button>
<button
onClick={fetchData}
className="flex items-center gap-2 px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
</div>
</div>
{/* Action Message */}
{actionMessage && (
<div
className={`p-4 rounded-lg ${
actionMessage.startsWith('Error')
? 'bg-red-50 text-red-700'
: 'bg-green-50 text-green-700'
}`}
>
{actionMessage}
</div>
)}
{error && (
<div className="p-4 bg-red-50 text-red-700 rounded-lg">{error}</div>
)}
{/* Status Summary Cards */}
<div className="grid grid-cols-2 sm:grid-cols-3 lg:grid-cols-6 gap-4">
{Object.entries(counts || {}).map(([status, count]) => (
<div
key={status}
className={`p-4 rounded-lg border ${
statusFilter === status ? 'ring-2 ring-emerald-500' : ''
} cursor-pointer hover:shadow-md transition-shadow`}
onClick={() => setStatusFilter(statusFilter === status ? '' : status)}
>
<div className="flex items-center gap-2 mb-2">
<span className={`p-1.5 rounded ${STATUS_COLORS[status]}`}>
{STATUS_ICONS[status]}
</span>
<span className="text-sm font-medium text-gray-600 capitalize">{status}</span>
</div>
<div className="text-2xl font-bold text-gray-900">{count}</div>
</div>
))}
</div>
{/* Capacity Planning Section */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<button
onClick={() => setShowCapacity(!showCapacity)}
className="w-full flex items-center justify-between p-4 hover:bg-gray-50"
>
<div className="flex items-center gap-2">
<Gauge className="w-5 h-5 text-emerald-600" />
<span className="font-medium text-gray-900">Capacity Planning</span>
</div>
{showCapacity ? (
<ChevronUp className="w-5 h-5 text-gray-400" />
) : (
<ChevronDown className="w-5 h-5 text-gray-400" />
)}
</button>
{showCapacity && (
<div className="p-4 border-t border-gray-200">
{capacity.length === 0 ? (
<p className="text-gray-500 text-center py-4">No capacity data available</p>
) : (
<div className="overflow-x-auto">
<table className="min-w-full divide-y divide-gray-200">
<thead>
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Role
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Pending
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Running
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Active Workers
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Avg Duration
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Tasks/Worker/Hr
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Est. Drain Time
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Completed/Hr
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Failed/Hr
</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-200">
{capacity.map((metric) => (
<tr key={metric.role} className="hover:bg-gray-50">
<td className="px-4 py-3 text-sm font-medium text-gray-900">
{metric.role.replace(/_/g, ' ')}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{metric.pending_tasks}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{metric.running_tasks}
</td>
<td className="px-4 py-3 text-sm text-right">
<span className="inline-flex items-center gap-1">
<Users className="w-4 h-4 text-gray-400" />
{metric.active_workers}
</span>
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{formatDuration(metric.avg_duration_sec)}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{metric.tasks_per_worker_hour?.toFixed(1) || '-'}
</td>
<td className="px-4 py-3 text-sm text-right">
{metric.estimated_hours_to_drain ? (
<span
className={
metric.estimated_hours_to_drain > 4
? 'text-red-600 font-medium'
: 'text-gray-600'
}
>
{metric.estimated_hours_to_drain.toFixed(1)}h
</span>
) : (
'-'
)}
</td>
<td className="px-4 py-3 text-sm text-right text-green-600">
{metric.completed_last_hour}
</td>
<td className="px-4 py-3 text-sm text-right text-red-600">
{metric.failed_last_hour}
</td>
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
)}
</div>
{/* Filters */}
<div className="flex flex-col sm:flex-row gap-4">
<div className="relative flex-1">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-5 h-5 text-gray-400" />
<input
type="text"
placeholder="Search tasks..."
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
className="w-full pl-10 pr-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500 focus:border-emerald-500"
/>
</div>
<select
value={roleFilter}
onChange={(e) => setRoleFilter(e.target.value)}
className="px-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500"
>
<option value="">All Roles</option>
{ROLES.map((role) => (
<option key={role} value={role}>
{role.replace(/_/g, ' ')}
</option>
))}
</select>
<select
value={statusFilter}
onChange={(e) => setStatusFilter(e.target.value)}
className="px-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500"
>
<option value="">All Statuses</option>
<option value="pending">Pending</option>
<option value="claimed">Claimed</option>
<option value="running">Running</option>
<option value="completed">Completed</option>
<option value="failed">Failed</option>
<option value="stale">Stale</option>
</select>
</div>
{/* Tasks Table */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="overflow-x-auto">
<table className="min-w-full divide-y divide-gray-200">
<thead className="bg-gray-50">
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
ID
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Role
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Store
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Status
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Worker
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Duration
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Created
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Error
</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-200">
{filteredTasks.length === 0 ? (
<tr>
<td colSpan={8} className="px-4 py-8 text-center text-gray-500">
No tasks found
</td>
</tr>
) : (
filteredTasks.map((task) => (
<tr key={task.id} className="hover:bg-gray-50">
<td className="px-4 py-3 text-sm font-mono text-gray-600">#{task.id}</td>
<td className="px-4 py-3 text-sm text-gray-900">
{task.role.replace(/_/g, ' ')}
</td>
<td className="px-4 py-3 text-sm text-gray-600">
{task.dispensary_name || task.dispensary_id || '-'}
</td>
<td className="px-4 py-3">
<span
className={`inline-flex items-center gap-1 px-2 py-1 rounded-full text-xs font-medium ${
STATUS_COLORS[task.status]
}`}
>
{STATUS_ICONS[task.status]}
{task.status}
</span>
</td>
<td className="px-4 py-3 text-sm font-mono text-gray-600">
{task.worker_id?.split('-').slice(-1)[0] || '-'}
</td>
<td className="px-4 py-3 text-sm text-gray-600">
{formatDuration(task.duration_sec ?? null)}
</td>
<td className="px-4 py-3 text-sm text-gray-500">
{formatTimeAgo(task.created_at)}
</td>
<td className="px-4 py-3 text-sm text-red-600 max-w-xs truncate">
{task.error_message || '-'}
</td>
</tr>
))
)}
</tbody>
</table>
</div>
</div>
</div>
</Layout>
);
}

View File

@@ -141,13 +141,21 @@ export function Users() {
};
const canModifyUser = (user: User) => {
// Can't modify yourself
if (currentUser?.id === user.id) return false;
// Only superadmin can modify superadmin users
if (user.role === 'superadmin' && currentUser?.role !== 'superadmin') return false;
return true;
};
const canDeleteUser = (user: User) => {
// Can't delete yourself
if (currentUser?.id === user.id) return false;
// Only superadmin can delete superadmin users
if (user.role === 'superadmin' && currentUser?.role !== 'superadmin') return false;
return true;
};
const isEditingSelf = (user: User) => currentUser?.id === user.id;
return (
<Layout>
<div className="space-y-6">
@@ -236,15 +244,17 @@ export function Users() {
{new Date(user.created_at).toLocaleDateString()}
</td>
<td className="px-6 py-4 whitespace-nowrap text-right text-sm font-medium">
{canModifyUser(user) ? (
<div className="flex items-center justify-end gap-2">
<div className="flex items-center justify-end gap-2">
{canModifyUser(user) && (
<button
onClick={() => openEditModal(user)}
className="p-1.5 text-gray-400 hover:text-blue-600 hover:bg-blue-50 rounded transition-colors"
title="Edit user"
title={isEditingSelf(user) ? "Edit your profile" : "Edit user"}
>
<Pencil className="w-4 h-4" />
</button>
)}
{canDeleteUser(user) ? (
<button
onClick={() => handleDelete(user)}
className="p-1.5 text-gray-400 hover:text-red-600 hover:bg-red-50 rounded transition-colors"
@@ -252,10 +262,10 @@ export function Users() {
>
<Trash2 className="w-4 h-4" />
</button>
</div>
) : (
<span className="text-xs text-gray-400"></span>
)}
) : !canModifyUser(user) && (
<span className="text-xs text-gray-400"></span>
)}
</div>
</td>
</tr>
))}
@@ -349,11 +359,15 @@ export function Users() {
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Role
{editingUser && currentUser?.id === editingUser.id && (
<span className="ml-2 text-xs text-gray-400 font-normal">(cannot change your own role)</span>
)}
</label>
<select
value={formData.role}
onChange={(e) => setFormData({ ...formData, role: e.target.value })}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
disabled={editingUser !== null && currentUser?.id === editingUser.id}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-blue-500 disabled:bg-gray-100 disabled:cursor-not-allowed"
>
<option value="viewer">Viewer</option>
<option value="analyst">Analyst</option>

File diff suppressed because it is too large Load Diff

353
docs/CRAWL_SYSTEM_V2.md Normal file
View File

@@ -0,0 +1,353 @@
# CannaiQ Crawl System V2
## Overview
The CannaiQ Crawl System is a GraphQL-based data pipeline that discovers and monitors cannabis dispensaries using the Dutchie platform. It operates in two phases:
1. **Phase 1: Store Discovery** - Weekly discovery of Dutchie-powered dispensaries
2. **Phase 2: Product Crawling** - Regular product/price/stock updates (documented separately)
---
## Phase 1: Store Discovery
### Purpose
Automatically discover and maintain a database of dispensaries that use Dutchie menus across all US states.
### Schedule
- **Frequency**: Weekly (typically Sunday night)
- **Duration**: ~2-4 hours for full US coverage
### Flow Diagram
```
┌─────────────────────────────────────────────────────────────────────┐
│ PHASE 1: STORE DISCOVERY │
└─────────────────────────────────────────────────────────────────────┘
1. IDENTITY SETUP
┌──────────────────┐
│ getRandomProxy() │ ──► Random IP from proxy pool
└──────────────────┘
┌──────────────────┐
│ startSession() │ ──► Random UA + fingerprint + locale matching proxy location
└──────────────────┘
2. CITY DISCOVERY (per state)
┌──────────────────────────────┐
│ GraphQL: getAllCitiesByState │ ──► Returns cities with active dispensaries
└──────────────────────────────┘
┌──────────────────────────────┐
│ Upsert dutchie_discovery_ │
│ cities table │
└──────────────────────────────┘
3. STORE DISCOVERY (per city)
┌───────────────────────────────┐
│ GraphQL: ConsumerDispensaries │ ──► Returns store data for city
└───────────────────────────────┘
┌───────────────────────────────┐
│ Upsert dutchie_discovery_ │
│ locations table │
└───────────────────────────────┘
4. VALIDATION & PROMOTION
┌──────────────────────────┐
│ validateForPromotion() │ ──► Check required fields
└──────────────────────────┘
┌──────────────────────────┐
│ promoteLocation() │ ──► Upsert to dispensaries table
└──────────────────────────┘
┌──────────────────────────┐
│ ensureCrawlerProfile() │ ──► Create profile with status='sandbox'
└──────────────────────────┘
5. DROPPED STORE DETECTION
┌──────────────────────────┐
│ detectDroppedStores() │ ──► Find stores missing from discovery
└──────────────────────────┘
┌──────────────────────────┐
│ Mark status='dropped' │ ──► Dashboard alert for review
└──────────────────────────┘
```
---
## Key Files
| File | Purpose |
|------|---------|
| `backend/src/platforms/dutchie/client.ts` | HTTP client with proxy/fingerprint rotation |
| `backend/src/discovery/discovery-crawler.ts` | Main discovery orchestrator |
| `backend/src/discovery/location-discovery.ts` | City/store GraphQL fetching |
| `backend/src/discovery/promotion.ts` | Validation and promotion logic |
| `backend/src/scripts/run-discovery.ts` | CLI entry point |
---
## Identity Masking
Before any GraphQL queries, the system establishes a masked identity:
### 1. Proxy Selection
```typescript
// backend/src/platforms/dutchie/client.ts
// Get random proxy from active pool (NOT state-specific)
const proxy = await getRandomProxy();
setProxy(proxy.url);
```
The proxy is selected randomly from the active proxy pool. It is NOT geo-targeted to the state being crawled.
### 2. Fingerprint + Locale Harmonization
```typescript
// backend/src/platforms/dutchie/client.ts
function startSession(stateCode: string, timezone: string) {
// 1. Random browser fingerprint (Chrome/Firefox/Safari/Edge variants)
const fingerprint = getRandomFingerprint();
// 2. Match Accept-Language to proxy's timezone/location
const locale = getLocaleForTimezone(timezone);
// 3. Set headers for this session
currentSession = {
userAgent: fingerprint.ua,
acceptLanguage: locale,
secChUa: fingerprint.secChUa,
// ... other fingerprint headers
};
}
```
### Fingerprint Pool
6 browser fingerprints rotate on each session and on 403 errors:
| Browser | Version | Platform |
|---------|---------|----------|
| Chrome | 120 | Windows |
| Chrome | 120 | macOS |
| Firefox | 121 | Windows |
| Firefox | 121 | macOS |
| Safari | 17.2 | macOS |
| Edge | 120 | Windows |
### Timezone → Locale Mapping
```typescript
const TIMEZONE_TO_LOCALE: Record<string, string> = {
'America/New_York': 'en-US,en;q=0.9',
'America/Chicago': 'en-US,en;q=0.9',
'America/Denver': 'en-US,en;q=0.9',
'America/Los_Angeles': 'en-US,en;q=0.9',
'America/Phoenix': 'en-US,en;q=0.9',
// ...
};
```
---
## GraphQL Queries
### 1. getAllCitiesByState
Fetches cities with active dispensaries for a state.
```typescript
// backend/src/discovery/location-discovery.ts
const response = await executeGraphQL({
operationName: 'getAllCitiesByState',
variables: {
state: 'AZ',
countryCode: 'US'
}
});
// Returns: { cities: [{ name: 'Phoenix', slug: 'phoenix' }, ...] }
```
**Hash**: `ae547a0466ace5a48f91e55bf6699eacd87e3a42841560f0c0eabed5a0a920e6`
### 2. ConsumerDispensaries
Fetches store data for a city/state.
```typescript
// backend/src/discovery/location-discovery.ts
const response = await executeGraphQL({
operationName: 'ConsumerDispensaries',
variables: {
dispensaryFilter: {
city: 'Phoenix',
state: 'AZ',
activeOnly: true
}
}
});
// Returns: [{ id, name, address, coords, menuUrl, ... }, ...]
```
**Hash**: `0a5bfa6ca1d64ae47bcccb7c8077c87147cbc4e6982c17ceec97a2a4948b311b`
---
## Database Tables
### Discovery Tables (Staging)
| Table | Purpose |
|-------|---------|
| `dutchie_discovery_cities` | Cities known to have dispensaries |
| `dutchie_discovery_locations` | Raw discovered store data |
### Canonical Tables
| Table | Purpose |
|-------|---------|
| `dispensaries` | Promoted stores ready for crawling |
| `dispensary_crawler_profiles` | Crawler configuration per store |
| `dutchie_promotion_log` | Audit trail for all discovery actions |
---
## Validation Rules
A discovery location must have these fields to be promoted:
| Field | Requirement |
|-------|-------------|
| `platform_location_id` | MongoDB ObjectId (24 hex chars) |
| `name` | Non-empty string |
| `city` | Non-empty string |
| `state_code` | Non-empty string |
| `platform_menu_url` | Valid URL |
Invalid records are marked `status='rejected'` with errors logged.
---
## Dropped Store Detection
After discovery, the system identifies stores that may have left the Dutchie platform:
### Detection Criteria
A store is marked as "dropped" if:
1. It has a `platform_dispensary_id` (was previously verified)
2. It's currently `status='open'` and `crawl_enabled=true`
3. It was NOT seen in the latest discovery (not in `dutchie_discovery_locations` with `last_seen_at` in last 24 hours)
### Implementation
```typescript
// backend/src/discovery/discovery-crawler.ts
export async function detectDroppedStores(pool: Pool, stateCode?: string) {
// 1. Find dispensaries not in recent discovery
// 2. Mark status='dropped'
// 3. Log to dutchie_promotion_log
// 4. Return list for dashboard alert
}
```
### Admin UI
- **Dashboard**: Red alert banner when dropped stores exist
- **Dispensaries page**: Filter by `status=dropped` to review
---
## CLI Usage
```bash
# Discover all stores in a state
npx tsx src/scripts/run-discovery.ts discover:state AZ
# Discover all US states
npx tsx src/scripts/run-discovery.ts discover:all
# Dry run (no DB writes)
npx tsx src/scripts/run-discovery.ts discover:state CA --dry-run
# Check stats
npx tsx src/scripts/run-discovery.ts stats
```
---
## Rate Limiting
- **2 seconds** between city requests
- **Exponential backoff** on 429/403 responses
- **Fingerprint rotation** on 403 errors
---
## Error Handling
| Error | Action |
|-------|--------|
| 403 Forbidden | Rotate fingerprint, retry |
| 429 Rate Limited | Wait 30s, retry |
| Network timeout | Retry up to 3 times |
| GraphQL error | Log and continue to next city |
---
## Monitoring
### Logs
Discovery progress is logged to stdout:
```
[Discovery] Starting discovery for state: AZ
[Discovery] Step 1: Initializing proxy...
[Discovery] Step 2: Fetching cities...
[Discovery] Found 45 cities for AZ
[Discovery] Step 3: Discovering locations...
[Discovery] City 1/45: Phoenix - found 28 stores
...
[Discovery] Step 4: Auto-promoting discovered locations...
[Discovery] Created: 5 new dispensaries
[Discovery] Updated: 40 existing dispensaries
[Discovery] Step 5: Detecting dropped stores...
[Discovery] Found 2 dropped stores
```
### Audit Log
All actions logged to `dutchie_promotion_log`:
| Action | Description |
|--------|-------------|
| `promoted_create` | New dispensary created |
| `promoted_update` | Existing dispensary updated |
| `rejected` | Validation failed |
| `dropped` | Store not found in discovery |
---
## Next: Phase 2
See `docs/PRODUCT_CRAWL_V2.md` for the product crawling phase (coming next).

408
docs/WORKER_SYSTEM.md Normal file
View File

@@ -0,0 +1,408 @@
# CannaiQ Worker System
## Overview
The Worker System is a role-based task queue that processes background jobs. All tasks go into a single pool, and workers claim tasks based on their assigned role.
---
## Design Pattern: Single Pool, Role-Based Claiming
```
┌─────────────────────────────────────────┐
│ TASK POOL (worker_tasks) │
│ │
│ ┌─────────────────────────────────┐ │
│ │ role=store_discovery pending │ │
│ │ role=product_resync pending │ │
│ │ role=product_resync pending │ │
│ │ role=product_resync pending │ │
│ │ role=analytics_refresh pending │ │
│ │ role=entry_point_disc pending │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
┌────────────────────────────┼────────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ WORKER │ │ WORKER │ │ WORKER │
│ role=product_ │ │ role=product_ │ │ role=store_ │
│ resync │ │ resync │ │ discovery │
│ │ │ │ │ │
│ Claims ONLY │ │ Claims ONLY │ │ Claims ONLY │
│ product_resync │ │ product_resync │ │ store_discovery │
│ tasks │ │ tasks │ │ tasks │
└──────────────────┘ └──────────────────┘ └──────────────────┘
```
**Key Points:**
- All tasks go into ONE table (`worker_tasks`)
- Each worker is assigned ONE role at startup
- Workers only claim tasks matching their role
- Multiple workers can share the same role (horizontal scaling)
---
## Worker Roles
| Role | Purpose | Per-Store? | Schedule |
|------|---------|------------|----------|
| `store_discovery` | Find new dispensaries via GraphQL | No | Weekly |
| `entry_point_discovery` | Resolve platform IDs from menu URLs | Yes | On-demand |
| `product_discovery` | Initial product fetch for new stores | Yes | On-demand |
| `product_resync` | Regular price/stock updates | Yes | Every 4 hours |
| `analytics_refresh` | Refresh materialized views | No | Daily |
---
## Task Lifecycle
```
pending → claimed → running → completed
failed
(retry if < max_retries)
```
| Status | Meaning |
|--------|---------|
| `pending` | Waiting to be claimed |
| `claimed` | Worker has claimed, not yet started |
| `running` | Worker is actively processing |
| `completed` | Successfully finished |
| `failed` | Error occurred |
| `stale` | Worker died (heartbeat timeout) |
---
## Task Chaining
Tasks automatically create follow-up tasks:
```
store_discovery (finds new stores)
├─ Returns newStoreIds[] in result
entry_point_discovery (for each new store)
├─ Resolves platform_dispensary_id
product_discovery (initial crawl)
(store enters regular schedule)
product_resync (every 4 hours)
```
---
## How Claiming Works
### 1. Worker starts with a role
```bash
WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts
```
### 2. Worker loop polls for tasks
```typescript
// Simplified worker loop
while (running) {
const task = await claimTask(this.role, this.workerId);
if (!task) {
await sleep(5000); // No tasks, wait 5 seconds
continue;
}
await processTask(task);
}
```
### 3. SQL function claims atomically
```sql
-- claim_task(role, worker_id)
UPDATE worker_tasks
SET status = 'claimed', worker_id = $2, claimed_at = NOW()
WHERE id = (
SELECT id FROM worker_tasks
WHERE role = $1 -- Filter by worker's role
AND status = 'pending'
AND (scheduled_for IS NULL OR scheduled_for <= NOW())
AND dispensary_id NOT IN ( -- Per-store locking
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
)
ORDER BY priority DESC, created_at ASC -- Priority ordering
LIMIT 1
FOR UPDATE SKIP LOCKED -- Atomic, no race conditions
)
RETURNING *;
```
**Key Features:**
- `FOR UPDATE SKIP LOCKED` - Prevents race conditions between workers
- Role filtering - Worker only sees tasks for its role
- Per-store locking - Only one active task per dispensary
- Priority ordering - Higher priority tasks first
- Scheduled tasks - Respects `scheduled_for` timestamp
---
## Heartbeat & Stale Recovery
Workers send heartbeats every 30 seconds while processing:
```typescript
// During task processing
setInterval(() => {
await pool.query(
'UPDATE worker_tasks SET last_heartbeat_at = NOW() WHERE id = $1',
[taskId]
);
}, 30000);
```
If a worker dies, its tasks are recovered:
```sql
-- recover_stale_tasks(threshold_minutes)
UPDATE worker_tasks
SET status = 'pending', worker_id = NULL, retry_count = retry_count + 1
WHERE status IN ('claimed', 'running')
AND last_heartbeat_at < NOW() - INTERVAL '10 minutes'
AND retry_count < max_retries;
```
---
## Scheduling
### Daily Resync Generation
```sql
SELECT generate_resync_tasks(6, CURRENT_DATE); -- 6 batches = every 4 hours
```
Creates staggered tasks:
| Batch | Time | Stores |
|-------|------|--------|
| 1 | 00:00 | 1-50 |
| 2 | 04:00 | 51-100 |
| 3 | 08:00 | 101-150 |
| 4 | 12:00 | 151-200 |
| 5 | 16:00 | 201-250 |
| 6 | 20:00 | 251-300 |
---
## Files
### Core
| File | Purpose |
|------|---------|
| `src/tasks/task-service.ts` | Task CRUD, claiming, capacity metrics |
| `src/tasks/task-worker.ts` | Worker loop, heartbeat, handler dispatch |
| `src/routes/tasks.ts` | REST API endpoints |
| `migrations/074_worker_task_queue.sql` | Database schema + SQL functions |
### Handlers
| File | Role |
|------|------|
| `src/tasks/handlers/store-discovery.ts` | `store_discovery` |
| `src/tasks/handlers/entry-point-discovery.ts` | `entry_point_discovery` |
| `src/tasks/handlers/product-discovery.ts` | `product_discovery` |
| `src/tasks/handlers/product-resync.ts` | `product_resync` |
| `src/tasks/handlers/analytics-refresh.ts` | `analytics_refresh` |
---
## Running Workers
### Local Development
```bash
# Start a single worker
WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts
# Start multiple workers (different terminals)
WORKER_ROLE=product_resync WORKER_ID=resync-1 npx tsx src/tasks/task-worker.ts
WORKER_ROLE=product_resync WORKER_ID=resync-2 npx tsx src/tasks/task-worker.ts
WORKER_ROLE=store_discovery npx tsx src/tasks/task-worker.ts
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `WORKER_ROLE` | (required) | Which task role to process |
| `WORKER_ID` | auto-generated | Custom worker identifier |
| `POLL_INTERVAL_MS` | 5000 | How often to check for tasks |
| `HEARTBEAT_INTERVAL_MS` | 30000 | How often to update heartbeat |
### Kubernetes
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: task-worker-resync
spec:
replicas: 5 # Scale horizontally
template:
spec:
containers:
- name: worker
image: code.cannabrands.app/creationshop/dispensary-scraper:latest
command: ["npx", "tsx", "src/tasks/task-worker.ts"]
env:
- name: WORKER_ROLE
value: "product_resync"
```
---
## API Endpoints
### Task Management
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/tasks` | List tasks (with filters) |
| POST | `/api/tasks` | Create a task |
| GET | `/api/tasks/:id` | Get task by ID |
| GET | `/api/tasks/counts` | Counts by status |
| GET | `/api/tasks/capacity` | Capacity metrics |
| POST | `/api/tasks/recover-stale` | Recover dead worker tasks |
### Task Generation
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/tasks/generate/resync` | Generate daily resync batch |
| POST | `/api/tasks/generate/discovery` | Create store discovery task |
---
## Capacity Planning
The `v_worker_capacity` view provides metrics:
```sql
SELECT * FROM v_worker_capacity;
```
| Metric | Description |
|--------|-------------|
| `pending_tasks` | Tasks waiting |
| `ready_tasks` | Tasks ready now (scheduled_for passed) |
| `running_tasks` | Tasks being processed |
| `active_workers` | Workers with recent heartbeat |
| `tasks_per_worker_hour` | Throughput estimate |
| `estimated_hours_to_drain` | Time to clear queue |
### Scaling API
```bash
GET /api/tasks/capacity/product_resync
```
```json
{
"pending_tasks": 500,
"active_workers": 3,
"workers_needed": {
"for_1_hour": 10,
"for_4_hours": 3,
"for_8_hours": 2
}
}
```
---
## Database Schema
### worker_tasks
```sql
CREATE TABLE worker_tasks (
id SERIAL PRIMARY KEY,
-- Task identification
role VARCHAR(50) NOT NULL,
dispensary_id INTEGER REFERENCES dispensaries(id),
platform VARCHAR(20),
-- State
status VARCHAR(20) DEFAULT 'pending',
priority INTEGER DEFAULT 0,
scheduled_for TIMESTAMPTZ,
-- Ownership
worker_id VARCHAR(100),
claimed_at TIMESTAMPTZ,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
last_heartbeat_at TIMESTAMPTZ,
-- Results
result JSONB,
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
### Key Indexes
```sql
-- Fast claiming by role
CREATE INDEX idx_worker_tasks_pending
ON worker_tasks(role, priority DESC, created_at ASC)
WHERE status = 'pending';
-- Prevent duplicate active tasks per store
CREATE UNIQUE INDEX idx_worker_tasks_unique_active_store
ON worker_tasks(dispensary_id)
WHERE status IN ('claimed', 'running') AND dispensary_id IS NOT NULL;
```
---
## Monitoring
### Logs
```
[TaskWorker] Starting worker worker-product_resync-a1b2c3d4 for role: product_resync
[TaskWorker] Claimed task 123 (product_resync) for dispensary 456
[TaskWorker] Task 123 completed successfully
```
### Health Check
```sql
-- Active workers
SELECT worker_id, role, COUNT(*), MAX(last_heartbeat_at)
FROM worker_tasks
WHERE last_heartbeat_at > NOW() - INTERVAL '5 minutes'
GROUP BY worker_id, role;
-- Task counts by role/status
SELECT role, status, COUNT(*)
FROM worker_tasks
GROUP BY role, status;
```

View File

@@ -33,8 +33,8 @@ or overwrites of existing data.
| Table | Purpose | Key Columns |
|-------|---------|-------------|
| `dispensaries` | Store locations | id, name, slug, city, state, platform_dispensary_id |
| `dutchie_products` | Canonical products | id, dispensary_id, external_product_id, name, brand_name, stock_status |
| `dutchie_product_snapshots` | Historical snapshots | dutchie_product_id, crawled_at, rec_min_price_cents |
| `store_products` | Canonical products | id, dispensary_id, external_product_id, name, brand_name, stock_status |
| `store_product_snapshots` | Historical snapshots | store_product_id, crawled_at, rec_min_price_cents |
| `brands` (view: v_brands) | Derived from products | brand_name, brand_id, product_count |
| `categories` (view: v_categories) | Derived from products | type, subcategory, product_count |
@@ -147,12 +147,10 @@ CREATE TABLE IF NOT EXISTS products_from_legacy (
---
### 3. Dutchie Products
### 3. Products (Legacy dutchie_products)
**Source:** `dutchie_legacy.dutchie_products`
**Target:** `cannaiq.dutchie_products`
These tables have nearly identical schemas. The mapping is direct:
**Target:** `cannaiq.store_products`
| Legacy Column | Canonical Column | Notes |
|---------------|------------------|-------|
@@ -180,15 +178,15 @@ ON CONFLICT (dispensary_id, external_product_id) DO NOTHING
---
### 4. Dutchie Product Snapshots
### 4. Product Snapshots (Legacy dutchie_product_snapshots)
**Source:** `dutchie_legacy.dutchie_product_snapshots`
**Target:** `cannaiq.dutchie_product_snapshots`
**Target:** `cannaiq.store_product_snapshots`
| Legacy Column | Canonical Column | Notes |
|---------------|------------------|-------|
| id | - | Generate new |
| dutchie_product_id | dutchie_product_id | Map via product lookup |
| dutchie_product_id | store_product_id | Map via product lookup |
| dispensary_id | dispensary_id | Map via dispensary lookup |
| crawled_at | crawled_at | Direct |
| rec_min_price_cents | rec_min_price_cents | Direct |
@@ -201,7 +199,7 @@ ON CONFLICT (dispensary_id, external_product_id) DO NOTHING
```sql
-- No unique constraint on snapshots - all are historical records
-- Just INSERT, no conflict handling needed
INSERT INTO dutchie_product_snapshots (...) VALUES (...)
INSERT INTO store_product_snapshots (...) VALUES (...)
```
---

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM code.cannabrands.app/creationshop/node:20-slim AS builder
WORKDIR /app
@@ -12,14 +12,15 @@ RUN npm install
# Copy source files
COPY . .
# Set build-time environment variable for API URL (CRA uses REACT_APP_ prefix)
ENV REACT_APP_API_URL=https://api.findadispo.com
# Note: REACT_APP_API_URL is intentionally NOT set here
# The frontend uses relative URLs (same domain) in production
# API calls go to /api/* which the ingress routes to the backend
# Build the app (CRA produces /build, not /dist)
RUN npm run build
# Production stage
FROM nginx:alpine
FROM code.cannabrands.app/creationshop/nginx:alpine
# Copy built assets from builder stage (CRA outputs to /build)
COPY --from=builder /app/build /usr/share/nginx/html

View File

@@ -288,3 +288,89 @@ export async function getStates() {
return [];
}
}
// ============================================================
// PRODUCTS
// ============================================================
/**
* Fetch products for a specific dispensary
* @param {number} dispensaryId - Dispensary ID
* @param {Object} params - Query parameters
* @returns {Promise<{products: Array, pagination: Object}>}
*/
export async function getDispensaryProducts(dispensaryId, params = {}) {
const queryParams = new URLSearchParams();
if (params.category) queryParams.append('category', params.category);
if (params.brand) queryParams.append('brand', params.brand);
if (params.search) queryParams.append('search', params.search);
if (params.inStockOnly) queryParams.append('in_stock_only', 'true');
if (params.limit) queryParams.append('limit', params.limit);
if (params.offset) queryParams.append('offset', params.offset);
const queryString = queryParams.toString();
const endpoint = `/api/v1/dispensaries/${dispensaryId}/products${queryString ? `?${queryString}` : ''}`;
return apiRequest(endpoint);
}
/**
* Get categories available at a dispensary
* @param {number} dispensaryId - Dispensary ID
* @returns {Promise<Array>}
*/
export async function getDispensaryCategories(dispensaryId) {
return apiRequest(`/api/v1/dispensaries/${dispensaryId}/categories`);
}
/**
* Get brands available at a dispensary
* @param {number} dispensaryId - Dispensary ID
* @returns {Promise<Array>}
*/
export async function getDispensaryBrands(dispensaryId) {
return apiRequest(`/api/v1/dispensaries/${dispensaryId}/brands`);
}
/**
* Map API product to UI format
* @param {Object} apiProduct - Product from API
* @returns {Object} - Product formatted for UI
*/
export function mapProductForUI(apiProduct) {
const p = apiProduct;
// Parse price from string or number
const parsePrice = (val) => {
if (val === null || val === undefined) return null;
const num = typeof val === 'string' ? parseFloat(val) : val;
return isNaN(num) ? null : num;
};
return {
id: p.id,
name: p.name || p.name_raw,
brand: p.brand || p.brand_name || p.brand_name_raw,
category: p.category || p.type || p.category_raw,
subcategory: p.subcategory || p.subcategory_raw,
strainType: p.strain_type,
image: p.image_url || p.primary_image_url,
thc: p.thc || p.thc_percent || p.thc_percentage,
cbd: p.cbd || p.cbd_percent || p.cbd_percentage,
price: parsePrice(p.price_rec) || parsePrice(p.regular_price) || parsePrice(p.price),
salePrice: parsePrice(p.price_rec_special) || parsePrice(p.sale_price),
inStock: p.in_stock !== undefined ? p.in_stock : p.stock_status === 'in_stock',
stockStatus: p.stock_status,
onSale: p.on_special || p.special || false,
updatedAt: p.updated_at || p.snapshot_at,
};
}
/**
* Get aggregate stats (product count, brand count, dispensary count)
* @returns {Promise<Object>}
*/
export async function getStats() {
return apiRequest('/api/v1/stats');
}

View File

@@ -1,10 +1,11 @@
import React, { useState, useEffect } from 'react';
import { useParams, Link } from 'react-router-dom';
import { MapPin, Phone, Clock, Star, Navigation, ArrowLeft, Share2, Heart, Loader2 } from 'lucide-react';
import { MapPin, Phone, Clock, Star, Navigation, ArrowLeft, Share2, Heart, Loader2, Search, Package } from 'lucide-react';
import { Button } from '../../components/ui/button';
import { Input } from '../../components/ui/input';
import { Card, CardContent, CardHeader, CardTitle } from '../../components/ui/card';
import { Badge } from '../../components/ui/badge';
import { getDispensaryBySlug, mapDispensaryForUI } from '../../api/client';
import { getDispensaryBySlug, mapDispensaryForUI, getDispensaryProducts, getDispensaryCategories, mapProductForUI } from '../../api/client';
import { formatDistance } from '../../lib/utils';
export function DispensaryDetail() {
@@ -13,6 +14,14 @@ export function DispensaryDetail() {
const [loading, setLoading] = useState(true);
const [error, setError] = useState(null);
// Products state
const [products, setProducts] = useState([]);
const [categories, setCategories] = useState([]);
const [productsLoading, setProductsLoading] = useState(false);
const [selectedCategory, setSelectedCategory] = useState('all');
const [searchQuery, setSearchQuery] = useState('');
const [productCount, setProductCount] = useState(0);
useEffect(() => {
const fetchDispensary = async () => {
try {
@@ -30,6 +39,35 @@ export function DispensaryDetail() {
fetchDispensary();
}, [slug]);
// Fetch products when dispensary is loaded
useEffect(() => {
if (!dispensary?.id) return;
const fetchProducts = async () => {
try {
setProductsLoading(true);
const [productsRes, categoriesRes] = await Promise.all([
getDispensaryProducts(dispensary.id, {
category: selectedCategory !== 'all' ? selectedCategory : undefined,
search: searchQuery || undefined,
limit: 50,
}),
getDispensaryCategories(dispensary.id),
]);
setProducts((productsRes.products || []).map(mapProductForUI));
setProductCount(productsRes.pagination?.total || productsRes.products?.length || 0);
setCategories(categoriesRes.categories || []);
} catch (err) {
console.error('Error fetching products:', err);
} finally {
setProductsLoading(false);
}
};
fetchProducts();
}, [dispensary?.id, selectedCategory, searchQuery]);
if (loading) {
return (
<div className="container mx-auto px-4 py-16 text-center">
@@ -158,16 +196,66 @@ export function DispensaryDetail() {
</Card>
)}
{/* Products Section Placeholder */}
{/* Products Section */}
<Card>
<CardHeader>
<CardTitle>Available Products</CardTitle>
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4">
<CardTitle>Available Products ({productCount})</CardTitle>
<div className="relative w-full sm:w-64">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 h-4 w-4 text-gray-400" />
<Input
type="text"
placeholder="Search products..."
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
className="pl-10"
/>
</div>
</div>
{/* Category Filters */}
<div className="flex flex-wrap gap-2 mt-4">
<Button
size="sm"
variant={selectedCategory === 'all' ? 'default' : 'outline'}
onClick={() => setSelectedCategory('all')}
>
All
</Button>
{categories.map((cat) => (
<Button
key={cat.type || cat.name}
size="sm"
variant={selectedCategory === (cat.type || cat.name) ? 'default' : 'outline'}
onClick={() => setSelectedCategory(cat.type || cat.name)}
>
{cat.type || cat.name} ({cat.count || cat.product_count || 0})
</Button>
))}
</div>
</CardHeader>
<CardContent>
<div className="text-center py-8 text-gray-500">
<p>Product menu coming soon</p>
<p className="text-sm mt-2">Connect to API to view available products</p>
</div>
{productsLoading ? (
<div className="text-center py-8">
<Loader2 className="h-8 w-8 mx-auto animate-spin text-primary" />
<p className="text-gray-500 mt-2">Loading products...</p>
</div>
) : products.length === 0 ? (
<div className="text-center py-8 text-gray-500">
<Package className="h-12 w-12 mx-auto mb-4 text-gray-300" />
<p>No products found</p>
{searchQuery && (
<Button variant="link" onClick={() => setSearchQuery('')}>
Clear search
</Button>
)}
</div>
) : (
<div className="grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-3 gap-4">
{products.map((product) => (
<ProductCard key={product.id} product={product} />
))}
</div>
)}
</CardContent>
</Card>
</div>
@@ -225,4 +313,63 @@ export function DispensaryDetail() {
);
}
// Product Card Component
function ProductCard({ product }) {
const formatPrice = (price) => {
if (price === null || price === undefined) return null;
return `$${parseFloat(price).toFixed(2)}`;
};
return (
<Card className="overflow-hidden hover:shadow-md transition-shadow">
<div className="aspect-square bg-gray-100 relative">
{product.image ? (
<img
src={product.image}
alt={product.name}
className="w-full h-full object-cover"
/>
) : (
<div className="w-full h-full flex items-center justify-center">
<Package className="h-12 w-12 text-gray-300" />
</div>
)}
{product.onSale && (
<Badge className="absolute top-2 right-2 bg-red-500">Sale</Badge>
)}
{!product.inStock && (
<div className="absolute inset-0 bg-black/50 flex items-center justify-center">
<Badge variant="secondary">Out of Stock</Badge>
</div>
)}
</div>
<CardContent className="p-4">
<p className="text-xs text-gray-500 uppercase tracking-wide mb-1">
{product.brand || 'Unknown Brand'}
</p>
<h4 className="font-medium text-gray-900 line-clamp-2 mb-2">{product.name}</h4>
<div className="flex items-center gap-2 text-sm text-gray-500 mb-2">
{product.category && <Badge variant="outline" className="text-xs">{product.category}</Badge>}
{product.strainType && <Badge variant="outline" className="text-xs">{product.strainType}</Badge>}
</div>
{product.thc && (
<p className="text-xs text-gray-500 mb-2">THC: {product.thc}%</p>
)}
<div className="flex items-baseline gap-2">
{product.salePrice ? (
<>
<span className="font-bold text-red-600">{formatPrice(product.salePrice)}</span>
<span className="text-sm text-gray-400 line-through">{formatPrice(product.price)}</span>
</>
) : product.price ? (
<span className="font-bold text-gray-900">{formatPrice(product.price)}</span>
) : (
<span className="text-sm text-gray-400">Price not available</span>
)}
</div>
</CardContent>
</Card>
);
}
export default DispensaryDetail;

114
findagram/FINDAGRAM.md Normal file
View File

@@ -0,0 +1,114 @@
# Findagram Development Notes
## Overview
Findagram (findagram.co) is a consumer-facing cannabis product discovery app. Users can search products across dispensaries, set price alerts, and save favorites.
## Architecture
- **Frontend**: React (Create React App) at `findagram/frontend/`
- **Backend**: Shared CannaiQ Express API at `backend/`
- **Auth**: JWT-based consumer auth via `/api/consumer/auth/*`
- **Domain**: `findagram.co` (passed in all auth requests)
## Key Files
| File | Purpose |
|------|---------|
| `src/context/AuthContext.js` | Global auth state, login/register, token management |
| `src/components/findagram/AuthModal.jsx` | Login/signup modal popup |
| `src/api/client.js` | API client for products, dispensaries, categories, brands |
| `src/api/consumer.js` | API client for favorites, alerts, saved searches (auth required) |
## Backend Consumer API Endpoints
All require JWT token in `Authorization: Bearer <token>` header.
### Auth (`/api/consumer/auth/*`)
- `POST /register` - Create account (requires `domain: 'findagram.co'`)
- `POST /login` - Login (requires `domain: 'findagram.co'`)
- `GET /me` - Get current user
- `PUT /me` - Update profile
### Favorites (`/api/consumer/favorites/*`)
- `GET /` - Get user's favorites
- `POST /` - Add favorite (`{ productId, dispensaryId? }`)
- `DELETE /:id` - Remove by favorite ID
- `DELETE /product/:productId` - Remove by product ID
- `GET /check/product/:id` - Check if product is favorited
### Alerts (`/api/consumer/alerts/*`)
- `GET /` - Get user's alerts
- `POST /` - Create alert (`{ alertType, productId, targetPrice }`)
- Alert types: `price_drop`, `back_in_stock`, `product_on_special`
- `PUT /:id` - Update alert
- `DELETE /:id` - Delete alert
- `POST /:id/toggle` - Toggle active status
### Saved Searches (`/api/consumer/saved-searches/*`)
- `GET /` - Get user's saved searches
- `POST /` - Create saved search
- `PUT /:id` - Update
- `DELETE /:id` - Delete
- `POST /:id/run` - Get search params for execution
## Database Tables (Consumer)
| Table | Purpose |
|-------|---------|
| `users` | User accounts (shared across domains via `domain` column) |
| `findagram_users` | Findagram-specific user profile data |
| `findagram_favorites` | Product favorites |
| `findagram_alerts` | Price/stock alerts |
| `findagram_saved_searches` | Saved search filters |
## Auth Flow
1. User clicks favorite/alert on a product
2. If not logged in → AuthModal opens
3. User logs in or creates account
4. JWT token stored in localStorage (`findagram_auth`)
5. Pending action (favorite/alert) executes automatically after auth
6. All subsequent API calls include `Authorization: Bearer <token>`
## Environment Variables
```bash
# Frontend (.env)
REACT_APP_API_URL=http://localhost:3010 # Local
REACT_APP_API_URL=https://cannaiq.co # Production
```
## Future: Migration to cannabrands.app
Currently uses CannaiQ backend. Later will migrate auth to cannabrands.app:
- Update `API_BASE_URL` for auth endpoints
- Keep product/dispensary API pointing to CannaiQ
- May need to sync user accounts between systems
## Important Notes
1. **Domain is critical** - All auth requests must include `domain: 'findagram.co'`
2. **Favorites are product-based** (unlike findadispo which is dispensary-based)
3. **Price alerts** require `targetPrice` for `price_drop` type
4. **Mock data** in `src/mockData.js` is no longer imported - can be safely deleted
5. **Token expiry** is 30 days (`JWT_EXPIRES_IN` in backend)
## Pages Using Real API
All pages are now wired to the real CannaiQ API:
| Page | API Endpoint | Notes |
|------|--------------|-------|
| Home | `/api/products`, `/api/dispensaries` | Featured products, deals |
| Products | `/api/products` | Search, filters, pagination |
| ProductDetail | `/api/products/:id` | Single product with dispensaries |
| Deals | `/api/products?hasSpecial=true` | Products on sale |
| Brands | `/api/brands` | Brand listing |
| BrandDetail | `/api/brands/:name` | Brand products |
| Categories | `/api/categories` | Category listing |
| CategoryDetail | `/api/products?category=...` | Category products |
| Dashboard | `/api/consumer/favorites`, alerts, searches | User dashboard (auth) |
| Favorites | `/api/consumer/favorites` | User favorites (auth) |
| Alerts | `/api/consumer/alerts` | Price alerts (auth) |
| SavedSearches | `/api/consumer/saved-searches` | Saved searches (auth) |

View File

@@ -1,5 +1,5 @@
# Build stage
FROM node:20-slim AS builder
FROM code.cannabrands.app/creationshop/node:20-slim AS builder
WORKDIR /app
@@ -12,14 +12,15 @@ RUN npm install
# Copy source files
COPY . .
# Set build-time environment variable for API URL (CRA uses REACT_APP_ prefix)
ENV REACT_APP_API_URL=https://api.findagram.co
# Note: REACT_APP_API_URL is intentionally NOT set here
# The frontend uses relative URLs (same domain) in production
# API calls go to /api/* which the ingress routes to the backend
# Build the app (CRA produces /build, not /dist)
RUN npm run build
# Production stage
FROM nginx:alpine
FROM code.cannabrands.app/creationshop/nginx:alpine
# Copy built assets from builder stage (CRA outputs to /build)
COPY --from=builder /app/build /usr/share/nginx/html

View File

@@ -1,7 +1,9 @@
import React, { useState } from 'react';
import React from 'react';
import { BrowserRouter as Router, Routes, Route } from 'react-router-dom';
import { AuthProvider } from './context/AuthContext';
import Header from './components/findagram/Header';
import Footer from './components/findagram/Footer';
import AuthModal from './components/findagram/AuthModal';
// Pages
import Home from './pages/findagram/Home';
@@ -12,6 +14,7 @@ import Brands from './pages/findagram/Brands';
import BrandDetail from './pages/findagram/BrandDetail';
import Categories from './pages/findagram/Categories';
import CategoryDetail from './pages/findagram/CategoryDetail';
import DispensaryDetail from './pages/findagram/DispensaryDetail';
import About from './pages/findagram/About';
import Contact from './pages/findagram/Contact';
import Login from './pages/findagram/Login';
@@ -23,32 +26,11 @@ import SavedSearches from './pages/findagram/SavedSearches';
import Profile from './pages/findagram/Profile';
function App() {
const [isLoggedIn, setIsLoggedIn] = useState(false);
const [user, setUser] = useState(null);
// Mock login function
const handleLogin = (email, password) => {
// In a real app, this would make an API call
setUser({
id: 1,
name: 'John Doe',
email: email,
avatar: null,
});
setIsLoggedIn(true);
return true;
};
// Mock logout function
const handleLogout = () => {
setUser(null);
setIsLoggedIn(false);
};
return (
<Router>
<div className="flex flex-col min-h-screen">
<Header isLoggedIn={isLoggedIn} user={user} onLogout={handleLogout} />
<AuthProvider>
<Router>
<div className="flex flex-col min-h-screen">
<Header />
<main className="flex-grow">
<Routes>
@@ -61,12 +43,13 @@ function App() {
<Route path="/brands/:slug" element={<BrandDetail />} />
<Route path="/categories" element={<Categories />} />
<Route path="/categories/:slug" element={<CategoryDetail />} />
<Route path="/dispensaries/:slug" element={<DispensaryDetail />} />
<Route path="/about" element={<About />} />
<Route path="/contact" element={<Contact />} />
{/* Auth Routes */}
<Route path="/login" element={<Login onLogin={handleLogin} />} />
<Route path="/signup" element={<Signup onLogin={handleLogin} />} />
<Route path="/login" element={<Login />} />
<Route path="/signup" element={<Signup />} />
{/* Dashboard Routes */}
<Route path="/dashboard" element={<Dashboard />} />
@@ -77,9 +60,11 @@ function App() {
</Routes>
</main>
<Footer />
</div>
</Router>
<Footer />
<AuthModal />
</div>
</Router>
</AuthProvider>
);
}

View File

@@ -1,11 +1,11 @@
/**
* Findagram API Client
*
* Connects to the backend /api/az/* endpoints which are publicly accessible.
* Connects to the backend /api/v1/* public endpoints.
* Uses REACT_APP_API_URL environment variable for the base URL.
*
* Local development: http://localhost:3010
* Production: https://findagram.co (proxied to backend via ingress)
* Production: https://cannaiq.co (shared API backend)
*/
const API_BASE_URL = process.env.REACT_APP_API_URL || '';
@@ -70,14 +70,14 @@ export async function getProducts(params = {}) {
offset: params.offset || 0,
});
return request(`/api/az/products${queryString}`);
return request(`/api/v1/products${queryString}`);
}
/**
* Get a single product by ID
*/
export async function getProduct(id) {
return request(`/api/az/products/${id}`);
return request(`/api/v1/products/${id}`);
}
/**
@@ -103,7 +103,7 @@ export async function getProductAvailability(productId, params = {}) {
max_radius_miles: maxRadiusMiles,
});
return request(`/api/az/products/${productId}/availability${queryString}`);
return request(`/api/v1/products/${productId}/availability${queryString}`);
}
/**
@@ -113,7 +113,7 @@ export async function getProductAvailability(productId, params = {}) {
* @returns {Promise<{similarProducts: Array<{productId: number, name: string, brandName: string, imageUrl: string, price: number}>}>}
*/
export async function getSimilarProducts(productId) {
return request(`/api/az/products/${productId}/similar`);
return request(`/api/v1/products/${productId}/similar`);
}
/**
@@ -130,7 +130,7 @@ export async function getStoreProducts(storeId, params = {}) {
offset: params.offset || 0,
});
return request(`/api/az/stores/${storeId}/products${queryString}`);
return request(`/api/v1/dispensaries/${storeId}/products${queryString}`);
}
// ============================================================
@@ -149,47 +149,49 @@ export async function getStoreProducts(storeId, params = {}) {
export async function getDispensaries(params = {}) {
const queryString = buildQueryString({
city: params.city,
state: params.state,
hasPlatformId: params.hasPlatformId,
has_products: params.hasProducts ? 'true' : undefined,
limit: params.limit || 100,
offset: params.offset || 0,
});
return request(`/api/az/stores${queryString}`);
return request(`/api/v1/dispensaries${queryString}`);
}
/**
* Get a single dispensary by ID
*/
export async function getDispensary(id) {
return request(`/api/az/stores/${id}`);
return request(`/api/v1/dispensaries/${id}`);
}
/**
* Get dispensary by slug or platform ID
*/
export async function getDispensaryBySlug(slug) {
return request(`/api/az/stores/slug/${slug}`);
return request(`/api/v1/dispensaries/slug/${slug}`);
}
/**
* Get dispensary summary (product counts, categories, brands)
*/
export async function getDispensarySummary(id) {
return request(`/api/az/stores/${id}/summary`);
return request(`/api/v1/dispensaries/${id}/summary`);
}
/**
* Get brands available at a specific dispensary
*/
export async function getDispensaryBrands(id) {
return request(`/api/az/stores/${id}/brands`);
return request(`/api/v1/dispensaries/${id}/brands`);
}
/**
* Get categories available at a specific dispensary
*/
export async function getDispensaryCategories(id) {
return request(`/api/az/stores/${id}/categories`);
return request(`/api/v1/dispensaries/${id}/categories`);
}
// ============================================================
@@ -200,7 +202,7 @@ export async function getDispensaryCategories(id) {
* Get all categories with product counts
*/
export async function getCategories() {
return request('/api/az/categories');
return request('/api/v1/categories');
}
// ============================================================
@@ -220,33 +222,52 @@ export async function getBrands(params = {}) {
offset: params.offset || 0,
});
return request(`/api/az/brands${queryString}`);
return request(`/api/v1/brands${queryString}`);
}
// ============================================================
// STATS
// ============================================================
/**
* Get aggregate stats (product count, brand count, dispensary count)
*/
export async function getStats() {
return request('/api/v1/stats');
}
// ============================================================
// DEALS / SPECIALS
// Note: The /api/az routes don't have a dedicated specials endpoint yet.
// For now, we can filter products with sale prices or use dispensary-specific specials.
// ============================================================
/**
* Get products on sale (products where sale_price exists)
* This is a client-side filter until a dedicated endpoint is added.
* Get products on special/sale
* Uses the on_special filter parameter on the products endpoint
*
* @param {Object} params
* @param {string} [params.type] - Category type filter
* @param {string} [params.brandName] - Brand name filter
* @param {number} [params.limit=100] - Page size
* @param {number} [params.offset=0] - Offset for pagination
*/
export async function getDeals(params = {}) {
// For now, get products and we'll need to filter client-side
// or we could use the /api/dispensaries/:slug/specials endpoint if we have a dispensary context
const result = await getProducts({
...params,
export async function getSpecials(params = {}) {
const queryString = buildQueryString({
on_special: 'true',
type: params.type,
brandName: params.brandName,
stockStatus: params.stockStatus || 'in_stock',
limit: params.limit || 100,
offset: params.offset || 0,
});
// Filter to only products with a sale price
// Note: This is a temporary solution - ideally the backend would support this filter
return {
...result,
products: result.products.filter(p => p.sale_price || p.med_sale_price),
};
return request(`/api/v1/products${queryString}`);
}
/**
* Alias for getSpecials for backward compatibility
*/
export async function getDeals(params = {}) {
return getSpecials(params);
}
// ============================================================
@@ -278,27 +299,40 @@ export function mapProductForUI(apiProduct) {
// Handle both direct product and transformed product formats
const p = apiProduct;
// Helper to parse price (API returns strings like "29.99" or null)
const parsePrice = (val) => {
if (val === null || val === undefined) return null;
const num = typeof val === 'string' ? parseFloat(val) : val;
return isNaN(num) ? null : num;
};
const regularPrice = parsePrice(p.regular_price);
const salePrice = parsePrice(p.sale_price);
const medPrice = parsePrice(p.med_price);
const medSalePrice = parsePrice(p.med_sale_price);
const regularPriceMax = parsePrice(p.regular_price_max);
return {
id: p.id,
name: p.name,
brand: p.brand || p.brand_name,
category: p.type || p.category,
subcategory: p.subcategory,
category: p.type || p.category || p.category_raw,
subcategory: p.subcategory || p.subcategory_raw,
strainType: p.strain_type || null,
// Images
image: p.image_url || p.primary_image_url || null,
// Potency
thc: p.thc_percentage || p.thc_content || null,
cbd: p.cbd_percentage || p.cbd_content || null,
// Prices (API returns dollars as numbers or null)
price: p.regular_price || null,
priceRange: p.regular_price_max && p.regular_price
? { min: p.regular_price, max: p.regular_price_max }
// Prices (parsed to numbers)
price: regularPrice,
priceRange: regularPriceMax && regularPrice
? { min: regularPrice, max: regularPriceMax }
: null,
onSale: !!(p.sale_price || p.med_sale_price),
salePrice: p.sale_price || null,
medPrice: p.med_price || null,
medSalePrice: p.med_sale_price || null,
onSale: !!(salePrice || medSalePrice),
salePrice: salePrice,
medPrice: medPrice,
medSalePrice: medSalePrice,
// Stock
inStock: p.in_stock !== undefined ? p.in_stock : p.stock_status === 'in_stock',
stockStatus: p.stock_status,
@@ -354,23 +388,41 @@ export function mapBrandForUI(apiBrand) {
* Map API dispensary to UI-compatible format
*/
export function mapDispensaryForUI(apiDispensary) {
// Handle location object from API (location.latitude, location.longitude)
const lat = apiDispensary.location?.latitude || apiDispensary.latitude;
const lng = apiDispensary.location?.longitude || apiDispensary.longitude;
return {
id: apiDispensary.id,
name: apiDispensary.dba_name || apiDispensary.name,
slug: apiDispensary.slug,
city: apiDispensary.city,
state: apiDispensary.state,
address: apiDispensary.address,
address: apiDispensary.address1 || apiDispensary.address,
zip: apiDispensary.zip,
latitude: apiDispensary.latitude,
longitude: apiDispensary.longitude,
latitude: lat,
longitude: lng,
website: apiDispensary.website,
menuUrl: apiDispensary.menu_url,
// Summary data (if fetched with summary)
productCount: apiDispensary.totalProducts,
imageUrl: apiDispensary.image_url,
rating: apiDispensary.rating,
reviewCount: apiDispensary.review_count,
// Product data from API
productCount: apiDispensary.product_count || apiDispensary.totalProducts || 0,
inStockCount: apiDispensary.in_stock_count || apiDispensary.inStockCount || 0,
brandCount: apiDispensary.brandCount,
categoryCount: apiDispensary.categoryCount,
inStockCount: apiDispensary.inStockCount,
// Services
services: apiDispensary.services || {
pickup: false,
delivery: false,
curbside: false
},
// License type
licenseType: apiDispensary.license_type || {
medical: false,
recreational: false
},
};
}
@@ -386,6 +438,68 @@ function formatCategoryName(type) {
.replace(/\b\w/g, c => c.toUpperCase());
}
// ============================================================
// CLICK TRACKING
// ============================================================
/**
* Get cached visitor location from sessionStorage
*/
function getCachedVisitorLocation() {
try {
const cached = sessionStorage.getItem('findagram_location');
if (cached) {
return JSON.parse(cached);
}
} catch (err) {
// Ignore errors
}
return null;
}
/**
* Track a product click event
* Fire-and-forget - doesn't block UI
*
* @param {Object} params
* @param {string} params.productId - Product ID (required)
* @param {string} [params.storeId] - Store/dispensary ID
* @param {string} [params.brandId] - Brand name/ID
* @param {string} [params.dispensaryName] - Dispensary name
* @param {string} params.action - Action type: view, open_product, open_store, compare
* @param {string} params.source - Source identifier (e.g., 'findagram')
* @param {string} [params.pageType] - Page type (e.g., 'home', 'dispensary', 'deals')
*/
export function trackProductClick(params) {
// Get visitor's cached location
const visitorLocation = getCachedVisitorLocation();
const payload = {
product_id: String(params.productId),
store_id: params.storeId ? String(params.storeId) : undefined,
brand_id: params.brandId || undefined,
dispensary_name: params.dispensaryName || undefined,
action: params.action || 'view',
source: params.source || 'findagram',
page_type: params.pageType || undefined,
url_path: window.location.pathname,
// Visitor location from IP geolocation
visitor_city: visitorLocation?.city || undefined,
visitor_state: visitorLocation?.state || undefined,
visitor_lat: visitorLocation?.lat || undefined,
visitor_lng: visitorLocation?.lng || undefined,
};
// Fire and forget - don't await
fetch(`${API_BASE_URL}/api/events/product-click`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
}).catch(() => {
// Silently ignore errors - analytics shouldn't break UX
});
}
// Default export for convenience
const api = {
// Products
@@ -405,13 +519,18 @@ const api = {
// Categories & Brands
getCategories,
getBrands,
// Stats
getStats,
// Deals
getDeals,
getSpecials,
// Mappers
mapProductForUI,
mapCategoryForUI,
mapBrandForUI,
mapDispensaryForUI,
// Tracking
trackProductClick,
};
export default api;

View File

@@ -0,0 +1,302 @@
/**
* Consumer API Client for Findagram
*
* Handles authenticated requests for:
* - Favorites
* - Price Alerts
* - Saved Searches
*
* All methods require auth token (use with AuthContext's authFetch)
*/
// ============================================================
// FAVORITES
// ============================================================
/**
* Get all user's favorites
* @param {Function} authFetch - Authenticated fetch from AuthContext
*/
export async function getFavorites(authFetch) {
return authFetch('/api/consumer/favorites');
}
/**
* Add product to favorites
* @param {Function} authFetch
* @param {number} productId
* @param {number} [dispensaryId] - Optional dispensary context
*/
export async function addFavorite(authFetch, productId, dispensaryId = null) {
return authFetch('/api/consumer/favorites', {
method: 'POST',
body: JSON.stringify({ productId, dispensaryId }),
});
}
/**
* Remove favorite by favorite ID
* @param {Function} authFetch
* @param {number} favoriteId
*/
export async function removeFavorite(authFetch, favoriteId) {
return authFetch(`/api/consumer/favorites/${favoriteId}`, {
method: 'DELETE',
});
}
/**
* Remove favorite by product ID
* @param {Function} authFetch
* @param {number} productId
*/
export async function removeFavoriteByProduct(authFetch, productId) {
return authFetch(`/api/consumer/favorites/product/${productId}`, {
method: 'DELETE',
});
}
/**
* Check if product is favorited
* @param {Function} authFetch
* @param {number} productId
* @returns {Promise<{isFavorited: boolean}>}
*/
export async function checkFavorite(authFetch, productId) {
return authFetch(`/api/consumer/favorites/check/product/${productId}`);
}
// ============================================================
// ALERTS
// ============================================================
/**
* Get all user's alerts
* @param {Function} authFetch
*/
export async function getAlerts(authFetch) {
return authFetch('/api/consumer/alerts');
}
/**
* Get alert statistics
* @param {Function} authFetch
*/
export async function getAlertStats(authFetch) {
return authFetch('/api/consumer/alerts/stats');
}
/**
* Create a price drop alert
* @param {Function} authFetch
* @param {Object} params
* @param {number} params.productId - Product to track
* @param {number} params.targetPrice - Price to alert at
* @param {number} [params.dispensaryId] - Optional dispensary context
*/
export async function createPriceAlert(authFetch, { productId, targetPrice, dispensaryId }) {
return authFetch('/api/consumer/alerts', {
method: 'POST',
body: JSON.stringify({
alertType: 'price_drop',
productId,
targetPrice,
dispensaryId,
}),
});
}
/**
* Create a back-in-stock alert
* @param {Function} authFetch
* @param {Object} params
* @param {number} params.productId - Product to track
* @param {number} [params.dispensaryId] - Optional dispensary context
*/
export async function createStockAlert(authFetch, { productId, dispensaryId }) {
return authFetch('/api/consumer/alerts', {
method: 'POST',
body: JSON.stringify({
alertType: 'back_in_stock',
productId,
dispensaryId,
}),
});
}
/**
* Create a brand/category alert
* @param {Function} authFetch
* @param {Object} params
* @param {string} [params.brand] - Brand to track
* @param {string} [params.category] - Category to track
*/
export async function createBrandCategoryAlert(authFetch, { brand, category }) {
return authFetch('/api/consumer/alerts', {
method: 'POST',
body: JSON.stringify({
alertType: 'product_on_special',
brand,
category,
}),
});
}
/**
* Update an alert
* @param {Function} authFetch
* @param {number} alertId
* @param {Object} updates
* @param {boolean} [updates.isActive]
* @param {number} [updates.targetPrice]
*/
export async function updateAlert(authFetch, alertId, updates) {
return authFetch(`/api/consumer/alerts/${alertId}`, {
method: 'PUT',
body: JSON.stringify(updates),
});
}
/**
* Toggle alert active status
* @param {Function} authFetch
* @param {number} alertId
*/
export async function toggleAlert(authFetch, alertId) {
return authFetch(`/api/consumer/alerts/${alertId}/toggle`, {
method: 'POST',
});
}
/**
* Delete an alert
* @param {Function} authFetch
* @param {number} alertId
*/
export async function deleteAlert(authFetch, alertId) {
return authFetch(`/api/consumer/alerts/${alertId}`, {
method: 'DELETE',
});
}
// ============================================================
// SAVED SEARCHES
// ============================================================
/**
* Get all user's saved searches
* @param {Function} authFetch
*/
export async function getSavedSearches(authFetch) {
return authFetch('/api/consumer/saved-searches');
}
/**
* Create a saved search
* @param {Function} authFetch
* @param {Object} params
* @param {string} params.name - Display name
* @param {string} [params.query] - Search query
* @param {string} [params.category] - Category filter
* @param {string} [params.brand] - Brand filter
* @param {string} [params.strainType] - Strain type filter
* @param {number} [params.minPrice] - Min price filter
* @param {number} [params.maxPrice] - Max price filter
* @param {number} [params.minThc] - Min THC filter
* @param {number} [params.maxThc] - Max THC filter
* @param {boolean} [params.notifyOnNew] - Notify on new products
* @param {boolean} [params.notifyOnPriceDrop] - Notify on price drops
*/
export async function createSavedSearch(authFetch, params) {
return authFetch('/api/consumer/saved-searches', {
method: 'POST',
body: JSON.stringify(params),
});
}
/**
* Update a saved search
* @param {Function} authFetch
* @param {number} searchId
* @param {Object} updates
*/
export async function updateSavedSearch(authFetch, searchId, updates) {
return authFetch(`/api/consumer/saved-searches/${searchId}`, {
method: 'PUT',
body: JSON.stringify(updates),
});
}
/**
* Delete a saved search
* @param {Function} authFetch
* @param {number} searchId
*/
export async function deleteSavedSearch(authFetch, searchId) {
return authFetch(`/api/consumer/saved-searches/${searchId}`, {
method: 'DELETE',
});
}
/**
* Run a saved search (get search params)
* @param {Function} authFetch
* @param {number} searchId
* @returns {Promise<{searchParams: Object, searchUrl: string}>}
*/
export async function runSavedSearch(authFetch, searchId) {
return authFetch(`/api/consumer/saved-searches/${searchId}/run`, {
method: 'POST',
});
}
// ============================================================
// HELPER: Generate search name from filters
// ============================================================
/**
* Generate a display name for a search based on filters
* @param {Object} filters
* @returns {string}
*/
export function generateSearchName(filters) {
const parts = [];
if (filters.query || filters.search) parts.push(`"${filters.query || filters.search}"`);
if (filters.category || filters.type) parts.push(filters.category || filters.type);
if (filters.brand || filters.brandName) parts.push(filters.brand || filters.brandName);
if (filters.strainType) parts.push(filters.strainType);
if (filters.maxPrice) parts.push(`Under $${filters.maxPrice}`);
if (filters.minThc) parts.push(`${filters.minThc}%+ THC`);
return parts.length > 0 ? parts.join(' - ') : 'All Products';
}
// Default export
const consumerApi = {
// Favorites
getFavorites,
addFavorite,
removeFavorite,
removeFavoriteByProduct,
checkFavorite,
// Alerts
getAlerts,
getAlertStats,
createPriceAlert,
createStockAlert,
createBrandCategoryAlert,
updateAlert,
toggleAlert,
deleteAlert,
// Saved Searches
getSavedSearches,
createSavedSearch,
updateSavedSearch,
deleteSavedSearch,
runSavedSearch,
// Helpers
generateSearchName,
};
export default consumerApi;

View File

@@ -0,0 +1,315 @@
/**
* AuthModal - Login/Signup modal for Findagram
*
* Shows when user tries to:
* - Favorite a product
* - Set a price alert
* - Save a search
* - Access dashboard features
*/
import React, { useState } from 'react';
import { useAuth } from '../../context/AuthContext';
import { Card, CardContent, CardHeader, CardTitle, CardDescription } from '../ui/card';
import { Button } from '../ui/button';
import { X, Mail, Lock, User, Phone, MapPin, Loader2, Eye, EyeOff } from 'lucide-react';
const AuthModal = () => {
const {
showAuthModal,
authModalMode,
setAuthModalMode,
closeAuthModal,
login,
register,
} = useAuth();
const [formData, setFormData] = useState({
email: '',
password: '',
firstName: '',
lastName: '',
phone: '',
city: '',
state: '',
});
const [showPassword, setShowPassword] = useState(false);
const [loading, setLoading] = useState(false);
const [error, setError] = useState('');
if (!showAuthModal) return null;
const handleChange = (e) => {
const { name, value } = e.target;
setFormData(prev => ({ ...prev, [name]: value }));
setError('');
};
const handleSubmit = async (e) => {
e.preventDefault();
setError('');
setLoading(true);
try {
if (authModalMode === 'login') {
await login(formData.email, formData.password);
} else {
// Validate signup fields
if (!formData.firstName || !formData.lastName) {
throw new Error('First and last name are required');
}
if (formData.password.length < 6) {
throw new Error('Password must be at least 6 characters');
}
await register(formData);
}
} catch (err) {
setError(err.message);
} finally {
setLoading(false);
}
};
const switchMode = () => {
setAuthModalMode(authModalMode === 'login' ? 'signup' : 'login');
setError('');
};
return (
<div className="fixed inset-0 z-50 flex items-center justify-center p-4">
{/* Backdrop */}
<div
className="absolute inset-0 bg-black/50 backdrop-blur-sm"
onClick={closeAuthModal}
/>
{/* Modal */}
<Card className="relative w-full max-w-md bg-white shadow-2xl animate-in fade-in zoom-in duration-200">
{/* Close button */}
<button
onClick={closeAuthModal}
className="absolute top-4 right-4 p-1 rounded-full hover:bg-gray-100 transition-colors"
>
<X className="h-5 w-5 text-gray-500" />
</button>
<CardHeader className="text-center pb-2">
<div className="mx-auto w-12 h-12 bg-gradient-to-br from-purple-500 to-pink-500 rounded-full flex items-center justify-center mb-4">
<User className="h-6 w-6 text-white" />
</div>
<CardTitle>
{authModalMode === 'login' ? 'Welcome Back' : 'Create Account'}
</CardTitle>
<CardDescription>
{authModalMode === 'login'
? 'Sign in to save favorites and set price alerts'
: 'Join Findagram to track products and get notified of deals'}
</CardDescription>
</CardHeader>
<CardContent>
<form onSubmit={handleSubmit} className="space-y-4">
{/* Error message */}
{error && (
<div className="p-3 bg-red-50 border border-red-200 rounded-lg text-red-600 text-sm">
{error}
</div>
)}
{/* Signup only fields */}
{authModalMode === 'signup' && (
<>
<div className="grid grid-cols-2 gap-3">
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
First Name
</label>
<div className="relative">
<User className="absolute left-3 top-1/2 -translate-y-1/2 h-4 w-4 text-gray-400" />
<input
type="text"
name="firstName"
value={formData.firstName}
onChange={handleChange}
className="w-full pl-10 pr-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
placeholder="John"
required
/>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Last Name
</label>
<input
type="text"
name="lastName"
value={formData.lastName}
onChange={handleChange}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
placeholder="Doe"
required
/>
</div>
</div>
</>
)}
{/* Email */}
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Email
</label>
<div className="relative">
<Mail className="absolute left-3 top-1/2 -translate-y-1/2 h-4 w-4 text-gray-400" />
<input
type="email"
name="email"
value={formData.email}
onChange={handleChange}
className="w-full pl-10 pr-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
placeholder="you@example.com"
required
/>
</div>
</div>
{/* Password */}
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Password
</label>
<div className="relative">
<Lock className="absolute left-3 top-1/2 -translate-y-1/2 h-4 w-4 text-gray-400" />
<input
type={showPassword ? 'text' : 'password'}
name="password"
value={formData.password}
onChange={handleChange}
className="w-full pl-10 pr-10 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
placeholder={authModalMode === 'signup' ? 'Min 6 characters' : 'Your password'}
required
minLength={authModalMode === 'signup' ? 6 : undefined}
/>
<button
type="button"
onClick={() => setShowPassword(!showPassword)}
className="absolute right-3 top-1/2 -translate-y-1/2 text-gray-400 hover:text-gray-600"
>
{showPassword ? <EyeOff className="h-4 w-4" /> : <Eye className="h-4 w-4" />}
</button>
</div>
</div>
{/* Signup only: Phone & Location */}
{authModalMode === 'signup' && (
<>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Phone <span className="text-gray-400">(optional)</span>
</label>
<div className="relative">
<Phone className="absolute left-3 top-1/2 -translate-y-1/2 h-4 w-4 text-gray-400" />
<input
type="tel"
name="phone"
value={formData.phone}
onChange={handleChange}
className="w-full pl-10 pr-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
placeholder="(555) 123-4567"
/>
</div>
<p className="text-xs text-gray-500 mt-1">For SMS alerts about price drops</p>
</div>
<div className="grid grid-cols-2 gap-3">
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
City <span className="text-gray-400">(optional)</span>
</label>
<div className="relative">
<MapPin className="absolute left-3 top-1/2 -translate-y-1/2 h-4 w-4 text-gray-400" />
<input
type="text"
name="city"
value={formData.city}
onChange={handleChange}
className="w-full pl-10 pr-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
placeholder="Phoenix"
/>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
State
</label>
<select
name="state"
value={formData.state}
onChange={handleChange}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-purple-500 focus:border-transparent"
>
<option value="">Select...</option>
<option value="AZ">Arizona</option>
<option value="CA">California</option>
<option value="CO">Colorado</option>
<option value="MI">Michigan</option>
<option value="NV">Nevada</option>
<option value="OR">Oregon</option>
<option value="WA">Washington</option>
</select>
</div>
</div>
</>
)}
{/* Submit button */}
<Button
type="submit"
disabled={loading}
className="w-full bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white py-2.5"
>
{loading ? (
<>
<Loader2 className="h-4 w-4 mr-2 animate-spin" />
{authModalMode === 'login' ? 'Signing in...' : 'Creating account...'}
</>
) : (
authModalMode === 'login' ? 'Sign In' : 'Create Account'
)}
</Button>
{/* Switch mode link */}
<div className="text-center text-sm text-gray-600">
{authModalMode === 'login' ? (
<>
Don't have an account?{' '}
<button
type="button"
onClick={switchMode}
className="text-purple-600 hover:text-purple-700 font-medium"
>
Sign up
</button>
</>
) : (
<>
Already have an account?{' '}
<button
type="button"
onClick={switchMode}
className="text-purple-600 hover:text-purple-700 font-medium"
>
Sign in
</button>
</>
)}
</div>
</form>
</CardContent>
</Card>
</div>
);
};
export default AuthModal;

View File

@@ -1,5 +1,6 @@
import React, { useState } from 'react';
import { Link, useLocation } from 'react-router-dom';
import { useAuth } from '../../context/AuthContext';
import { Button } from '../ui/button';
import { Input } from '../ui/input';
import {
@@ -27,7 +28,8 @@ import {
Store,
} from 'lucide-react';
const Header = ({ isLoggedIn = false, user = null }) => {
const Header = () => {
const { isAuthenticated, user, logout, openAuthModal } = useAuth();
const [mobileMenuOpen, setMobileMenuOpen] = useState(false);
const [searchQuery, setSearchQuery] = useState('');
const location = useLocation();
@@ -99,7 +101,7 @@ const Header = ({ isLoggedIn = false, user = null }) => {
{/* Right side actions */}
<div className="flex items-center space-x-4">
{isLoggedIn ? (
{isAuthenticated ? (
<>
{/* Favorites */}
<Link to="/dashboard/favorites" className="hidden sm:block">
@@ -121,9 +123,9 @@ const Header = ({ isLoggedIn = false, user = null }) => {
<DropdownMenuTrigger asChild>
<Button variant="ghost" className="relative h-10 w-10 rounded-full">
<Avatar className="h-10 w-10 border-2 border-primary">
<AvatarImage src={user?.avatar} alt={user?.name} />
<AvatarImage src={user?.avatar} alt={user?.firstName} />
<AvatarFallback className="bg-primary text-white">
{user?.name?.charAt(0) || 'U'}
{user?.firstName?.charAt(0) || 'U'}
</AvatarFallback>
</Avatar>
</Button>
@@ -131,9 +133,11 @@ const Header = ({ isLoggedIn = false, user = null }) => {
<DropdownMenuContent className="w-56" align="end" forceMount>
<DropdownMenuLabel className="font-normal">
<div className="flex flex-col space-y-1">
<p className="text-sm font-medium leading-none">{user?.name || 'User'}</p>
<p className="text-sm font-medium leading-none">
{user?.firstName} {user?.lastName}
</p>
<p className="text-xs leading-none text-muted-foreground">
{user?.email || 'user@example.com'}
{user?.email}
</p>
</div>
</DropdownMenuLabel>
@@ -169,7 +173,10 @@ const Header = ({ isLoggedIn = false, user = null }) => {
Settings
</Link>
</DropdownMenuItem>
<DropdownMenuItem className="text-red-600">
<DropdownMenuItem
className="text-red-600 cursor-pointer"
onClick={logout}
>
<LogOut className="mr-2 h-4 w-4" />
Log out
</DropdownMenuItem>
@@ -178,16 +185,19 @@ const Header = ({ isLoggedIn = false, user = null }) => {
</>
) : (
<>
<Link to="/login" className="hidden sm:block">
<Button variant="ghost" className="text-gray-600">
Log in
</Button>
</Link>
<Link to="/signup">
<Button className="gradient-purple text-white hover:opacity-90">
Sign up
</Button>
</Link>
<Button
variant="ghost"
className="hidden sm:block text-gray-600"
onClick={() => openAuthModal('login')}
>
Log in
</Button>
<Button
className="gradient-purple text-white hover:opacity-90"
onClick={() => openAuthModal('signup')}
>
Sign up
</Button>
</>
)}
@@ -241,7 +251,7 @@ const Header = ({ isLoggedIn = false, user = null }) => {
<span className="font-medium">{item.name}</span>
</Link>
))}
{isLoggedIn && (
{isAuthenticated && (
<>
<div className="border-t border-gray-200 my-2" />
<Link

View File

@@ -1,16 +1,59 @@
import React from 'react';
import { Link } from 'react-router-dom';
import React, { useState, useEffect } from 'react';
import { Link, useLocation } from 'react-router-dom';
import { Card, CardContent } from '../ui/card';
import { Badge } from '../ui/badge';
import { Button } from '../ui/button';
import { Heart, Star, MapPin, TrendingDown } from 'lucide-react';
import { Heart, Star, MapPin, TrendingDown, Loader2 } from 'lucide-react';
import { useAuth } from '../../context/AuthContext';
import { addFavorite, removeFavoriteByProduct, checkFavorite } from '../../api/consumer';
import { trackProductClick } from '../../api/client';
const ProductCard = ({
product,
onFavorite,
isFavorite = false,
showDispensaryCount = true
onFavoriteChange,
initialIsFavorite,
showDispensaryCount = true,
pageType = 'browse'
}) => {
const location = useLocation();
const { isAuthenticated, requireAuth, authFetch } = useAuth();
const [isFavorite, setIsFavorite] = useState(initialIsFavorite || false);
const [favoriteLoading, setFavoriteLoading] = useState(false);
// Check favorite status on mount if authenticated
useEffect(() => {
if (isAuthenticated && product?.id && initialIsFavorite === undefined) {
checkFavorite(authFetch, product.id)
.then(data => setIsFavorite(data.isFavorited))
.catch(() => {}); // Ignore errors
}
}, [isAuthenticated, product?.id, authFetch, initialIsFavorite]);
const handleFavoriteClick = async (e) => {
e.preventDefault();
e.stopPropagation();
// If not authenticated, show auth modal with pending action
if (!requireAuth(() => handleFavoriteClick({ preventDefault: () => {}, stopPropagation: () => {} }))) {
return;
}
setFavoriteLoading(true);
try {
if (isFavorite) {
await removeFavoriteByProduct(authFetch, product.id);
setIsFavorite(false);
} else {
await addFavorite(authFetch, product.id, product.dispensaryId);
setIsFavorite(true);
}
onFavoriteChange?.(product.id, !isFavorite);
} catch (error) {
console.error('Failed to update favorite:', error);
} finally {
setFavoriteLoading(false);
}
};
const {
id,
name,
@@ -35,11 +78,24 @@ const ProductCard = ({
hybrid: 'bg-green-100 text-green-800',
};
const savings = onSale && salePrice ? ((price - salePrice) / price * 100).toFixed(0) : 0;
const savings = onSale && salePrice && price ? ((price - salePrice) / price * 100).toFixed(0) : 0;
// Track product click
const handleProductClick = () => {
trackProductClick({
productId: id,
storeId: product.dispensaryId,
brandId: brand,
dispensaryName: product.storeName,
action: 'open_product',
source: 'findagram',
pageType: pageType || location.pathname.split('/')[1] || 'home',
});
};
return (
<Card className="product-card group overflow-hidden">
<Link to={`/products/${id}`}>
<Link to={`/products/${id}`} onClick={handleProductClick}>
{/* Image Container */}
<div className="relative aspect-square overflow-hidden bg-gray-100">
<img
@@ -70,13 +126,14 @@ const ProductCard = ({
className={`absolute top-3 right-3 h-8 w-8 rounded-full bg-white/80 hover:bg-white ${
isFavorite ? 'text-red-500' : 'text-gray-400'
}`}
onClick={(e) => {
e.preventDefault();
e.stopPropagation();
onFavorite?.(id);
}}
onClick={handleFavoriteClick}
disabled={favoriteLoading}
>
<Heart className={`h-4 w-4 ${isFavorite ? 'fill-current' : ''}`} />
{favoriteLoading ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Heart className={`h-4 w-4 ${isFavorite ? 'fill-current' : ''}`} />
)}
</Button>
</div>
</Link>
@@ -88,7 +145,7 @@ const ProductCard = ({
</p>
{/* Product Name */}
<Link to={`/products/${id}`}>
<Link to={`/products/${id}`} onClick={handleProductClick}>
<h3 className="font-semibold text-gray-900 line-clamp-2 hover:text-primary transition-colors mb-2">
{name}
</h3>
@@ -124,27 +181,31 @@ const ProductCard = ({
</div>
)}
{/* Price */}
<div className="flex items-baseline gap-2 mb-3">
{onSale && salePrice ? (
<>
<span className="text-lg font-bold text-pink-600">
${salePrice.toFixed(2)}
{/* Price - only show if we have price data */}
{(price != null || salePrice != null || priceRange != null) && (
<div className="flex items-baseline gap-2 mb-3">
{onSale && salePrice ? (
<>
<span className="text-lg font-bold text-pink-600">
${salePrice.toFixed(2)}
</span>
{price && (
<span className="text-sm text-gray-400 line-through">
${price.toFixed(2)}
</span>
)}
</>
) : priceRange && priceRange.min != null && priceRange.max != null ? (
<span className="text-lg font-bold text-gray-900">
${priceRange.min.toFixed(2)} - ${priceRange.max.toFixed(2)}
</span>
<span className="text-sm text-gray-400 line-through">
) : price != null ? (
<span className="text-lg font-bold text-gray-900">
${price.toFixed(2)}
</span>
</>
) : priceRange ? (
<span className="text-lg font-bold text-gray-900">
${priceRange.min.toFixed(2)} - ${priceRange.max.toFixed(2)}
</span>
) : (
<span className="text-lg font-bold text-gray-900">
${price.toFixed(2)}
</span>
)}
</div>
) : null}
</div>
)}
{/* Dispensary Count */}
{showDispensaryCount && dispensaries.length > 0 && (

View File

@@ -0,0 +1,258 @@
/**
* AuthContext - Global authentication state for Findagram
*
* Manages user login state, JWT token, and provides auth methods.
* Persists auth state in localStorage for session continuity.
*/
import React, { createContext, useContext, useState, useEffect, useCallback } from 'react';
const AuthContext = createContext(null);
const API_BASE_URL = process.env.REACT_APP_API_URL || '';
const STORAGE_KEY = 'findagram_auth';
const DOMAIN = 'findagram.co';
/**
* AuthProvider component - wrap your app with this
*/
export function AuthProvider({ children }) {
const [user, setUser] = useState(null);
const [token, setToken] = useState(null);
const [loading, setLoading] = useState(true);
const [showAuthModal, setShowAuthModal] = useState(false);
const [authModalMode, setAuthModalMode] = useState('login'); // 'login' or 'signup'
const [pendingAction, setPendingAction] = useState(null); // Action to perform after login
// Load auth state from localStorage on mount
useEffect(() => {
const stored = localStorage.getItem(STORAGE_KEY);
if (stored) {
try {
const { user: storedUser, token: storedToken } = JSON.parse(stored);
setUser(storedUser);
setToken(storedToken);
} catch (e) {
console.error('Failed to parse stored auth:', e);
localStorage.removeItem(STORAGE_KEY);
}
}
setLoading(false);
}, []);
// Save auth state to localStorage when it changes
useEffect(() => {
if (user && token) {
localStorage.setItem(STORAGE_KEY, JSON.stringify({ user, token }));
} else {
localStorage.removeItem(STORAGE_KEY);
}
}, [user, token]);
/**
* Make authenticated API request
*/
const authFetch = useCallback(async (endpoint, options = {}) => {
const url = `${API_BASE_URL}${endpoint}`;
const headers = {
'Content-Type': 'application/json',
...options.headers,
};
if (token) {
headers['Authorization'] = `Bearer ${token}`;
}
const response = await fetch(url, { ...options, headers });
// Handle 401 - token expired
if (response.status === 401) {
setUser(null);
setToken(null);
throw new Error('Session expired. Please log in again.');
}
if (!response.ok) {
const error = await response.json().catch(() => ({ error: 'Request failed' }));
throw new Error(error.error || `HTTP ${response.status}`);
}
return response.json();
}, [token]);
/**
* Register a new user
*/
const register = useCallback(async ({ firstName, lastName, email, password, phone, city, state }) => {
const response = await fetch(`${API_BASE_URL}/api/consumer/auth/register`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
firstName,
lastName,
email,
password,
phone,
city,
state,
domain: DOMAIN,
notificationPreference: phone ? 'both' : 'email',
}),
});
const data = await response.json();
if (!response.ok) {
throw new Error(data.error || 'Registration failed');
}
setUser(data.user);
setToken(data.token);
setShowAuthModal(false);
// Execute pending action if any
if (pendingAction) {
setTimeout(() => {
pendingAction();
setPendingAction(null);
}, 100);
}
return data;
}, [pendingAction]);
/**
* Login user
*/
const login = useCallback(async (email, password) => {
const response = await fetch(`${API_BASE_URL}/api/consumer/auth/login`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
email,
password,
domain: DOMAIN,
}),
});
const data = await response.json();
if (!response.ok) {
throw new Error(data.error || 'Login failed');
}
setUser(data.user);
setToken(data.token);
setShowAuthModal(false);
// Execute pending action if any
if (pendingAction) {
setTimeout(() => {
pendingAction();
setPendingAction(null);
}, 100);
}
return data;
}, [pendingAction]);
/**
* Logout user
*/
const logout = useCallback(() => {
setUser(null);
setToken(null);
localStorage.removeItem(STORAGE_KEY);
}, []);
/**
* Update user profile
*/
const updateProfile = useCallback(async (updates) => {
const data = await authFetch('/api/consumer/auth/me', {
method: 'PUT',
body: JSON.stringify(updates),
});
// Refresh user data
const meData = await authFetch('/api/consumer/auth/me');
setUser(meData.user);
return data;
}, [authFetch]);
/**
* Require auth - shows modal if not logged in
* Returns true if authenticated, false if modal was shown
*
* @param {Function} action - Optional action to perform after successful auth
*/
const requireAuth = useCallback((action = null) => {
if (user && token) {
return true;
}
setPendingAction(() => action);
setAuthModalMode('login');
setShowAuthModal(true);
return false;
}, [user, token]);
/**
* Open auth modal in specific mode
*/
const openAuthModal = useCallback((mode = 'login') => {
setAuthModalMode(mode);
setShowAuthModal(true);
}, []);
/**
* Close auth modal
*/
const closeAuthModal = useCallback(() => {
setShowAuthModal(false);
setPendingAction(null);
}, []);
const value = {
// State
user,
token,
loading,
isAuthenticated: !!user && !!token,
showAuthModal,
authModalMode,
// Auth methods
register,
login,
logout,
updateProfile,
authFetch,
// Modal control
requireAuth,
openAuthModal,
closeAuthModal,
setAuthModalMode,
};
return (
<AuthContext.Provider value={value}>
{children}
</AuthContext.Provider>
);
}
/**
* Hook to use auth context
*/
export function useAuth() {
const context = useContext(AuthContext);
if (!context) {
throw new Error('useAuth must be used within an AuthProvider');
}
return context;
}
export default AuthContext;

View File

@@ -0,0 +1,314 @@
import { useState, useEffect, useCallback } from 'react';
// Default location: Phoenix, AZ (fallback if all else fails)
const DEFAULT_LOCATION = {
lat: 33.4484,
lng: -112.0740,
city: 'Phoenix',
state: 'AZ'
};
const LOCATION_STORAGE_KEY = 'findagram_location';
const SESSION_ID_KEY = 'findagram_session_id';
const API_BASE_URL = process.env.REACT_APP_API_URL || '';
/**
* Get or create session ID
*/
function getSessionId() {
let sessionId = sessionStorage.getItem(SESSION_ID_KEY);
if (!sessionId) {
sessionId = `${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
sessionStorage.setItem(SESSION_ID_KEY, sessionId);
}
return sessionId;
}
/**
* Get cached location from sessionStorage
*/
function getCachedLocation() {
try {
const cached = sessionStorage.getItem(LOCATION_STORAGE_KEY);
if (cached) {
return JSON.parse(cached);
}
} catch (err) {
console.error('Error reading cached location:', err);
}
return null;
}
/**
* Save location to sessionStorage
*/
function cacheLocation(location) {
try {
sessionStorage.setItem(LOCATION_STORAGE_KEY, JSON.stringify(location));
} catch (err) {
console.error('Error caching location:', err);
}
}
/**
* Track visitor and get location from our backend API
* This logs the visit for analytics and returns location from IP
*/
async function trackVisitorAndGetLocation() {
try {
const response = await fetch(`${API_BASE_URL}/api/v1/visitor/track`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
domain: 'findagram.co',
page_path: window.location.pathname,
session_id: getSessionId(),
referrer: document.referrer || null,
}),
});
const data = await response.json();
if (data.success && data.location) {
return {
lat: data.location.lat,
lng: data.location.lng,
city: data.location.city,
state: data.location.state,
stateCode: data.location.stateCode,
source: 'api'
};
}
} catch (err) {
console.error('Visitor tracking error:', err);
}
return null;
}
/**
* Custom hook for getting user's geolocation
*
* @param {Object} options
* @param {boolean} options.autoRequest - Whether to request location automatically on mount
* @param {boolean} options.useIPFallback - Whether to use IP geolocation as fallback (default: true)
* @param {Object} options.defaultLocation - Default location if all methods fail
* @returns {Object} { location, loading, error, requestLocation, hasPermission }
*/
export function useGeolocation(options = {}) {
const {
autoRequest = false,
useIPFallback = true,
defaultLocation = DEFAULT_LOCATION
} = options;
const [location, setLocation] = useState(null);
const [loading, setLoading] = useState(false);
const [error, setError] = useState(null);
const [hasPermission, setHasPermission] = useState(null);
const [locationSource, setLocationSource] = useState(null); // 'gps', 'ip', or 'default'
// Try IP geolocation first (no permission needed)
const getIPLocation = useCallback(async () => {
if (!useIPFallback) return null;
const ipLoc = await getLocationFromIP();
if (ipLoc) {
setLocation(ipLoc);
setLocationSource('ip');
return ipLoc;
}
return null;
}, [useIPFallback]);
// Request precise GPS location (requires permission)
const requestLocation = useCallback(async () => {
setLoading(true);
setError(null);
// First try browser geolocation
if (navigator.geolocation) {
return new Promise(async (resolve) => {
navigator.geolocation.getCurrentPosition(
(position) => {
const { latitude, longitude } = position.coords;
const loc = { lat: latitude, lng: longitude, source: 'gps' };
setLocation(loc);
setLocationSource('gps');
setHasPermission(true);
setLoading(false);
resolve(loc);
},
async (err) => {
console.error('Geolocation error:', err);
if (err.code === err.PERMISSION_DENIED) {
setHasPermission(false);
}
// Fall back to IP geolocation
if (useIPFallback) {
const ipLoc = await getLocationFromIP();
if (ipLoc) {
setLocation(ipLoc);
setLocationSource('ip');
setLoading(false);
resolve(ipLoc);
return;
}
}
// Last resort: default location
setError('Unable to determine location');
setLocation(defaultLocation);
setLocationSource('default');
setLoading(false);
resolve(defaultLocation);
},
{
enableHighAccuracy: false,
timeout: 5000,
maximumAge: 600000 // Cache for 10 minutes
}
);
});
}
// No browser geolocation, try IP
if (useIPFallback) {
const ipLoc = await getLocationFromIP();
if (ipLoc) {
setLocation(ipLoc);
setLocationSource('ip');
setLoading(false);
return ipLoc;
}
}
// Fallback to default
setLocation(defaultLocation);
setLocationSource('default');
setLoading(false);
return defaultLocation;
}, [defaultLocation, useIPFallback]);
// Auto-request location on mount if enabled
useEffect(() => {
if (autoRequest) {
const init = async () => {
// Check for cached location first
const cached = getCachedLocation();
if (cached) {
setLocation(cached);
setLocationSource(cached.source || 'api');
setLoading(false);
return;
}
setLoading(true);
// Track visitor and get location from our backend API
const apiLoc = await trackVisitorAndGetLocation();
if (apiLoc) {
setLocation(apiLoc);
setLocationSource('api');
cacheLocation(apiLoc); // Save for session
} else {
// Fallback to default
setLocation(defaultLocation);
setLocationSource('default');
}
setLoading(false);
};
init();
}
}, [autoRequest, defaultLocation]);
return {
location,
loading,
error,
requestLocation,
hasPermission,
locationSource,
isDefault: locationSource === 'default',
isFromIP: locationSource === 'ip',
isFromGPS: locationSource === 'gps'
};
}
/**
* Calculate distance between two points using Haversine formula
*
* @param {number} lat1 - First point latitude
* @param {number} lng1 - First point longitude
* @param {number} lat2 - Second point latitude
* @param {number} lng2 - Second point longitude
* @returns {number} Distance in miles
*/
export function calculateDistance(lat1, lng1, lat2, lng2) {
const R = 3959; // Earth's radius in miles
const dLat = toRad(lat2 - lat1);
const dLng = toRad(lng2 - lng1);
const a =
Math.sin(dLat / 2) * Math.sin(dLat / 2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) *
Math.sin(dLng / 2) * Math.sin(dLng / 2);
const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
return R * c;
}
function toRad(deg) {
return deg * (Math.PI / 180);
}
/**
* Sort items by distance from a location
*
* @param {Array} items - Array of items with location data
* @param {Object} userLocation - User's location { lat, lng }
* @param {Function} getItemLocation - Function to extract lat/lng from item
* @returns {Array} Items sorted by distance with distance property added
*/
export function sortByDistance(items, userLocation, getItemLocation = (item) => item.location) {
if (!userLocation || !items?.length) return items;
return items
.map(item => {
const itemLoc = getItemLocation(item);
if (!itemLoc?.latitude || !itemLoc?.longitude) {
return { ...item, distance: null };
}
const distance = calculateDistance(
userLocation.lat,
userLocation.lng,
itemLoc.latitude,
itemLoc.longitude
);
return { ...item, distance: Math.round(distance * 10) / 10 };
})
.sort((a, b) => {
if (a.distance === null) return 1;
if (b.distance === null) return -1;
return a.distance - b.distance;
});
}
/**
* Filter items within a radius from user location
*
* @param {Array} items - Array of items with location data
* @param {Object} userLocation - User's location { lat, lng }
* @param {number} radiusMiles - Max distance in miles
* @param {Function} getItemLocation - Function to extract lat/lng from item
* @returns {Array} Items within radius, sorted by distance
*/
export function filterByRadius(items, userLocation, radiusMiles = 50, getItemLocation = (item) => item.location) {
const sorted = sortByDistance(items, userLocation, getItemLocation);
return sorted.filter(item => item.distance !== null && item.distance <= radiusMiles);
}
export default useGeolocation;

View File

@@ -0,0 +1,363 @@
/**
* localStorage helpers for user data persistence
*
* Manages favorites, price alerts, and saved searches without requiring authentication.
* All data is stored locally in the browser.
*/
const STORAGE_KEYS = {
FAVORITES: 'findagram_favorites',
ALERTS: 'findagram_alerts',
SAVED_SEARCHES: 'findagram_saved_searches',
};
// ============================================================
// FAVORITES
// ============================================================
/**
* Get all favorite product IDs
* @returns {number[]} Array of product IDs
*/
export function getFavorites() {
try {
const data = localStorage.getItem(STORAGE_KEYS.FAVORITES);
return data ? JSON.parse(data) : [];
} catch (e) {
console.error('Error reading favorites:', e);
return [];
}
}
/**
* Check if a product is favorited
* @param {number} productId
* @returns {boolean}
*/
export function isFavorite(productId) {
const favorites = getFavorites();
return favorites.includes(productId);
}
/**
* Add a product to favorites
* @param {number} productId
*/
export function addFavorite(productId) {
const favorites = getFavorites();
if (!favorites.includes(productId)) {
favorites.push(productId);
localStorage.setItem(STORAGE_KEYS.FAVORITES, JSON.stringify(favorites));
}
}
/**
* Remove a product from favorites
* @param {number} productId
*/
export function removeFavorite(productId) {
const favorites = getFavorites();
const updated = favorites.filter(id => id !== productId);
localStorage.setItem(STORAGE_KEYS.FAVORITES, JSON.stringify(updated));
}
/**
* Toggle a product's favorite status
* @param {number} productId
* @returns {boolean} New favorite status
*/
export function toggleFavorite(productId) {
if (isFavorite(productId)) {
removeFavorite(productId);
return false;
} else {
addFavorite(productId);
return true;
}
}
/**
* Clear all favorites
*/
export function clearFavorites() {
localStorage.setItem(STORAGE_KEYS.FAVORITES, JSON.stringify([]));
}
// ============================================================
// PRICE ALERTS
// ============================================================
/**
* @typedef {Object} PriceAlert
* @property {string} id - Unique alert ID
* @property {number} productId - Product ID to track
* @property {string} productName - Product name (for display when offline)
* @property {string} productImage - Product image URL
* @property {string} brandName - Brand name
* @property {number} targetPrice - Target price to alert at
* @property {number} originalPrice - Price when alert was created
* @property {boolean} active - Whether alert is active
* @property {string} createdAt - ISO date string
*/
/**
* Get all price alerts
* @returns {PriceAlert[]}
*/
export function getAlerts() {
try {
const data = localStorage.getItem(STORAGE_KEYS.ALERTS);
return data ? JSON.parse(data) : [];
} catch (e) {
console.error('Error reading alerts:', e);
return [];
}
}
/**
* Get alert for a specific product
* @param {number} productId
* @returns {PriceAlert|null}
*/
export function getAlertForProduct(productId) {
const alerts = getAlerts();
return alerts.find(a => a.productId === productId) || null;
}
/**
* Create a new price alert
* @param {Object} params
* @param {number} params.productId
* @param {string} params.productName
* @param {string} params.productImage
* @param {string} params.brandName
* @param {number} params.targetPrice
* @param {number} params.originalPrice
* @returns {PriceAlert}
*/
export function createAlert({ productId, productName, productImage, brandName, targetPrice, originalPrice }) {
const alerts = getAlerts();
// Check if alert already exists for this product
const existingIndex = alerts.findIndex(a => a.productId === productId);
const alert = {
id: existingIndex >= 0 ? alerts[existingIndex].id : `alert_${Date.now()}`,
productId,
productName,
productImage,
brandName,
targetPrice,
originalPrice,
active: true,
createdAt: existingIndex >= 0 ? alerts[existingIndex].createdAt : new Date().toISOString(),
};
if (existingIndex >= 0) {
alerts[existingIndex] = alert;
} else {
alerts.push(alert);
}
localStorage.setItem(STORAGE_KEYS.ALERTS, JSON.stringify(alerts));
return alert;
}
/**
* Update an existing alert
* @param {string} alertId
* @param {Partial<PriceAlert>} updates
*/
export function updateAlert(alertId, updates) {
const alerts = getAlerts();
const index = alerts.findIndex(a => a.id === alertId);
if (index >= 0) {
alerts[index] = { ...alerts[index], ...updates };
localStorage.setItem(STORAGE_KEYS.ALERTS, JSON.stringify(alerts));
}
}
/**
* Toggle alert active status
* @param {string} alertId
* @returns {boolean} New active status
*/
export function toggleAlertActive(alertId) {
const alerts = getAlerts();
const alert = alerts.find(a => a.id === alertId);
if (alert) {
alert.active = !alert.active;
localStorage.setItem(STORAGE_KEYS.ALERTS, JSON.stringify(alerts));
return alert.active;
}
return false;
}
/**
* Delete an alert
* @param {string} alertId
*/
export function deleteAlert(alertId) {
const alerts = getAlerts();
const updated = alerts.filter(a => a.id !== alertId);
localStorage.setItem(STORAGE_KEYS.ALERTS, JSON.stringify(updated));
}
/**
* Clear all alerts
*/
export function clearAlerts() {
localStorage.setItem(STORAGE_KEYS.ALERTS, JSON.stringify([]));
}
// ============================================================
// SAVED SEARCHES
// ============================================================
/**
* @typedef {Object} SavedSearch
* @property {string} id - Unique search ID
* @property {string} name - User-defined name for the search
* @property {Object} filters - Search filter parameters
* @property {string} [filters.search] - Search term
* @property {string} [filters.type] - Category type
* @property {string} [filters.brandName] - Brand filter
* @property {string} [filters.strainType] - Strain type filter
* @property {number} [filters.priceMax] - Max price filter
* @property {number} [filters.thcMin] - Min THC filter
* @property {string} createdAt - ISO date string
*/
/**
* Get all saved searches
* @returns {SavedSearch[]}
*/
export function getSavedSearches() {
try {
const data = localStorage.getItem(STORAGE_KEYS.SAVED_SEARCHES);
return data ? JSON.parse(data) : [];
} catch (e) {
console.error('Error reading saved searches:', e);
return [];
}
}
/**
* Create a new saved search
* @param {Object} params
* @param {string} params.name - Display name for the search
* @param {Object} params.filters - Search filters
* @returns {SavedSearch}
*/
export function createSavedSearch({ name, filters }) {
const searches = getSavedSearches();
const search = {
id: `search_${Date.now()}`,
name,
filters,
createdAt: new Date().toISOString(),
};
searches.push(search);
localStorage.setItem(STORAGE_KEYS.SAVED_SEARCHES, JSON.stringify(searches));
return search;
}
/**
* Update a saved search
* @param {string} searchId
* @param {Partial<SavedSearch>} updates
*/
export function updateSavedSearch(searchId, updates) {
const searches = getSavedSearches();
const index = searches.findIndex(s => s.id === searchId);
if (index >= 0) {
searches[index] = { ...searches[index], ...updates };
localStorage.setItem(STORAGE_KEYS.SAVED_SEARCHES, JSON.stringify(searches));
}
}
/**
* Delete a saved search
* @param {string} searchId
*/
export function deleteSavedSearch(searchId) {
const searches = getSavedSearches();
const updated = searches.filter(s => s.id !== searchId);
localStorage.setItem(STORAGE_KEYS.SAVED_SEARCHES, JSON.stringify(updated));
}
/**
* Clear all saved searches
*/
export function clearSavedSearches() {
localStorage.setItem(STORAGE_KEYS.SAVED_SEARCHES, JSON.stringify([]));
}
// ============================================================
// UTILITY FUNCTIONS
// ============================================================
/**
* Build a URL with search params from filters
* @param {Object} filters
* @returns {string}
*/
export function buildSearchUrl(filters) {
const params = new URLSearchParams();
Object.entries(filters).forEach(([key, value]) => {
if (value !== undefined && value !== null && value !== '') {
params.set(key, value);
}
});
return `/products?${params.toString()}`;
}
/**
* Generate a name for a search based on its filters
* @param {Object} filters
* @returns {string}
*/
export function generateSearchName(filters) {
const parts = [];
if (filters.search) parts.push(`"${filters.search}"`);
if (filters.type) parts.push(filters.type);
if (filters.brandName) parts.push(filters.brandName);
if (filters.strainType) parts.push(filters.strainType);
if (filters.priceMax) parts.push(`Under $${filters.priceMax}`);
if (filters.thcMin) parts.push(`${filters.thcMin}%+ THC`);
return parts.length > 0 ? parts.join(' - ') : 'All Products';
}
// Default export
const storage = {
// Favorites
getFavorites,
isFavorite,
addFavorite,
removeFavorite,
toggleFavorite,
clearFavorites,
// Alerts
getAlerts,
getAlertForProduct,
createAlert,
updateAlert,
toggleAlertActive,
deleteAlert,
clearAlerts,
// Saved Searches
getSavedSearches,
createSavedSearch,
updateSavedSearch,
deleteSavedSearch,
clearSavedSearches,
// Utilities
buildSearchUrl,
generateSearchName,
};
export default storage;

View File

@@ -1,28 +1,90 @@
import React, { useState } from 'react';
import { Link } from 'react-router-dom';
import React, { useState, useEffect } from 'react';
import { Link, useNavigate } from 'react-router-dom';
import { Button } from '../../components/ui/button';
import { Card, CardContent } from '../../components/ui/card';
import { Badge } from '../../components/ui/badge';
import { mockAlerts, mockProducts } from '../../mockData';
import { Bell, Trash2, Pause, Play, TrendingDown } from 'lucide-react';
import { useAuth } from '../../context/AuthContext';
import { getAlerts, toggleAlert, deleteAlert } from '../../api/consumer';
import { Bell, Trash2, Pause, Play, TrendingDown, Loader2 } from 'lucide-react';
const Alerts = () => {
const [alerts, setAlerts] = useState(mockAlerts);
const { isAuthenticated, authFetch, requireAuth } = useAuth();
const navigate = useNavigate();
const toggleAlert = (alertId) => {
setAlerts((prev) =>
prev.map((alert) =>
alert.id === alertId ? { ...alert, active: !alert.active } : alert
)
const [alerts, setAlerts] = useState([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState(null);
const [togglingId, setTogglingId] = useState(null);
const [deletingId, setDeletingId] = useState(null);
// Redirect to home if not authenticated
useEffect(() => {
if (!isAuthenticated) {
requireAuth(() => navigate('/dashboard/alerts'));
}
}, [isAuthenticated, requireAuth, navigate]);
// Fetch alerts
useEffect(() => {
if (!isAuthenticated) return;
const fetchAlerts = async () => {
setLoading(true);
setError(null);
try {
const data = await getAlerts(authFetch);
setAlerts(data.alerts || []);
} catch (err) {
setError(err.message);
} finally {
setLoading(false);
}
};
fetchAlerts();
}, [isAuthenticated, authFetch]);
const handleToggleAlert = async (alertId) => {
setTogglingId(alertId);
try {
const result = await toggleAlert(authFetch, alertId);
setAlerts(prev =>
prev.map(a => a.id === alertId ? { ...a, isActive: result.isActive } : a)
);
} catch (err) {
setError(err.message);
} finally {
setTogglingId(null);
}
};
const handleDeleteAlert = async (alertId) => {
setDeletingId(alertId);
try {
await deleteAlert(authFetch, alertId);
setAlerts(prev => prev.filter(a => a.id !== alertId));
} catch (err) {
setError(err.message);
} finally {
setDeletingId(null);
}
};
if (!isAuthenticated) {
return null;
}
if (loading) {
return (
<div className="min-h-screen bg-gray-50 flex items-center justify-center">
<Loader2 className="h-8 w-8 animate-spin text-primary" />
</div>
);
};
}
const deleteAlert = (alertId) => {
setAlerts((prev) => prev.filter((alert) => alert.id !== alertId));
};
const activeAlerts = alerts.filter((a) => a.active);
const pausedAlerts = alerts.filter((a) => !a.active);
const activeAlerts = alerts.filter(a => a.isActive);
const pausedAlerts = alerts.filter(a => !a.isActive);
return (
<div className="min-h-screen bg-gray-50">
@@ -50,6 +112,12 @@ const Alerts = () => {
</section>
<div className="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
{error && (
<div className="mb-6 p-4 bg-red-50 border border-red-200 rounded-lg text-red-600">
{error}
</div>
)}
{alerts.length > 0 ? (
<div className="space-y-8">
{/* Active Alerts */}
@@ -61,40 +129,46 @@ const Alerts = () => {
</h2>
<div className="space-y-4">
{activeAlerts.map((alert) => {
const product = mockProducts.find((p) => p.id === alert.productId);
const priceDiff = product ? product.price - alert.targetPrice : 0;
const isTriggered = priceDiff <= 0;
const isTriggered = alert.isTriggered;
return (
<Card key={alert.id} className={isTriggered ? 'border-green-500 bg-green-50' : ''}>
<CardContent className="p-4">
<div className="flex items-center gap-4">
<Link to={`/products/${product?.id}`}>
<Link to={`/products/${alert.productId}`}>
<img
src={product?.image || '/placeholder-product.jpg'}
alt={product?.name}
src={alert.productImage || '/placeholder-product.jpg'}
alt={alert.productName}
className="w-16 h-16 rounded-lg object-cover"
/>
</Link>
<div className="flex-1 min-w-0">
<Link
to={`/products/${product?.id}`}
to={`/products/${alert.productId}`}
className="font-medium text-gray-900 hover:text-primary truncate block"
>
{product?.name}
{alert.productName}
</Link>
<p className="text-sm text-gray-500">{product?.brand}</p>
<p className="text-sm text-gray-500">{alert.productBrand}</p>
<div className="flex items-center gap-4 mt-1">
<span className="text-sm">
Current: <span className="font-medium">${product?.price.toFixed(2)}</span>
Current:{' '}
<span className="font-medium">
{alert.currentPrice
? `$${parseFloat(alert.currentPrice).toFixed(2)}`
: 'N/A'}
</span>
</span>
<span className="text-sm">
Target: <span className="font-medium text-primary">${alert.targetPrice.toFixed(2)}</span>
Target:{' '}
<span className="font-medium text-primary">
${parseFloat(alert.targetPrice).toFixed(2)}
</span>
</span>
</div>
</div>
{isTriggered && (
<Badge variant="success" className="flex items-center gap-1">
<Badge className="bg-green-500 text-white flex items-center gap-1">
<TrendingDown className="h-3 w-3" />
Price Dropped!
</Badge>
@@ -103,19 +177,29 @@ const Alerts = () => {
<Button
variant="ghost"
size="icon"
onClick={() => toggleAlert(alert.id)}
onClick={() => handleToggleAlert(alert.id)}
disabled={togglingId === alert.id}
title="Pause alert"
>
<Pause className="h-4 w-4" />
{togglingId === alert.id ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Pause className="h-4 w-4" />
)}
</Button>
<Button
variant="ghost"
size="icon"
onClick={() => deleteAlert(alert.id)}
onClick={() => handleDeleteAlert(alert.id)}
disabled={deletingId === alert.id}
className="text-red-600 hover:text-red-700 hover:bg-red-50"
title="Delete alert"
>
<Trash2 className="h-4 w-4" />
{deletingId === alert.id ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Trash2 className="h-4 w-4" />
)}
</Button>
</div>
</div>
@@ -135,52 +219,61 @@ const Alerts = () => {
Paused Alerts ({pausedAlerts.length})
</h2>
<div className="space-y-4">
{pausedAlerts.map((alert) => {
const product = mockProducts.find((p) => p.id === alert.productId);
return (
<Card key={alert.id} className="opacity-75">
<CardContent className="p-4">
<div className="flex items-center gap-4">
<img
src={product?.image || '/placeholder-product.jpg'}
alt={product?.name}
className="w-16 h-16 rounded-lg object-cover grayscale"
/>
<div className="flex-1 min-w-0">
<p className="font-medium text-gray-900 truncate">
{product?.name}
</p>
<p className="text-sm text-gray-500">{product?.brand}</p>
<span className="text-sm">
Target: <span className="font-medium">${alert.targetPrice.toFixed(2)}</span>
{pausedAlerts.map((alert) => (
<Card key={alert.id} className="opacity-75">
<CardContent className="p-4">
<div className="flex items-center gap-4">
<img
src={alert.productImage || '/placeholder-product.jpg'}
alt={alert.productName}
className="w-16 h-16 rounded-lg object-cover grayscale"
/>
<div className="flex-1 min-w-0">
<p className="font-medium text-gray-900 truncate">
{alert.productName}
</p>
<p className="text-sm text-gray-500">{alert.productBrand}</p>
<span className="text-sm">
Target:{' '}
<span className="font-medium">
${parseFloat(alert.targetPrice).toFixed(2)}
</span>
</div>
<Badge variant="secondary">Paused</Badge>
<div className="flex items-center gap-2">
<Button
variant="ghost"
size="icon"
onClick={() => toggleAlert(alert.id)}
title="Resume alert"
>
<Play className="h-4 w-4" />
</Button>
<Button
variant="ghost"
size="icon"
onClick={() => deleteAlert(alert.id)}
className="text-red-600 hover:text-red-700 hover:bg-red-50"
title="Delete alert"
>
<Trash2 className="h-4 w-4" />
</Button>
</div>
</span>
</div>
</CardContent>
</Card>
);
})}
<Badge variant="secondary">Paused</Badge>
<div className="flex items-center gap-2">
<Button
variant="ghost"
size="icon"
onClick={() => handleToggleAlert(alert.id)}
disabled={togglingId === alert.id}
title="Resume alert"
>
{togglingId === alert.id ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Play className="h-4 w-4" />
)}
</Button>
<Button
variant="ghost"
size="icon"
onClick={() => handleDeleteAlert(alert.id)}
disabled={deletingId === alert.id}
className="text-red-600 hover:text-red-700 hover:bg-red-50"
title="Delete alert"
>
{deletingId === alert.id ? (
<Loader2 className="h-4 w-4 animate-spin" />
) : (
<Trash2 className="h-4 w-4" />
)}
</Button>
</div>
</div>
</CardContent>
</Card>
))}
</div>
</div>
)}

Some files were not shown because too many files have changed in this diff Show More