feat(tasks): Add unified task-based worker architecture

Replace fragmented job systems (job_schedules, dispensary_crawl_jobs, SyncOrchestrator)
with a single unified task queue:

- Add worker_tasks table with atomic task claiming via SELECT FOR UPDATE SKIP LOCKED
- Add TaskService for CRUD, claiming, and capacity metrics
- Add TaskWorker with role-based handlers (resync, discovery, analytics)
- Add /api/tasks endpoints for management and migration from legacy systems
- Add TasksDashboard UI and integrate task counts into main dashboard
- Add comprehensive documentation

Task roles: store_discovery, entry_point_discovery, product_discovery, product_resync, analytics_refresh

Run workers with: WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Kelly
2025-12-09 16:27:03 -07:00
parent 7f9cf559cf
commit 89c262ee20
18 changed files with 3167 additions and 2 deletions

View File

@@ -0,0 +1,400 @@
# Worker Task Architecture
This document describes the unified task-based worker system that replaces the legacy fragmented job systems.
## Overview
The task worker architecture provides a single, unified system for managing all background work in CannaiQ:
- **Store discovery** - Find new dispensaries on platforms
- **Entry point discovery** - Resolve platform IDs from menu URLs
- **Product discovery** - Initial product fetch for new stores
- **Product resync** - Regular price/stock updates for existing stores
- **Analytics refresh** - Refresh materialized views and analytics
## Architecture
### Database Tables
**`worker_tasks`** - Central task queue
```sql
CREATE TABLE worker_tasks (
id SERIAL PRIMARY KEY,
role task_role NOT NULL, -- What type of work
dispensary_id INTEGER, -- Which store (if applicable)
platform VARCHAR(50), -- Which platform (dutchie, etc.)
status task_status DEFAULT 'pending',
priority INTEGER DEFAULT 0, -- Higher = process first
scheduled_for TIMESTAMP, -- Don't process before this time
worker_id VARCHAR(100), -- Which worker claimed it
claimed_at TIMESTAMP,
started_at TIMESTAMP,
completed_at TIMESTAMP,
last_heartbeat_at TIMESTAMP, -- For stale detection
result JSONB, -- Output from handler
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
```
**Key indexes:**
- `idx_worker_tasks_pending_priority` - For efficient task claiming
- `idx_worker_tasks_active_dispensary` - Prevents concurrent tasks per store (partial unique index)
### Task Roles
| Role | Purpose | Per-Store | Scheduled |
|------|---------|-----------|-----------|
| `store_discovery` | Find new stores on a platform | No | Daily |
| `entry_point_discovery` | Resolve platform IDs | Yes | On-demand |
| `product_discovery` | Initial product fetch | Yes | After entry_point |
| `product_resync` | Price/stock updates | Yes | Every 4 hours |
| `analytics_refresh` | Refresh MVs | No | Daily |
### Task Lifecycle
```
pending → claimed → running → completed
failed
```
1. **pending** - Task is waiting to be picked up
2. **claimed** - Worker has claimed it (atomic via SELECT FOR UPDATE SKIP LOCKED)
3. **running** - Worker is actively processing
4. **completed** - Task finished successfully
5. **failed** - Task encountered an error
6. **stale** - Task lost its worker (recovered automatically)
## Files
### Core Files
| File | Purpose |
|------|---------|
| `src/tasks/task-service.ts` | TaskService - CRUD, claiming, capacity metrics |
| `src/tasks/task-worker.ts` | TaskWorker - Main worker loop |
| `src/tasks/index.ts` | Module exports |
| `src/routes/tasks.ts` | API endpoints |
| `migrations/074_worker_task_queue.sql` | Database schema |
### Task Handlers
| File | Role |
|------|------|
| `src/tasks/handlers/store-discovery.ts` | `store_discovery` |
| `src/tasks/handlers/entry-point-discovery.ts` | `entry_point_discovery` |
| `src/tasks/handlers/product-discovery.ts` | `product_discovery` |
| `src/tasks/handlers/product-resync.ts` | `product_resync` |
| `src/tasks/handlers/analytics-refresh.ts` | `analytics_refresh` |
## Running Workers
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `WORKER_ROLE` | (required) | Which task role to process |
| `WORKER_ID` | auto-generated | Custom worker identifier |
| `POLL_INTERVAL_MS` | 5000 | How often to check for tasks |
| `HEARTBEAT_INTERVAL_MS` | 30000 | How often to update heartbeat |
### Starting a Worker
```bash
# Start a product resync worker
WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts
# Start with custom ID
WORKER_ROLE=product_resync WORKER_ID=resync-1 npx tsx src/tasks/task-worker.ts
# Start multiple workers for different roles
WORKER_ROLE=store_discovery npx tsx src/tasks/task-worker.ts &
WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts &
```
### Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: task-worker-resync
spec:
replicas: 3
template:
spec:
containers:
- name: worker
image: code.cannabrands.app/creationshop/dispensary-scraper:latest
command: ["npx", "tsx", "src/tasks/task-worker.ts"]
env:
- name: WORKER_ROLE
value: "product_resync"
```
## API Endpoints
### Task Management
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks` | GET | List tasks with filters |
| `/api/tasks` | POST | Create a new task |
| `/api/tasks/:id` | GET | Get task by ID |
| `/api/tasks/counts` | GET | Get counts by status |
| `/api/tasks/capacity` | GET | Get capacity metrics |
| `/api/tasks/capacity/:role` | GET | Get role-specific capacity |
| `/api/tasks/recover-stale` | POST | Recover tasks from dead workers |
### Task Generation
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks/generate/resync` | POST | Generate daily resync tasks |
| `/api/tasks/generate/discovery` | POST | Create store discovery task |
### Migration (from legacy systems)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks/migration/status` | GET | Compare old vs new systems |
| `/api/tasks/migration/disable-old-schedules` | POST | Disable job_schedules |
| `/api/tasks/migration/cancel-pending-crawl-jobs` | POST | Cancel old crawl jobs |
| `/api/tasks/migration/create-resync-tasks` | POST | Create tasks for all stores |
| `/api/tasks/migration/full-migrate` | POST | One-click migration |
### Role-Specific Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/tasks/role/:role/last-completion` | GET | Last completion time |
| `/api/tasks/role/:role/recent` | GET | Recent completions |
| `/api/tasks/store/:id/active` | GET | Check if store has active task |
## Capacity Planning
The `v_worker_capacity` view provides real-time metrics:
```sql
SELECT * FROM v_worker_capacity;
```
Returns:
- `pending_tasks` - Tasks waiting to be claimed
- `ready_tasks` - Tasks ready now (scheduled_for is null or past)
- `claimed_tasks` - Tasks claimed but not started
- `running_tasks` - Tasks actively processing
- `completed_last_hour` - Recent completions
- `failed_last_hour` - Recent failures
- `active_workers` - Workers with recent heartbeats
- `avg_duration_sec` - Average task duration
- `tasks_per_worker_hour` - Throughput estimate
- `estimated_hours_to_drain` - Time to clear queue
### Scaling Recommendations
```javascript
// API: GET /api/tasks/capacity/:role
{
"role": "product_resync",
"pending_tasks": 500,
"active_workers": 3,
"workers_needed": {
"for_1_hour": 10,
"for_4_hours": 3,
"for_8_hours": 2
}
}
```
## Task Chaining
Tasks can automatically create follow-up tasks:
```
store_discovery → entry_point_discovery → product_discovery
(store has platform_dispensary_id)
Daily resync tasks
```
The `chainNextTask()` method handles this automatically.
## Stale Task Recovery
Tasks are considered stale if `last_heartbeat_at` is older than the threshold (default 10 minutes).
```sql
SELECT recover_stale_tasks(10); -- 10 minute threshold
```
Or via API:
```bash
curl -X POST /api/tasks/recover-stale \
-H 'Content-Type: application/json' \
-d '{"threshold_minutes": 10}'
```
## Migration from Legacy Systems
### Legacy Systems Replaced
1. **job_schedules + job_run_logs** - Scheduled job definitions
2. **dispensary_crawl_jobs** - Per-dispensary crawl queue
3. **SyncOrchestrator + HydrationWorker** - Raw payload processing
### Migration Steps
**Option 1: One-Click Migration**
```bash
curl -X POST /api/tasks/migration/full-migrate
```
This will:
1. Disable all job_schedules
2. Cancel pending dispensary_crawl_jobs
3. Generate resync tasks for all stores
4. Create discovery and analytics tasks
**Option 2: Manual Migration**
```bash
# 1. Check current status
curl /api/tasks/migration/status
# 2. Disable old schedules
curl -X POST /api/tasks/migration/disable-old-schedules
# 3. Cancel pending crawl jobs
curl -X POST /api/tasks/migration/cancel-pending-crawl-jobs
# 4. Create resync tasks
curl -X POST /api/tasks/migration/create-resync-tasks \
-H 'Content-Type: application/json' \
-d '{"state_code": "AZ"}'
# 5. Generate daily resync schedule
curl -X POST /api/tasks/generate/resync \
-H 'Content-Type: application/json' \
-d '{"batches_per_day": 6}'
```
## Per-Store Locking
The system prevents concurrent tasks for the same store using a partial unique index:
```sql
CREATE UNIQUE INDEX idx_worker_tasks_active_dispensary
ON worker_tasks (dispensary_id)
WHERE dispensary_id IS NOT NULL
AND status IN ('claimed', 'running');
```
This ensures only one task can be active per store at any time.
## Task Priority
Tasks are claimed in priority order (higher first), then by creation time:
```sql
ORDER BY priority DESC, created_at ASC
```
Default priorities:
- `store_discovery`: 0
- `entry_point_discovery`: 10 (high - new stores)
- `product_discovery`: 10 (high - new stores)
- `product_resync`: 0
- `analytics_refresh`: 0
## Scheduled Tasks
Tasks can be scheduled for future execution:
```javascript
await taskService.createTask({
role: 'product_resync',
dispensary_id: 123,
scheduled_for: new Date('2025-01-10T06:00:00Z'),
});
```
The `generate_resync_tasks()` function creates staggered tasks throughout the day:
```sql
SELECT generate_resync_tasks(6, '2025-01-10'); -- 6 batches = every 4 hours
```
## Dashboard Integration
The admin dashboard shows task queue status in the main overview:
```
Task Queue Summary
------------------
Pending: 45
Running: 3
Completed: 1,234
Failed: 12
```
Full task management is available at `/admin/tasks`.
## Error Handling
Failed tasks include the error message in `error_message` and can be retried:
```sql
-- View failed tasks
SELECT id, role, dispensary_id, error_message, retry_count
FROM worker_tasks
WHERE status = 'failed'
ORDER BY completed_at DESC
LIMIT 20;
-- Retry failed tasks
UPDATE worker_tasks
SET status = 'pending', retry_count = retry_count + 1
WHERE status = 'failed' AND retry_count < max_retries;
```
## Monitoring
### Logs
Workers log to stdout:
```
[TaskWorker] Starting worker worker-product_resync-a1b2c3d4 for role: product_resync
[TaskWorker] Claimed task 123 (product_resync) for dispensary 456
[TaskWorker] Task 123 completed successfully
```
### Health Check
Check if workers are active:
```sql
SELECT worker_id, role, COUNT(*), MAX(last_heartbeat_at)
FROM worker_tasks
WHERE last_heartbeat_at > NOW() - INTERVAL '5 minutes'
GROUP BY worker_id, role;
```
### Metrics
```sql
-- Tasks by status
SELECT status, COUNT(*) FROM worker_tasks GROUP BY status;
-- Tasks by role
SELECT role, status, COUNT(*) FROM worker_tasks GROUP BY role, status;
-- Average duration by role
SELECT role, AVG(EXTRACT(EPOCH FROM (completed_at - started_at))) as avg_seconds
FROM worker_tasks
WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '24 hours'
GROUP BY role;
```

View File

@@ -0,0 +1,322 @@
-- Migration 074: Worker Task Queue System
-- Implements role-based task queue with per-store locking and capacity tracking
-- Task queue table
CREATE TABLE IF NOT EXISTS worker_tasks (
id SERIAL PRIMARY KEY,
-- Task identification
role VARCHAR(50) NOT NULL, -- store_discovery, entry_point_discovery, product_discovery, product_resync, analytics_refresh
dispensary_id INTEGER REFERENCES dispensaries(id) ON DELETE CASCADE,
platform VARCHAR(20), -- dutchie, jane, treez, etc.
-- Task state
status VARCHAR(20) NOT NULL DEFAULT 'pending',
priority INTEGER DEFAULT 0, -- Higher = more urgent
-- Scheduling
scheduled_for TIMESTAMPTZ, -- For batch scheduling (e.g., every 4 hours)
-- Ownership
worker_id VARCHAR(100), -- Pod name or worker ID
claimed_at TIMESTAMPTZ,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
last_heartbeat_at TIMESTAMPTZ,
-- Results
result JSONB, -- Task output data
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3,
-- Metadata
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
-- Constraints
CONSTRAINT valid_status CHECK (status IN ('pending', 'claimed', 'running', 'completed', 'failed', 'stale'))
);
-- Indexes for efficient task claiming
CREATE INDEX IF NOT EXISTS idx_worker_tasks_pending
ON worker_tasks(role, priority DESC, created_at ASC)
WHERE status = 'pending';
CREATE INDEX IF NOT EXISTS idx_worker_tasks_claimed
ON worker_tasks(worker_id, claimed_at)
WHERE status = 'claimed';
CREATE INDEX IF NOT EXISTS idx_worker_tasks_running
ON worker_tasks(worker_id, last_heartbeat_at)
WHERE status = 'running';
CREATE INDEX IF NOT EXISTS idx_worker_tasks_dispensary
ON worker_tasks(dispensary_id)
WHERE dispensary_id IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_worker_tasks_scheduled
ON worker_tasks(scheduled_for)
WHERE status = 'pending' AND scheduled_for IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_worker_tasks_history
ON worker_tasks(role, completed_at DESC)
WHERE status IN ('completed', 'failed');
-- Partial unique index to prevent duplicate active tasks per store
-- Only one task can be claimed/running for a given dispensary at a time
CREATE UNIQUE INDEX IF NOT EXISTS idx_worker_tasks_unique_active_store
ON worker_tasks(dispensary_id)
WHERE status IN ('claimed', 'running') AND dispensary_id IS NOT NULL;
-- Worker registration table (tracks active workers)
CREATE TABLE IF NOT EXISTS worker_registry (
id SERIAL PRIMARY KEY,
worker_id VARCHAR(100) UNIQUE NOT NULL,
role VARCHAR(50) NOT NULL,
pod_name VARCHAR(100),
hostname VARCHAR(100),
started_at TIMESTAMPTZ DEFAULT NOW(),
last_heartbeat_at TIMESTAMPTZ DEFAULT NOW(),
tasks_completed INTEGER DEFAULT 0,
tasks_failed INTEGER DEFAULT 0,
status VARCHAR(20) DEFAULT 'active',
CONSTRAINT valid_worker_status CHECK (status IN ('active', 'idle', 'offline'))
);
CREATE INDEX IF NOT EXISTS idx_worker_registry_role
ON worker_registry(role, status);
CREATE INDEX IF NOT EXISTS idx_worker_registry_heartbeat
ON worker_registry(last_heartbeat_at)
WHERE status = 'active';
-- Task completion tracking (summarized history)
CREATE TABLE IF NOT EXISTS task_completion_log (
id SERIAL PRIMARY KEY,
role VARCHAR(50) NOT NULL,
date DATE NOT NULL DEFAULT CURRENT_DATE,
hour INTEGER NOT NULL DEFAULT EXTRACT(HOUR FROM NOW()),
tasks_created INTEGER DEFAULT 0,
tasks_completed INTEGER DEFAULT 0,
tasks_failed INTEGER DEFAULT 0,
avg_duration_sec NUMERIC(10,2),
min_duration_sec NUMERIC(10,2),
max_duration_sec NUMERIC(10,2),
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(role, date, hour)
);
-- Capacity planning view
CREATE OR REPLACE VIEW v_worker_capacity AS
SELECT
role,
COUNT(*) FILTER (WHERE status = 'pending') as pending_tasks,
COUNT(*) FILTER (WHERE status = 'pending' AND (scheduled_for IS NULL OR scheduled_for <= NOW())) as ready_tasks,
COUNT(*) FILTER (WHERE status = 'claimed') as claimed_tasks,
COUNT(*) FILTER (WHERE status = 'running') as running_tasks,
COUNT(*) FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') as completed_last_hour,
COUNT(*) FILTER (WHERE status = 'failed' AND completed_at > NOW() - INTERVAL '1 hour') as failed_last_hour,
COUNT(DISTINCT worker_id) FILTER (WHERE status IN ('claimed', 'running')) as active_workers,
AVG(EXTRACT(EPOCH FROM (completed_at - started_at)))
FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') as avg_duration_sec,
-- Capacity planning metrics
CASE
WHEN COUNT(*) FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') > 0
THEN 3600.0 / NULLIF(AVG(EXTRACT(EPOCH FROM (completed_at - started_at)))
FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour'), 0)
ELSE NULL
END as tasks_per_worker_hour,
-- Estimated time to drain queue
CASE
WHEN COUNT(DISTINCT worker_id) FILTER (WHERE status IN ('claimed', 'running')) > 0
AND COUNT(*) FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour') > 0
THEN COUNT(*) FILTER (WHERE status = 'pending') / NULLIF(
COUNT(DISTINCT worker_id) FILTER (WHERE status IN ('claimed', 'running')) *
(3600.0 / NULLIF(AVG(EXTRACT(EPOCH FROM (completed_at - started_at)))
FILTER (WHERE status = 'completed' AND completed_at > NOW() - INTERVAL '1 hour'), 0)),
0
)
ELSE NULL
END as estimated_hours_to_drain
FROM worker_tasks
GROUP BY role;
-- Task history view (for UI)
CREATE OR REPLACE VIEW v_task_history AS
SELECT
t.id,
t.role,
t.dispensary_id,
d.name as dispensary_name,
t.platform,
t.status,
t.priority,
t.worker_id,
t.scheduled_for,
t.claimed_at,
t.started_at,
t.completed_at,
t.error_message,
t.retry_count,
t.created_at,
EXTRACT(EPOCH FROM (t.completed_at - t.started_at)) as duration_sec
FROM worker_tasks t
LEFT JOIN dispensaries d ON d.id = t.dispensary_id
ORDER BY t.created_at DESC;
-- Function to claim a task atomically
CREATE OR REPLACE FUNCTION claim_task(
p_role VARCHAR(50),
p_worker_id VARCHAR(100)
) RETURNS worker_tasks AS $$
DECLARE
claimed_task worker_tasks;
BEGIN
UPDATE worker_tasks
SET
status = 'claimed',
worker_id = p_worker_id,
claimed_at = NOW(),
updated_at = NOW()
WHERE id = (
SELECT id FROM worker_tasks
WHERE role = p_role
AND status = 'pending'
AND (scheduled_for IS NULL OR scheduled_for <= NOW())
-- Exclude stores that already have an active task
AND (dispensary_id IS NULL OR dispensary_id NOT IN (
SELECT dispensary_id FROM worker_tasks
WHERE status IN ('claimed', 'running')
AND dispensary_id IS NOT NULL
))
ORDER BY priority DESC, created_at ASC
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING * INTO claimed_task;
RETURN claimed_task;
END;
$$ LANGUAGE plpgsql;
-- Function to mark stale tasks (workers that died)
CREATE OR REPLACE FUNCTION recover_stale_tasks(
stale_threshold_minutes INTEGER DEFAULT 10
) RETURNS INTEGER AS $$
DECLARE
recovered_count INTEGER;
BEGIN
WITH stale AS (
UPDATE worker_tasks
SET
status = 'pending',
worker_id = NULL,
claimed_at = NULL,
started_at = NULL,
retry_count = retry_count + 1,
updated_at = NOW()
WHERE status IN ('claimed', 'running')
AND last_heartbeat_at < NOW() - (stale_threshold_minutes || ' minutes')::INTERVAL
AND retry_count < max_retries
RETURNING id
)
SELECT COUNT(*) INTO recovered_count FROM stale;
-- Mark tasks that exceeded retries as failed
UPDATE worker_tasks
SET
status = 'failed',
error_message = 'Exceeded max retries after worker failures',
completed_at = NOW(),
updated_at = NOW()
WHERE status IN ('claimed', 'running')
AND last_heartbeat_at < NOW() - (stale_threshold_minutes || ' minutes')::INTERVAL
AND retry_count >= max_retries;
RETURN recovered_count;
END;
$$ LANGUAGE plpgsql;
-- Function to generate daily resync tasks
CREATE OR REPLACE FUNCTION generate_resync_tasks(
p_batches_per_day INTEGER DEFAULT 6, -- Every 4 hours
p_date DATE DEFAULT CURRENT_DATE
) RETURNS INTEGER AS $$
DECLARE
store_count INTEGER;
stores_per_batch INTEGER;
batch_num INTEGER;
scheduled_time TIMESTAMPTZ;
created_count INTEGER := 0;
BEGIN
-- Count active stores that need resync
SELECT COUNT(*) INTO store_count
FROM dispensaries
WHERE crawl_enabled = true
AND menu_type = 'dutchie'
AND platform_dispensary_id IS NOT NULL;
IF store_count = 0 THEN
RETURN 0;
END IF;
stores_per_batch := CEIL(store_count::NUMERIC / p_batches_per_day);
FOR batch_num IN 0..(p_batches_per_day - 1) LOOP
scheduled_time := p_date + (batch_num * 4 || ' hours')::INTERVAL;
INSERT INTO worker_tasks (role, dispensary_id, platform, scheduled_for, priority)
SELECT
'product_resync',
d.id,
'dutchie',
scheduled_time,
0
FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY id) as rn
FROM dispensaries
WHERE crawl_enabled = true
AND menu_type = 'dutchie'
AND platform_dispensary_id IS NOT NULL
) d
WHERE d.rn > (batch_num * stores_per_batch)
AND d.rn <= ((batch_num + 1) * stores_per_batch)
ON CONFLICT DO NOTHING;
GET DIAGNOSTICS created_count = created_count + ROW_COUNT;
END LOOP;
RETURN created_count;
END;
$$ LANGUAGE plpgsql;
-- Trigger to update timestamp
CREATE OR REPLACE FUNCTION update_worker_tasks_timestamp()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS worker_tasks_updated_at ON worker_tasks;
CREATE TRIGGER worker_tasks_updated_at
BEFORE UPDATE ON worker_tasks
FOR EACH ROW
EXECUTE FUNCTION update_worker_tasks_timestamp();
-- Comments
COMMENT ON TABLE worker_tasks IS 'Central task queue for all worker roles';
COMMENT ON TABLE worker_registry IS 'Registry of active workers and their stats';
COMMENT ON TABLE task_completion_log IS 'Hourly aggregated task completion metrics';
COMMENT ON VIEW v_worker_capacity IS 'Real-time capacity planning metrics per role';
COMMENT ON VIEW v_task_history IS 'Task history with dispensary details for UI';
COMMENT ON FUNCTION claim_task IS 'Atomically claim a task for a worker, respecting per-store locking';
COMMENT ON FUNCTION recover_stale_tasks IS 'Release tasks from dead workers back to pending';
COMMENT ON FUNCTION generate_resync_tasks IS 'Generate daily product resync tasks in batches';

View File

@@ -139,6 +139,7 @@ import eventsRoutes from './routes/events';
import clickAnalyticsRoutes from './routes/click-analytics'; import clickAnalyticsRoutes from './routes/click-analytics';
import seoRoutes from './routes/seo'; import seoRoutes from './routes/seo';
import priceAnalyticsRoutes from './routes/price-analytics'; import priceAnalyticsRoutes from './routes/price-analytics';
import tasksRoutes from './routes/tasks';
// Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com) // Mark requests from trusted domains (cannaiq.co, findagram.co, findadispo.com)
// These domains can access the API without authentication // These domains can access the API without authentication
@@ -211,6 +212,10 @@ app.use('/api/monitor', workersRoutes);
app.use('/api/job-queue', jobQueueRoutes); app.use('/api/job-queue', jobQueueRoutes);
console.log('[Workers] Routes registered at /api/workers, /api/monitor, and /api/job-queue'); console.log('[Workers] Routes registered at /api/workers, /api/monitor, and /api/job-queue');
// Task queue management - worker tasks with capacity planning
app.use('/api/tasks', tasksRoutes);
console.log('[Tasks] Routes registered at /api/tasks');
// Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation // Phase 3: Analytics V2 - Enhanced analytics with rec/med state segmentation
try { try {
const analyticsV2Router = createAnalyticsV2Router(getPool()); const analyticsV2Router = createAnalyticsV2Router(getPool());

565
backend/src/routes/tasks.ts Normal file
View File

@@ -0,0 +1,565 @@
/**
* Task Queue API Routes
*
* Endpoints for managing worker tasks, viewing capacity metrics,
* and generating batch tasks.
*/
import { Router, Request, Response } from 'express';
import {
taskService,
TaskRole,
TaskStatus,
TaskFilter,
} from '../tasks/task-service';
import { pool } from '../db/pool';
const router = Router();
/**
* GET /api/tasks
* List tasks with optional filters
*
* Query params:
* - role: Filter by role
* - status: Filter by status (comma-separated for multiple)
* - dispensary_id: Filter by dispensary
* - worker_id: Filter by worker
* - limit: Max results (default 100)
* - offset: Pagination offset
*/
router.get('/', async (req: Request, res: Response) => {
try {
const filter: TaskFilter = {};
if (req.query.role) {
filter.role = req.query.role as TaskRole;
}
if (req.query.status) {
const statuses = (req.query.status as string).split(',') as TaskStatus[];
filter.status = statuses.length === 1 ? statuses[0] : statuses;
}
if (req.query.dispensary_id) {
filter.dispensary_id = parseInt(req.query.dispensary_id as string, 10);
}
if (req.query.worker_id) {
filter.worker_id = req.query.worker_id as string;
}
if (req.query.limit) {
filter.limit = parseInt(req.query.limit as string, 10);
}
if (req.query.offset) {
filter.offset = parseInt(req.query.offset as string, 10);
}
const tasks = await taskService.listTasks(filter);
res.json({ tasks, count: tasks.length });
} catch (error: unknown) {
console.error('Error listing tasks:', error);
res.status(500).json({ error: 'Failed to list tasks' });
}
});
/**
* GET /api/tasks/counts
* Get task counts by status
*/
router.get('/counts', async (_req: Request, res: Response) => {
try {
const counts = await taskService.getTaskCounts();
res.json(counts);
} catch (error: unknown) {
console.error('Error getting task counts:', error);
res.status(500).json({ error: 'Failed to get task counts' });
}
});
/**
* GET /api/tasks/capacity
* Get capacity metrics for all roles
*/
router.get('/capacity', async (_req: Request, res: Response) => {
try {
const metrics = await taskService.getCapacityMetrics();
res.json({ metrics });
} catch (error: unknown) {
console.error('Error getting capacity metrics:', error);
res.status(500).json({ error: 'Failed to get capacity metrics' });
}
});
/**
* GET /api/tasks/capacity/:role
* Get capacity metrics for a specific role
*/
router.get('/capacity/:role', async (req: Request, res: Response) => {
try {
const role = req.params.role as TaskRole;
const capacity = await taskService.getRoleCapacity(role);
if (!capacity) {
return res.status(404).json({ error: 'Role not found or no data' });
}
// Calculate workers needed for different SLAs
const workersFor1Hour = await taskService.calculateWorkersNeeded(role, 1);
const workersFor4Hours = await taskService.calculateWorkersNeeded(role, 4);
const workersFor8Hours = await taskService.calculateWorkersNeeded(role, 8);
res.json({
...capacity,
workers_needed: {
for_1_hour: workersFor1Hour,
for_4_hours: workersFor4Hours,
for_8_hours: workersFor8Hours,
},
});
} catch (error: unknown) {
console.error('Error getting role capacity:', error);
res.status(500).json({ error: 'Failed to get role capacity' });
}
});
/**
* GET /api/tasks/:id
* Get a specific task by ID
*/
router.get('/:id', async (req: Request, res: Response) => {
try {
const taskId = parseInt(req.params.id, 10);
const task = await taskService.getTask(taskId);
if (!task) {
return res.status(404).json({ error: 'Task not found' });
}
res.json(task);
} catch (error: unknown) {
console.error('Error getting task:', error);
res.status(500).json({ error: 'Failed to get task' });
}
});
/**
* POST /api/tasks
* Create a new task
*
* Body:
* - role: TaskRole (required)
* - dispensary_id: number (optional)
* - platform: string (optional)
* - priority: number (optional, default 0)
* - scheduled_for: ISO date string (optional)
*/
router.post('/', async (req: Request, res: Response) => {
try {
const { role, dispensary_id, platform, priority, scheduled_for } = req.body;
if (!role) {
return res.status(400).json({ error: 'Role is required' });
}
// Check if store already has an active task
if (dispensary_id) {
const hasActive = await taskService.hasActiveTask(dispensary_id);
if (hasActive) {
return res.status(409).json({
error: 'Store already has an active task',
dispensary_id,
});
}
}
const task = await taskService.createTask({
role,
dispensary_id,
platform,
priority,
scheduled_for: scheduled_for ? new Date(scheduled_for) : undefined,
});
res.status(201).json(task);
} catch (error: unknown) {
console.error('Error creating task:', error);
res.status(500).json({ error: 'Failed to create task' });
}
});
/**
* POST /api/tasks/generate/resync
* Generate daily resync tasks for all active stores
*
* Body:
* - batches_per_day: number (optional, default 6 = every 4 hours)
* - date: ISO date string (optional, default today)
*/
router.post('/generate/resync', async (req: Request, res: Response) => {
try {
const { batches_per_day, date } = req.body;
const batchesPerDay = batches_per_day ?? 6;
const targetDate = date ? new Date(date) : new Date();
const createdCount = await taskService.generateDailyResyncTasks(
batchesPerDay,
targetDate
);
res.json({
success: true,
tasks_created: createdCount,
batches_per_day: batchesPerDay,
date: targetDate.toISOString().split('T')[0],
});
} catch (error: unknown) {
console.error('Error generating resync tasks:', error);
res.status(500).json({ error: 'Failed to generate resync tasks' });
}
});
/**
* POST /api/tasks/generate/discovery
* Generate store discovery tasks for a platform
*
* Body:
* - platform: string (required, e.g., 'dutchie')
* - state_code: string (optional, e.g., 'AZ')
* - priority: number (optional)
*/
router.post('/generate/discovery', async (req: Request, res: Response) => {
try {
const { platform, state_code, priority } = req.body;
if (!platform) {
return res.status(400).json({ error: 'Platform is required' });
}
const task = await taskService.createStoreDiscoveryTask(
platform,
state_code,
priority ?? 0
);
res.status(201).json(task);
} catch (error: unknown) {
console.error('Error creating discovery task:', error);
res.status(500).json({ error: 'Failed to create discovery task' });
}
});
/**
* POST /api/tasks/recover-stale
* Recover stale tasks from dead workers
*
* Body:
* - threshold_minutes: number (optional, default 10)
*/
router.post('/recover-stale', async (req: Request, res: Response) => {
try {
const { threshold_minutes } = req.body;
const recovered = await taskService.recoverStaleTasks(threshold_minutes ?? 10);
res.json({
success: true,
tasks_recovered: recovered,
});
} catch (error: unknown) {
console.error('Error recovering stale tasks:', error);
res.status(500).json({ error: 'Failed to recover stale tasks' });
}
});
/**
* GET /api/tasks/role/:role/last-completion
* Get the last completion time for a role
*/
router.get('/role/:role/last-completion', async (req: Request, res: Response) => {
try {
const role = req.params.role as TaskRole;
const lastCompletion = await taskService.getLastCompletion(role);
res.json({
role,
last_completion: lastCompletion?.toISOString() ?? null,
time_since: lastCompletion
? Math.floor((Date.now() - lastCompletion.getTime()) / 1000)
: null,
});
} catch (error: unknown) {
console.error('Error getting last completion:', error);
res.status(500).json({ error: 'Failed to get last completion' });
}
});
/**
* GET /api/tasks/role/:role/recent
* Get recent completions for a role
*/
router.get('/role/:role/recent', async (req: Request, res: Response) => {
try {
const role = req.params.role as TaskRole;
const limit = parseInt(req.query.limit as string, 10) || 10;
const tasks = await taskService.getRecentCompletions(role, limit);
res.json({ tasks });
} catch (error: unknown) {
console.error('Error getting recent completions:', error);
res.status(500).json({ error: 'Failed to get recent completions' });
}
});
/**
* GET /api/tasks/store/:dispensaryId/active
* Check if a store has an active task
*/
router.get('/store/:dispensaryId/active', async (req: Request, res: Response) => {
try {
const dispensaryId = parseInt(req.params.dispensaryId, 10);
const hasActive = await taskService.hasActiveTask(dispensaryId);
res.json({
dispensary_id: dispensaryId,
has_active_task: hasActive,
});
} catch (error: unknown) {
console.error('Error checking active task:', error);
res.status(500).json({ error: 'Failed to check active task' });
}
});
// ============================================================
// MIGRATION ROUTES - Disable old job systems
// ============================================================
/**
* GET /api/tasks/migration/status
* Get status of old job systems vs new task queue
*/
router.get('/migration/status', async (_req: Request, res: Response) => {
try {
// Get old job system counts
const [schedules, crawlJobs, rawPayloads, taskCounts] = await Promise.all([
pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE enabled = true) as enabled
FROM job_schedules
`),
pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE status = 'pending') as pending,
COUNT(*) FILTER (WHERE status = 'running') as running
FROM dispensary_crawl_jobs
`),
pool.query(`
SELECT
COUNT(*) as total,
COUNT(*) FILTER (WHERE processed = false) as unprocessed
FROM raw_payloads
`),
taskService.getTaskCounts(),
]);
res.json({
old_systems: {
job_schedules: {
total: parseInt(schedules.rows[0].total) || 0,
enabled: parseInt(schedules.rows[0].enabled) || 0,
},
dispensary_crawl_jobs: {
total: parseInt(crawlJobs.rows[0].total) || 0,
pending: parseInt(crawlJobs.rows[0].pending) || 0,
running: parseInt(crawlJobs.rows[0].running) || 0,
},
raw_payloads: {
total: parseInt(rawPayloads.rows[0].total) || 0,
unprocessed: parseInt(rawPayloads.rows[0].unprocessed) || 0,
},
},
new_task_queue: taskCounts,
recommendation: schedules.rows[0].enabled > 0
? 'Disable old job schedules before switching to new task queue'
: 'Ready to use new task queue',
});
} catch (error: unknown) {
console.error('Error getting migration status:', error);
res.status(500).json({ error: 'Failed to get migration status' });
}
});
/**
* POST /api/tasks/migration/disable-old-schedules
* Disable all old job schedules to prepare for new task queue
*/
router.post('/migration/disable-old-schedules', async (_req: Request, res: Response) => {
try {
const result = await pool.query(`
UPDATE job_schedules
SET enabled = false,
updated_at = NOW()
WHERE enabled = true
RETURNING id, job_name
`);
res.json({
success: true,
disabled_count: result.rowCount,
disabled_schedules: result.rows.map(r => ({ id: r.id, job_name: r.job_name })),
});
} catch (error: unknown) {
console.error('Error disabling old schedules:', error);
res.status(500).json({ error: 'Failed to disable old schedules' });
}
});
/**
* POST /api/tasks/migration/cancel-pending-crawl-jobs
* Cancel all pending crawl jobs from the old system
*/
router.post('/migration/cancel-pending-crawl-jobs', async (_req: Request, res: Response) => {
try {
const result = await pool.query(`
UPDATE dispensary_crawl_jobs
SET status = 'cancelled',
completed_at = NOW(),
updated_at = NOW()
WHERE status = 'pending'
RETURNING id
`);
res.json({
success: true,
cancelled_count: result.rowCount,
});
} catch (error: unknown) {
console.error('Error cancelling pending crawl jobs:', error);
res.status(500).json({ error: 'Failed to cancel pending crawl jobs' });
}
});
/**
* POST /api/tasks/migration/create-resync-tasks
* Create product_resync tasks for all crawl-enabled dispensaries
*/
router.post('/migration/create-resync-tasks', async (req: Request, res: Response) => {
try {
const { priority = 0, state_code } = req.body;
let query = `
SELECT id, name FROM dispensaries
WHERE crawl_enabled = true
AND platform_dispensary_id IS NOT NULL
`;
const params: any[] = [];
if (state_code) {
query += `
AND state_id = (SELECT id FROM states WHERE code = $1)
`;
params.push(state_code.toUpperCase());
}
query += ` ORDER BY id`;
const dispensaries = await pool.query(query, params);
let created = 0;
for (const disp of dispensaries.rows) {
// Check if already has pending/running task
const hasActive = await taskService.hasActiveTask(disp.id);
if (!hasActive) {
await taskService.createTask({
role: 'product_resync',
dispensary_id: disp.id,
platform: 'dutchie',
priority,
});
created++;
}
}
res.json({
success: true,
tasks_created: created,
dispensaries_checked: dispensaries.rows.length,
state_filter: state_code || 'all',
});
} catch (error: unknown) {
console.error('Error creating resync tasks:', error);
res.status(500).json({ error: 'Failed to create resync tasks' });
}
});
/**
* POST /api/tasks/migration/full-migrate
* One-click migration: disable old systems, create new tasks
*/
router.post('/migration/full-migrate', async (req: Request, res: Response) => {
try {
const results: any = {
success: true,
steps: [],
};
// Step 1: Disable old job schedules
const disableResult = await pool.query(`
UPDATE job_schedules
SET enabled = false, updated_at = NOW()
WHERE enabled = true
RETURNING id
`);
results.steps.push({
step: 'disable_job_schedules',
count: disableResult.rowCount,
});
// Step 2: Cancel pending crawl jobs
const cancelResult = await pool.query(`
UPDATE dispensary_crawl_jobs
SET status = 'cancelled', completed_at = NOW(), updated_at = NOW()
WHERE status = 'pending'
RETURNING id
`);
results.steps.push({
step: 'cancel_pending_crawl_jobs',
count: cancelResult.rowCount,
});
// Step 3: Generate initial resync tasks
const resyncCount = await taskService.generateDailyResyncTasks(6);
results.steps.push({
step: 'generate_resync_tasks',
count: resyncCount,
});
// Step 4: Create store discovery task
const discoveryTask = await taskService.createStoreDiscoveryTask('dutchie', undefined, 0);
results.steps.push({
step: 'create_discovery_task',
task_id: discoveryTask.id,
});
// Step 5: Create analytics refresh task
const analyticsTask = await taskService.createTask({
role: 'analytics_refresh',
priority: 0,
});
results.steps.push({
step: 'create_analytics_task',
task_id: analyticsTask.id,
});
results.message = 'Migration complete. New task workers will pick up tasks.';
res.json(results);
} catch (error: unknown) {
console.error('Error during full migration:', error);
res.status(500).json({ error: 'Failed to complete migration' });
}
});
export default router;

View File

@@ -0,0 +1,92 @@
/**
* Analytics Refresh Handler
*
* Refreshes materialized views and pre-computed analytics tables.
* Should run daily or on-demand after major data changes.
*/
import { TaskContext, TaskResult } from '../task-worker';
export async function handleAnalyticsRefresh(ctx: TaskContext): Promise<TaskResult> {
const { pool } = ctx;
console.log(`[AnalyticsRefresh] Starting analytics refresh...`);
const refreshed: string[] = [];
const failed: string[] = [];
// List of materialized views to refresh
const materializedViews = [
'mv_state_metrics',
'mv_brand_metrics',
'mv_category_metrics',
'v_brand_summary',
'v_dashboard_stats',
];
for (const viewName of materializedViews) {
try {
// Heartbeat before each refresh
await ctx.heartbeat();
// Check if view exists
const existsResult = await pool.query(`
SELECT EXISTS (
SELECT 1 FROM pg_matviews WHERE matviewname = $1
UNION
SELECT 1 FROM pg_views WHERE viewname = $1
) as exists
`, [viewName]);
if (!existsResult.rows[0].exists) {
console.log(`[AnalyticsRefresh] View ${viewName} does not exist, skipping`);
continue;
}
// Try to refresh (only works for materialized views)
try {
await pool.query(`REFRESH MATERIALIZED VIEW CONCURRENTLY ${viewName}`);
refreshed.push(viewName);
console.log(`[AnalyticsRefresh] Refreshed ${viewName}`);
} catch (refreshError: any) {
// Try non-concurrent refresh
try {
await pool.query(`REFRESH MATERIALIZED VIEW ${viewName}`);
refreshed.push(viewName);
console.log(`[AnalyticsRefresh] Refreshed ${viewName} (non-concurrent)`);
} catch (nonConcurrentError: any) {
// Not a materialized view or other error
console.log(`[AnalyticsRefresh] ${viewName} is not a materialized view or refresh failed`);
}
}
} catch (error: any) {
console.error(`[AnalyticsRefresh] Error refreshing ${viewName}:`, error.message);
failed.push(viewName);
}
}
// Run analytics capture functions if they exist
const captureFunctions = [
'capture_brand_snapshots',
'capture_category_snapshots',
];
for (const funcName of captureFunctions) {
try {
await pool.query(`SELECT ${funcName}()`);
console.log(`[AnalyticsRefresh] Executed ${funcName}()`);
} catch (error: any) {
// Function might not exist
console.log(`[AnalyticsRefresh] ${funcName}() not available`);
}
}
console.log(`[AnalyticsRefresh] Complete: ${refreshed.length} refreshed, ${failed.length} failed`);
return {
success: failed.length === 0,
refreshed,
failed,
error: failed.length > 0 ? `Failed to refresh: ${failed.join(', ')}` : undefined,
};
}

View File

@@ -0,0 +1,105 @@
/**
* Entry Point Discovery Handler
*
* Detects menu type and resolves platform IDs for a discovered store.
* This is the step between store_discovery and product_discovery.
*/
import { TaskContext, TaskResult } from '../task-worker';
import { DutchieClient } from '../../platforms/dutchie/client';
export async function handleEntryPointDiscovery(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const dispensaryId = task.dispensary_id;
if (!dispensaryId) {
return { success: false, error: 'No dispensary_id specified for entry_point_discovery task' };
}
try {
// Get dispensary info
const dispResult = await pool.query(`
SELECT id, name, menu_url, platform_dispensary_id, menu_type
FROM dispensaries
WHERE id = $1
`, [dispensaryId]);
if (dispResult.rows.length === 0) {
return { success: false, error: `Dispensary ${dispensaryId} not found` };
}
const dispensary = dispResult.rows[0];
// If already has platform_dispensary_id, we're done
if (dispensary.platform_dispensary_id) {
console.log(`[EntryPointDiscovery] Dispensary ${dispensaryId} already has platform ID`);
return {
success: true,
alreadyResolved: true,
platformId: dispensary.platform_dispensary_id,
};
}
const menuUrl = dispensary.menu_url;
if (!menuUrl) {
return { success: false, error: `Dispensary ${dispensaryId} has no menu_url` };
}
console.log(`[EntryPointDiscovery] Resolving platform ID for ${dispensary.name} from ${menuUrl}`);
// Extract cName from menu URL
// Format: https://dutchie.com/embedded-menu/<cName> or https://dutchie.com/dispensary/<slug>
let cName: string | null = null;
const embeddedMatch = menuUrl.match(/\/embedded-menu\/([^/?]+)/);
const dispensaryMatch = menuUrl.match(/\/dispensary\/([^/?]+)/);
if (embeddedMatch) {
cName = embeddedMatch[1];
} else if (dispensaryMatch) {
cName = dispensaryMatch[1];
}
if (!cName) {
return {
success: false,
error: `Could not extract cName from menu_url: ${menuUrl}`,
};
}
// Resolve platform ID using Dutchie API
const client = new DutchieClient();
const platformId = await client.resolveDispensaryId(cName);
if (!platformId) {
return {
success: false,
error: `Could not resolve platform ID for cName: ${cName}`,
};
}
// Update dispensary with platform ID and enable crawling
await pool.query(`
UPDATE dispensaries
SET platform_dispensary_id = $2,
menu_type = 'dutchie',
crawl_enabled = true,
updated_at = NOW()
WHERE id = $1
`, [dispensaryId, platformId]);
console.log(`[EntryPointDiscovery] Resolved ${dispensary.name}: platformId=${platformId}`);
return {
success: true,
platformId,
cName,
};
} catch (error: any) {
console.error(`[EntryPointDiscovery] Error for dispensary ${dispensaryId}:`, error.message);
return {
success: false,
error: error.message,
};
}
}

View File

@@ -0,0 +1,11 @@
/**
* Task Handlers Index
*
* Exports all task handlers for the task worker.
*/
export { handleProductResync } from './product-resync';
export { handleProductDiscovery } from './product-discovery';
export { handleStoreDiscovery } from './store-discovery';
export { handleEntryPointDiscovery } from './entry-point-discovery';
export { handleAnalyticsRefresh } from './analytics-refresh';

View File

@@ -0,0 +1,16 @@
/**
* Product Discovery Handler
*
* Initial product fetch for stores that have 0 products.
* Same logic as product_resync, but for initial discovery.
*/
import { TaskContext, TaskResult } from '../task-worker';
import { handleProductResync } from './product-resync';
export async function handleProductDiscovery(ctx: TaskContext): Promise<TaskResult> {
// Product discovery is essentially the same as resync for the first time
// The main difference is in when this task is triggered (new store vs scheduled)
console.log(`[ProductDiscovery] Starting initial product fetch for dispensary ${ctx.task.dispensary_id}`);
return handleProductResync(ctx);
}

View File

@@ -0,0 +1,131 @@
/**
* Product Resync Handler
*
* Re-crawls a store that already has products to capture price/stock changes.
* Creates new snapshots for any changed products.
*/
import { TaskContext, TaskResult } from '../task-worker';
import { DutchieClient } from '../../platforms/dutchie/client';
import { hydrateToCanonical } from '../../hydration/canonical-upsert';
import { DutchieNormalizer } from '../../hydration/normalizers/dutchie';
export async function handleProductResync(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const dispensaryId = task.dispensary_id;
if (!dispensaryId) {
return { success: false, error: 'No dispensary_id specified for product_resync task' };
}
try {
// Get dispensary info
const dispResult = await pool.query(`
SELECT id, name, platform_dispensary_id, menu_url, state
FROM dispensaries
WHERE id = $1 AND crawl_enabled = true
`, [dispensaryId]);
if (dispResult.rows.length === 0) {
return { success: false, error: `Dispensary ${dispensaryId} not found or not crawl_enabled` };
}
const dispensary = dispResult.rows[0];
const platformId = dispensary.platform_dispensary_id;
if (!platformId) {
return { success: false, error: `Dispensary ${dispensaryId} has no platform_dispensary_id` };
}
console.log(`[ProductResync] Crawling ${dispensary.name} (${dispensaryId})`);
// Send heartbeat before long operation
await ctx.heartbeat();
// Fetch products from Dutchie
const client = new DutchieClient();
const products = await client.fetchProducts(platformId);
if (!products || products.length === 0) {
// No products returned - could be a problem or could be empty menu
console.log(`[ProductResync] No products returned for ${dispensary.name}`);
return {
success: true,
productsProcessed: 0,
snapshotsCreated: 0,
message: 'No products returned from API',
};
}
console.log(`[ProductResync] Fetched ${products.length} products for ${dispensary.name}`);
// Heartbeat again
await ctx.heartbeat();
// Normalize products
const normalizer = new DutchieNormalizer();
const normResult = normalizer.normalize({
products,
dispensary_id: dispensaryId,
platform: 'dutchie',
});
// Create crawl run record
const crawlRunResult = await pool.query(`
INSERT INTO crawl_runs (dispensary_id, provider, started_at, status, trigger_type)
VALUES ($1, 'dutchie', NOW(), 'running', 'task')
RETURNING id
`, [dispensaryId]);
const crawlRunId = crawlRunResult.rows[0].id;
// Hydrate to canonical tables
const hydrateResult = await hydrateToCanonical(
pool,
dispensaryId,
normResult,
crawlRunId
);
// Update crawl run
await pool.query(`
UPDATE crawl_runs
SET status = 'completed',
completed_at = NOW(),
products_found = $2,
products_new = $3,
products_updated = $4,
snapshots_created = $5
WHERE id = $1
`, [
crawlRunId,
hydrateResult.productsUpserted,
hydrateResult.productsNew,
hydrateResult.productsUpdated,
hydrateResult.snapshotsCreated,
]);
// Update dispensary last_crawled_at
await pool.query(`
UPDATE dispensaries
SET last_crawled_at = NOW()
WHERE id = $1
`, [dispensaryId]);
console.log(`[ProductResync] Completed ${dispensary.name}: ${hydrateResult.productsUpserted} products, ${hydrateResult.snapshotsCreated} snapshots`);
return {
success: true,
productsProcessed: hydrateResult.productsUpserted,
productsNew: hydrateResult.productsNew,
productsUpdated: hydrateResult.productsUpdated,
snapshotsCreated: hydrateResult.snapshotsCreated,
brandsCreated: hydrateResult.brandsCreated,
};
} catch (error: any) {
console.error(`[ProductResync] Error for dispensary ${dispensaryId}:`, error.message);
return {
success: false,
error: error.message,
};
}
}

View File

@@ -0,0 +1,67 @@
/**
* Store Discovery Handler
*
* Discovers new stores on a platform (e.g., Dutchie) by crawling
* location APIs and adding them to dutchie_discovery_locations.
*/
import { TaskContext, TaskResult } from '../task-worker';
import { DiscoveryCrawler } from '../../discovery/discovery-crawler';
export async function handleStoreDiscovery(ctx: TaskContext): Promise<TaskResult> {
const { pool, task } = ctx;
const platform = task.platform || 'dutchie';
console.log(`[StoreDiscovery] Starting discovery for platform: ${platform}`);
try {
// Get states to discover
const statesResult = await pool.query(`
SELECT code FROM states WHERE active = true ORDER BY code
`);
const stateCodes = statesResult.rows.map(r => r.code);
if (stateCodes.length === 0) {
return { success: true, storesDiscovered: 0, message: 'No active states to discover' };
}
let totalDiscovered = 0;
let totalPromoted = 0;
// Run discovery for each state
const crawler = new DiscoveryCrawler(pool);
for (const stateCode of stateCodes) {
// Heartbeat before each state
await ctx.heartbeat();
console.log(`[StoreDiscovery] Discovering stores in ${stateCode}...`);
try {
const result = await crawler.discoverState(stateCode);
totalDiscovered += result.locationsDiscovered || 0;
totalPromoted += result.locationsPromoted || 0;
console.log(`[StoreDiscovery] ${stateCode}: discovered ${result.locationsDiscovered}, promoted ${result.locationsPromoted}`);
} catch (error: any) {
console.error(`[StoreDiscovery] Error discovering ${stateCode}:`, error.message);
// Continue with other states
}
}
console.log(`[StoreDiscovery] Complete: ${totalDiscovered} discovered, ${totalPromoted} promoted`);
return {
success: true,
storesDiscovered: totalDiscovered,
storesPromoted: totalPromoted,
statesProcessed: stateCodes.length,
newStoreIds: [], // Would be populated with actual new store IDs for chaining
};
} catch (error: any) {
console.error(`[StoreDiscovery] Error:`, error.message);
return {
success: false,
error: error.message,
};
}
}

View File

@@ -0,0 +1,25 @@
/**
* Task Queue Module
*
* Exports task service, worker, and types for use throughout the application.
*/
export {
taskService,
TaskRole,
TaskStatus,
WorkerTask,
CreateTaskParams,
CapacityMetrics,
TaskFilter,
} from './task-service';
export { TaskWorker, TaskContext, TaskResult } from './task-worker';
export {
handleProductResync,
handleProductDiscovery,
handleStoreDiscovery,
handleEntryPointDiscovery,
handleAnalyticsRefresh,
} from './handlers';

View File

@@ -0,0 +1,474 @@
/**
* Task Service
*
* Central service for managing worker tasks with:
* - Atomic task claiming (per-store locking)
* - Task lifecycle management
* - Auto-chaining of related tasks
* - Capacity planning metrics
*/
import { pool } from '../db/pool';
export type TaskRole =
| 'store_discovery'
| 'entry_point_discovery'
| 'product_discovery'
| 'product_resync'
| 'analytics_refresh';
export type TaskStatus =
| 'pending'
| 'claimed'
| 'running'
| 'completed'
| 'failed'
| 'stale';
export interface WorkerTask {
id: number;
role: TaskRole;
dispensary_id: number | null;
platform: string | null;
status: TaskStatus;
priority: number;
scheduled_for: Date | null;
worker_id: string | null;
claimed_at: Date | null;
started_at: Date | null;
completed_at: Date | null;
last_heartbeat_at: Date | null;
result: Record<string, unknown> | null;
error_message: string | null;
retry_count: number;
max_retries: number;
created_at: Date;
updated_at: Date;
}
export interface CreateTaskParams {
role: TaskRole;
dispensary_id?: number;
platform?: string;
priority?: number;
scheduled_for?: Date;
}
export interface CapacityMetrics {
role: string;
pending_tasks: number;
ready_tasks: number;
claimed_tasks: number;
running_tasks: number;
completed_last_hour: number;
failed_last_hour: number;
active_workers: number;
avg_duration_sec: number | null;
tasks_per_worker_hour: number | null;
estimated_hours_to_drain: number | null;
}
export interface TaskFilter {
role?: TaskRole;
status?: TaskStatus | TaskStatus[];
dispensary_id?: number;
worker_id?: string;
limit?: number;
offset?: number;
}
class TaskService {
/**
* Create a new task
*/
async createTask(params: CreateTaskParams): Promise<WorkerTask> {
const result = await pool.query(
`INSERT INTO worker_tasks (role, dispensary_id, platform, priority, scheduled_for)
VALUES ($1, $2, $3, $4, $5)
RETURNING *`,
[
params.role,
params.dispensary_id ?? null,
params.platform ?? null,
params.priority ?? 0,
params.scheduled_for ?? null,
]
);
return result.rows[0] as WorkerTask;
}
/**
* Create multiple tasks in a batch
*/
async createTasks(tasks: CreateTaskParams[]): Promise<number> {
if (tasks.length === 0) return 0;
const values = tasks.map((t, i) => {
const base = i * 5;
return `($${base + 1}, $${base + 2}, $${base + 3}, $${base + 4}, $${base + 5})`;
});
const params = tasks.flatMap((t) => [
t.role,
t.dispensary_id ?? null,
t.platform ?? null,
t.priority ?? 0,
t.scheduled_for ?? null,
]);
const result = await pool.query(
`INSERT INTO worker_tasks (role, dispensary_id, platform, priority, scheduled_for)
VALUES ${values.join(', ')}
ON CONFLICT DO NOTHING`,
params
);
return result.rowCount ?? 0;
}
/**
* Claim a task atomically for a worker
* Uses the SQL function for proper locking
*/
async claimTask(role: TaskRole, workerId: string): Promise<WorkerTask | null> {
const result = await pool.query(
`SELECT * FROM claim_task($1, $2)`,
[role, workerId]
);
return (result.rows[0] as WorkerTask) || null;
}
/**
* Mark a task as running (worker started processing)
*/
async startTask(taskId: number): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET status = 'running', started_at = NOW(), last_heartbeat_at = NOW()
WHERE id = $1`,
[taskId]
);
}
/**
* Update heartbeat to prevent stale detection
*/
async heartbeat(taskId: number): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET last_heartbeat_at = NOW()
WHERE id = $1 AND status = 'running'`,
[taskId]
);
}
/**
* Mark a task as completed
*/
async completeTask(taskId: number, result?: Record<string, unknown>): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET status = 'completed', completed_at = NOW(), result = $2
WHERE id = $1`,
[taskId, result ? JSON.stringify(result) : null]
);
}
/**
* Mark a task as failed
*/
async failTask(taskId: number, errorMessage: string): Promise<void> {
await pool.query(
`UPDATE worker_tasks
SET status = 'failed', completed_at = NOW(), error_message = $2
WHERE id = $1`,
[taskId, errorMessage]
);
}
/**
* Get a task by ID
*/
async getTask(taskId: number): Promise<WorkerTask | null> {
const result = await pool.query(
`SELECT * FROM worker_tasks WHERE id = $1`,
[taskId]
);
return (result.rows[0] as WorkerTask) || null;
}
/**
* List tasks with filters
*/
async listTasks(filter: TaskFilter = {}): Promise<WorkerTask[]> {
const conditions: string[] = [];
const params: (string | number | string[])[] = [];
let paramIndex = 1;
if (filter.role) {
conditions.push(`role = $${paramIndex++}`);
params.push(filter.role);
}
if (filter.status) {
if (Array.isArray(filter.status)) {
conditions.push(`status = ANY($${paramIndex++})`);
params.push(filter.status);
} else {
conditions.push(`status = $${paramIndex++}`);
params.push(filter.status);
}
}
if (filter.dispensary_id) {
conditions.push(`dispensary_id = $${paramIndex++}`);
params.push(filter.dispensary_id);
}
if (filter.worker_id) {
conditions.push(`worker_id = $${paramIndex++}`);
params.push(filter.worker_id);
}
const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : '';
const limit = filter.limit ?? 100;
const offset = filter.offset ?? 0;
const result = await pool.query(
`SELECT * FROM worker_tasks
${whereClause}
ORDER BY created_at DESC
LIMIT ${limit} OFFSET ${offset}`,
params
);
return result.rows as WorkerTask[];
}
/**
* Get capacity metrics for all roles
*/
async getCapacityMetrics(): Promise<CapacityMetrics[]> {
const result = await pool.query(
`SELECT * FROM v_worker_capacity`
);
return result.rows as CapacityMetrics[];
}
/**
* Get capacity metrics for a specific role
*/
async getRoleCapacity(role: TaskRole): Promise<CapacityMetrics | null> {
const result = await pool.query(
`SELECT * FROM v_worker_capacity WHERE role = $1`,
[role]
);
return (result.rows[0] as CapacityMetrics) || null;
}
/**
* Recover stale tasks from dead workers
*/
async recoverStaleTasks(staleThresholdMinutes = 10): Promise<number> {
const result = await pool.query(
`SELECT recover_stale_tasks($1)`,
[staleThresholdMinutes]
);
return (result.rows[0] as { recover_stale_tasks: number })?.recover_stale_tasks ?? 0;
}
/**
* Generate daily resync tasks for all active stores
*/
async generateDailyResyncTasks(batchesPerDay = 6, date?: Date): Promise<number> {
const result = await pool.query(
`SELECT generate_resync_tasks($1, $2)`,
[batchesPerDay, date ?? new Date()]
);
return (result.rows[0] as { generate_resync_tasks: number })?.generate_resync_tasks ?? 0;
}
/**
* Chain next task after completion
* Called automatically when a task completes successfully
*/
async chainNextTask(completedTask: WorkerTask): Promise<WorkerTask | null> {
if (completedTask.status !== 'completed') {
return null;
}
switch (completedTask.role) {
case 'store_discovery': {
// New stores discovered -> create entry_point_discovery tasks
const newStoreIds = (completedTask.result as { newStoreIds?: number[] })?.newStoreIds;
if (newStoreIds && newStoreIds.length > 0) {
for (const storeId of newStoreIds) {
await this.createTask({
role: 'entry_point_discovery',
dispensary_id: storeId,
platform: completedTask.platform ?? undefined,
priority: 10, // High priority for new stores
});
}
}
break;
}
case 'entry_point_discovery': {
// Entry point resolved -> create product_discovery task
const success = (completedTask.result as { success?: boolean })?.success;
if (success && completedTask.dispensary_id) {
return this.createTask({
role: 'product_discovery',
dispensary_id: completedTask.dispensary_id,
platform: completedTask.platform ?? undefined,
priority: 10,
});
}
break;
}
case 'product_discovery': {
// Product discovery done -> store is now ready for regular resync
// No immediate chaining needed; will be picked up by daily batch generation
break;
}
}
return null;
}
/**
* Create store discovery task for a platform/state
*/
async createStoreDiscoveryTask(
platform: string,
stateCode?: string,
priority = 0
): Promise<WorkerTask> {
return this.createTask({
role: 'store_discovery',
platform,
priority,
});
}
/**
* Create entry point discovery task for a specific store
*/
async createEntryPointTask(
dispensaryId: number,
platform: string,
priority = 10
): Promise<WorkerTask> {
return this.createTask({
role: 'entry_point_discovery',
dispensary_id: dispensaryId,
platform,
priority,
});
}
/**
* Create product discovery task for a specific store
*/
async createProductDiscoveryTask(
dispensaryId: number,
platform: string,
priority = 10
): Promise<WorkerTask> {
return this.createTask({
role: 'product_discovery',
dispensary_id: dispensaryId,
platform,
priority,
});
}
/**
* Get task counts by status for dashboard
*/
async getTaskCounts(): Promise<Record<TaskStatus, number>> {
const result = await pool.query(
`SELECT status, COUNT(*) as count
FROM worker_tasks
GROUP BY status`
);
const counts: Record<TaskStatus, number> = {
pending: 0,
claimed: 0,
running: 0,
completed: 0,
failed: 0,
stale: 0,
};
for (const row of result.rows) {
const typedRow = row as { status: TaskStatus; count: string };
counts[typedRow.status] = parseInt(typedRow.count, 10);
}
return counts;
}
/**
* Get recent task completions for a role
*/
async getRecentCompletions(role: TaskRole, limit = 10): Promise<WorkerTask[]> {
const result = await pool.query(
`SELECT * FROM worker_tasks
WHERE role = $1 AND status = 'completed'
ORDER BY completed_at DESC
LIMIT $2`,
[role, limit]
);
return result.rows as WorkerTask[];
}
/**
* Check if a store has any active tasks
*/
async hasActiveTask(dispensaryId: number): Promise<boolean> {
const result = await pool.query(
`SELECT EXISTS(
SELECT 1 FROM worker_tasks
WHERE dispensary_id = $1
AND status IN ('claimed', 'running')
) as exists`,
[dispensaryId]
);
return (result.rows[0] as { exists: boolean })?.exists ?? false;
}
/**
* Get the last completion time for a role
*/
async getLastCompletion(role: TaskRole): Promise<Date | null> {
const result = await pool.query(
`SELECT MAX(completed_at) as completed_at
FROM worker_tasks
WHERE role = $1 AND status = 'completed'`,
[role]
);
return (result.rows[0] as { completed_at: Date | null })?.completed_at ?? null;
}
/**
* Calculate workers needed to complete tasks within SLA
*/
async calculateWorkersNeeded(role: TaskRole, slaHours: number): Promise<number> {
const capacity = await this.getRoleCapacity(role);
if (!capacity || !capacity.tasks_per_worker_hour) {
return 1; // Default to 1 worker if no data
}
const pendingTasks = capacity.pending_tasks;
const tasksPerWorkerHour = capacity.tasks_per_worker_hour;
const totalTaskCapacityNeeded = pendingTasks / slaHours;
return Math.ceil(totalTaskCapacityNeeded / tasksPerWorkerHour);
}
}
export const taskService = new TaskService();

View File

@@ -0,0 +1,266 @@
/**
* Task Worker
*
* A unified worker that processes tasks from the worker_tasks queue.
* Replaces the fragmented job systems (job_schedules, dispensary_crawl_jobs, etc.)
*
* Usage:
* WORKER_ROLE=product_resync npx tsx src/tasks/task-worker.ts
*
* Environment:
* WORKER_ROLE - Which task role to process (required)
* WORKER_ID - Optional custom worker ID
* POLL_INTERVAL_MS - How often to check for tasks (default: 5000)
* HEARTBEAT_INTERVAL_MS - How often to update heartbeat (default: 30000)
*/
import { Pool } from 'pg';
import { v4 as uuidv4 } from 'uuid';
import { taskService, TaskRole, WorkerTask } from './task-service';
import { pool } from '../db/pool';
// Task handlers by role
import { handleProductResync } from './handlers/product-resync';
import { handleProductDiscovery } from './handlers/product-discovery';
import { handleStoreDiscovery } from './handlers/store-discovery';
import { handleEntryPointDiscovery } from './handlers/entry-point-discovery';
import { handleAnalyticsRefresh } from './handlers/analytics-refresh';
const POLL_INTERVAL_MS = parseInt(process.env.POLL_INTERVAL_MS || '5000');
const HEARTBEAT_INTERVAL_MS = parseInt(process.env.HEARTBEAT_INTERVAL_MS || '30000');
export interface TaskContext {
pool: Pool;
workerId: string;
task: WorkerTask;
heartbeat: () => Promise<void>;
}
export interface TaskResult {
success: boolean;
productsProcessed?: number;
snapshotsCreated?: number;
storesDiscovered?: number;
error?: string;
[key: string]: unknown;
}
type TaskHandler = (ctx: TaskContext) => Promise<TaskResult>;
const TASK_HANDLERS: Record<TaskRole, TaskHandler> = {
product_resync: handleProductResync,
product_discovery: handleProductDiscovery,
store_discovery: handleStoreDiscovery,
entry_point_discovery: handleEntryPointDiscovery,
analytics_refresh: handleAnalyticsRefresh,
};
export class TaskWorker {
private pool: Pool;
private workerId: string;
private role: TaskRole;
private isRunning: boolean = false;
private heartbeatInterval: NodeJS.Timeout | null = null;
private currentTask: WorkerTask | null = null;
constructor(role: TaskRole, workerId?: string) {
this.pool = pool;
this.role = role;
this.workerId = workerId || `worker-${role}-${uuidv4().slice(0, 8)}`;
}
/**
* Start the worker loop
*/
async start(): Promise<void> {
this.isRunning = true;
console.log(`[TaskWorker] Starting worker ${this.workerId} for role: ${this.role}`);
while (this.isRunning) {
try {
await this.processNextTask();
} catch (error: any) {
console.error(`[TaskWorker] Loop error:`, error.message);
await this.sleep(POLL_INTERVAL_MS);
}
}
console.log(`[TaskWorker] Worker ${this.workerId} stopped`);
}
/**
* Stop the worker
*/
stop(): void {
this.isRunning = false;
this.stopHeartbeat();
console.log(`[TaskWorker] Stopping worker ${this.workerId}...`);
}
/**
* Process the next available task
*/
private async processNextTask(): Promise<void> {
// Try to claim a task
const task = await taskService.claimTask(this.role, this.workerId);
if (!task) {
// No tasks available, wait and retry
await this.sleep(POLL_INTERVAL_MS);
return;
}
this.currentTask = task;
console.log(`[TaskWorker] Claimed task ${task.id} (${task.role}) for dispensary ${task.dispensary_id || 'N/A'}`);
// Start heartbeat
this.startHeartbeat(task.id);
try {
// Mark as running
await taskService.startTask(task.id);
// Get handler for this role
const handler = TASK_HANDLERS[task.role];
if (!handler) {
throw new Error(`No handler registered for role: ${task.role}`);
}
// Create context
const ctx: TaskContext = {
pool: this.pool,
workerId: this.workerId,
task,
heartbeat: async () => {
await taskService.heartbeat(task.id);
},
};
// Execute the task
const result = await handler(ctx);
if (result.success) {
// Mark as completed
await taskService.completeTask(task.id, result);
console.log(`[TaskWorker] Task ${task.id} completed successfully`);
// Chain next task if applicable
const chainedTask = await taskService.chainNextTask({
...task,
status: 'completed',
result,
});
if (chainedTask) {
console.log(`[TaskWorker] Chained new task ${chainedTask.id} (${chainedTask.role})`);
}
} else {
// Mark as failed
await taskService.failTask(task.id, result.error || 'Unknown error');
console.log(`[TaskWorker] Task ${task.id} failed: ${result.error}`);
}
} catch (error: any) {
// Mark as failed
await taskService.failTask(task.id, error.message);
console.error(`[TaskWorker] Task ${task.id} threw error:`, error.message);
} finally {
this.stopHeartbeat();
this.currentTask = null;
}
}
/**
* Start heartbeat interval
*/
private startHeartbeat(taskId: number): void {
this.heartbeatInterval = setInterval(async () => {
try {
await taskService.heartbeat(taskId);
} catch (error: any) {
console.warn(`[TaskWorker] Heartbeat failed:`, error.message);
}
}, HEARTBEAT_INTERVAL_MS);
}
/**
* Stop heartbeat interval
*/
private stopHeartbeat(): void {
if (this.heartbeatInterval) {
clearInterval(this.heartbeatInterval);
this.heartbeatInterval = null;
}
}
/**
* Sleep helper
*/
private sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
/**
* Get worker info
*/
getInfo(): { workerId: string; role: TaskRole; isRunning: boolean; currentTaskId: number | null } {
return {
workerId: this.workerId,
role: this.role,
isRunning: this.isRunning,
currentTaskId: this.currentTask?.id || null,
};
}
}
// ============================================================
// CLI ENTRY POINT
// ============================================================
async function main(): Promise<void> {
const role = process.env.WORKER_ROLE as TaskRole;
if (!role) {
console.error('Error: WORKER_ROLE environment variable is required');
console.error('Valid roles: store_discovery, entry_point_discovery, product_discovery, product_resync, analytics_refresh');
process.exit(1);
}
const validRoles: TaskRole[] = [
'store_discovery',
'entry_point_discovery',
'product_discovery',
'product_resync',
'analytics_refresh',
];
if (!validRoles.includes(role)) {
console.error(`Error: Invalid WORKER_ROLE: ${role}`);
console.error(`Valid roles: ${validRoles.join(', ')}`);
process.exit(1);
}
const workerId = process.env.WORKER_ID;
const worker = new TaskWorker(role, workerId);
// Handle graceful shutdown
process.on('SIGTERM', () => {
console.log('[TaskWorker] Received SIGTERM, shutting down...');
worker.stop();
});
process.on('SIGINT', () => {
console.log('[TaskWorker] Received SIGINT, shutting down...');
worker.stop();
});
await worker.start();
}
// Run if this is the main module
if (require.main === module) {
main().catch((error) => {
console.error('[TaskWorker] Fatal error:', error);
process.exit(1);
});
}
export { main };

View File

@@ -47,6 +47,7 @@ import StateDetail from './pages/StateDetail';
import { Discovery } from './pages/Discovery'; import { Discovery } from './pages/Discovery';
import { WorkersDashboard } from './pages/WorkersDashboard'; import { WorkersDashboard } from './pages/WorkersDashboard';
import { JobQueue } from './pages/JobQueue'; import { JobQueue } from './pages/JobQueue';
import TasksDashboard from './pages/TasksDashboard';
import { ScraperOverviewDashboard } from './pages/ScraperOverviewDashboard'; import { ScraperOverviewDashboard } from './pages/ScraperOverviewDashboard';
import { SeoOrchestrator } from './pages/admin/seo/SeoOrchestrator'; import { SeoOrchestrator } from './pages/admin/seo/SeoOrchestrator';
import { StatePage } from './pages/public/StatePage'; import { StatePage } from './pages/public/StatePage';
@@ -124,6 +125,8 @@ export default function App() {
<Route path="/workers" element={<PrivateRoute><WorkersDashboard /></PrivateRoute>} /> <Route path="/workers" element={<PrivateRoute><WorkersDashboard /></PrivateRoute>} />
{/* Job Queue Management */} {/* Job Queue Management */}
<Route path="/job-queue" element={<PrivateRoute><JobQueue /></PrivateRoute>} /> <Route path="/job-queue" element={<PrivateRoute><JobQueue /></PrivateRoute>} />
{/* Task Queue Dashboard */}
<Route path="/tasks" element={<PrivateRoute><TasksDashboard /></PrivateRoute>} />
{/* Scraper Overview Dashboard (new primary) */} {/* Scraper Overview Dashboard (new primary) */}
<Route path="/scraper/overview" element={<PrivateRoute><ScraperOverviewDashboard /></PrivateRoute>} /> <Route path="/scraper/overview" element={<PrivateRoute><ScraperOverviewDashboard /></PrivateRoute>} />
<Route path="*" element={<Navigate to="/dashboard" replace />} /> <Route path="*" element={<Navigate to="/dashboard" replace />} />

View File

@@ -23,7 +23,8 @@ import {
UserCog, UserCog,
ListOrdered, ListOrdered,
Key, Key,
Bot Bot,
ListChecks
} from 'lucide-react'; } from 'lucide-react';
interface LayoutProps { interface LayoutProps {
@@ -164,6 +165,7 @@ export function Layout({ children }: LayoutProps) {
<NavLink to="/users" icon={<UserCog className="w-4 h-4" />} label="Users" isActive={isActive('/users')} /> <NavLink to="/users" icon={<UserCog className="w-4 h-4" />} label="Users" isActive={isActive('/users')} />
<NavLink to="/workers" icon={<Users className="w-4 h-4" />} label="Workers" isActive={isActive('/workers')} /> <NavLink to="/workers" icon={<Users className="w-4 h-4" />} label="Workers" isActive={isActive('/workers')} />
<NavLink to="/job-queue" icon={<ListOrdered className="w-4 h-4" />} label="Job Queue" isActive={isActive('/job-queue')} /> <NavLink to="/job-queue" icon={<ListOrdered className="w-4 h-4" />} label="Job Queue" isActive={isActive('/job-queue')} />
<NavLink to="/tasks" icon={<ListChecks className="w-4 h-4" />} label="Task Queue" isActive={isActive('/tasks')} />
<NavLink to="/admin/seo" icon={<FileText className="w-4 h-4" />} label="SEO Pages" isActive={isActive('/admin/seo')} /> <NavLink to="/admin/seo" icon={<FileText className="w-4 h-4" />} label="SEO Pages" isActive={isActive('/admin/seo')} />
<NavLink to="/proxies" icon={<Shield className="w-4 h-4" />} label="Proxies" isActive={isActive('/proxies')} /> <NavLink to="/proxies" icon={<Shield className="w-4 h-4" />} label="Proxies" isActive={isActive('/proxies')} />
<NavLink to="/api-permissions" icon={<Key className="w-4 h-4" />} label="API Keys" isActive={isActive('/api-permissions')} /> <NavLink to="/api-permissions" icon={<Key className="w-4 h-4" />} label="API Keys" isActive={isActive('/api-permissions')} />

View File

@@ -2777,6 +2777,94 @@ class ApiClient {
sampleValues: Record<string, any>; sampleValues: Record<string, any>;
}>(`/api/seo/templates/variables/${encodeURIComponent(pageType)}`); }>(`/api/seo/templates/variables/${encodeURIComponent(pageType)}`);
} }
// ==========================================
// Task Queue API
// ==========================================
async getTasks(params?: {
role?: string;
status?: string;
dispensary_id?: number;
limit?: number;
offset?: number;
}) {
const query = new URLSearchParams();
if (params?.role) query.set('role', params.role);
if (params?.status) query.set('status', params.status);
if (params?.dispensary_id) query.set('dispensary_id', String(params.dispensary_id));
if (params?.limit) query.set('limit', String(params.limit));
if (params?.offset) query.set('offset', String(params.offset));
const qs = query.toString();
return this.request<{ tasks: any[]; count: number }>(`/api/tasks${qs ? '?' + qs : ''}`);
}
async getTask(id: number) {
return this.request<any>(`/api/tasks/${id}`);
}
async getTaskCounts() {
return this.request<Record<string, number>>('/api/tasks/counts');
}
async getTaskCapacity() {
return this.request<{ metrics: any[] }>('/api/tasks/capacity');
}
async getRoleCapacity(role: string) {
return this.request<any>(`/api/tasks/capacity/${role}`);
}
async createTask(params: {
role: string;
dispensary_id?: number;
platform?: string;
priority?: number;
scheduled_for?: string;
}) {
return this.request<any>('/api/tasks', {
method: 'POST',
body: JSON.stringify(params),
});
}
async generateResyncTasks(params?: { batches_per_day?: number; date?: string }) {
return this.request<{ success: boolean; tasks_created: number }>('/api/tasks/generate/resync', {
method: 'POST',
body: JSON.stringify(params ?? {}),
});
}
async generateDiscoveryTask(platform: string, stateCode?: string, priority?: number) {
return this.request<any>('/api/tasks/generate/discovery', {
method: 'POST',
body: JSON.stringify({ platform, state_code: stateCode, priority }),
});
}
async recoverStaleTasks(thresholdMinutes?: number) {
return this.request<{ success: boolean; tasks_recovered: number }>('/api/tasks/recover-stale', {
method: 'POST',
body: JSON.stringify({ threshold_minutes: thresholdMinutes }),
});
}
async getLastRoleCompletion(role: string) {
return this.request<{ role: string; last_completion: string | null; time_since: number | null }>(
`/api/tasks/role/${role}/last-completion`
);
}
async getRecentRoleCompletions(role: string, limit?: number) {
const qs = limit ? `?limit=${limit}` : '';
return this.request<{ tasks: any[] }>(`/api/tasks/role/${role}/recent${qs}`);
}
async checkStoreActiveTask(dispensaryId: number) {
return this.request<{ dispensary_id: number; has_active_task: boolean }>(
`/api/tasks/store/${dispensaryId}/active`
);
}
} }
export const api = new ApiClient(API_URL); export const api = new ApiClient(API_URL);

View File

@@ -18,7 +18,11 @@ import {
Globe, Globe,
MapPin, MapPin,
ArrowRight, ArrowRight,
BarChart3 BarChart3,
ListChecks,
Play,
CheckCircle2,
XCircle
} from 'lucide-react'; } from 'lucide-react';
import { import {
LineChart, LineChart,
@@ -41,6 +45,7 @@ export function Dashboard() {
const [refreshing, setRefreshing] = useState(false); const [refreshing, setRefreshing] = useState(false);
const [pendingChangesCount, setPendingChangesCount] = useState(0); const [pendingChangesCount, setPendingChangesCount] = useState(0);
const [showNotification, setShowNotification] = useState(false); const [showNotification, setShowNotification] = useState(false);
const [taskCounts, setTaskCounts] = useState<Record<string, number> | null>(null);
useEffect(() => { useEffect(() => {
loadData(); loadData();
@@ -119,6 +124,15 @@ export function Dashboard() {
// National stats not critical, just skip // National stats not critical, just skip
setNationalStats(null); setNationalStats(null);
} }
// Fetch task queue counts
try {
const counts = await api.getTaskCounts();
setTaskCounts(counts);
} catch {
// Task counts not critical, just skip
setTaskCounts(null);
}
} catch (error) { } catch (error) {
console.error('Failed to load dashboard:', error); console.error('Failed to load dashboard:', error);
} finally { } finally {
@@ -471,6 +485,60 @@ export function Dashboard() {
</div> </div>
)} )}
{/* Task Queue Summary */}
{taskCounts && (
<div className="bg-white rounded-xl border border-gray-200 p-4 sm:p-6">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="p-2 bg-violet-50 rounded-lg">
<ListChecks className="w-5 h-5 text-violet-600" />
</div>
<div>
<h3 className="text-sm sm:text-base font-semibold text-gray-900">Task Queue</h3>
<p className="text-xs text-gray-500">Worker task processing status</p>
</div>
</div>
<button
onClick={() => navigate('/tasks')}
className="flex items-center gap-1 text-sm text-violet-600 hover:text-violet-700"
>
View Dashboard
<ArrowRight className="w-4 h-4" />
</button>
</div>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4">
<div className="p-3 bg-amber-50 rounded-lg">
<div className="flex items-center gap-2">
<Clock className="w-4 h-4 text-amber-600" />
<span className="text-xs text-gray-500">Pending</span>
</div>
<div className="text-xl font-bold text-amber-600 mt-1">{taskCounts.pending || 0}</div>
</div>
<div className="p-3 bg-blue-50 rounded-lg">
<div className="flex items-center gap-2">
<Play className="w-4 h-4 text-blue-600" />
<span className="text-xs text-gray-500">Running</span>
</div>
<div className="text-xl font-bold text-blue-600 mt-1">{(taskCounts.claimed || 0) + (taskCounts.running || 0)}</div>
</div>
<div className="p-3 bg-emerald-50 rounded-lg">
<div className="flex items-center gap-2">
<CheckCircle2 className="w-4 h-4 text-emerald-600" />
<span className="text-xs text-gray-500">Completed</span>
</div>
<div className="text-xl font-bold text-emerald-600 mt-1">{taskCounts.completed || 0}</div>
</div>
<div className="p-3 bg-red-50 rounded-lg">
<div className="flex items-center gap-2">
<XCircle className="w-4 h-4 text-red-600" />
<span className="text-xs text-gray-500">Failed</span>
</div>
<div className="text-xl font-bold text-red-600 mt-1">{(taskCounts.failed || 0) + (taskCounts.stale || 0)}</div>
</div>
</div>
</div>
)}
{/* Activity Lists */} {/* Activity Lists */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4 sm:gap-6"> <div className="grid grid-cols-1 lg:grid-cols-2 gap-4 sm:gap-6">
{/* Recent Scrapes */} {/* Recent Scrapes */}

View File

@@ -0,0 +1,525 @@
import { useState, useEffect } from 'react';
import { api } from '../lib/api';
import { Layout } from '../components/Layout';
import {
ListChecks,
Clock,
CheckCircle2,
XCircle,
AlertTriangle,
PlayCircle,
RefreshCw,
Search,
ChevronDown,
ChevronUp,
Gauge,
Users,
Calendar,
Zap,
} from 'lucide-react';
interface Task {
id: number;
role: string;
dispensary_id: number | null;
dispensary_name?: string;
platform: string | null;
status: string;
priority: number;
scheduled_for: string | null;
worker_id: string | null;
claimed_at: string | null;
started_at: string | null;
completed_at: string | null;
error_message: string | null;
retry_count: number;
created_at: string;
duration_sec?: number;
}
interface CapacityMetric {
role: string;
pending_tasks: number;
ready_tasks: number;
claimed_tasks: number;
running_tasks: number;
completed_last_hour: number;
failed_last_hour: number;
active_workers: number;
avg_duration_sec: number | null;
tasks_per_worker_hour: number | null;
estimated_hours_to_drain: number | null;
workers_needed?: {
for_1_hour: number;
for_4_hours: number;
for_8_hours: number;
};
}
interface TaskCounts {
pending: number;
claimed: number;
running: number;
completed: number;
failed: number;
stale: number;
}
const ROLES = [
'store_discovery',
'entry_point_discovery',
'product_discovery',
'product_resync',
'analytics_refresh',
];
const STATUS_COLORS: Record<string, string> = {
pending: 'bg-yellow-100 text-yellow-800',
claimed: 'bg-blue-100 text-blue-800',
running: 'bg-indigo-100 text-indigo-800',
completed: 'bg-green-100 text-green-800',
failed: 'bg-red-100 text-red-800',
stale: 'bg-gray-100 text-gray-800',
};
const STATUS_ICONS: Record<string, React.ReactNode> = {
pending: <Clock className="w-4 h-4" />,
claimed: <PlayCircle className="w-4 h-4" />,
running: <RefreshCw className="w-4 h-4 animate-spin" />,
completed: <CheckCircle2 className="w-4 h-4" />,
failed: <XCircle className="w-4 h-4" />,
stale: <AlertTriangle className="w-4 h-4" />,
};
function formatDuration(seconds: number | null): string {
if (seconds === null) return '-';
if (seconds < 60) return `${Math.round(seconds)}s`;
if (seconds < 3600) return `${Math.floor(seconds / 60)}m ${Math.round(seconds % 60)}s`;
return `${Math.floor(seconds / 3600)}h ${Math.floor((seconds % 3600) / 60)}m`;
}
function formatTimeAgo(dateStr: string | null): string {
if (!dateStr) return '-';
const date = new Date(dateStr);
const now = new Date();
const diff = (now.getTime() - date.getTime()) / 1000;
if (diff < 60) return `${Math.round(diff)}s ago`;
if (diff < 3600) return `${Math.floor(diff / 60)}m ago`;
if (diff < 86400) return `${Math.floor(diff / 3600)}h ago`;
return `${Math.floor(diff / 86400)}d ago`;
}
export default function TasksDashboard() {
const [tasks, setTasks] = useState<Task[]>([]);
const [counts, setCounts] = useState<TaskCounts | null>(null);
const [capacity, setCapacity] = useState<CapacityMetric[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
// Filters
const [roleFilter, setRoleFilter] = useState<string>('');
const [statusFilter, setStatusFilter] = useState<string>('');
const [searchQuery, setSearchQuery] = useState('');
const [showCapacity, setShowCapacity] = useState(true);
// Actions
const [actionLoading, setActionLoading] = useState(false);
const [actionMessage, setActionMessage] = useState<string | null>(null);
const fetchData = async () => {
try {
const [tasksRes, countsRes, capacityRes] = await Promise.all([
api.getTasks({
role: roleFilter || undefined,
status: statusFilter || undefined,
limit: 100,
}),
api.getTaskCounts(),
api.getTaskCapacity(),
]);
setTasks(tasksRes.tasks || []);
setCounts(countsRes);
setCapacity(capacityRes.metrics || []);
setError(null);
} catch (err: any) {
setError(err.message || 'Failed to load tasks');
} finally {
setLoading(false);
}
};
useEffect(() => {
fetchData();
const interval = setInterval(fetchData, 10000); // Refresh every 10 seconds
return () => clearInterval(interval);
}, [roleFilter, statusFilter]);
const handleGenerateResync = async () => {
setActionLoading(true);
try {
const result = await api.generateResyncTasks();
setActionMessage(`Generated ${result.tasks_created} resync tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const handleRecoverStale = async () => {
setActionLoading(true);
try {
const result = await api.recoverStaleTasks();
setActionMessage(`Recovered ${result.tasks_recovered} stale tasks`);
fetchData();
} catch (err: any) {
setActionMessage(`Error: ${err.message}`);
} finally {
setActionLoading(false);
setTimeout(() => setActionMessage(null), 5000);
}
};
const filteredTasks = tasks.filter((task) => {
if (searchQuery) {
const query = searchQuery.toLowerCase();
return (
task.role.toLowerCase().includes(query) ||
task.dispensary_name?.toLowerCase().includes(query) ||
task.worker_id?.toLowerCase().includes(query) ||
String(task.id).includes(query)
);
}
return true;
});
const totalActive = (counts?.claimed || 0) + (counts?.running || 0);
const totalPending = counts?.pending || 0;
if (loading) {
return (
<Layout>
<div className="flex items-center justify-center h-64">
<RefreshCw className="w-8 h-8 animate-spin text-emerald-600" />
</div>
</Layout>
);
}
return (
<Layout>
<div className="space-y-6">
{/* Header */}
<div className="flex flex-col sm:flex-row sm:items-center sm:justify-between gap-4">
<div>
<h1 className="text-2xl font-bold text-gray-900 flex items-center gap-2">
<ListChecks className="w-7 h-7 text-emerald-600" />
Task Queue
</h1>
<p className="text-gray-500 mt-1">
{totalActive} active, {totalPending} pending tasks
</p>
</div>
<div className="flex gap-2">
<button
onClick={handleGenerateResync}
disabled={actionLoading}
className="flex items-center gap-2 px-4 py-2 bg-emerald-600 text-white rounded-lg hover:bg-emerald-700 disabled:opacity-50"
>
<Calendar className="w-4 h-4" />
Generate Resync
</button>
<button
onClick={handleRecoverStale}
disabled={actionLoading}
className="flex items-center gap-2 px-4 py-2 bg-gray-600 text-white rounded-lg hover:bg-gray-700 disabled:opacity-50"
>
<Zap className="w-4 h-4" />
Recover Stale
</button>
<button
onClick={fetchData}
className="flex items-center gap-2 px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
</div>
</div>
{/* Action Message */}
{actionMessage && (
<div
className={`p-4 rounded-lg ${
actionMessage.startsWith('Error')
? 'bg-red-50 text-red-700'
: 'bg-green-50 text-green-700'
}`}
>
{actionMessage}
</div>
)}
{error && (
<div className="p-4 bg-red-50 text-red-700 rounded-lg">{error}</div>
)}
{/* Status Summary Cards */}
<div className="grid grid-cols-2 sm:grid-cols-3 lg:grid-cols-6 gap-4">
{Object.entries(counts || {}).map(([status, count]) => (
<div
key={status}
className={`p-4 rounded-lg border ${
statusFilter === status ? 'ring-2 ring-emerald-500' : ''
} cursor-pointer hover:shadow-md transition-shadow`}
onClick={() => setStatusFilter(statusFilter === status ? '' : status)}
>
<div className="flex items-center gap-2 mb-2">
<span className={`p-1.5 rounded ${STATUS_COLORS[status]}`}>
{STATUS_ICONS[status]}
</span>
<span className="text-sm font-medium text-gray-600 capitalize">{status}</span>
</div>
<div className="text-2xl font-bold text-gray-900">{count}</div>
</div>
))}
</div>
{/* Capacity Planning Section */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<button
onClick={() => setShowCapacity(!showCapacity)}
className="w-full flex items-center justify-between p-4 hover:bg-gray-50"
>
<div className="flex items-center gap-2">
<Gauge className="w-5 h-5 text-emerald-600" />
<span className="font-medium text-gray-900">Capacity Planning</span>
</div>
{showCapacity ? (
<ChevronUp className="w-5 h-5 text-gray-400" />
) : (
<ChevronDown className="w-5 h-5 text-gray-400" />
)}
</button>
{showCapacity && (
<div className="p-4 border-t border-gray-200">
{capacity.length === 0 ? (
<p className="text-gray-500 text-center py-4">No capacity data available</p>
) : (
<div className="overflow-x-auto">
<table className="min-w-full divide-y divide-gray-200">
<thead>
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Role
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Pending
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Running
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Active Workers
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Avg Duration
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Tasks/Worker/Hr
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Est. Drain Time
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Completed/Hr
</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-500 uppercase">
Failed/Hr
</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-200">
{capacity.map((metric) => (
<tr key={metric.role} className="hover:bg-gray-50">
<td className="px-4 py-3 text-sm font-medium text-gray-900">
{metric.role.replace(/_/g, ' ')}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{metric.pending_tasks}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{metric.running_tasks}
</td>
<td className="px-4 py-3 text-sm text-right">
<span className="inline-flex items-center gap-1">
<Users className="w-4 h-4 text-gray-400" />
{metric.active_workers}
</span>
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{formatDuration(metric.avg_duration_sec)}
</td>
<td className="px-4 py-3 text-sm text-right text-gray-600">
{metric.tasks_per_worker_hour?.toFixed(1) || '-'}
</td>
<td className="px-4 py-3 text-sm text-right">
{metric.estimated_hours_to_drain ? (
<span
className={
metric.estimated_hours_to_drain > 4
? 'text-red-600 font-medium'
: 'text-gray-600'
}
>
{metric.estimated_hours_to_drain.toFixed(1)}h
</span>
) : (
'-'
)}
</td>
<td className="px-4 py-3 text-sm text-right text-green-600">
{metric.completed_last_hour}
</td>
<td className="px-4 py-3 text-sm text-right text-red-600">
{metric.failed_last_hour}
</td>
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
)}
</div>
{/* Filters */}
<div className="flex flex-col sm:flex-row gap-4">
<div className="relative flex-1">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-5 h-5 text-gray-400" />
<input
type="text"
placeholder="Search tasks..."
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
className="w-full pl-10 pr-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500 focus:border-emerald-500"
/>
</div>
<select
value={roleFilter}
onChange={(e) => setRoleFilter(e.target.value)}
className="px-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500"
>
<option value="">All Roles</option>
{ROLES.map((role) => (
<option key={role} value={role}>
{role.replace(/_/g, ' ')}
</option>
))}
</select>
<select
value={statusFilter}
onChange={(e) => setStatusFilter(e.target.value)}
className="px-4 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-emerald-500"
>
<option value="">All Statuses</option>
<option value="pending">Pending</option>
<option value="claimed">Claimed</option>
<option value="running">Running</option>
<option value="completed">Completed</option>
<option value="failed">Failed</option>
<option value="stale">Stale</option>
</select>
</div>
{/* Tasks Table */}
<div className="bg-white rounded-lg border border-gray-200 overflow-hidden">
<div className="overflow-x-auto">
<table className="min-w-full divide-y divide-gray-200">
<thead className="bg-gray-50">
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
ID
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Role
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Store
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Status
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Worker
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Duration
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Created
</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">
Error
</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-200">
{filteredTasks.length === 0 ? (
<tr>
<td colSpan={8} className="px-4 py-8 text-center text-gray-500">
No tasks found
</td>
</tr>
) : (
filteredTasks.map((task) => (
<tr key={task.id} className="hover:bg-gray-50">
<td className="px-4 py-3 text-sm font-mono text-gray-600">#{task.id}</td>
<td className="px-4 py-3 text-sm text-gray-900">
{task.role.replace(/_/g, ' ')}
</td>
<td className="px-4 py-3 text-sm text-gray-600">
{task.dispensary_name || task.dispensary_id || '-'}
</td>
<td className="px-4 py-3">
<span
className={`inline-flex items-center gap-1 px-2 py-1 rounded-full text-xs font-medium ${
STATUS_COLORS[task.status]
}`}
>
{STATUS_ICONS[task.status]}
{task.status}
</span>
</td>
<td className="px-4 py-3 text-sm font-mono text-gray-600">
{task.worker_id?.split('-').slice(-1)[0] || '-'}
</td>
<td className="px-4 py-3 text-sm text-gray-600">
{formatDuration(task.duration_sec ?? null)}
</td>
<td className="px-4 py-3 text-sm text-gray-500">
{formatTimeAgo(task.created_at)}
</td>
<td className="px-4 py-3 text-sm text-red-600 max-w-xs truncate">
{task.error_message || '-'}
</td>
</tr>
))
)}
</tbody>
</table>
</div>
</div>
</div>
</Layout>
);
}