feat: Remove Run Now, add source tracking, optimize dashboard

- Remove /run-now endpoint (use task priority instead)
- Add source tracking to worker_tasks (source, source_schedule_id, source_metadata)
- Parallelize dashboard API calls (Promise.all)
- Add 1-5 min caching to /markets/dashboard and /national/summary
- Add performance indexes for dashboard queries

Migrations:
- 104: Task source tracking columns
- 105: Dashboard performance indexes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Kelly
2025-12-13 13:23:35 -07:00
parent a8fec97bcb
commit 8b3ae40089
13 changed files with 271 additions and 460 deletions

View File

@@ -1,5 +1,8 @@
# Claude Guidelines for CannaiQ
## CURRENT ENVIRONMENT: PRODUCTION
**We are working in PRODUCTION only.** All database queries and API calls should target the remote production environment, not localhost. Use kubectl port-forward or remote DB connections as needed.
## PERMANENT RULES (NEVER VIOLATE)
### 1. NO DELETE
@@ -247,14 +250,14 @@ These binaries mimic real browser TLS fingerprints to avoid detection.
---
## Staggered Task Workflow (Added 2025-12-12)
## Bulk Task Workflow (Updated 2025-12-13)
### Overview
When creating many tasks at once (e.g., product refresh for all AZ stores), staggered scheduling prevents resource contention, proxy assignment lag, and API rate limiting.
Tasks are created with `scheduled_for = NOW()` by default. Worker-level controls handle pacing - no task-level staggering needed.
### How It Works
```
1. Task created with scheduled_for = NOW() + (index * stagger_seconds)
1. Task created with scheduled_for = NOW()
2. Worker claims task only when scheduled_for <= NOW()
3. Worker runs preflight on EVERY task claim (proxy health check)
4. If preflight passes, worker executes task
@@ -263,57 +266,51 @@ When creating many tasks at once (e.g., product refresh for all AZ stores), stag
7. Repeat - preflight runs on each new task claim
```
### Worker-Level Throttling
These controls pace task execution - no staggering at task creation time:
| Control | Purpose |
|---------|---------|
| `MAX_CONCURRENT_TASKS` | Limits concurrent tasks per pod (default: 3) |
| Working hours | Restricts when tasks run (configurable per schedule) |
| Preflight checks | Ensures proxy health before each task |
| Per-store locking | Only one active task per dispensary |
### Key Points
- **Preflight is per-task, not per-startup**: Each task claim triggers a new preflight check
- **Stagger prevents thundering herd**: 15 seconds between tasks is default
- **Task assignment is the trigger**: Worker picks up task → runs preflight → executes if passed
- **Worker controls pacing**: Tasks scheduled for NOW() but claimed based on worker capacity
- **Optional staggering**: Pass `stagger_seconds > 0` if you need explicit delays
### API Endpoints
```bash
# Create staggered tasks for specific dispensary IDs
# Create bulk tasks for specific dispensary IDs
POST /api/tasks/batch/staggered
{
"dispensary_ids": [1, 2, 3, 4],
"role": "product_refresh", # or "product_discovery"
"stagger_seconds": 15, # default: 15
"stagger_seconds": 0, # default: 0 (all NOW)
"platform": "dutchie", # default: "dutchie"
"method": null # "curl" | "http" | null
}
# Create staggered tasks for AZ stores (convenience endpoint)
POST /api/tasks/batch/az-stores
# Create bulk tasks for all stores in a state
POST /api/tasks/crawl-state/:stateCode
{
"total_tasks": 24, # default: 24
"stagger_seconds": 15, # default: 15
"split_roles": true # default: true (12 refresh, 12 discovery)
"stagger_seconds": 0, # default: 0 (all NOW)
"method": "http" # default: "http"
}
```
### Example: 24 Tasks for AZ Stores
### Example: Tasks for AZ Stores
```bash
curl -X POST http://localhost:3010/api/tasks/batch/az-stores \
-H "Content-Type: application/json" \
-d '{"total_tasks": 24, "stagger_seconds": 15, "split_roles": true}'
```
Response:
```json
{
"success": true,
"total": 24,
"product_refresh": 12,
"product_discovery": 12,
"stagger_seconds": 15,
"total_duration_seconds": 345,
"estimated_completion": "2025-12-12T08:40:00.000Z",
"message": "Created 24 staggered tasks for AZ stores (12 refresh, 12 discovery)"
}
curl -X POST http://localhost:3010/api/tasks/crawl-state/AZ \
-H "Content-Type: application/json"
```
### Related Files
| File | Purpose |
|------|---------|
| `src/tasks/task-service.ts` | `createStaggeredTasks()` and `createAZStoreTasks()` methods |
| `src/tasks/task-service.ts` | `createStaggeredTasks()` method |
| `src/routes/tasks.ts` | API endpoints for batch task creation |
| `src/tasks/task-worker.ts` | Worker task claiming and preflight logic |