feat: Parallelized store discovery, modification tracking, and task deduplication

Store Discovery Parallelization:
- Add store_discovery_state handler for per-state parallel discovery
- Add POST /api/tasks/batch/store-discovery endpoint
- 8 workers can now process states in parallel (~30-45 min vs 3+ hours)

Modification Tracking (Migration 090):
- Add last_modified_at, last_modified_by_task, last_modified_task_id to dispensaries
- Add same columns to store_products
- Update all handlers to set tracking info on modifications

Stale Task Recovery:
- Add periodic stale cleanup every 10 minutes (worker-0 only)
- Prevents orphaned tasks from blocking queue after worker crashes

Task Deduplication:
- createStaggeredTasks now skips if pending/active task exists for same role
- Skips if same role completed within last 4 hours
- API responses include skipped count

🤖 Generated with [Claude Code](https://claude.com/claude-code)
This commit is contained in:
Kelly
2025-12-12 22:15:04 -07:00
parent e4e8438d8b
commit c62f8cbf06
11 changed files with 815 additions and 51 deletions

View File

@@ -326,13 +326,16 @@ export async function handleProductDiscoveryHttp(ctx: TaskContext): Promise<Task
console.log(`[ProductDiscoveryHTTP] Saved payload #${payloadResult.id} (${(payloadResult.sizeBytes / 1024).toFixed(1)}KB)`);
// ============================================================
// STEP 6: Update dispensary last_fetch_at
// STEP 6: Update dispensary last_fetch_at and tracking
// ============================================================
await pool.query(`
UPDATE dispensaries
SET last_fetch_at = NOW()
SET last_fetch_at = NOW(),
last_modified_at = NOW(),
last_modified_by_task = $2,
last_modified_task_id = $3
WHERE id = $1
`, [dispensaryId]);
`, [dispensaryId, task.role, task.id]);
// ============================================================
// STEP 7: Queue product_refresh task to process the payload