feat: Worker improvements and Run Now duplicate prevention

- Fix Run Now to prevent duplicate task creation
- Add loading state to Run Now button in UI
- Return early when no stores need refresh
- Worker dashboard improvements
- Browser pooling architecture updates
- K8s worker config updates (8 replicas, 3 concurrent tasks)
This commit is contained in:
Kelly
2025-12-12 20:11:31 -07:00
parent c98c409f59
commit 63023a4061
12 changed files with 809 additions and 239 deletions

View File

@@ -25,13 +25,26 @@ Never import `src/db/migrate.ts` at runtime. Use `src/db/pool.ts` for DB access.
- **Worker** = Concurrent task runner INSIDE a pod (controlled by `MAX_CONCURRENT_TASKS` env var)
- Formula: `8 pods × MAX_CONCURRENT_TASKS = total concurrent workers`
**To increase workers:** Change `MAX_CONCURRENT_TASKS` env var, NOT replicas.
```bash
# CORRECT - increase workers per pod
kubectl set env deployment/scraper-worker -n dispensary-scraper MAX_CONCURRENT_TASKS=5
**Browser Task Memory Limits:**
- Each Puppeteer/Chrome browser uses ~400 MB RAM
- Pod memory limit is 2 GB
- **MAX_CONCURRENT_TASKS=3** is the safe maximum for browser tasks
- More than 3 concurrent browsers per pod = OOM crash
# WRONG - never scale above 8 replicas
kubectl scale deployment/scraper-worker --replicas=20 # NEVER DO THIS
| Browsers | RAM Used | Status |
|----------|----------|--------|
| 3 | ~1.3 GB | Safe (recommended) |
| 4 | ~1.7 GB | Risky |
| 5+ | >2 GB | OOM crash |
**To increase throughput:** Add more pods (up to 8), NOT more concurrent tasks per pod.
```bash
# CORRECT - scale pods (up to 8)
kubectl scale deployment/scraper-worker -n dispensary-scraper --replicas=8
# WRONG - will cause OOM crashes
kubectl set env deployment/scraper-worker -n dispensary-scraper MAX_CONCURRENT_TASKS=10
```
**If K8s API returns ServiceUnavailable:** STOP IMMEDIATELY. Do not retry. The cluster is overloaded.