feat: Worker improvements and Run Now duplicate prevention
- Fix Run Now to prevent duplicate task creation - Add loading state to Run Now button in UI - Return early when no stores need refresh - Worker dashboard improvements - Browser pooling architecture updates - K8s worker config updates (8 replicas, 3 concurrent tasks)
This commit is contained in:
@@ -504,6 +504,103 @@ The Workers Dashboard shows:
|
||||
| `src/routes/worker-registry.ts:148-195` | Heartbeat endpoint handling |
|
||||
| `cannaiq/src/pages/WorkersDashboard.tsx:233-305` | UI components for resources |
|
||||
|
||||
## Browser Task Memory Limits (Updated 2025-12)
|
||||
|
||||
Browser-based tasks (Puppeteer/Chrome) have strict memory constraints that limit concurrency.
|
||||
|
||||
### Why Browser Tasks Are Different
|
||||
|
||||
Each browser task launches a Chrome process. Unlike I/O-bound API calls, browsers consume significant RAM:
|
||||
|
||||
| Component | RAM Usage |
|
||||
|-----------|-----------|
|
||||
| Node.js runtime | ~150 MB |
|
||||
| Chrome browser (base) | ~200-250 MB |
|
||||
| Dutchie menu page (loaded) | ~100-150 MB |
|
||||
| **Per browser total** | **~350-450 MB** |
|
||||
|
||||
### Memory Math for Pod Limits
|
||||
|
||||
```
|
||||
Pod memory limit: 2 GB (2000 MB)
|
||||
Node.js runtime: -150 MB
|
||||
Safety buffer: -100 MB
|
||||
────────────────────────────────
|
||||
Available for browsers: 1750 MB
|
||||
|
||||
Per browser + page: ~400 MB
|
||||
|
||||
Max browsers: 1750 ÷ 400 = ~4 browsers
|
||||
|
||||
Recommended: 3 browsers (leaves headroom for spikes)
|
||||
```
|
||||
|
||||
### MAX_CONCURRENT_TASKS for Browser Tasks
|
||||
|
||||
| Browsers per Pod | RAM Used | Risk Level |
|
||||
|------------------|----------|------------|
|
||||
| 1 | ~500 MB | Very safe |
|
||||
| 2 | ~900 MB | Safe |
|
||||
| **3** | **~1.3 GB** | **Recommended** |
|
||||
| 4 | ~1.7 GB | Tight (may OOM) |
|
||||
| 5+ | >2 GB | Will OOM crash |
|
||||
|
||||
**CRITICAL**: `MAX_CONCURRENT_TASKS=3` is the maximum safe value for browser tasks with current pod limits.
|
||||
|
||||
### Scaling Strategy
|
||||
|
||||
Scale **horizontally** (more pods) rather than vertically (more concurrency per pod):
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Cluster: 8 pods × 3 browsers = 24 concurrent tasks │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ Pod 0 │ │ Pod 1 │ │ Pod 2 │ │ Pod 3 │ │
|
||||
│ │ 3 browsers │ │ 3 browsers │ │ 3 browsers │ │ 3 browsers │ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ Pod 4 │ │ Pod 5 │ │ Pod 6 │ │ Pod 7 │ │
|
||||
│ │ 3 browsers │ │ 3 browsers │ │ 3 browsers │ │ 3 browsers │ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Browser Lifecycle Per Task
|
||||
|
||||
Each task gets a fresh browser with fresh IP/identity:
|
||||
|
||||
```
|
||||
1. Claim task from queue
|
||||
2. Get fresh proxy from pool
|
||||
3. Launch browser with proxy
|
||||
4. Run preflight (verify IP)
|
||||
5. Execute scrape
|
||||
6. Close browser
|
||||
7. Repeat
|
||||
```
|
||||
|
||||
This ensures:
|
||||
- Fresh IP per task (proxy rotation)
|
||||
- Fresh fingerprint per task (UA rotation)
|
||||
- No cookie/session bleed between tasks
|
||||
- Predictable memory usage
|
||||
|
||||
### Increasing Capacity
|
||||
|
||||
To handle more concurrent tasks:
|
||||
|
||||
1. **Add more pods** (up to 8 per CLAUDE.md limit)
|
||||
2. **Increase pod memory** (allows 4 browsers per pod):
|
||||
```yaml
|
||||
resources:
|
||||
limits:
|
||||
memory: "2.5Gi" # from 2Gi
|
||||
```
|
||||
|
||||
**DO NOT** simply increase `MAX_CONCURRENT_TASKS` without also increasing pod memory limits.
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Logs
|
||||
|
||||
Reference in New Issue
Block a user