feat: Add stale process monitor, users route, landing page, archive old scripts
- Add backend stale process monitoring API (/api/stale-processes) - Add users management route - Add frontend landing page and stale process monitor UI on /scraper-tools - Move old development scripts to backend/archive/ - Update frontend build with new features 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
82
CLAUDE.md
82
CLAUDE.md
@@ -211,3 +211,85 @@
|
|||||||
- **Trigger schedules manually**: `curl -X POST /api/az/admin/schedules/{id}/trigger`
|
- **Trigger schedules manually**: `curl -X POST /api/az/admin/schedules/{id}/trigger`
|
||||||
- **Check schedule status**: `curl /api/az/admin/schedules`
|
- **Check schedule status**: `curl /api/az/admin/schedules`
|
||||||
- **Worker logs**: `kubectl logs -f deployment/scraper-worker -n dispensary-scraper`
|
- **Worker logs**: `kubectl logs -f deployment/scraper-worker -n dispensary-scraper`
|
||||||
|
|
||||||
|
24) **Crawler Maintenance Procedure (Check Jobs, Requeue, Restart)**
|
||||||
|
When crawlers are stuck or jobs aren't processing, follow this procedure:
|
||||||
|
|
||||||
|
**Step 1: Check Job Status**
|
||||||
|
```bash
|
||||||
|
# Port-forward to production
|
||||||
|
kubectl port-forward -n dispensary-scraper deployment/scraper 3099:3010 &
|
||||||
|
|
||||||
|
# Check active/stuck jobs
|
||||||
|
curl -s http://localhost:3099/api/az/monitor/active-jobs | jq .
|
||||||
|
|
||||||
|
# Check recent job history
|
||||||
|
curl -s "http://localhost:3099/api/az/monitor/jobs?limit=20" | jq '.jobs[] | {id, job_type, status, dispensary_id, started_at, products_found, duration_min: (.duration_ms/60000 | floor)}'
|
||||||
|
|
||||||
|
# Check schedule status
|
||||||
|
curl -s http://localhost:3099/api/az/admin/schedules | jq '.schedules[] | {id, jobName, enabled, lastRunAt, lastStatus, nextRunAt}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 2: Reset Stuck Jobs**
|
||||||
|
Jobs are considered stuck if they have `status='running'` but no heartbeat in >30 minutes:
|
||||||
|
```bash
|
||||||
|
# Via API (if endpoint exists)
|
||||||
|
curl -s -X POST http://localhost:3099/api/az/admin/reset-stuck-jobs
|
||||||
|
|
||||||
|
# Via direct DB (if API not available)
|
||||||
|
kubectl exec -n dispensary-scraper deployment/scraper -- psql $DATABASE_URL -c "
|
||||||
|
UPDATE dispensary_crawl_jobs
|
||||||
|
SET status = 'failed',
|
||||||
|
error_message = 'Job timed out - worker stopped sending heartbeats',
|
||||||
|
completed_at = NOW()
|
||||||
|
WHERE status = 'running'
|
||||||
|
AND (last_heartbeat_at < NOW() - INTERVAL '30 minutes' OR last_heartbeat_at IS NULL);
|
||||||
|
"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 3: Requeue Jobs (Trigger Fresh Crawl)**
|
||||||
|
```bash
|
||||||
|
# Trigger product crawl schedule (typically ID 1)
|
||||||
|
curl -s -X POST http://localhost:3099/api/az/admin/schedules/1/trigger
|
||||||
|
|
||||||
|
# Trigger menu detection schedule (typically ID 2)
|
||||||
|
curl -s -X POST http://localhost:3099/api/az/admin/schedules/2/trigger
|
||||||
|
|
||||||
|
# Or crawl a specific dispensary
|
||||||
|
curl -s -X POST http://localhost:3099/api/az/admin/crawl/112
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 4: Restart Crawler Workers**
|
||||||
|
```bash
|
||||||
|
# Restart scraper-worker pods (clears any stuck processes)
|
||||||
|
kubectl rollout restart deployment/scraper-worker -n dispensary-scraper
|
||||||
|
|
||||||
|
# Watch rollout progress
|
||||||
|
kubectl rollout status deployment/scraper-worker -n dispensary-scraper
|
||||||
|
|
||||||
|
# Optionally restart main scraper pod too
|
||||||
|
kubectl rollout restart deployment/scraper -n dispensary-scraper
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 5: Monitor Recovery**
|
||||||
|
```bash
|
||||||
|
# Watch worker logs
|
||||||
|
kubectl logs -f deployment/scraper-worker -n dispensary-scraper --tail=50
|
||||||
|
|
||||||
|
# Check dashboard for product counts
|
||||||
|
curl -s http://localhost:3099/api/az/dashboard | jq '{totalStores, totalProducts, storesByType}'
|
||||||
|
|
||||||
|
# Verify jobs are processing
|
||||||
|
curl -s http://localhost:3099/api/az/monitor/active-jobs | jq .
|
||||||
|
```
|
||||||
|
|
||||||
|
**Quick One-Liner for Full Reset:**
|
||||||
|
```bash
|
||||||
|
# Reset stuck jobs and restart workers
|
||||||
|
kubectl exec -n dispensary-scraper deployment/scraper -- psql $DATABASE_URL -c "UPDATE dispensary_crawl_jobs SET status='failed', completed_at=NOW() WHERE status='running' AND (last_heartbeat_at < NOW() - INTERVAL '30 minutes' OR last_heartbeat_at IS NULL);" && kubectl rollout restart deployment/scraper-worker -n dispensary-scraper && kubectl rollout status deployment/scraper-worker -n dispensary-scraper
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cleanup port-forwards when done:**
|
||||||
|
```bash
|
||||||
|
pkill -f "port-forward.*dispensary-scraper"
|
||||||
|
```
|
||||||
|
|||||||
@@ -412,3 +412,65 @@ HTTP Status Codes:
|
|||||||
- `404` - Not Found
|
- `404` - Not Found
|
||||||
- `429` - Too Many Requests
|
- `429` - Too Many Requests
|
||||||
- `500` - Server Error
|
- `500` - Server Error
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## WordPress Plugin
|
||||||
|
|
||||||
|
### Plugin Download
|
||||||
|
|
||||||
|
The WordPress plugin is available for download from the admin dashboard landing page.
|
||||||
|
|
||||||
|
**Download URL Pattern:**
|
||||||
|
```
|
||||||
|
/downloads/cb-wpmenu-{version}.zip
|
||||||
|
```
|
||||||
|
|
||||||
|
Example: `/downloads/cb-wpmenu-1.5.1.zip`
|
||||||
|
|
||||||
|
### Building the Plugin
|
||||||
|
|
||||||
|
Use the build script to create a new plugin zip for distribution:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd wordpress-plugin
|
||||||
|
./build-plugin.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
The build script will:
|
||||||
|
1. Extract the version from `crawlsy-menus.php`
|
||||||
|
2. Create a zip file named `cb-wpmenu-{version}.zip`
|
||||||
|
3. Output the file to `backend/public/downloads/`
|
||||||
|
4. Display a reminder to update the landing page
|
||||||
|
|
||||||
|
### Release Checklist
|
||||||
|
|
||||||
|
When releasing a new plugin version:
|
||||||
|
|
||||||
|
1. **Update the version** in `wordpress-plugin/crawlsy-menus.php`:
|
||||||
|
```php
|
||||||
|
* Version: 1.5.2
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Build the plugin**:
|
||||||
|
```bash
|
||||||
|
cd wordpress-plugin
|
||||||
|
./build-plugin.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Update the landing page** (`frontend/src/pages/LandingPage.tsx`):
|
||||||
|
- Update download URLs: `href="/downloads/cb-wpmenu-{version}.zip"`
|
||||||
|
- Update button text: `Download Plugin v{version}`
|
||||||
|
|
||||||
|
4. **Deploy**:
|
||||||
|
- Local: The zip is served from `backend/public/downloads/`
|
||||||
|
- Remote: Commit and push changes, then deploy
|
||||||
|
|
||||||
|
### Naming Convention
|
||||||
|
|
||||||
|
| Component | Format | Example |
|
||||||
|
|-----------|--------|---------|
|
||||||
|
| Plugin file | `crawlsy-menus.php` | - |
|
||||||
|
| Zip file | `cb-wpmenu-{version}.zip` | `cb-wpmenu-1.5.1.zip` |
|
||||||
|
| Download URL | `/downloads/cb-wpmenu-{version}.zip` | `/downloads/cb-wpmenu-1.5.1.zip` |
|
||||||
|
| Shortcodes | `[cb_products]`, `[cb_product]` | `[cb_products limit="12"]` |
|
||||||
|
|||||||
@@ -43,7 +43,7 @@ ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium
|
|||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
|
|
||||||
COPY package*.json ./
|
COPY package*.json ./
|
||||||
RUN npm ci --only=production
|
RUN npm ci --omit=dev
|
||||||
|
|
||||||
COPY --from=builder /app/dist ./dist
|
COPY --from=builder /app/dist ./dist
|
||||||
|
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user