feat: Stealth worker system with mandatory proxy rotation

## Worker System
- Role-agnostic workers that can handle any task type
- Pod-based architecture with StatefulSet (5-15 pods, 5 workers each)
- Custom pod names (Aethelgard, Xylos, Kryll, etc.)
- Worker registry with friendly names and resource monitoring
- Hub-and-spoke visualization on JobQueue page

## Stealth & Anti-Detection (REQUIRED)
- Proxies are MANDATORY - workers fail to start without active proxies
- CrawlRotator initializes on worker startup
- Loads proxies from `proxies` table
- Auto-rotates proxy + fingerprint on 403 errors
- 12 browser fingerprints (Chrome, Firefox, Safari, Edge)
- Locale/timezone matching for geographic consistency

## Task System
- Renamed product_resync → product_refresh
- Task chaining: store_discovery → entry_point → product_discovery
- Priority-based claiming with FOR UPDATE SKIP LOCKED
- Heartbeat and stale task recovery

## UI Updates
- JobQueue: Pod visualization, resource monitoring on hover
- WorkersDashboard: Simplified worker list
- Removed unused filters from task list

## Other
- IP2Location service for visitor analytics
- Findagram consumer features scaffolding
- Documentation updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Kelly
2025-12-10 00:44:59 -07:00
parent 0295637ed6
commit 56cc171287
61 changed files with 8591 additions and 2076 deletions

View File

@@ -0,0 +1,65 @@
#!/bin/bash
# Download IP2Location LITE DB3 (City-level) database
# Free for commercial use with attribution
# https://lite.ip2location.com/database/db3-ip-country-region-city
set -e
DATA_DIR="${1:-./data/ip2location}"
DB_FILE="IP2LOCATION-LITE-DB3.BIN"
mkdir -p "$DATA_DIR"
cd "$DATA_DIR"
echo "Downloading IP2Location LITE DB3 database..."
# IP2Location LITE DB3 - includes city, region, country, lat/lng
# You need to register at https://lite.ip2location.com/ to get a download token
# Then set IP2LOCATION_TOKEN environment variable
if [ -z "$IP2LOCATION_TOKEN" ]; then
echo ""
echo "ERROR: IP2LOCATION_TOKEN not set"
echo ""
echo "To download the database:"
echo "1. Register free at https://lite.ip2location.com/"
echo "2. Get your download token from the dashboard"
echo "3. Run: IP2LOCATION_TOKEN=your_token ./scripts/download-ip2location.sh"
echo ""
exit 1
fi
# Download DB3.LITE (IPv4 + City)
DOWNLOAD_URL="https://www.ip2location.com/download/?token=${IP2LOCATION_TOKEN}&file=DB3LITEBIN"
echo "Downloading from IP2Location..."
curl -L -o ip2location.zip "$DOWNLOAD_URL"
echo "Extracting..."
unzip -o ip2location.zip
# Rename to standard name
if [ -f "IP2LOCATION-LITE-DB3.BIN" ]; then
echo "Database ready: $DATA_DIR/IP2LOCATION-LITE-DB3.BIN"
elif [ -f "IP-COUNTRY-REGION-CITY.BIN" ]; then
mv "IP-COUNTRY-REGION-CITY.BIN" "$DB_FILE"
echo "Database ready: $DATA_DIR/$DB_FILE"
else
# Find whatever BIN file was extracted
BIN_FILE=$(ls *.BIN 2>/dev/null | head -1)
if [ -n "$BIN_FILE" ]; then
mv "$BIN_FILE" "$DB_FILE"
echo "Database ready: $DATA_DIR/$DB_FILE"
else
echo "ERROR: No BIN file found in archive"
ls -la
exit 1
fi
fi
# Cleanup
rm -f ip2location.zip *.txt LICENSE* README*
echo ""
echo "Done! Database saved to: $DATA_DIR/$DB_FILE"
echo "Update monthly by re-running this script."