Commit Graph

451 Commits

Author SHA1 Message Date
Kelly
287627195c chore: trigger CI 2025-12-15 08:01:56 -07:00
Kelly
bfb965fa44 chore: trigger CI 2025-12-15 07:58:10 -07:00
Kelly
7bbc77a854 chore: trigger CI 2025-12-15 07:57:40 -07:00
Kelly
39ba522643 chore: trigger CI 2025-12-15 07:57:08 -07:00
Kelly
6ea4cd0d05 chore: trigger CI 2025-12-15 07:56:50 -07:00
Kelly
520cba9d31 chore: trigger CI 2025-12-15 07:56:44 -07:00
Kelly
331b6273ac chore: trigger build 2025-12-15 07:53:42 -07:00
Kelly
d4a18cc3ce chore: test CI agent 2025-12-15 07:51:47 -07:00
Kelly
977803d862 chore: trigger CI build 2025-12-15 07:48:39 -07:00
Kelly
48c640aae5 chore: trigger CI 2025-12-15 07:44:38 -07:00
Kelly
918a1c6b26 chore: trigger CI after Woodpecker activation 2025-12-15 07:26:25 -07:00
Kelly
c7541ec2eb chore: Rename all references from dispensary-scraper to cannaiq 2025-12-15 07:19:33 -07:00
Kelly
8676762d6b chore: trigger CI build 2025-12-15 06:51:58 -07:00
Kelly
3f393ef77f fix: Correct repo name in auto-merge URL 2025-12-15 06:42:16 -07:00
Kelly
a8360c7260 feat: Migrate to spdy.io infrastructure
- Namespace: dispensary-scraper → cannaiq
- Registry: code.cannabrands.app → git.spdy.io
- Database: External PostgreSQL at 10.100.6.50
- MinIO: Internal at 10.100.9.80:9000
- CI: ci.spdy.io

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 06:40:48 -07:00
Kelly
0979c9c37a Revert "feat(scheduler): Support sub-hour interval_minutes in task_schedules"
This reverts commit b607fd7f44.
2025-12-14 18:50:25 -07:00
Kelly
b607fd7f44 feat(scheduler): Support sub-hour interval_minutes in task_schedules
- Add interval_minutes column to TaskSchedule interface
- Prefer interval_minutes over interval_hours when calculating next_run_at
- Add jitter (0-20% of interval) for sub-hour schedules to prevent detection
- Update getSchedules() to include interval_minutes and dispensary_name
- Update updateSchedule() to allow setting interval_minutes
- Add migration 121 for interval_minutes column

Part of Real-Time Inventory Tracking feature.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 18:22:55 -07:00
Kelly
bf988529eb fix(ci): switch from buildx to regular docker plugin
BuildKit container driver has sysctl permission issues in LXC.
Using plugins/docker instead of woodpeckerci/plugin-docker-buildx.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 17:05:06 -07:00
Kelly
04153a2efa chore: retry CI after docker update 2025-12-14 17:01:05 -07:00
Kelly
a1a6876064 chore: retry CI after docker restart 2025-12-14 16:55:33 -07:00
Kelly
83466a03c3 chore: retry CI build 2025-12-14 16:40:52 -07:00
Kelly
35d6a17740 feat: Add daily baseline payload logic (12:01 AM - 3:00 AM window)
- Replace saveRawPayload with saveDailyBaseline in all handlers
- Full payloads only saved once per day per store during window
- Inventory snapshots still saved every crawl (lightweight tracking)
- Add last_baseline_at column to dispensaries table
- Show baseline status in Per-Store Schedules dashboard
- Display baseline window info (12:01 AM - 3:00 AM) in UI

Reduces storage ~95% for high-frequency stores while maintaining
full audit capability via daily baselines.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 16:24:41 -07:00
Kelly
294d3db7a2 fix: Remove NOW() from partial indexes (not immutable)
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 15:58:05 -07:00
Kelly
bbbd21ba94 chore: Ignore test scripts and .claude directory 2025-12-14 15:57:27 -07:00
Kelly
3496be3064 feat(treez): Fetch all products with match_all query (+19% more)
- Update buildProductQuery() to use match_all by default
- Captures hidden, below-threshold, and out-of-stock products
- Add extractPrimaryImage() and extractImages() to normalizer
- Add product_refresh_treez handler for platform-specific refresh
- Add product_refresh_treez to TaskRole type

Best Dispensary: 228 → 271 products (+43)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 15:56:06 -07:00
Kelly
af859a85f9 feat: Add Real-Time Inventory Tracking infrastructure
Implements per-store high-frequency crawl scheduling and inventory
snapshot tracking for sales velocity estimation (Hoodie Analytics parity).

Database migrations:
- 117: Per-store crawl_interval_minutes and next_crawl_at columns
- 118: inventory_snapshots table (30-day retention)
- 119: product_visibility_events table for OOS/brand alerts (90-day)

Backend changes:
- inventory-snapshots.ts: Shared utility normalizing Dutchie/Jane/Treez
- visibility-events.ts: Detects OOS, price changes, brand drops
- task-scheduler.ts: checkHighFrequencyStores() runs every 60s
- Handler updates: 2-line additions to save snapshots/events

API endpoints:
- GET /api/tasks/schedules/high-frequency
- PUT /api/tasks/schedules/high-frequency/:id
- DELETE /api/tasks/schedules/high-frequency/:id

Frontend:
- TasksDashboard: Per-Store Schedules section with stats

Features:
- Per-store intervals (15/30/60 min configurable)
- Jitter (0-20%) to avoid detection patterns
- Cross-platform support (Dutchie, Jane, Treez)
- No crawler core changes - scheduling/post-crawl only

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 15:53:04 -07:00
Kelly
d3f5e4ef4b feat(nav): Add Payloads menu item to admin sidebar
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 14:26:37 -07:00
Kelly
abef265ae9 feat(workers): Add platform badge (D/J/T) to active tasks display
- Add PlatformBadge component showing D=Dutchie, J=Jane, T=Treez
- Include platform field in worker-registry API response
- Fix null running_seconds displaying as "nulls"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 12:21:23 -07:00
Kelly
b28a91fca5 fix: Task completion result and null duration display bugs
1. task-worker.ts: Pass full result object to completeTask instead of
   non-existent result.data property (was causing {} to be stored)

2. WorkersDashboard.tsx: Handle null running_seconds in formatSecondsToTime
   (was displaying "nulls" due to JS type coercion)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 12:02:05 -07:00
Kelly
60b221e7fb feat: Add payloads dashboard, disable snapshots, fix scheduler
Frontend:
- Add PayloadsDashboard page with search, filter, view, and diff
- Update TasksDashboard default sort: pending → claimed → completed
- Add payload API methods to api.ts

Backend:
- Disable snapshot creation in product-refresh handler
- Remove product_refresh from schedule role options
- Disable compression in payload-storage (plain JSON for debugging)
- Fix task-scheduler: map 'embedded' menu_type to 'dutchie' platform
- Fix task-scheduler: use schedule.interval_hours as skipRecentHours

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 11:54:25 -07:00
Kelly
15cb657f13 fix(docker): Revert to libasound2 for Debian bookworm
- libasound2t64 is for Debian trixie (13), not bookworm (12)
- Keep build tools fix for native modules

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 11:07:03 -07:00
Kelly
f15920e508 fix(docker): Add build tools for native modules and fix Debian package name
- Add python3 and build-essential to builder stage for bcrypt/sharp compilation
- Change libasound2 to libasound2t64 for Debian bookworm compatibility
- Copy pre-built node_modules from builder instead of re-running npm install
- Prune dev dependencies in builder for smaller production image

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 11:01:22 -07:00
Kelly
9518ca48a5 feat(tasks): Task tracking, IP-per-store, and schedule edit fixes
- Add task completion verification with DB and output layers
- Add reconciliation loop to sync worker memory with DB state
- Implement IP-per-store-per-platform conflict detection
- Add task ID hash to MinIO payload filenames for traceability
- Fix schedule edit modal with dispensary info in API responses
- Add task ID display after dispensary name in worker dashboard
- Add migrations for proxy_ip and source tracking columns

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 10:49:21 -07:00
Kelly
3e9667571f fix(ui): Restore round worker slot circles with hover tooltips 2025-12-14 09:58:24 -07:00
Kelly
8f6efd377b fix(ui): Remove 'test' from fingerprint tooltip 2025-12-14 03:40:17 -07:00
Kelly
83e9718d78 fix(ui): Worker slot preflight checklist and fingerprint hover
- Fix fingerprint tooltip to use actual API field names (browserName, deviceCategory, detectedTimezone)
- Show real preflight steps: HTTP Preflight, Geo Session, Pool Ready
- Checkmarks appear as each step completes, spinners while in progress

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:38:46 -07:00
Kelly
f5cb17e1d4 feat(dutchie): Full payload with specials and all product statuses
- Set includeEnterpriseSpecials: true to get BOGO/sale deal names
- Set Status: 'All' to capture both Active and Inactive (sold out) products
- Make schedules query backward-compatible for missing pool_id column

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:35:25 -07:00
Kelly
f48a503e82 fix(tasks): Filter out disabled dispensaries in createStaggeredTasks
Tasks were being created for dispensaries with crawl_enabled=false
(duplicates, deprecated stores). Added EXISTS check to filter only
crawl_enabled=true stores before creating tasks.

This prevents errors like:
"Dispensary 207 not found or not crawl_enabled"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:07:03 -07:00
Kelly
e7b392141a feat(ui): Task pool toggle, sortable columns, worker slot visualization
Tasks Dashboard:
- Add clickable Pool Open/Paused toggle button in header
- Add sortable columns (ID, Role, Store, Status, Worker, Duration, Created)
- Show menu_type and pool badges under Store column
- Add Pool column to Schedules table
- Filter stores by platform in Create Task modal

Workers Dashboard:
- Redesign pod visualization to show 3 worker slots per pod
- Each slot shows preflight checklist (Overload? Terminating? Pool Query?)
- Once qualified, shows City/State, Proxy IP, Antidetect status
- Hover shows full fingerprint data (browser, platform, bot detection)

Backend:
- Add menu_type to listTasks query
- Add pool_id/pool_name to schedules query with task_pools JOIN
- Migration 114: Add pool_id column to task_schedules table

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:00:19 -07:00
Kelly
15a5a4239e fix(tasks): Make pool JOIN defensive when table doesn't exist
Auto-migrate fails early, so task_pools may not exist yet.
Check table existence before including pool columns/joins.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 02:29:07 -07:00
Kelly
20d7534b93 fix(ci): prefix docker tags with sha- to prevent scientific notation parsing
Git SHAs like 1861e183 or 698995e4 get parsed as scientific notation
by JSON parsers, breaking Docker tag creation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 02:10:17 -07:00
Kelly
698995e46f chore: bump task worker version comment
Force new git SHA to avoid CI scientific notation bug.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 02:02:30 -07:00
Kelly
1861e18396 feat(workers): Implement geo-based task pools
Workers now follow the correct flow:
1. Check what pools have pending tasks
2. Claim a pool (e.g., Phoenix AZ)
3. Get Evomi proxy for that geo
4. Run preflight with geo proxy
5. Pull tasks from pool (up to 6 stores)
6. Execute tasks
7. Release pool when exhausted (6 stores visited)

Task pools group dispensaries by metro area (100mi radius):
- Phoenix AZ, Tucson AZ
- Los Angeles CA, San Francisco CA, San Diego CA, Sacramento CA
- Denver CO, Chicago IL, Boston MA, Detroit MI
- Las Vegas NV, Reno NV, Newark NJ, New York NY
- Oklahoma City OK, Tulsa OK, Portland OR, Seattle WA

Benefits:
- Workers know geo BEFORE getting proxy (no more "No geo assigned")
- IP diversity within metro area (Phoenix worker can use Tempe IP)
- Simpler worker logic - just match pool geo
- Pre-organized tasks, not grouped at claim time

New files:
- migrations/113_task_pools.sql - schema, seed data, functions
- src/services/task-pool.ts - TypeScript service

Env vars:
- USE_TASK_POOLS=true (new system)
- USE_IDENTITY_POOL=false (disabled)

🤖 Generated with [Claude Code](https://claude.ai/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-14 01:41:52 -07:00
Kelly
eedc027ff6 fix(workers): Report geo to worker_registry when identity claimed
Workers were showing "No geo assigned" on dashboard because geo info
was set internally but never reported to worker_registry after
identity pool claim.

Now updates current_state and current_city columns when identity
is claimed, so dashboard shows correct geo assignment.

Also documents CI/CD batching rule to minimize build time.

🤖 Generated with [Claude Code](https://claude.ai/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-14 01:14:31 -07:00
Kelly
ec5fcd9bc4 fix(proxy): Use rotating IPs instead of sticky sessions
Removes session parameter from Evomi proxy URL so each request
gets a different IP. Prevents all workers from sharing same IP.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 00:48:04 -07:00
Kelly
58150dafa6 docs: Add CI/CD workflow rule - commit and wait
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 00:46:43 -07:00
Kelly
06adab7225 fix(preflight): Add state fallback when IP lookup fails
- Try ip-api.com first, then ipapi.co as fallback
- If both fail, use state coords from targetState param
- Prevents workers from getting stuck in preflight loop

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 00:24:31 -07:00
Kelly
38d7678a2e feat(antidetect): Use actual proxy IP location for browser fingerprint
- Replace hardcoded state coords with IP geolocation lookup via ip-api.com
- Browser timezone and geolocation now match actual proxy IP location
- City-level proxy targeting already in place via Evomi _city- parameter
- Add browser-factory.ts shared utility for antidetect setup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 23:49:25 -07:00
Kelly
aac1181f3d perf(analytics): Fix 7.5s national summary endpoint
- Use denormalized d.product_count instead of JOIN to store_products
- Remove expensive per-product aggregations (avg_price, brand counts, stock)
- Query now runs in <100ms instead of 7.5s

The massive JOIN between dispensaries and store_products was causing
the slow load. State metrics now use pre-computed product_count column.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 23:31:10 -07:00
Kelly
4eaf7e50d7 feat(ui): Add dropdown for Add User/Origin button
- Single dropdown button shows both options
- Selecting an option switches to that tab and opens modal
- Cleaner UX than separate buttons per tab

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 23:06:42 -07:00