Commit Graph

176 Commits

Author SHA1 Message Date
Kelly
83f629fec4 feat: Add identity pool for diverse IP/fingerprint rotation
- Add worker_identities table and metro_areas for city groupings
- Create IdentityPoolService for claiming/releasing identities
- Each identity used for 3-5 tasks, then 2-3 hour cooldown
- Integrate with task-worker via USE_IDENTITY_POOL feature flag
- Update puppeteer-preflight to accept custom proxy URLs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 18:46:58 -07:00
Kelly
d02c347ef6 fix(proxy): Use correct Evomi host rp.evomi.com
Was using rpc.evomi.com which doesn't exist.
Residential proxies use rp.evomi.com:1000

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 17:35:43 -07:00
Kelly
d779a08bbf fix(preflight): Use Evomi proxy API before falling back to DB pool
The preflight was only checking DB proxy pool which was empty.
Now checks Evomi API first (when configured), then falls back to DB.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 17:14:23 -07:00
Kelly
1490c60d2a fix(ui): Update TasksDashboard badges for consistency
- Platform badge now shows green (emerald) for dutchie even when null/undefined
- State badge shows "ALL" (uppercase) with indigo color when no state specified
- Remove "(HTTP transport)" from store discovery description

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 17:10:57 -07:00
Kelly
ba15802a77 perf(puppeteer): Block analytics/tracking domains to save proxy bandwidth
Block requests to non-essential domains:
- googletagmanager.com, google-analytics.com (analytics)
- launchdarkly.com (feature flags)
- assets2.dutchie.com (CDN assets - we only need GraphQL)
- sentry.io (error tracking)
- segment.io/segment.com, amplitude.com, mixpanel.com (analytics)
- hotjar.com, fullstory.com (session recording)

Applied to both product-discovery-dutchie.ts and puppeteer-preflight.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 16:50:28 -07:00
Kelly
cf99ef9e09 fix(worker): Use Evomi API first, DB proxies as fallback
- Check Evomi API availability before waiting for DB proxies
- If EVOMI_USER/EVOMI_PASS configured, proceed immediately
- Only fall back to DB proxy polling if Evomi not configured
- Added clear comments explaining proxy initialization order

This fixes workers getting stuck waiting for DB proxies when
Evomi API is available for on-demand geo-targeted proxies.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 16:45:52 -07:00
Kelly
023cfc127f fix(preflight): Apply stored fingerprint to task browser
- Add WorkerFingerprint interface with timezone, city, state, ip, locale
- Store fingerprint in TaskWorker after preflight passes
- Pass fingerprint through TaskContext to handlers
- Apply timezone via CDP and locale via Accept-Language header
- Ensures browser fingerprint matches proxy IP location

This fixes anti-detect detection where timezone/locale mismatch
with proxy IP was getting blocked by Cloudflare.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 16:40:52 -07:00
Kelly
d7da0b938d feat(jane): Direct Algolia product fetch and multi-platform product-refresh
- Add fetchProductsByStoreIdDirect() for reliable Algolia product fetching
- Update product-discovery-jane to use direct Algolia instead of network interception
- Fix product-refresh handler to support both Dutchie and Jane payloads
  - Handle both `products` (Dutchie) and `hits` (Jane) formats
  - Use platform-appropriate raw_json structure for normalizers
  - Fix consecutive_misses tracking to use correct provider
  - Extract product IDs correctly (Dutchie _id vs Jane product_id)
- Add store discovery deduplication (prefer REC over MED at same location)
- Add storeTypes field to DiscoveredStore interface
- Add scripts: run-jane-store-discovery.ts, run-jane-product-discovery.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 16:05:50 -07:00
Kelly
88e590d026 feat: Worker geo sessions for state-based task assignment
Workers are now geo-locked to a specific state for their session:
- Session = 60 minutes OR 7 store visits (whichever comes first)
- Workers ONLY claim tasks matching their assigned state
- State assignment prioritizes: most pending tasks, fewest workers

Changes:
- Migration 108: geo session columns, claim_task with geo filter,
  assign_worker_geo(), check_worker_geo_session(), worker_state_capacity view
- task-worker.ts: ensureGeoSession() method before task claiming
- worker-registry.ts: /state-capacity and /geo-sessions API endpoints
- WorkersDashboard: Show qualified icon + geo state in Preflight column

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 16:00:09 -07:00
Kelly
c215d11a84 feat: Platform isolation, Evomi geo-targeting, proxy management
Platform isolation:
- Rename handlers to {task}-{platform}.ts convention
- Deprecate -curl variants (now _deprecated-*)
- Platform-based routing in task-worker.ts
- Add Jane platform handlers and client

Evomi geo-targeting:
- Add dynamic proxy URL builder with state/city targeting
- Session stickiness per worker per state (30 min)
- Fallback to static proxy table when API unavailable
- Add proxy tracking columns to worker_tasks

Proxy management:
- New /proxies admin page for visibility
- Track proxy_ip, proxy_geo, proxy_source per task
- Show active sessions and task history

Validation filtering:
- Filter by validated stores (platform_dispensary_id + menu_url)
- Mark incomplete stores as deprecated
- Update all dashboard/stats queries

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 15:16:48 -07:00
Kelly
59e0e45f8f feat(discovery): Add self-healing and rename schedule
- Rename 'store_discovery_dutchie' to 'Store Discovery' (platform badge via platform field)
- Add self-healing: scan for stores missing payloads and queue product_discovery
- Catches stores added before chaining was implemented
- Limits to 50 stores per run to avoid overwhelming the system

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 14:14:21 -07:00
Kelly
e9a688fbb3 feat(api): Add stage field to dispensary PUT endpoint
Allows updating dispensary stage via API for better data management.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 13:37:47 -07:00
Kelly
8b3ae40089 feat: Remove Run Now, add source tracking, optimize dashboard
- Remove /run-now endpoint (use task priority instead)
- Add source tracking to worker_tasks (source, source_schedule_id, source_metadata)
- Parallelize dashboard API calls (Promise.all)
- Add 1-5 min caching to /markets/dashboard and /national/summary
- Add performance indexes for dashboard queries

Migrations:
- 104: Task source tracking columns
- 105: Dashboard performance indexes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 13:23:35 -07:00
Kelly
a8fec97bcb feat: Support per-dispensary schedules (not just per-state)
- Add dispensary_id column to task_schedules table
- Update scheduler to handle single-dispensary schedules
- Update run-now endpoint to handle single-dispensary schedules
- Update frontend modal to pass dispensary_id when 1 store selected
- Fix existing "Deeply Rooted Hourly" schedule with dispensary_id=112

Now when you select ONE store and check "Make recurring", it creates
a schedule that runs for that specific store every interval.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 12:03:08 -07:00
Kelly
c969c7385b fix: Handle product_refresh and payload_fetch in run-now endpoint
The run-now endpoint only fanned out to stores for product_discovery
schedules, not product_refresh or payload_fetch. This caused single
tasks to be created without dispensary_id, which then failed.

Now all crawl roles (product_discovery, product_refresh, payload_fetch)
with state_code properly fan out to individual store tasks.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 03:49:10 -07:00
Kelly
5084cb1a85 fix: Block images/fonts/media in Puppeteer to save bandwidth
Add request interception to all Puppeteer handlers to block unnecessary
resources (images, fonts, media, stylesheets). We only need HTML/JS for
the session cookie, then the GraphQL JSON response.

This was causing 2.4GB of bandwidth from assets2.dutchie.com - every
page visit downloaded all product thumbnails, logos, etc.

Files updated:
- product-discovery-http.ts
- entry-point-discovery.ts
- store-discovery-http.ts
- store-discovery-state.ts
- puppeteer-preflight.ts

Note: Product images from payload are still downloaded once to MinIO
via image-storage.ts - this only blocks browser-rendered page images.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 03:28:12 -07:00
Kelly
ec6843dfd6 feat: Add working hours for natural traffic patterns
Workers check their timezone (from preflight IP geolocation) and current
hour's weight probability to determine availability. This creates natural
traffic patterns - more workers active during peak hours, fewer during
off-peak. Tasks queue up at night and drain during the day.

Migrations:
- 099: working_hours table with hourly weights by profile
- 100: Add timezone column to worker_registry
- 101: Store timezone from preflight IP geolocation
- 102: check_working_hours() function with probability roll

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 03:24:42 -07:00
Kelly
268429b86c feat: Use MinIO for permanent product image storage
- Rewrite image-storage.ts to use MinIO instead of ephemeral local filesystem
- Images downloaded ONCE from Dutchie CDN, stored permanently in MinIO
- Check MinIO before downloading (skipIfExists) to avoid re-downloads
- Convert images to webp before storage
- Storage path: images/products/<state>/<store>/<brand>/<product>/image-<hash>.webp
- Public URL: https://cdn.cannabrands.app/cannaiq/images/...

This fixes the 2.4GB bandwidth issue from repeatedly downloading images
that were lost when K8s pods restarted.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 03:24:22 -07:00
Kelly
5c08135007 feat(plugin): Add Elementor dynamic tags and product loop widget v1.7.0
WordPress Plugin:
- Add dynamic tags for all product payload fields (name, brand, price, THC, effects, etc.)
- Add Product Loop widget with filtering, sorting, and layout options
- Register CannaIQ widget category in Elementor
- Update build script to auto-upload to MinIO CDN
- Remove legacy dutchie references
- Bump version to 1.7.0

Backend:
- Redirect /downloads/* to CDN instead of serving from local filesystem

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 03:10:01 -07:00
Kelly
9f0d68d4c9 Revert "feat: Store full Dutchie payload in latest_raw_payload"
This reverts commit e11400566e.
2025-12-13 02:33:30 -07:00
Kelly
e11400566e feat: Store full Dutchie payload in latest_raw_payload
Now stores the complete raw product JSON from Dutchie on every
product refresh. This enables querying any Dutchie field
(terpenes, effects, description, etc.) without schema changes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 02:31:00 -07:00
Kelly
983cd71fc2 feat: Performance optimizations and preflight improvements
- Add missing /api/analytics/national/summary endpoint
- Optimize dashboard activity queries (subquery vs JOIN+GROUP BY)
- Add PreflightSummary component to Workers page with gold qualified badge
- Add preflight retry logic - workers retry every 30s until qualified
- Run stale task cleanup on ALL workers (not just worker-0)
- Add preflight fields to worker-registry API (ip, fingerprint, is_qualified)

Database indexes added:
- idx_store_products_created_at (for recent products)
- idx_dispensaries_last_crawl_at (for recent scrapes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 02:06:33 -07:00
Kelly
7849ee0256 feat: Add POST /api/tasks/fix-null-methods endpoint
Updates null method tasks to 'http' for proper worker qualification

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:33:09 -07:00
Kelly
432842f442 fix: Ensure all crawl tasks use method='http' transport
- product_discovery → product_refresh now sets method: 'http'
- product_refresh → entry_point_discovery now sets method: 'http'
- All crawl tasks now require HTTP preflight to claim

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:31:22 -07:00
Kelly
94ebbb2497 fix: State dropdown and locked platform in schedule modal
- State Code → State dropdown with available states
- Platform field locked to 'dutchie' (read-only)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:26:58 -07:00
Kelly
b7e96359ef feat: Auto-retry failed tasks with exponential backoff
- Hard failures now auto-retry up to 3 times
- Exponential backoff: 5, 10, 20 minutes
- Only permanently fails after max retries exceeded
- Soft failures still requeue immediately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:19:48 -07:00
Kelly
b1c1955082 feat: Add POST /api/tasks/retry-failed endpoint
Resets failed tasks back to pending for retry.
Options: role (filter), max_age_hours (default 24), limit (default 100)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:18:07 -07:00
Kelly
7067db68fc feat: Add server-side brand search to Intelligence page
- Backend: Add 'search' param to /api/admin/intelligence/brands
- Frontend: Debounced search triggers server-side query
- Now searches ALL brands, not just top 500

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:14:02 -07:00
Kelly
271faf0f00 perf: Optimize dashboard queries for faster load times
- Use pg_stat for approximate product count (instant vs full scan)
- LIMIT on DISTINCT queries for brand/category counts
- Single combined query (reduces round trips)
- Add index on store_product_snapshots.captured_at
- Add index on worker_tasks.worker_id and created_at

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 01:09:02 -07:00
Kelly
291a8279bd fix(entry-point-discovery): Self-healing duplicate detection
When resolving platform_dispensary_id, check if it already exists on
another dispensary. If so, mark current dispensary as duplicate instead
of failing with unique constraint violation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:46:10 -07:00
Kelly
b69d03c02f feat: Add stage checkpoints to task handlers and fix worker name display
Stage checkpoints (observational, non-blocking):
- product_refresh: success → 'production', failure tracking → 'failing' after 3
- product_discovery: success → 'hydrating', failure tracking
- entry_point_discovery: success → 'promoted', failure tracking

Worker name fix:
- Join worker_registry in tasks query to get friendly_name directly
- Update TasksDashboard to use worker_name from joined query
- Fallback to registry lookup then pod ID suffix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:43:00 -07:00
Kelly
54f59c6082 fix(analytics): Fix market-summary store count and add search indexes
- market-summary now counts from store_products table (not product_variants)
- Added trigram indexes for fast ILIKE product searches

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:35:17 -07:00
Kelly
24dd301d84 fix: State stores endpoint returns only Dutchie stores with products
- Filter by menu_type = 'dutchie'
- Use INNER JOIN + HAVING to only return stores with products
- Stores without product discovery are excluded

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:15:23 -07:00
Kelly
1d6211db19 perf: Add store_intelligence_cache for fast /intelligence/stores
- Remove costly correlated subquery (snapshot_count) from /stores endpoint
- Add migration 092 for store_intelligence_cache table
- Update analytics_refresh to populate cache with pre-computed metrics
- Add /intelligence/stores/cached endpoint using cache table

Performance: O(n*m) → O(1) for snapshot counts, ~10x faster response

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-13 00:13:41 -07:00
Kelly
e62f927218 feat: Auto-retry failed proxies after cooldown period
- Add last_failed_at column to track failure time
- Failed proxies auto-retry after 4 hours (configurable)
- Proxies permanently failed after 10 failures
- Add /retry-stats and /reenable-failed API endpoints
- markProxySuccess() re-enables recovered proxies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 00:08:44 -07:00
Kelly
470097eb19 fix: Intelligence stores endpoint and UI consistency
- Fix stores endpoint to only show stores with actual products (INNER JOIN + HAVING)
- Update badge colors to match Workers/Tasks dashboard style
- Use emerald/amber/red/gray color scheme consistently
- Chain badge now uses purple (bg-purple-100)
- Add migration 092 to fix Trulieve store URLs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 23:37:28 -07:00
Kelly
5af86edf83 feat: Update last_payload_at and last_store_discovery_at timestamps
- payload-storage.ts: Update dispensaries.last_payload_at when saving payload
- promotion.ts: Update dispensaries.last_store_discovery_at on INSERT/UPDATE

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-12 23:30:57 -07:00
Kelly
55b26e9153 feat: Auto-healing entry_point_discovery with browser-first transport
- Rewrote entry_point_discovery with auto-healing scheme:
  1. Check dutchie_discovery_locations for existing platform_location_id
  2. Browser-based GraphQL with 5x network retries
  3. Mark as needs_investigation on hard failure
- Browser (Puppeteer) is now DEFAULT transport - curl only when explicit
- Added migration 091 for tracking columns:
  - last_store_discovery_at: When store_discovery updated record
  - last_payload_at: When last product payload was saved
- Updated CODEBASE_MAP.md with transport rules documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-12 22:55:21 -07:00
Kelly
a6f09ee6e3 fix: Calculate stale task count from heartbeat age 2025-12-12 22:15:16 -07:00
Kelly
c62f8cbf06 feat: Parallelized store discovery, modification tracking, and task deduplication
Store Discovery Parallelization:
- Add store_discovery_state handler for per-state parallel discovery
- Add POST /api/tasks/batch/store-discovery endpoint
- 8 workers can now process states in parallel (~30-45 min vs 3+ hours)

Modification Tracking (Migration 090):
- Add last_modified_at, last_modified_by_task, last_modified_task_id to dispensaries
- Add same columns to store_products
- Update all handlers to set tracking info on modifications

Stale Task Recovery:
- Add periodic stale cleanup every 10 minutes (worker-0 only)
- Prevents orphaned tasks from blocking queue after worker crashes

Task Deduplication:
- createStaggeredTasks now skips if pending/active task exists for same role
- Skips if same role completed within last 4 hours
- API responses include skipped count

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2025-12-12 22:15:04 -07:00
kelly
e4e8438d8b Merge pull request 'feat: Worker improvements and Run Now duplicate prevention' (#64) from feat/minio-payload-storage into master 2025-12-13 03:35:48 +00:00
Kelly
822d2b0609 feat: Idempotent entry_point_discovery with bulk endpoint
- Track id_resolution_status, attempts, and errors in handler
- Add POST /api/tasks/batch/entry-point-discovery endpoint
- Skip already-resolved stores, retry failed with force flag
2025-12-12 20:27:36 -07:00
Kelly
4ea7139ed5 feat: Add step reporting to all task handlers
Added updateStep() calls to:
- payload-fetch-curl: loading → preflight → fetching → saving
- product-refresh: loading → normalizing → upserting
- store-discovery-http: starting → preflight → navigating → fetching

This enables real-time visibility of worker progress in the dashboard.
2025-12-12 20:14:00 -07:00
Kelly
63023a4061 feat: Worker improvements and Run Now duplicate prevention
- Fix Run Now to prevent duplicate task creation
- Add loading state to Run Now button in UI
- Return early when no stores need refresh
- Worker dashboard improvements
- Browser pooling architecture updates
- K8s worker config updates (8 replicas, 3 concurrent tasks)
2025-12-12 20:11:31 -07:00
kelly
13a80e893e Merge pull request 'feat: Add MinIO/S3 support for payload storage' (#63) from feat/minio-payload-storage into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/63
2025-12-12 19:00:29 +00:00
Kelly
c98c409f59 feat: Add MinIO/S3 support for payload storage
- Update payload-storage.ts to use MinIO when configured
- Payloads stored at: cannaiq/payloads/{year}/{month}/{day}/store_{id}_{ts}.json.gz
- Falls back to local filesystem when MINIO_* env vars not set
- Enables shared storage across all worker pods
- Fixes ephemeral storage issue where payloads were lost on pod restart

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 11:30:57 -07:00
kelly
6c8993f7bd Merge pull request 'fix(workers): Increase max concurrent tasks to 15' (#62) from feat/proxy-reload-and-bulk-import into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/62
2025-12-12 18:19:04 +00:00
Kelly
92f88fdcd6 fix(workers): Increase max concurrent tasks to 15 and add K8s permission rule
- Change MAX_CONCURRENT_TASKS default from 3 to 15
- Add CLAUDE.md rule requiring explicit permission before kubectl commands

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 10:54:33 -07:00
kelly
fd4a9b1434 Merge pull request 'feat(scheduler): Immutable schedules and HTTP-only pipeline' (#61) from feat/proxy-reload-and-bulk-import into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/61
2025-12-12 16:37:16 +00:00
Kelly
832ef1cf83 feat(scheduler): Immutable schedules and HTTP-only pipeline
## Changes
- **Migration 089**: Add is_immutable and method columns to task_schedules
  - Per-state product_discovery schedules (4h default)
  - Store discovery weekly (168h)
  - All schedules use HTTP transport (Puppeteer/browser)
- **Task Scheduler**: HTTP-only product discovery with per-state scheduling
  - Each state has its own immutable schedule
  - Schedules can be edited (interval/priority) but not deleted
- **TasksDashboard UI**: Full immutability support
  - Lock icon for immutable schedules
  - State and Method columns in schedules table
  - Disabled delete for immutable, restricted edit fields
- **Store Discovery HTTP**: Auto-queue product_discovery for new stores
- **Migration 088**: Discovery payloads storage schema

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 09:24:08 -07:00