Commit Graph

13 Commits

Author SHA1 Message Date
Kelly
56cc171287 feat: Stealth worker system with mandatory proxy rotation
## Worker System
- Role-agnostic workers that can handle any task type
- Pod-based architecture with StatefulSet (5-15 pods, 5 workers each)
- Custom pod names (Aethelgard, Xylos, Kryll, etc.)
- Worker registry with friendly names and resource monitoring
- Hub-and-spoke visualization on JobQueue page

## Stealth & Anti-Detection (REQUIRED)
- Proxies are MANDATORY - workers fail to start without active proxies
- CrawlRotator initializes on worker startup
- Loads proxies from `proxies` table
- Auto-rotates proxy + fingerprint on 403 errors
- 12 browser fingerprints (Chrome, Firefox, Safari, Edge)
- Locale/timezone matching for geographic consistency

## Task System
- Renamed product_resync → product_refresh
- Task chaining: store_discovery → entry_point → product_discovery
- Priority-based claiming with FOR UPDATE SKIP LOCKED
- Heartbeat and stale task recovery

## UI Updates
- JobQueue: Pod visualization, resource monitoring on hover
- WorkersDashboard: Simplified worker list
- Removed unused filters from task list

## Other
- IP2Location service for visitor analytics
- Findagram consumer features scaffolding
- Documentation updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 00:44:59 -07:00
Kelly
0295637ed6 fix: Public API column mappings and OOS detection
- Fix store_products column references (name_raw, brand_name_raw, category_raw)
- Fix v_product_snapshots column references (crawled_at, *_cents pricing)
- Fix dispensaries column references (zipcode, logo_image, remove hours/amenities)
- Add services and license_type to dispensary API response
- Add consecutive_misses OOS tracking to product-resync handler
- Add migration 075 for consecutive_misses column
- Add CRAWL_PIPELINE.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 20:44:53 -07:00
Kelly
1d6e67d837 feat(api): Add store metrics endpoints with localhost bypass
New public API v1 endpoints for third-party integrations:
- GET /api/v1/stores/:id/metrics - Store performance metrics
- GET /api/v1/stores/:id/product-metrics - Product-level price changes
- GET /api/v1/stores/:id/competitor-snapshot - Competitive intelligence

Also adds localhost IP bypass for local development testing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 12:14:13 -07:00
Kelly
aa776226b0 fix(consumer): Wire findagram/findadispo to public API
- Update Dockerfiles to use cannaiq.co as API base URL
- Change findagram API client from /api/az to /api/v1 endpoints
- Add trusted origin bypass in public-api middleware for consumer sites
- Consumer sites (findagram.co, findadispo.com) can now access /api/v1
  endpoints without API key authentication

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 11:04:50 -07:00
Kelly
2f483b3084 feat: SEO template library, discovery pipeline, and orchestrator enhancements
## SEO Template Library
- Add complete template library with 7 page types (state, city, category, brand, product, search, regeneration)
- Add Template Library tab in SEO Orchestrator with accordion-based editors
- Add template preview, validation, and variable injection engine
- Add API endpoints: /api/seo/templates, preview, validate, generate, regenerate

## Discovery Pipeline
- Add promotion.ts for discovery location validation and promotion
- Add discover-all-states.ts script for multi-state discovery
- Add promotion log migration (067)
- Enhance discovery routes and types

## Orchestrator & Admin
- Add crawl_enabled filter to stores page
- Add API permissions page
- Add job queue management
- Add price analytics routes
- Add markets and intelligence routes
- Enhance dashboard and worker monitoring

## Infrastructure
- Add migrations for worker definitions, SEO settings, field alignment
- Add canonical pipeline for scraper v2
- Update hydration and sync orchestrator
- Enhance multi-state query service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 00:05:34 -07:00
Kelly
b7cfec0770 feat: AZ dispensary harmonization with Dutchie source of truth
Major changes:
- Add harmonize-az-dispensaries.ts script to sync dispensaries with Dutchie API
- Add migration 057 for crawl_enabled and dutchie_verified fields
- Remove legacy dutchie-az module (replaced by platforms/dutchie)
- Clean up deprecated crawlers, scrapers, and orchestrator code
- Update location-discovery to not fallback to slug when ID is missing
- Add crawl-rotator service for proxy rotation
- Add types/index.ts for shared type definitions
- Add woodpecker-agent k8s manifest

Harmonization script:
- Queries ConsumerDispensaries API for all 32 AZ cities
- Matches dispensaries by platform_dispensary_id (not slug)
- Updates existing records with full Dutchie data
- Creates new records for unmatched Dutchie dispensaries
- Disables dispensaries not found in Dutchie

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 10:19:49 -07:00
Kelly
3bc0effa33 feat: Responsive admin UI, SEO pages, and click analytics
## Responsive Admin UI
- Layout.tsx: Mobile sidebar drawer with hamburger menu
- Dashboard.tsx: 2-col grid on mobile, responsive stats cards
- OrchestratorDashboard.tsx: Responsive table with hidden columns
- PagesTab.tsx: Responsive filters and table

## SEO Pages
- New /admin/seo section with state landing pages
- SEO page generation and management
- State page content with dispensary/product counts

## Click Analytics
- Product click tracking infrastructure
- Click analytics dashboard

## Other Changes
- Consumer features scaffolding (alerts, deals, favorites)
- Health panel component
- Workers dashboard improvements
- Legacy DutchieAZ pages removed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:48:21 -07:00
Kelly
b4a2fb7d03 feat: Add v2 architecture with multi-state support and orchestrator services
Major additions:
- Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare
- Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator
- Discovery system: dutchie discovery service, geo validation, city seeding scripts
- Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages
- Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram)
- Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata

Frontend pages added:
- Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores
- StateHeatmap, CrossStateCompare, SyncInfoPanel

Components added:
- StateSelector, OrchestratorTraceModal, WorkflowStepper

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 11:30:57 -07:00
Kelly
a0f8d3911c feat: Add Findagram and FindADispo consumer frontends
- Add findagram.co React frontend with product search, brands, categories
- Add findadispo.com React frontend with dispensary locator
- Wire findagram to backend /api/az/* endpoints
- Update category/brand links to route to /products with filters
- Add k8s manifests for both frontends
- Add multi-domain user support migrations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 16:10:15 -07:00
Kelly
85e69ef6ad feat(api): Add API key scoping for /api/v1 endpoints
- Add key_type column to wp_dutchie_api_permissions (internal/wordpress)
- Create apiScope middleware with scope types and helpers
- Internal keys: full access to ALL dispensaries
- WordPress keys: restricted to single dispensary
- Update all /api/v1 handlers to honor scope
- Add /dispensaries and /search endpoints to public API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 06:13:20 -07:00
Kelly
d91c55a344 feat: Add stale process monitor, users route, landing page, archive old scripts
- Add backend stale process monitoring API (/api/stale-processes)
- Add users management route
- Add frontend landing page and stale process monitor UI on /scraper-tools
- Move old development scripts to backend/archive/
- Update frontend build with new features

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 04:07:31 -07:00
Kelly
66e07b2009 fix(monitor): remove non-existent worker columns from job_run_logs query
The job_run_logs table tracks scheduled job orchestration, not individual
worker jobs. Worker info (worker_id, worker_hostname) belongs on
dispensary_crawl_jobs, not job_run_logs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 18:45:05 -07:00
Kelly
917e91297e Add Dutchie AZ data pipeline and public API v1
- Add dutchie-az module with GraphQL product crawler, scheduler, and admin UI
- Add public API v1 endpoints (/api/v1/products, /categories, /brands, /specials, /menu)
- API key auth maps dispensary to dutchie_az store for per-dispensary data access
- Add frontend pages for Dutchie AZ stores, store details, and schedule management
- Update Layout with Dutchie AZ navigation section

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-02 09:43:26 -07:00