Commit Graph

346 Commits

Author SHA1 Message Date
Kelly
2f483b3084 feat: SEO template library, discovery pipeline, and orchestrator enhancements
## SEO Template Library
- Add complete template library with 7 page types (state, city, category, brand, product, search, regeneration)
- Add Template Library tab in SEO Orchestrator with accordion-based editors
- Add template preview, validation, and variable injection engine
- Add API endpoints: /api/seo/templates, preview, validate, generate, regenerate

## Discovery Pipeline
- Add promotion.ts for discovery location validation and promotion
- Add discover-all-states.ts script for multi-state discovery
- Add promotion log migration (067)
- Enhance discovery routes and types

## Orchestrator & Admin
- Add crawl_enabled filter to stores page
- Add API permissions page
- Add job queue management
- Add price analytics routes
- Add markets and intelligence routes
- Enhance dashboard and worker monitoring

## Infrastructure
- Add migrations for worker definitions, SEO settings, field alignment
- Add canonical pipeline for scraper v2
- Update hydration and sync orchestrator
- Enhance multi-state query service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 00:05:34 -07:00
Kelly
9711d594db feat(orchestrator): Add crawl_enabled filter to stores page
- Backend: Filter stores by crawl_enabled (default: enabled only)
- API: Support crawl_enabled param in getOrchestratorStores
- UI: Add Enabled/Disabled/All filter toggle buttons
- UI: Show crawl status icon in stores table

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 14:18:28 -07:00
Kelly
39aebfcb82 fix: Static file paths and crawl_enabled API filters
- Fix static file paths for local development (./public/* instead of /app/public/*)
- Add crawl_enabled and dutchie_verified filters to /api/stores and /api/dispensaries
- Default API to return only enabled stores (crawl_enabled=true)
- Add ?crawl_enabled=false to show disabled, ?crawl_enabled=all to show all

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 14:07:17 -07:00
Kelly
5415cac2f3 feat(seo): Add SEO tables to migration and ingress config
- Add seo_pages and seo_page_contents tables to migrate.ts for
  automatic creation on deployment
- Update Home.tsx with minor formatting
- Add ingress configuration updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 12:58:38 -07:00
kelly
70d2364a6f Merge pull request 'feat: Rename WordPress plugin to CannaIQ Menus v1.5.3' (#3) from feature/cannaiq-menus-plugin-rename into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/3
2025-12-08 18:47:15 +00:00
Kelly
b1ab45f662 fix: Fix TypeScript errors for CI
- Add export {} to cli.ts and discover-and-import-store.ts to treat as modules
- Remove scrapeStore reference in crawler-jobs.ts (legacy scraper removed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 10:45:47 -07:00
Kelly
20300edbb8 fix: Remove platform_id_source from harmonization INSERT
Production DB doesn't have this column - removing to allow creates.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 10:23:19 -07:00
Kelly
b7cfec0770 feat: AZ dispensary harmonization with Dutchie source of truth
Major changes:
- Add harmonize-az-dispensaries.ts script to sync dispensaries with Dutchie API
- Add migration 057 for crawl_enabled and dutchie_verified fields
- Remove legacy dutchie-az module (replaced by platforms/dutchie)
- Clean up deprecated crawlers, scrapers, and orchestrator code
- Update location-discovery to not fallback to slug when ID is missing
- Add crawl-rotator service for proxy rotation
- Add types/index.ts for shared type definitions
- Add woodpecker-agent k8s manifest

Harmonization script:
- Queries ConsumerDispensaries API for all 32 AZ cities
- Matches dispensaries by platform_dispensary_id (not slug)
- Updates existing records with full Dutchie data
- Creates new records for unmatched Dutchie dispensaries
- Disables dispensaries not found in Dutchie

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 10:19:49 -07:00
Kelly
948a732dd5 feat: Rename WordPress plugin to CannaIQ Menus v1.5.3
- Rename plugin from Crawlsy Menus to CannaIQ Menus
- Update version to 1.5.3
- Update text domain to cannaiq-menus
- Rename all CSS classes from crawlsy-* to cannaiq-*
- Update shortcodes to [cannaiq_products] and [cannaiq_product]
- Add backward compatibility for legacy shortcodes
- Update download links on Home and LandingPage
- Fix health panel Redis timeout issue
- Add clear error message when backend not running

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 00:24:43 -07:00
Kelly
bf4ceaf09e fix: Make all migrations idempotent
- 008: Add IF NOT EXISTS to ALTER TABLE ADD COLUMN
- 011: Add IF NOT EXISTS to CREATE TABLE and INDEX
- 012: Add IF NOT EXISTS, DROP TRIGGER IF EXISTS
- 013: Add ON CONFLICT (azdhs_id) DO NOTHING
- 014: Add IF NOT EXISTS to ALTER TABLE ADD COLUMN

All migrations can now be safely re-run without errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 23:48:35 -07:00
kelly
fda688b11a Merge pull request 'feat: Responsive UI, SEO pages, AI content generation' (#2) from feature/workers-dashboard into master 2025-12-08 06:43:36 +00:00
Kelly
f7081838cf feat: Add AI settings to database (provider/model configurable via UI)
- ai_provider and ai_model stored in settings table
- Editable via /settings page in admin UI
- API keys remain in env vars for security
- Falls back to env vars if settings not in DB

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 23:18:52 -07:00
Kelly
4a5af1bd5f feat: Add AI_MODEL env var for custom model selection
Set AI_MODEL to any model name (e.g., gpt-4.1, claude-opus-4-20250514)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 23:03:09 -07:00
Kelly
d813874b3a feat: Add Claude and OpenAI support for SEO content generation
Configure via env vars:
- AI_PROVIDER=claude|openai (default: claude)
- ANTHROPIC_API_KEY=sk-ant-...
- OPENAI_API_KEY=sk-...

Falls back to template generation if no API key configured.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:57:59 -07:00
Kelly
a3b7ae9802 fix: Fix SEO generator - lazy pool init, fallback queries
- Move pool initialization inside functions (lazy loading)
- Fix page_key parsing for state-XX format
- Add fallback query if mv_state_metrics doesn't exist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:56:45 -07:00
Kelly
7a1835778b fix: Add missing SEO pages list and sync endpoints
- GET /api/seo/pages - List all SEO pages with filters
- POST /api/seo/sync-state-pages - Create pages for states with dispensaries

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:51:35 -07:00
Kelly
3bc0effa33 feat: Responsive admin UI, SEO pages, and click analytics
## Responsive Admin UI
- Layout.tsx: Mobile sidebar drawer with hamburger menu
- Dashboard.tsx: 2-col grid on mobile, responsive stats cards
- OrchestratorDashboard.tsx: Responsive table with hidden columns
- PagesTab.tsx: Responsive filters and table

## SEO Pages
- New /admin/seo section with state landing pages
- SEO page generation and management
- State page content with dispensary/product counts

## Click Analytics
- Product click tracking infrastructure
- Click analytics dashboard

## Other Changes
- Consumer features scaffolding (alerts, deals, favorites)
- Health panel component
- Workers dashboard improvements
- Legacy DutchieAZ pages removed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:48:21 -07:00
Kelly
38d3ea1408 chore: Remove build artifacts from tracking
These files are now in .gitignore and should not be committed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 22:45:23 -07:00
kelly
414b97b3c0 Merge pull request 'feature/workers-dashboard' (#1) from feature/workers-dashboard into master
Reviewed-on: https://code.cannabrands.app/Creationshop/dispensary-scraper/pulls/1
2025-12-08 02:54:26 +00:00
Kelly
05c55809b6 fix: Regenerate findagram package-lock.json for npm ci compatibility
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 19:16:27 -07:00
Kelly
705bb57a23 fix: Remove unused imports and fix ESLint errors in findadispo
- Remove unused MapPin import from Dashboard.jsx
- Remove unused Clock import from DashboardHome.jsx
- Remove unused Mail import from Profile.jsx
- Remove unused CardHeader, CardTitle imports from SavedSearches.jsx
- Add eslint-disable for heading-has-content in card.jsx (shadcn pattern)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 19:02:40 -07:00
Kelly
17ca0bd3ee fix: Sync findadispo package-lock.json with package.json 2025-12-07 15:31:37 -07:00
Kelly
112e127b5d ci: trigger build 2025-12-07 15:20:22 -07:00
Kelly
c5a8ef84bf chore: Add package-lock.json for findadispo 2025-12-07 14:11:42 -07:00
Kelly
dd299e0d4c fix: Remove broken llm-scraper submodule reference 2025-12-07 13:53:21 -07:00
Kelly
a91565ca5a ci: Use woodpeckerci/plugin-docker-buildx and fix secrets syntax 2025-12-07 13:47:36 -07:00
Kelly
9e30e806f9 ci: Rename to .ci.yml to match woodpecker convention 2025-12-07 13:38:17 -07:00
Kelly
c779e6919f ci: Fix woodpecker syntax - use steps instead of pipeline 2025-12-07 13:21:00 -07:00
Kelly
566872eae8 ci: trigger pipeline test 2025-12-07 13:19:14 -07:00
Kelly
2d82cf9323 ci: Move pipeline to .woodpecker/ci.yml 2025-12-07 13:06:23 -07:00
Kelly
861201290a ci: Switch to Woodpecker CI pipeline
Replaces Gitea Actions with Woodpecker CI config.

Pipeline:
- CI: typecheck backend, build all 3 frontends (all branches)
- CD: build 4 Docker images, deploy to k8s (master only)

Required secrets in Woodpecker:
- registry_username
- registry_password
- kubeconfig_data (base64 encoded)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 12:18:13 -07:00
Kelly
84cdc1c12c chore: trigger CI test 2025-12-07 12:14:26 -07:00
Kelly
c6ab066d25 ci: Add Gitea Actions CI/CD pipeline
- ci.yml: Runs on all branches - typecheck backend, build all 3 frontends
- deploy.yml: Runs on master only after CI passes
  - Builds and pushes 4 Docker images to Gitea registry
  - Deploys to Kubernetes (scraper, scraper-worker, 3 frontends)

Required secrets:
- REGISTRY_USERNAME: Gitea username
- REGISTRY_PASSWORD: Gitea password/token
- KUBECONFIG: Base64-encoded kubeconfig

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 12:10:19 -07:00
Kelly
b4a2fb7d03 feat: Add v2 architecture with multi-state support and orchestrator services
Major additions:
- Multi-state expansion: states table, StateSelector, NationalDashboard, StateHeatmap, CrossStateCompare
- Orchestrator services: trace service, error taxonomy, retry manager, proxy rotator
- Discovery system: dutchie discovery service, geo validation, city seeding scripts
- Analytics infrastructure: analytics v2 routes, brand/pricing/stores intelligence pages
- Local development: setup-local.sh starts all 5 services (postgres, backend, cannaiq, findadispo, findagram)
- Migrations 037-056: crawler profiles, states, analytics indexes, worker metadata

Frontend pages added:
- Discovery, ChainsDashboard, IntelligenceBrands, IntelligencePricing, IntelligenceStores
- StateHeatmap, CrossStateCompare, SyncInfoPanel

Components added:
- StateSelector, OrchestratorTraceModal, WorkflowStepper

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 11:30:57 -07:00
Kelly
8ac64ba077 feat(cannaiq): Add Workers Dashboard and visibility tracking
Workers Dashboard:
- New /workers route with two-pane layout
- Workers table showing Alice, Henry, Bella, Oscar with role badges
- Run history with visibility stats (lost/restored counts)
- "Run Now" action to trigger workers immediately

Migrations:
- 057: Add visibility tracking columns (visibility_lost, visibility_lost_at, visibility_restored_at)
- 058: Add ID resolution columns for Henry worker
- 059: Add job queue columns (max_retries, retry_count, worker_id, locked_at, locked_by)

Backend fixes:
- Add httpStatus to CrawlResult interface for error classification
- Fix pool.ts typing for event listener
- Update completeJob to accept visibility stats in metadata

Frontend fixes:
- Fix NationalDashboard crash with safe formatMoney helper
- Fix OrchestratorDashboard/Stores StoreInfo type mismatches
- Add workerName/workerRole to getDutchieAZSchedules API type

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 11:04:12 -07:00
Kelly
1d1263afc6 docs: Add mandatory local mode checklist for crawls and tests
CLAUDE.md now requires explicit local mode confirmation before
running any crawler, orchestrator, sandbox test, or image scrape.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 12:37:48 -07:00
Kelly
e63329457c feat: Add local storage adapter and update CLAUDE.md with permanent rules
- Add local-storage.ts with smart folder structure:
  /storage/products/{brand}/{state}/{product_id}/
- Add storage-adapter.ts unified abstraction
- Add docker-compose.local.yml (NO MinIO)
- Add start-local.sh convenience script
- Update CLAUDE.md with:
  - PERMANENT RULES section (no data deletion)
  - DEPLOYMENT AUTHORIZATION requirements
  - LOCAL DEVELOPMENT defaults
  - STORAGE BEHAVIOR documentation
  - FORBIDDEN ACTIONS list
  - UI ANONYMIZATION rules

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 12:36:10 -07:00
Kelly
a0f8d3911c feat: Add Findagram and FindADispo consumer frontends
- Add findagram.co React frontend with product search, brands, categories
- Add findadispo.com React frontend with dispensary locator
- Wire findagram to backend /api/az/* endpoints
- Update category/brand links to route to /products with filters
- Add k8s manifests for both frontends
- Add multi-domain user support migrations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 16:10:15 -07:00
Kelly
d120a07ed7 fix(cannaiq): Fix hardcoded localhost URLs in product image paths
- Add .dockerignore to exclude .env.local from Docker builds
- Replace http://localhost:9020/dutchie/ with /api/images/dutchie/ in:
  - StoreDetail.tsx
  - ProductDetail.tsx
  - StoreView.tsx

This fixes production builds connecting to localhost for images.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 15:32:27 -07:00
Kelly
85e69ef6ad feat(api): Add API key scoping for /api/v1 endpoints
- Add key_type column to wp_dutchie_api_permissions (internal/wordpress)
- Create apiScope middleware with scope types and helpers
- Internal keys: full access to ALL dispensaries
- WordPress keys: restricted to single dispensary
- Update all /api/v1 handlers to honor scope
- Add /dispensaries and /search endpoints to public API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 06:13:20 -07:00
Kelly
d91c55a344 feat: Add stale process monitor, users route, landing page, archive old scripts
- Add backend stale process monitoring API (/api/stale-processes)
- Add users management route
- Add frontend landing page and stale process monitor UI on /scraper-tools
- Move old development scripts to backend/archive/
- Update frontend build with new features

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 04:07:31 -07:00
Kelly
d2d44d2aeb Improve ScraperMonitor tab loading efficiency and add New/Updated columns
- Tab-specific data loading: Only fetch APIs needed for the active tab
- AZ Live tab fetches only AZ monitor APIs
- Dispensary Jobs tab fetches only legacy job APIs
- Crawl History tab fetches only scraper history APIs
- Auto-refresh now respects active tab
- Added New and Updated columns to Crawl History table with color coding

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 18:26:56 -07:00
Kelly
b082b2cf05 Add curaleaf/sol dutchie detection, update batch crawl script with all 57 store IDs
- Add curaleaf.com and livewithsol.com to dutchie detection patterns
- Update crawl-five-sequential.ts with all 57 dutchie store IDs for batch crawling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 01:00:56 -07:00
Kelly
ff1510f475 feat(api): add bulk crawl endpoints for all Dutchie stores
- GET /api/az/admin/dutchie-stores - Lists all Dutchie stores with crawl status
- POST /api/az/admin/crawl-all - Enqueues product crawl jobs for all ready stores

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 00:07:20 -07:00
Kelly
1083b51e6d feat(detection): extract reactEnv dispensaryId and prefer Dutchie on page 2025-12-03 22:31:40 -07:00
Kelly
129d318314 fix(detection): crawl websites to find Dutchie menus and retry missing platform IDs 2025-12-03 22:16:31 -07:00
Kelly
33e12de3f2 Remove domain-based shortcuts for Curaleaf/Sol detection
Menu detection now always crawls websites to find actual embedded menu
providers instead of marking stores as proprietary based on domain alone.
This fixes detection for stores like Curaleaf that may use Dutchie embeds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:55:31 -07:00
Kelly
95fc8bb4cc Improve menu detection to extract platform ID from URL and crawl proprietary domains
- Add extractFromMenuUrl() to discovery.ts that extracts either cName or platformId directly
  from Dutchie URLs (handles /api/v2/embedded-menu/<id>.js pattern)
- Add isObjectId() helper to identify MongoDB ObjectIds in URLs
- Update menu-detection.ts to skip GraphQL resolution when URL contains platformId directly
- For proprietary domains (curaleaf, sol), crawl website to find actual menu provider
  instead of blindly marking as not_crawlable
- If website crawl finds Dutchie embedded menu, set menu_type='dutchie' and resolve platform ID
- Tested successfully with consumeaz.com which discovers Dutchie embedded menu JS URL

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:41:45 -07:00
Kelly
202a3b92bf Add dbaName to Dispensary type to fix TypeScript build error
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:12:05 -07:00
Kelly
05f4de86d3 Fix detection query to include dutchie stores missing platform_id
When onlyUnknown=true and onlyMissingPlatformId=true, the query now
uses OR logic instead of AND to include:
- Stores with unknown menu_type (new stores needing detection)
- Stores with menu_type='dutchie' but no platform_dispensary_id

This allows re-detection to:
1. Resolve platform IDs for actual Dutchie stores
2. Reclassify stores that migrated away from Dutchie (e.g. to Curaleaf)

Also changed default for onlyMissingPlatformId to true so scheduled
detection jobs always attempt to resolve missing platform IDs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:09:37 -07:00