- Add backend stale process monitoring API (/api/stale-processes) - Add users management route - Add frontend landing page and stale process monitor UI on /scraper-tools - Move old development scripts to backend/archive/ - Update frontend build with new features 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
71 lines
2.4 KiB
TypeScript
71 lines
2.4 KiB
TypeScript
import { createStealthBrowser, createStealthContext, waitForPageLoad, isCloudflareChallenge, waitForCloudflareChallenge } from './src/utils/stealthBrowser';
|
|
import { getRandomProxy } from './src/utils/proxyManager';
|
|
import { dutchieTemplate } from './src/scrapers/templates/dutchie';
|
|
import { pool } from './src/db/migrate';
|
|
|
|
async function testDutchieTemplate() {
|
|
console.log('🧪 Testing Dutchie scraper template\n');
|
|
|
|
const baseUrl = 'https://dutchie.com/dispensary/sol-flower-dispensary';
|
|
const category = 'Flower';
|
|
|
|
// Build category URL using template
|
|
const categoryUrl = dutchieTemplate.buildCategoryUrl(baseUrl, category);
|
|
console.log(`📂 Category: ${category}`);
|
|
console.log(`🔗 URL: ${categoryUrl}\n`);
|
|
|
|
// Get proxy
|
|
const proxy = await getRandomProxy();
|
|
console.log(`🔍 Using proxy: ${proxy?.server || 'none'}\n`);
|
|
|
|
const browser = await createStealthBrowser({ proxy: proxy || undefined, headless: true });
|
|
|
|
try {
|
|
const context = await createStealthContext(browser, { state: 'Arizona' });
|
|
const page = await context.newPage();
|
|
|
|
console.log('🌐 Loading page...');
|
|
await page.goto(categoryUrl, { waitUntil: 'domcontentloaded', timeout: 60000 });
|
|
|
|
// Check for Cloudflare
|
|
if (await isCloudflareChallenge(page)) {
|
|
console.log('🛡️ Cloudflare detected, waiting...');
|
|
const passed = await waitForCloudflareChallenge(page, 60000);
|
|
if (!passed) {
|
|
console.log('❌ Failed to pass Cloudflare');
|
|
await browser.close();
|
|
await pool.end();
|
|
return;
|
|
}
|
|
}
|
|
|
|
await waitForPageLoad(page);
|
|
|
|
// Wait for content
|
|
await page.waitForTimeout(3000);
|
|
|
|
console.log('\n📦 Extracting products using Dutchie template...');
|
|
const products = await dutchieTemplate.extractProducts(page);
|
|
|
|
console.log(`\n✅ Found ${products.length} products!\n`);
|
|
|
|
if (products.length > 0) {
|
|
console.log('First 10 products:');
|
|
products.slice(0, 10).forEach((product, i) => {
|
|
console.log(`\n${i + 1}. ${product.name}`);
|
|
if (product.brand) console.log(` Brand: ${product.brand}`);
|
|
if (product.price) console.log(` Price: $${product.price}`);
|
|
console.log(` URL: ${product.product_url}`);
|
|
});
|
|
}
|
|
|
|
} catch (error) {
|
|
console.error('❌ Error:', error);
|
|
} finally {
|
|
await browser.close();
|
|
await pool.end();
|
|
}
|
|
}
|
|
|
|
testDutchieTemplate();
|