Back to projects

Scraping Bars & Restaurants Paris + 92/93/94/95 (Google Maps & Uber Eats)

Scraping Bars & Restaurants Paris + 92/93/94/95 (Google Maps & Uber Eats)
Simon RochwergSimon Rochwerg

Large-scale extraction and enrichment of bars & restaurants in Paris + 92/93/94/95 from Google Maps and Uber Eats. Focus on extracting and verifying emails, capturing all phones (incl. 06/07), cross-source deduplication, and delivery of clean XLSX/CSV files.

🎯 Context & Goals

Build a Bars & Restaurants prospecting dataset for Paris + 92/93/94/95 that maximizes:

  1. verified, high-quality emails, 2) all phone numbers including 06/07, and 3) coverage by combining Google Maps + Uber Eats.

🤖 What the AI does in the pipeline

  1. Website discovery & selection (AI + SERP API)

    • Connect Google Maps entities to SERP API for targeted Google queries.
    • An AI ranker automatically selects the best official website (disambiguates homonyms, ignores marketplaces/directories).
  2. Structured extraction (Parser + AI normalization)

    • Our parser scrapes the selected website and extracts emails, phones (fixed + 06/07), addresses, social links (IG/FB/TikTok), and legal mentions.
    • An AI normalizer standardizes formats (email/phone/address), fixes common anomalies, and completes fields from public hints.
  3. Smart email prioritization (AI scoring)

    • An AI classifier scores each email (e.g., contact@, info@, firstname.lastname@domain, etc.).
    • It prioritizes direct/personalized addresses (outreach-friendly) and de-prioritizes generic, no-reply, or platform addresses.
  4. Anti-bounce verification (technical validation)

    • Syntax (RFC) and MX/DNS checks.
    • Optional SMTP routing check (no real send) to reduce bounce rate.
    • Deduplication within and across sources (GMaps ↔ Uber Eats) using email/phone/URL keys.
  5. Consolidation & quality controls

    • Merge Google Maps (address, category, hours, reviews, site) with Uber Eats (often more 06/07 and additional emails).
    • Paris by arrondissement + communes in 92/93/94/95 for maximum coverage.
    • Filter out non-prospect domains (marketplaces/platforms).
  6. Production-ready deliverables

    • XLSX + CSV (stable schema), recap sheet, field dictionary, source IDs, and extraction timestamps.

🔎 Anonymized sample (domains visible)

(5 rows from the exports; names/coordinates partially masked — domains kept visible)

sourcenameemailphonecity
Google MapsSo p.a***@gmail.com********90Valenton
Google MapsC***r … Cri***@orange.fr********50Chennevières-sur-Marne
Google MapsC***n St *…c***@yahoo.fr********50Puteaux
Google MapsT**k * P****t …i***@laposte.net********40Ivry-sur-Seine
Uber EatsBo R***tc***@restaurant-paris.fr********21Paris 11ᵉ

Final deliveries contain full (unmasked) emails and phone numbers. Masking here is only for the example.


📈 Key Figures (highlights)

  • Paris (Google Maps): Bars 1,205 emails / 2,947 rows (40.9%); Restaurants 1,422 / 3,711 (38.3%).
  • 92/93/94/95 (Google Maps): 3,708 unique emails delivered.
  • Uber Eats (Paris): ~3,000 restaurants in Paris + ~1,800 nearby; strong 06/07 coverage and additional emails.

Why Uber Eats? It’s more “direct” on the merchant side: you frequently find more mobile numbers (06/07) and emails that don’t appear on Google Maps — a powerful complement for outreach.


🧰 Stack & Practices

  • Python, SERP API, parallel workers, robust retry logic.
  • AI for ranking, normalization, and email scoring (prompting + business rules).
  • Email validation (syntax, MX, optional SMTP check), cross-source deduplication.
  • XLSX/CSV exports + field dictionary; execution logs for auditability.

✅ Outcomes

  • A clean, verified, consolidated dataset ready for multi-channel outreach.
  • Significant increase in mobile (06/07) coverage thanks to Uber Eats.
  • Immediate import into CRM / n8n; lower bounce rate and less manual qualification.

💰 Budget & Milestones

  • Google Maps (Paris + 92/93/94/95): €600 incl. VAT.
  • Uber Eats (Paris): €250 incl. VAT (discount applied).
  • Total: €850 incl. VAT — deliveries over ~3 weeks.

Have a similar project?

Let's discuss your data needs together.

Get a Quote