How a Canadian Grocery Aggregator Used Metro.ca Data for Catalog Enrichment

 



At a Glance

Client

  • Canadian grocery technology platform (catalog aggregation focus)

Geography

  • Canada — primary focus on Quebec and Ontario markets

Platforms Scraped

  • Metro.ca (full grocery catalog and product images)

Project Duration

  • 6 weeks initial build

  • Ongoing weekly refreshes thereafter

The Challenge

The client was building a grocery technology platform that needed a clean, comprehensive product catalog of Canadian grocery items — with structured attributes (brand, size, category, nutrition where applicable) and high-quality product images. Building this catalog from scratch through manual entry or supplier feeds was estimated at 9-12 months and significant headcount.

Metro.ca, one of Canada's largest grocery retailers, had a well-structured online catalog with thousands of SKUs across categories — but no public API. The product images were particularly valuable: high-resolution, consistent angle, professional photography that would have taken months to commission.

The client needed a way to extract Metro.ca's catalog structure AND product images at scale, respectfully and within reasonable rate limits.

The Approach

Actowiz Solutions designed a grocery catalog extraction pipeline tailored to Metro.ca's structure:

  • Category tree mapping — full extraction of Metro.ca's category hierarchy across all departments (produce, dairy, frozen, packaged, etc.)

  • SKU-level data extraction — product name, brand, size/weight, price, category tags, and structured attributes for tens of thousands of products

  • Image extraction — high-quality product images downloaded with appropriate attribution metadata

  • Schema normalization — output structured into a clean schema the client could ingest directly into their database

  • Weekly refresh — to catch new SKU launches, discontinuations, and image updates

The Solution Architecture

Grocery catalogs are deceptively complex — variant handling (12oz vs 24oz of the same product), brand-name normalization, and category taxonomy each require careful logic. The extraction pipeline handled these systematically. For image data, the pipeline downloaded images at appropriate resolutions and tagged them with the source SKU for the client's downstream content workflow.

Output was delivered as structured CSVs plus a separate image archive, weekly, with a change-log showing additions and removals from the prior week's snapshot.

Results

  • 40,000+ SKUs extracted from Metro.ca with full attribute data

  • 35,000+ product images downloaded and tagged for client catalog use

  • Catalog build timeline reduced from 9 months to 6 weeks for the Canadian grocery vertical

  • Weekly refresh kept the catalog current with Metro.ca's actual SKU evolution

  • Quality benchmark established — the Metro.ca data became the reference catalog against which the client validated other Canadian grocery data sources

Why This Matters For You

If you're building a grocery, recipe, meal-planning, or food-tech platform that needs a clean Canadian product catalog, building from scratch is structurally slow. Major Canadian grocery retailers (Metro, Loblaws, Sobeys) maintain comprehensive online catalogs with structured data and high-quality images. Automated extraction — done within rate limits and respecting platform terms — turns 9 months of catalog work into 6 weeks of data engineering.

The same approach works for US grocery (Kroger, Walmart, Safeway, Whole Foods), UK grocery (Tesco, Sainsbury's, Waitrose, Ocado), and Australian grocery (Coles, Woolworths) — anywhere a structured online grocery catalog exists.


Comments

Popular posts from this blog

Rappi Menu and Rating Datasets - Monitoring Restaurant Performance

How AI-Powered Web Scraping Delivered Unified Blinkit, Zepto, Zomato, Swiggy, and BigBasket Datasets through a Single API Integration

Scrape McCain FS Product Availability & Stock Status | Actowiz Solutions