Overview
An automated n8n workflow that intelligently scrapes, analyzes, and scores rental listings from 51.ca for the Toronto area, helping identify high-quality housing opportunities based on personalized criteria.
Key Features

- Intelligent Web Scraping: Automated daily collection of over 300 rental listings across multiple Toronto neighborhoods (North York, Scarborough, Downtown, Etobicoke, Markham, Pickering.) with built-in rate limiting and error handling
- AI-Powered Scoring System: Custom LLM-based evaluation engine using Google Gemini and Grok models that scores listings 0-10 based on:
- Unit type (whole apartment vs. room)
- Proximity to transit and amenities
- Tenant requirements and preferences
- Price-to-value ratio
- Listing freshness
- Duplicate Detection: Multi-layered deduplication using URL, title, address, and detailed requirements to prevent redundant entries
- Smart Update Tracking: Monitors listing modifications (e.g. update frequency) and price changes over time, maintaining historical data
- Data Enrichment: Extracts detailed requirements and property information using Cheerio for HTML parsing
Technical Stack
- Workflow Automation: n8n, Javascript
- Web Scraping: HTTP requests with Cheerio for parsing Chinese/English content
- AI/ML: Google Gemini 2.5 Flash, Grok 4 Fast (via OpenRouter API)
- Data Storage: Google Sheets for persistent storage and easy analysis
- Language Processing: Structured output parsing for consistent AI responses
Workflow Architecture
- Scheduled Scraping: Daily automated runs across configured Toronto neighborhoods
- Data Extraction: Parses listing details including price, location, requirements, and amenities
- Validation Pipeline: Checks for 404 errors and invalid links (for deleted or renewed posts)
- AI Analysis: Scores listings based on complex multi-criteria evaluation rules
- Deduplication & Updates: Intelligent matching to update existing records or create new entries
- Historical Tracking: Maintains price history and modification dates
Technical Highlights
- Implements complex conditional logic with 50+ interconnected nodes
- Handles bilingual content (Chinese/English)
- Custom JavaScript functions for data transformation
- Rate-limited API calls with randomized delays
- Batch processing with split-and-merge patterns
- Robust error handling and retry mechanisms
Skills Demonstrated
Workflow Automation, Javascript, Web Scraping, LLM Integration, Data Engineering, API Integration, Chinese Language Processing, Google Sheets API, Error Handling, Batch Processing, Real Estate Tech
