Skip to content (Press Enter)

Shijun's portfolio

Data Scientist, Engineer, AI, Educator

  • Home
  • Resume
  • Portfolios
  • Blog
  • Contact

    Shijun's Programming and Teaching Portfolio

Data ScienceEducational TechnologyProgrammingPython

(Python) Web Scraping and Data Collection: Transforming Educational Information with Python

Project time: 2020

This project showcases my expertise in web scraping and data analysis. This project leverages the power of Python and Scrapy, a robust web scraping library, to collect and transform educational information into valuable insights.

Project Overview:

This project revolves around the collection and organization of educational data, with a particular focus on UK universities. Using Python's Scrapy library, I harnessed the capabilities of web scraping to gather crucial information from various online sources, ultimately creating a comprehensive database of educational details.

Key Data Collection Areas:

  1. UK University Majors: The project involved scraping data related to the majors and academic programs offered by UK universities. This information is invaluable for prospective students looking to explore their academic options and make informed decisions about their educational paths.

  2. University News: Beyond simple data collection, I took this project a step further by organizing, summarizing, and translating university news into Chinese. This added layer of analysis and translation provides a more accessible and informative resource for Chinese-speaking audiences interested in UK universities.

  3. University Rankings: Another critical aspect of the project was the collection of university rankings. By aggregating data on rankings from reputable sources, I created a resource that helps students and educators understand the standing of UK universities in the academic landscape.

Benefits and Insights:

This project offers several notable benefits and insights:

  • Comprehensive Information: The collected data provides a comprehensive overview of UK universities, their majors, and their rankings, making it a valuable resource for educational research and decision-making.
  • Multilingual Accessibility: By translating university news into Chinese, I aimed to bridge language barriers and make educational information more accessible to a global audience.
  • Data-Driven Decision-Making: The availability of university rankings allows users to make data-driven decisions when choosing educational institutions.

Technological Proficiency:

This project serves as a testament to my proficiency in Python and web scraping techniques using Scrapy. It demonstrates my ability to extract valuable data from the web and convert it into organized, actionable insights.

Related Projects

Real-Time Data Streaming: Monitoring Database Changes with Postgres, Debezium, and Kafka

2025年4月22日

Unlocking Knowledge: AI-Powered Conversations with Your Private Documents

2024年2月17日

Real-Time Anomaly Detection Pipeline for Stock Trading Data with Redpanda and Quix

2025年5月18日

About Me

I’m Shijun Ju, currently living in Toronto. I’m a data analyst, educator and programmer.

Recent Posts

  • AIMarketing

    Unveiling Customer Behavior through Marketing Basket Analysis with Apriori Algorithm – My Study Notes (1/3)

    2024年1月27日
  • AI

    Unlocking Knowledge: AI-Powered Conversations with Your Private Documents

    2024年1月6日

Contact

shijunju@hotmail.com
(+1)6476498130

© Copyright 2025 Shijun's portfolio. Perfect Portfolio | Developed By Rara Theme. Powered by WordPress.