SHIJUN JU
๐ง jushijun@gmail.com | ๐ 647-649-8130 | ๐ North York, ON
๐ shijunju.com | medium.com/@jushijun | github.com/shj37 | linkedin.com/in/shijunju
SUMMARY
Results-driven Data Analyst, Data Scientist, and Data Engineer with hands-on experience in data analytics, machine learning, and AI. Proficient in Python, with a strong background in statistical modeling and AI technologies. Skilled in managing data analytics projects and developing practical solutions that support business decision-making. Hands-on experience with cloud platforms including AWS, Azure, and GCP through data science and analytics projects.
Downloadable Resume
RELEVANT EXPERIENCE AND PROJECTS
Data Analytics Projects
Power BI Analysis of Forex Volatility
Power BI, DAX
Power BI Report, Medium Article, YouTube Video
- Analyzed EURUSD, USDJPY, & GBPUSD volatility using Dukascopy & Forex Factory data.
- Created an interactive dashboard to visualize trends and macro event impacts (e.g., NFP, FOMC).
- Used Polars, ArcticDB, Power Query and DAX for transformation and analysis.
- Narrated findings using an AI voice-over video.
Employee Churn Prediction Pipeline
Looker Studio, BigQuery, Python
Medium Article
- Built a churn prediction pipeline with BigQuery, PyCaret, and Looker Studio.
- Used Random Forest to identify turnover risks and visualized them in dashboards.
Predictive Modeling of LendingClub Loan Defaults
Python, AutoML
Kaggle Article
- Performed EDA to compare fully paid vs. charged-off loans.
- Developed AutoML pipeline with LightAutoML.
- Highlighted key risk features using feature importance.
AI, MLOps, and LLMOps Projects
Course-Specific AI Study Assistant
Python, RAG with Pinecone, AWS, GitHub CI/CD, Docker
Medium Article, GitHub
- Built an RAG-based assistant for course materials.
- Integrated AWS, CI/CD with GitHub, Docker for deployment.
MLOps Automation: CI/CD for ML Deployment
Jenkins, Docker, AWS ECS
Medium Article, GitHub
- Created a Jenkins-based pipeline to deploy ML models using AWS ECS and Docker.
- Ensured quality via linting, testing, and security checks.
LLM Fine-tuning with PyTorch and Keras
KaggleX Fellowship
Presentation Video
- Fine-tuned LLMs (Gemma-2b/7b) for financial compliance Q&A using LoRA, QLoRA.
- Achieved 78.6% accuracy on a benchmark dataset.
Data Engineering Projects (More on Medium)
AWS
- Spark & ELT with EMR/S3/IAM, Redshift warehouse models with dbt, Airflow ETL into RDS, Snowflake ELT over S3.
Azure & Fabric
- Terraform CI/CD for Data Factory & Storage, Azure VM Spark clusters, medallion architecture ELT, real-time pipelines via Event Hubs, Functions, Databricks.
GCP
- Healthcare pipelines: Airflow, BigQuery, dbt, GitHub Actions.
TECHNICAL SKILLS
Programming Languages: Python, R, SQL, SAS, LaTeX, JavaScript
Data Analytics / AI Tools: Power BI (DAX), Looker Studio, AutoML (AutoGluon, LightAutoML, MLJAR), PyTorch, Keras, LoRA, QLoRA, RAG (LlamaIndex, LangChain, Pinecone)
MLOps / LLMOps: Kubernetes, Kubeflow, Jenkins, AWS ECS, Docker, MLflow, GitHub Actions
Big Data / Cloud:
- Azure: AI Services, Data Factory, Data Lake, Databricks, Synapse Analytics, OpenAI, Power BI
- AWS: EC2, ECR, Glue, Redshift, EMR, Lambda, IAM
- GCP: BigQuery
- Others: Spark, PySpark, Flink, dbt, Elasticsearch
EDUCATION
Ph.D. in Economics, University of Pittsburgh, USA โ Nov 2015
B.A. with Honorary M.A. in Economics, University of Edinburgh, Scotland โ Jun 2006
PROFESSIONAL CERTIFICATIONS & TRAINING
Graduate Certificate in Artificial Intelligence, Georgian College โ Dec 2024
Graduate Certificate in Marketing Analytics, Centennial College โ May 2024
Passed all three levels of the CFA exam
LANGUAGES
English (fluent), Mandarin Chinese (native)
HOBBIES & INTERESTS
- Exploring and applying new technologies, especially in edtech and AI
- Teaching: Economics, Math, Physics, Accounting, Business, Programming (college-level)