This project implements an automated MLOps pipeline for deploying a machine learning model, focusing on operational efficiency and security. It integrates Jenkins for CI/CD, Docker for containerization, and AWS ECS for scalable deployment, ensuring a repeatable and reliable workflow.
Project Overview
The pipeline automates the deployment of a Flask-based ML application, emphasizing Continuous Integration and Continuous Deployment (CI/CD) practices. A Docker-in-Docker setup enables Jenkins to build and manage containerized applications, while AWS ECS handles production-grade orchestration.
Key Components
- Technologies:
- Jenkins: Automates CI/CD workflows.
- Docker: Ensures consistent environments.
- AWS ECS: Manages scalable container deployment.
- Flask: Serves the ML model via a web interface.
- Trivy: Scans for security vulnerabilities.
- Pylint, Flake8, Black: Enforce Python code quality.
- Pytest: Validates model and application functionality.
- Pipeline Stages:
- Clone GitHub repository to retrieve the latest code.
- Lint and test Python code using Pylint, Flake8, Black, and Pytest.
- Scan filesystem and Docker image with Trivy for vulnerabilities.
- Build and push Docker image to Docker Hub.
- Deploy to AWS ECS using Fargate for serverless orchestration.
- Key Files:
- train.py: Trains a simple Iris classification model.
- app.py: Flask application for model serving.
- Jenkinsfile: Defines the CI/CD pipeline.
- Dockerfile: Containerizes the Flask app.
- custom_jenkins/Dockerfile: Sets up Jenkins with Docker-in-Docker.
Implementation Details
- Jenkins Setup: Configured a custom Jenkins image with Docker-in-Docker to manage container builds within the pipeline.
- Security: Integrated Trivy for filesystem and image scans to identify vulnerabilities before deployment.
- AWS ECS: Deployed the containerized app using Fargate, with a load balancer for public access.
- Code Quality: Automated linting and formatting ensure maintainable, consistent code.
Outcomes
The project delivers a production-ready ML application through an automated pipeline, demonstrating proficiency in MLOps practices, containerization, and cloud deployment. The full pipeline is documented in a Medium article, and the source code is available on GitHub.
Link: Read the full article on Medium
Code: https://github.com/shj37/AWS-ECS-Jenkins-ML-CICD
All credit to iQuant for the original project design and code.
References
- YouTube Reference: iQuant tutorial.
- https://github.com/iQuantC/MLOps01