Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
This repository demonstrates a complete MLOps pipeline using MLflow, DagsHub, DVC, and Evidently for comprehensive model lifecycle management, data versioning, and experiment tracking.
MLflow_demo/
│
├── data/
│ ├── data_loader.py
│ ├── raw/ # Original dataset (DVC tracked)
│ ├── processed/ # Cleaned data (pipeline output)
│ └── drift_baseline/ # Drift detection reports
│
├── notebooks/
│ ├── 01_data_cleaning.ipynb
│ ├── 02_drift.ipynb
│ └── 03_model_training.ipynb
│
├── src/
│ ├── data_preprocessing.py # Data cleaning pipeline
│ ├── drift_detection.py # Evidently drift detection
│ ├── train.py # Model training with MLflow
│ ├── evaluate.py # Model evaluation
│ └── pipeline.py # Complete end-to-end pipeline
│
├── dvc.yaml # DVC pipeline configuration
├── dvc.lock # Pipeline lock file
├── metrics.json # Pipeline metrics output
├── requirements.txt
└── README.md
Clone the repository:
git clone https://github.com/yahiaehab10/MLFlow_demo.git
cd MLFlow_demo
Install dependencies:
pip install -r requirements.txt
Configure DagsHub authentication (for data push):
# Get your token from: https://dagshub.com/user/settings/tokens
dvc remote modify origin password <your-dagshub-token>
Option 1: DVC Pipeline (Recommended)
dvc repro # Runs the complete reproducible pipeline
Option 2: Direct Python Execution
python -m src.pipeline # Runs with MLflow logging to DagsHub
python -m src.data_preprocessing
python -m src.train
python -m src.drift_detection
dvc dag
dvc status
dvc push
dvc pull
All experiments are automatically tracked and logged to:
mlflow ui
and visit http://localhost:5000IrisRandomForest
in MLflow Model RegistryStaging
, Production
)The pipeline includes comprehensive drift detection using Evidently:
data/raw/iris.csv
- Version controlled with DVCdata/processed/iris_clean.csv
- Pipeline outputdata/drift_baseline/
- Drift analysis artifactsThe pipeline uses the following configuration:
yahiaehab10/MLFlow_demo
https://dagshub.com/yahiaehab10/MLFlow_demo.mlflow
IrisRandomForest
# dvc.yaml
stages:
full_pipeline:
cmd: python -m src.pipeline
deps:
- data/raw/iris.csv
- src/pipeline.py
- src/data_preprocessing.py
- src/train.py
- src/drift_detection.py
outs:
- data/processed/iris_clean.csv
- data/drift_baseline/iris_drift_baseline.html
metrics:
- metrics.json
DVC Push Authentication Error:
dvc remote modify origin password <your-dagshub-token>
MLflow Tracking URI Error:
Pipeline Dependencies:
pip install -r requirements.txt
This project is open source and available under the MIT License.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?