Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
This repo provides an example of how to incorporate popular machine learning tools such as DVC, MLflow, and Hydra in your machine learning project. I use my project on predicting aggressive tweets as an example.
Find the article on how to use MLflow and Hydra here
Find the article on how to use DVC here
DVC is a data version control tool. To install DVC, run
pip install dvc
With Hydra, you can compose your configuration dynamically. To install Hydra, simply run
pip install hydra-core --upgrade
MLflow is a platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment. Install MLflow with
pip install mlflow
import mlflow
import hydra
from hydra import utils
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
src/preprocessing.py
: file for preprocessingsrc/train_pipeline.py
: training's pipelinesrc/train.py
: file for training and saving modelsrc/predict.py
: file for prediction and loading modelPull the data from Google Drive
dvc pull
To run the configs and see how these experiments are displayed on MLflow's server, clone this repo and run
python src/train.py
Once the run is completed, you can access to MLflow's server with
mlflow ui
Access http://localhost:5000/ from the same directory that you run the file, you should be able to see your experiment like this
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?