Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
e9fc3cceaf
Configure local DVC remote
3 weeks ago
fbcbc3413b
Fixed model access in pipeline and feature importance plotting
3 weeks ago
fbcbc3413b
Fixed model access in pipeline and feature importance plotting
3 weeks ago
fbcbc3413b
Fixed model access in pipeline and feature importance plotting
3 weeks ago
d89bd4ebdb
Initialize DVC and track data
3 weeks ago
d89bd4ebdb
Initialize DVC and track data
3 weeks ago
4570f08fb8
add link in readme
3 weeks ago
fbcbc3413b
Fixed model access in pipeline and feature importance plotting
3 weeks ago
d89bd4ebdb
Initialize DVC and track data
3 weeks ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

Model Performance Report Project Repository: Winemaker's Dilemma - Predictive Modeling

Tools and Infrastructure: The project leverages Python for data preprocessing and modeling, with libraries such as Pandas, NumPy, and Scikit-learn for machine learning. DVC is utilized for data versioning, and DagsHub hosts the repository and tracks the progress. Future infrastructure requirements might include a deployment platform and monitoring tools like MLflow for managing the machine learning lifecycle, including experimentation, reproducibility, and deployment.

Model Quality: Three machine learning models were trained: Logistic Regression, Random Forest, and Gradient Boosting. The models were evaluated based on accuracy, precision, and recall. Logistic Regression proved to be the most balanced model, with moderate accuracy and high precision but lower recall. Random Forest and Gradient Boosting offered varied performance with trade-offs between precision and recall.

Trade-offs: Logistic Regression, while less computationally intensive and more interpretable, may miss some storm occurrences (lower recall). Random Forest and Gradient Boosting are more complex, potentially offering richer insights at the cost of increased computational resources and potential overfitting.

Deployment and Monitoring Recommendations: For deployment, a lightweight framework such as Flask could be used to serve the model via an API, with Docker for containerization to ensure consistency across environments. Monitoring could be done through a combination of Prometheus for system monitoring and custom logging for model performance metrics. It is vital to set up alerts for model drift and retrain the model with fresh data periodically.

Tip!

Press p or to see the previous file or, n or to see the next file

About

ml in practice

Collaborators 1

Comments

Loading...