Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
General:  health weather data human Integration:  git mlflow github
app
64ebee1562
monitoring
1 month ago
64ebee1562
monitoring
1 month ago
af93dd6cab
updated project
1 month ago
f57b05f9f8
Remove Terraform state and lock files from repo
1 month ago
64ebee1562
monitoring
1 month ago
64ebee1562
monitoring
1 month ago
af93dd6cab
updated project
1 month ago
af93dd6cab
updated project
1 month ago
src
64ebee1562
monitoring
1 month ago
af93dd6cab
updated project
1 month ago
35c8622980
preliminary setup complete
2 months ago
53669ae624
monitoring
1 month ago
6470c83078
fixed the data issue
2 months ago
35c8622980
preliminary setup complete
2 months ago
64ebee1562
monitoring
1 month ago
52618c1d3c
Initial commit
2 months ago
af93dd6cab
updated project
1 month ago
af93dd6cab
updated project
1 month ago
2 months ago
64ebee1562
monitoring
1 month ago
2 months ago
6470c83078
fixed the data issue
2 months ago
64ebee1562
monitoring
1 month ago
64ebee1562
monitoring
1 month ago
64ebee1562
monitoring
1 month ago
64ebee1562
monitoring
1 month ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

๐ŸŒฆ๏ธ Weather Disease Prediction

๐Ÿ‘ค Author

Project Description

This project aims to predict the likelihood of weather-sensitive diseases using machine learning. By analyzing historical climate and health records, it provides early warnings for disease outbreaks, empowering public health systems to respond proactively.

๐Ÿงฉ Problem Statement

Weather patterns influence the prevalence and spread of many diseases such as asthma, flu, and other respiratory conditions. The challenge is to build a robust prediction system that can:

  • Accurately classify disease categories based on environmental conditions.
  • Offer explainability of the predictions for healthcare stakeholders.
  • Generalize well to unseen data from other regions or time periods.

๐Ÿ—ƒ๏ธ Dataset Overview

The dataset contains features related to weather and environmental measurements along with disease labels. Typical columns include: Age, Gender, Temperature (C), Humidity, Wind Speed (km/h), nausea, joint_pain, abdominal_pain, high_fever, chills,...,back_pain, knee_ache.

Dataset Source: Additional info about the data can be found on Kaggle.

๐Ÿง  Features

  • End-to-end scikit-learn pipeline
  • Hyperparameter optimization using Hyperopt
  • Multiclass classification support
  • Evaluation metrics and plots
  • Feature importance for model interpretability

๐Ÿ“Š Evaluation Outputs

Metric Description
Accuracy Overall correct predictions
Precision Correctness among positive predictions
Recall Coverage of actual positives
F1-Score Harmonic mean of precision & recall

โœ… Requirements

  • Python 3.10+
  • pandas
  • scikit-learn
  • matplotlib
  • seaborn
  • numpy
  • pickle
  • hyperopt
  • prefect
  • evidently

๐Ÿš€ Getting Started

1. Clone the repository

git clone https://github.com/Danselem/weather-health.git
cd weather-health

The project makes use of Makefile and Astral uv. Click the Astral link to see the details of the package and how to install it.

2. Create and activate a virtual environment

To create and activate an environment:

make init

3. Install dependencies

make install

4. Set up MLflow server

There are two options to set up MLflow

  1. Use AWS EC2 and S3 Ensure terraform is installed on your PC and you have AWS credentials set up on your PC with aws configure. Next, cd infra then follow the instructions in infra for a complete set up of AWS resources including EC2, RDS, S3, Kinesis, Lambda, etc.

  2. Use DagsHub Sign up at Dagshub and obtain an API key and create a project repo. After that, run the command to create a .env file:

make env

Next, fill the .env file with the right information.

5. Start the orchestrator.

This project uses Prefect for running the ML pipeline. To start the prefect server, run the command:

make prefect

This will start a prefect server running at https://127.0.0.1/4200.

6 Run the ML Pipeline

To run the pipeline,

make pipeline

This will proceed to load the data, clean it, transform it and start the hyper-parameter tuning. See image below for the prefect modeling pipeline

Prefect.

It will also log the ML experiments in Dagshub. For example, see below. Prefect.

All experiments ran for this project can be accessed in here.

7. Fetch and serve the best model

fetch-best-model

The above command will fetch the best model from the Dagshub ML server and save it in the models repo. With this, we are ready to serve the model.

Generate sample data for testing the serve service.

make sample

Test the local deployment

make serve_local

Test for [docker](/Dockerfile) deployment

Build the docker

make build

Start the docker container

make run

Then test the serve script:

make serve

8. Monitoring

A simulated inference was performed in this project for testing observability with Evidently. See the observability directory.

Start the container

start-monitoring

This will start a docker compose with postgres, adminer and grafana.

Adminer can be accessed at https://127.0.0.1/8080. Grafana can be accessed at https://127.0.0.1/3000.

Simulate the inference with the command:

make observe

For example, see

Adminer

Grafana


๐Ÿงช Testing

To test your setup or add unit tests:

pytest tests/

๐Ÿ“Œ Notes

  • Label encoding is required for correct ROC/metric computation.
  • Only models with .feature_importances_ are supported for feature explanation.
  • SHAP and PDP (partial dependence plots) are excluded for simplicity and clarity.

๐Ÿ“œ License

This project is licensed under the MIT License.


๐Ÿ™‹๐Ÿฝโ€โ™€๏ธ Contact

Created by Daniel Egbo. Feel free to reach out with questions, issues, or suggestions.

Tip!

Press p or to see the previous file or, n or to see the next file

About

A machine learning project for predicting how weather conditions affects human illnesses.

Collaborators 1

Comments

Loading...