Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
MartinKalema b892cf8015
continuing with test implementation
5 days ago
26e4a105be
dvc init
3 weeks ago
d0c8e6f20f
deployment code
1 week ago
d0c8e6f20f
deployment code
1 week ago
1 week ago
4eaf47407e
document
1 week ago
src
1 week ago
d0c8e6f20f
deployment code
1 week ago
2bfa47f096
folder structure
1 month ago
b892cf8015
continuing with test implementation
5 days ago
26e4a105be
dvc init
3 weeks ago
4 weeks ago
1 week ago
ce86f46d20
flask api && streamlit heroku config files
3 weeks ago
d0c8e6f20f
deployment code
1 week ago
c74b504d44
Document methods
1 week ago
53f5ac606c
requirements
3 weeks ago
2b51e0bcd4
unit tests
1 week ago
1 week ago
1 week ago
a5b061cde9
batch size change
3 weeks ago
53f5ac606c
requirements
3 weeks ago
232241c44d
Model evaluation complete
3 weeks ago
a30c718935
code linting
1 week ago
ce86f46d20
flask api && streamlit heroku config files
3 weeks ago
e2b1643789
Documenting
1 week ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

MIIA Pothole Image Classification Challenge

This knowledge challenge was designed for Leaderex and AI Expo, taking place on 3 and 4 September 2019, respectively. This competition is open to anyone and will remain open to allow the Zindi community to learn and test their skills.

Potholes have become a huge problem for most drivers. With the South African government spending over R22 billion over the past 3 years on pothole repair programs and the Automobile Association(AA) acknowledging more than 5% of road deaths to unmaintained road structure (potholes).

The objective of this challenge is to create a machine learning model to accurately predict the likelihood that an image contains a pothole.

Evaluation

The evaluation metric for this challenge is the Area Under the Curve.

The label is the likelihood that the image contains a pothole. Values can be from 0 to 1.

Your submission file should look like:

id                   label
AEJGkTGsvGnwBVQ       .543254
AEPaZSgFfneYkLS       0
AEjDuKGztTuzjDC       1

Folder Structure

config
│   └── config.yaml
├── Dockerfile
├── dvc.yaml
├── flaskapp.py
├── logs
│   └── logfile.log
├── main.py
├── mlflow.py
├── params.yaml
├── Procfile
├── __pycache__
│   └── mlflow.cpython-38.pyc
├── README.md
├── requirements.txt
├── research
│   ├── 01_data_ingestion.ipynb
│   ├── 02_prepare_base_model.ipynb
│   ├── 03_model_training.ipynb
│   ├── 04_model_evaluation.ipynb
│   └── trials.ipynb
├── scores.json
├── setup.py
├── setup.sh
├── src
│   ├── potholeClassifier
│   │   ├── components
│   │   │   ├── data_ingestion.py
│   │   │   ├── __init__.py
│   │   │   ├── model_evaluation.py
│   │   │   ├── model_training.py
│   │   │   └── prepare_base_model.py
│   │   ├── config
│   │   │   ├── configuration.py
│   │   │   └── __init__.py
│   │   ├── constants
│   │   │   ├── __init__.py
│   │   │   └── __pycache__
│   │   │       └── __init__.cpython-38.pyc
│   │   ├── entity
│   │   │   ├── config_entity.py
│   │   │   └── __init__.py
│   │   ├── __init__.py
│   │   ├── pipeline
│   │   │   ├── __init__.py
│   │   │   ├── stage_01_data_ingestion.py
│   │   │   ├── stage_02_prepare_base_model.py
│   │   │   ├── stage_03_model_training.py
│   │   │   ├── stage_04_model_evaluation.py
│   │   │   └── stage_05_prediction.py
│   │   ├── __pycache__
│   │   │   └── __init__.cpython-38.pyc
│   │   └── utils
│   │       ├── common.py
│   │       ├── _init__.py
│   │       └── __pycache__
│   │           └── common.cpython-38.pyc
│   └── potholeClassifier.egg-info
│       ├── dependency_links.txt
│       ├── PKG-INFO
│       ├── SOURCES.txt
│       └── top_level.txt
├── static
│   ├── CNN workings.png
│   └── mlops.png
├── template.py
└── templates
    └── index.html

Pipeline

This project uses the MLOps Level 0: Manual Process.

Methodology

We developed a custon CNN (1 input layer, 3 conv2D layers, a flatten layer, a dense layer & softmax layer) for this task. We used a keras-tuner to discover the best hyperparameters that gave us the best validation loss. The lowest validation loss, given compute constraints was 0.04 and the accuracy achieved was 97%.

How to install

Clone the repository

https://github.com/MartinKalema/MIIA-Pothole-Image-classification.git

Create a conda environment after opening the repository and activate it

conda create -n classifier python=3.8 -y
conda activate classifier

Install the requirements

pip install -r requirements.txt

This Project is connected to Dagshub so all my experiments are sent to dagshub and can be viewed on dagshub itself or on the mlflow platform integrated there.

MLflow is a production grade experiments tracker for managing end-to-end machine learning lifecycle. It helps with experiments tracking, packaging code into reproducible runs and sharing and deploying models.

Local Experiment tracking

Do not set the tracking uri using the line of code below. All experiments will be stored inside an auto generated folder called mlruns.

mlflow.set_tracking_uri()

Use the command below to view them in the mlflow web interface

mlflow ui

Online Experiment tracking with Dagshub

Connect your github project to your Dagshub account. Visit dagshub here https://dagshub.com

Add the URI, USERNAME && PASSWORD variables to your environment by running the commands below.

export MLFLOW_TRACKING_URI=https://dagshub.com/kalema3502/MIIA-Pothole-Image-classification.mlflow
export MLFLOW_TRACKING_USERNAME=kalema3502
export MLFLOW_TRACKING_PASSWORD=fb3845efcc3b2e46a4157b1d2c977a21e02dd16e

If these are not added to your working environment, the experiment data will be stored in your project root in a folder called mlruns.

DVC(Data Version Control) Setup.

Its a tool we used to automate our pipeline. Initialize dvc inside your project using the command below.

dvc init

Add the project pipelines to the dvc.yaml file, then run the command below.

dvc repro

To view the pipeline structure, use the command below

dvc dag

Tests

To run the unit tests, use the command below

pytest test/test_file_name.py

AWS CI/CD Deployment with Github Actions

  • Login to the AWS console

  • Create an IAM user for deployment

  • The user should have EC2 & ECR access. The deployment steps are mentioned below,

    1. Build the docker image of the source code
    2. Push your docker image to the ECR
    3. Launch your EC2
    4. Pull your image from the ECR to the EC2
    5. Launch your docker image in the EC2
  • Policy

    1. Amazon EC2 Container Registry Access
    2. Amazon EC2 Full Access
  • Create EC2 repo to save docker image

    save the URI: 566373416292.dkr.ecr.us-east-1.amazonaws.com/chicken
    
  • Create Ubuntu EC2 VM

  • Install Docker on the EC2 Machine

        #optional
    
        sudo apt-get update -y
    
        sudo apt-get upgrade
    
        #required
    
        curl -fsSL https://get.docker.com -o get-docker.sh
    
        sudo sh get-docker.sh
    
        sudo usermod -aG docker ubuntu
    
        newgrp docker
    
  • Configure EC2 as self-hosted runner

    setting > actions > runner > new self hosted runner > choose os >
    
  • Setup github secrets

        AWS_ACCESS_KEY_ID=
    
        AWS_SECRET_ACCESS_KEY=
    
        AWS_REGION = us-east-1
    
        AWS_ECR_LOGIN_URI = demo >>  566373416292.dkr.ecr.ap-south-1.amazonaws.com
    
Tip!

Press p or to see the previous file or, n or to see the next file

About

Refactory Final

Collaborators 1

Comments

Loading...