Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
aa033b4a4c
prediction added
2 weeks ago
4336ab597e
cicd added
1 week ago
1f3fd299ed
cicd added
1 week ago
f97f97efad
model evaluation updated
2 weeks ago
aa033b4a4c
prediction added
2 weeks ago
04d1546a20
updated
1 week ago
34ab971a3f
scripts added
2 weeks ago
src
aa033b4a4c
prediction added
2 weeks ago
b0c3f5eaa1
templates added
2 weeks ago
1943db87b2
docker folder and files created and updated
2 weeks ago
aa033b4a4c
prediction added
2 weeks ago
4e8d3380a5
updated
2 weeks ago
1943db87b2
docker folder and files created and updated
2 weeks ago
72b3b39dbd
Create LICENSE
2 weeks ago
adcd6547f4
updated
1 week ago
f08adcfbcf
app.py updated
2 weeks ago
c6af0ce4a9
mlflow demo
2 weeks ago
491d8eea17
docker-compose file updated
2 weeks ago
aa033b4a4c
prediction added
2 weeks ago
aa033b4a4c
prediction added
2 weeks ago
e12b061633
score and input image
2 weeks ago
4ccf69929f
model evaluation completed
2 weeks ago
4ccf69929f
model evaluation completed
2 weeks ago
c6af0ce4a9
mlflow demo
2 weeks ago
e12b061633
score and input image
2 weeks ago
c6af0ce4a9
mlflow demo
2 weeks ago
253852df5f
tenplate.py created
2 weeks ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

CHEST DISEASE CLASSIFICATION USING CT-SCAN

Overview

This project aims to develop an AI model capable of classifying and diagnosing chest cancer, with a specific focus on adenocarcinoma, the most prevalent form of lung cancer. Leveraging deep learning techniques, particularly Convolutional Neural Networks (CNNs), the model utilizes the pretrained VGG-16 architecture to analyze medical images for cancer detection. The primary objective is to assist healthcare professionals in achieving early and accurate diagnoses, ultimately leading to improved patient outcomes.

Model Architecture

The VGG-16 model is a deep convolutional neural network renowned for its effectiveness in image classification tasks. Pretrained on large-scale image datasets such as ImageNet, VGG-16 possesses a deep network architecture consisting of 16 layers, including convolutional layers with small 3x3 filters and max-pooling layers. By fine-tuning the pretrained VGG-16 model on chest cancer images, we aim to capitalize on its robust feature extraction capabilities for accurate cancer classification.

Usage

Healthcare professionals can utilize the developed AI model as a supplementary tool in chest cancer diagnosis. By inputting medical images into the pretrained VGG-16 model, clinicians can receive automated predictions regarding the presence of adenocarcinoma, facilitating timely intervention and treatment planning.

Benefits

  • Early Detection: The AI model enables early detection of chest cancer, particularly adenocarcinoma, which is crucial for improving patient prognosis.
  • Accuracy: By leveraging deep learning techniques and pretrained models like VGG-16, the model achieves high levels of accuracy in cancer classification.
  • Efficiency: Automated classification of medical images streamlines the diagnostic process, allowing healthcare professionals to focus on patient care and treatment decisions.

MLOps Implemented

  • MLFlow: Experiment Tracking
  • DVC/Data Version Control: Pipeline Tracking

Deployment

Using Jenkins and AWS EC2, ECR


Project Implementation

  • Create an acc in local repo and connect it with the github
  • Then create the READMe.md, .gitignore, LICENSE files
  • Then create the template.py file
  • Created the env and activated
  • We are going to write the constants in yaml file instead of constants
  • Edit the requirements.txt file and setup.py file
  • Python-box is also used to manage the exception
  • Here custom logging is written on the constructor file in src/cnnclassifier. By doing this, no need to mention the logger folder (first approach)
  • You can also do it in another way of creating the logging folder and inside the folder create a constructor file, write the code (second approach)
  • Create a common.py file inside the utils → code
  • Configbox: to make it easily callable (refer trails.ipynb)
  • Ensure_annotation: make it bug free (refer trails.ipynb)

Workflows

  • Update the config.yaml
  • Update params.yaml
  • Update the entity
  • Update the configuration manager in src config
  • Update the components
  • Update the pipeline
  • Update the main.py
  • Update the dvc.yaml

Mlflow --> Experiment Tracking :

Example :

ElasticNet : parameters (alpha , L1 ) 0.7 , 0.9 --> exp 1 :70% 0.5 , 0.5 --> exp 2 :80% 0.4 , 0.6 --> exp 3 :50% CSV Exp 1 , Exp 2 , Exp 3

DagsHub

  1. Create a repo in GitHub
  2. Go to DagsHub
  3. Create a new repository
  4. Connect to repo
  5. GitHub
  6. Connect to the repo

Copy the experiment tracking to the README file in GitHub:

MLFLOW_TRACKING_URI=https://dagshub.com/-------/mlflow_demo.mlflow \
MLFLOW_TRACKING_USERNAME=------ \
MLFLOW_TRACKING_PASSWORD=------3a547cee2bfa163992db880d6b571b70 \
python script.py
export MLFLOW_TRACKING_URI=https://dagshub.com/-----/mlflow_demo.mlflow 
export MLFLOW_TRACKING_USERNAME=-------- 
export MLFLOW_TRACKING_PASSWORD=------63a547cee2bfa163992db880d6b571b70

How to select the good one among many?

  1. Select every experiments
  2. Compare
  3. We get a parallel coordinate plot

Criteria for selecting the best model:

  • Accuracy wise: r2 (It should be high)
  • Mean Absolute Error (MAE): should be low
  • Root Mean Squared Error (RMSE): should be low

Thus instead of performing hyperparameter tuning on DL model (which is costly and time-taking task), we could simply compare and obtain the best model from MLflow - DagsHub.

Components

Data Ingestion

  • Upload the data in the drive and download it using gdown
  • Now in config → config.yaml: code
  • Create a data ingestion file in the research folder
  • Now update the entity: which is the return type of the function → create config_entity.py
  • Update the config → configuration.py
  • Now the component → create a file: data_ingestion.py
  • Now update the pipeline
  • Update the endpoint → main.py
  • Update the utils → common.py

Create a base model

  • Update the config.yaml file: prepare_base_model
  • Params.yaml
  • Update the preparebasemodel.ipynb
  • Entity → config entity
  • Then update the config → configuration manager
  • Components → create a file prepare_base_model.py
  • Update the pipeline
  • Then the endpoint: main.py

Model Trainer

  • Create a file modeltrainer.ipynb and update it
  • Entity → configentity.py
  • Config → configuration.py
  • Components → create a file: modeltrainer.py
  • Update the pipeline
  • Then the endpoint: main.py

Model Evaluation using MLflow

  • Config is not required here
  • Create an ipynb file model evaluation with MLflow
  • Connect the repo to DagsHub
  • Then MLflow tracking in bash

In Jupyter Notebook:

os.environ["MLFLOW_TRACKING_URI"]="https://dagshub.com/------/Chest-Disease-Classification-using-CT-Scan-Image.mlflow"
os.environ["MLFLOW_TRACKING_USERNAME"]="------"
os.environ["MLFLOW_TRACKING_PASSWORD"]="------63a547cee2bfa163992db880d6b571b70”
  • Update the entity
  • Config → configuration
  • Components → model evaluation
  • Pipeline
  • End point → main.py
  • Dvc.yaml → Update the file
  • Now execute the commands
  • Execute Dvc init (dvc folder has been created)
  • Execute DVC repro
  • In dvc.lock file → it has saved all the metadata
  • If you execute the dvc repro it will show:
Stage 'data_ingestion' didn't change, skipping
Stage 'prepare_base_model' didn't change, skipping
Stage 'training' didn't change, skipping
Stage 'evaluation' didn't change, skipping
Data and pipelines are up to date.

Prediction pipeline

  • Prediction.py file added
  • Create a folder → model → copy the model during training
  • Push the changes to GitHub

User App

  • Create the index.html
  • And app.py using Flask

Deployment

  • Create a Dockerfile
  • .dockerignore
  • Docker-compose.yml
  • Create a .jenkins folder → then inside that create a Jenkinsfile
  • Scripts → ec2_instance.sh, Jenkins.sh files created

Deployment

  • Create a Dockerfile
  • .dockerignore
  • Docker-compose.yml
  • Create a .jenkins folder → then inside that create a Jenkinsfile
  • Scripts → ec2_instance.sh, Jenkins.sh files created

AWS

AWS login

  • Create IAM user:
  • IAM → User → Create user → User name: chest-user → Permission policy: Administrator access → Create user

Setup the security credential

Security credential → create access keys → command line interface → permission → create access keys: download as CSV

Now we need to launch a Jenkins Server: here EC2

  • EC2 → Launch instance
  • Name: jenkin-machine → Ubuntu → Amazon Machine Image → Instance type: → Create key pair → Launch instance
  • Take the instance created → connect

Now we need to setup the Jenkins

#!/bin/bash

sudo apt update

sudo apt install openjdk-8-jdk -y

wget -q -O - https://pkg.jenkins.io/debian-stable/jenkins.io.key | sudo apt-key add -

sudo sh -c 'echo deb https://pkg.jenkins.io/debian-stable binary/ > /etc/apt/sources.list.d/jenkins.list'

sudo apt-get update

sudo apt-get install jenkins -y

sudo systemctl start jenkins

sudo systemctl enable jenkins

sudo systemctl status jenkins

## Installing Docker

curl -fsSL https://get.docker.com -o get-docker.sh

sudo sh get-docker.sh

sudo usermod -aG docker $USER

sudo usermod -aG docker jenkins

newgrp docker

sudo apt install awscli -y

sudo usermod -a -G docker jenkins

## AWS configuration & restarts jenkins

aws configure

sudo systemctl restart jenkins

## Now setup elastic IP on AWS

## For getting the admin password for jenkins

sudo cat /var/lib/jenkins/secrets/initialAdminPassword

Setting the Elastic IP

  • Allocate Elastic IP address → keep it as default → allocate
  • Associate this elastic IP address → instance → select the instance → Associate
  • Select the instance → security → security groups → edit inbound rules → add rules → 8080 & 0.0.0.0/0
  • Copy the public IP and load in a new page → Jenkins will be running → administrator password: paste → continue
  • For administrator password: execute sudo cat /var/lib/jenkins/secrets/initialAdminPassword in EC2 terminal
  • Install suggested plugins → automatically installing required items → create First admin user
  • Username: , password: , full name: , email: → save and finish → obtain the URL for Jenkins → Log into Jenkins Server

Now we have to set the secret variable in Jenkins server:

  • Manage Jenkins → credentials → system → global credentials → add credential → secret text → ECR_REPOSITORY:

Create an ECR repo in AWS

  • AWS → ECR → create repo → private → name → create
  • Copy the URI → paste in the Jenkins server: ECR_REPOSITORY

Next secret variable:

  • Secret text → Global → AWS_ACCOUNT_ID: copy the ID from AWS account
  • Secret text → Global → AWS_ACCESS_KEY_ID: copy from the downloaded CSV
  • Secret text → Global → AWS_SECRET_ACCESS_KEY: copy from the downloaded CSV
  • SSH Username with private key → Global → ssh_key → Enter directly → add → copy the PEM file

Dashboard → Manage Jenkins → Plugins → Available plugins → SSH agents → install → install and restart → Again log into Jenkins server → verify whether the credentials are added

Now create a pipeline:

  • New item → Pipeline name → Pipeline → okay → Pipeline script from SCM → SCM: Git → paste the repo URL → branch: main → path of Jenkinsfile → save

Now we have to create another EC2 instance for the application:

  • EC2 → instance → Launch instance → Name: → Ubuntu → t2large → key value pair for the EC2 -1 → 32 GB → launch instance

Instance → Connect

#!/bin/bash

sudo apt update

sudo apt-get update

sudo apt upgrade -y

curl -fsSL https://get.docker.com -o get-docker.sh

sudo sh get-docker.sh

sudo usermod -aG docker $USER

newgrp docker

sudo apt install awscli -y

## AWS configuration

aws configure

## Now setup elastic IP on AWS

Run it on the EC2-2 instance terminal

sudo apt update
sudo apt-get update
sudo apt upgrade -y
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER
newgrp docker
sudo apt install awscli -y

AWS configuration

aws configure

Now setup elastic IP on AWS

Now create an Elastic IP for the EC2-2 instance:

  • Allocate Elastic IP address → allocate → Associate Elastic IP address → Associate for the EC2-2 instance

Now open up the Jenkinsfile → change the public IP near the ssh_key from the EC2-2 instance

Now create a folder .github → create another folder inside → workflows → create a file: main.yaml → copy the code

Now in GitHub → settings → secrets and variables → actions → create new repository secret:

  • URL: Jenkins URL
  • USER: Jenkins username
  • TOKEN: Jenkins dashboard → profile → configure → API token → add new Token → Generate → copy the Token
  • JOB: job created in Jenkins: pipeline

Now we can push the code into GitHub. GitHub repo will trigger the Jenkins Server. Manually Trigger the pipeline/workflow:

  • GitHub → actions → Trigger Jenkins Job → Run workflow → run workflow → workflow starts → and triggered the Jenkins server → Build has started in Jenkins Server → Build the images and push the image into ECR and execute the application
Tip!

Press p or to see the previous file or, n or to see the next file

About

No description

Collaborators 1

Comments

Loading...