Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
An end-to-end CNN Image Classification Model was developed using Transfer Learning to identify food items in images. The popular EfficientnetB1 model, which had been pretrained on the large Food101 dataset, was employed and retrained for the project's purposes. Remarkably, the DeepFood Paper's model, which had an accuracy of 77.4% and was also trained on Food101, was outperformed by the Model developed in this project. The project uses DVC (data version control) for managing data. It is built on a microservices architecture and is an end-to-end project. The dataset can be downloaded from this link.
Dataset :
Food101
Model :
EfficientNetB1 & VGG16
The project's model will be built using all of the data from the Food101 dataset, comprising 75,750 training images and 25,250 testing images.
Two methods to significantly improve the speed of the model training: *Prefetching *Mixed precision training
For this Project we will working with Mixed Precision. And mixed precision works best with a with a GPU with compatibility capacity 7.0+.
At the time of writing, colab offers the following GPU's :
Colab allocates a random GPU everytime we factory reset runtime. So you can reset the runtime till you get a Tesla T4 GPU as T4 GPU has a rating 7.5.
In case using local hardware, use a GPU with rating 7.0+ for better results.
Since we've downloaded the data from TensorFlow Datasets, there are a couple of preprocessing steps we have to take before it's ready to model.
More specifically, our data is currently:
uint8
data typeWhereas, models like data to be:
float32
data type(224, 224, 3)
)To take care of these, we'll create a preprocess_img()
function which:
tf.image.resize()
tf.float32
using tf.cast()
Implemented Mixed Precision training and Prefetching to decrease the time taken for the model to train.
As we are dealing with a complex Neural Network (EfficientNetB0) its a good practice to have few call backs set up. Few callbacks I will be using throughtout this Notebook are :
TensorBoard Callback : TensorBoard provides the visualization and tooling needed for machine learning experimentation
EarlyStoppingCallback : Used to stop training when a monitored metric has stopped improving.
ReduceLROnPlateau : Reduce learning rate when a metric has stopped improving.
Complete Project Data Pipeline is available at DagsHub Data Pipeline
1. Python
2. shell scripting
3. aws cloud Provider
4. DVC
1. AWS S3
2. GitHub
3. DaghsHub
conda create --prefix ./env python=3.9
conda activate ./env
pip install -r requirements.txt
dvc init
This project is production ready to be used for the similar use cases and it will provide the automated and orchesrated production ready pipelines(Training & Serving)
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?