Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
2f7bb3899e
adding API features
3 years ago
7f56d67e7a
procfile
3 years ago
2a7aadb1b3
drop DB
3 years ago
a496ee95d1
procfile
3 years ago
ee18feccaf
procfile
3 years ago
72f3b82320
add docstring to all files
3 years ago
da61871d7b
procfile
3 years ago
7f56d67e7a
procfile
3 years ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Email Classifier

Random forest classifier for spam or ham emails (deploied on linode sever)

This Classifier was created as part of a home assignment at the 'Israeli Tech Challenge' Bootcamp.The main purpose of this classifier is to determine if an email is spam or ham.

The model predictions are based on the 'Enron' database provided by the NLP group at the Athens University of Economics and Business AUEB .I've used this data to train a spam filter, using a processed version of the Enron dataset including labels for "ham" (non-spam) and spam emails.I this case I've used the AUEB predictions as the true label of the data and classified the data for ham or spam myself.

First I've used 'CountVectorizer' from 'Sklearn' to create Vectorize the words in the dataset into 500 different features that were created from 1-2 words.After trying different prediction models the one how to produce the best score with 97% of precision is 'Random Forest Classifier'.To prefect the classifier I have used 'GridSearchCV' from 'Sklearn' to find the best parameters on the train dataset.Then, to deploy the Classifier to an online server I have used the 'Pickle' package to dump ('zip') them.When the application is activated the models are loaded and can be used to create prediction in last than 1 sec!One of the latest features that was added to the application is a API request options. Can be used as single request with param or as multi request using json file.

Moreover, I have created an SQLite database for user accounts, classified email archives, and API statistics.For that, I have mainly used 'flask' extensions

I have deployed the model to a Linux server provided by 'Linode'.To do so I have used 'Nginx', 'Gunicorn' ,'flask' extensions and bash scripting

Hope you enjoy my application and wish you good luck,

yours, Nir Barazida

Application Screenshots

  • Homepage for visitors:

screenshot_1

  • Homepage for users:

screenshot_2

  • Classifier:

screenshot_3

Sources:

Tip!

Press p or to see the previous file or, n or to see the next file

About

Classifier for spam or ham emails based on 'enron' database (deploied on linode sever)

Collaborators 1

Comments

Loading...