Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Nir Barazida 6e8ae6c5a6
Update data set - Run preprocessing without punc
3 years ago
547096acad
Adding files from Google Drive to the project
3 years ago
src
e08c5f0dfb
Not removing punctuation in the preprocessing
3 years ago
17f41db304
updata ignore files
3 years ago
17f41db304
updata ignore files
3 years ago
547096acad
Adding files from Google Drive to the project
3 years ago
6e8ae6c5a6
Update data set - Run preprocessing without punc
3 years ago
a42cf16226
RFC experiment
3 years ago
a42cf16226
RFC experiment
3 years ago
547096acad
Adding files from Google Drive to the project
3 years ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

First Repo Project

This project is a simple 'Ham or Spam' classifier for emails using the Enron data set. It contains two python code files, 5 data files, and one constants file.

  • code directory - holds the data-preprocessing and modeling files:
    • data-preprocessing.py - processing the raw data (enron.csv), splits it to train and test sets, and saves it to the data directory.
    • modeling.py - simple Random Forest Regressor.
  • data directory - contains the raw and processed data.
  • src - contains the constants file.
  • requirements.txt - python dependencies that are required to run the python files.
  • README.md - Read me file.
Tip!

Press p or to see the previous file or, n or to see the next file

About

No description

Collaborators 1

Comments

Loading...