Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Commit History
Message Author SHA1 Date
Refactor out duplicated code to create helper function "load_data" in data.__init__.py. Remove unused variables from replace_nan.py. Update DVC stages.   Jeff Nirschl 3 years ago
Add Drange to data dictionary. Reformat file according to PEP   Jeff Nirschl 3 years ago
Add Drange to data dictionary. Revert TableOne to output mean (SD) for Age and Fare.   Jeff Nirschl 3 years ago
Run DVC with force to over-write cached output   Jeff Nirschl 3 years ago
Update .gitignore following running DVC stage   Jeff Nirschl 3 years ago
Add script to replace missing age values using mean imputation. Added 3rd stage of DVC pipeline "impute_nan"   Jeff Nirschl 3 years ago
Update files to read str "nan" in CSV as np.Nan and save np.Nan as "nan" when writing to CSV   Jeff Nirschl 3 years ago
Change Table one output to give [min-max] for float variables instead of SD. Re-run dvc repro   Jeff Nirschl 3 years ago
Update encode_labels.py to accept "drop_cols" from params.yaml ot indicate which columns to drop during label encoding.   Jeff Nirschl 3 years ago
Refactor encode_labels.py to read dtypes from params.yaml.   Jeff Nirschl 3 years ago
Refactor make_dataset.py to only include functions to download data and save data dictionary/summary table. Keep encode_labels.py separate.   Jeff Nirschl 3 years ago
Add stage 1 = make dataset   Jeff Nirschl 3 years ago
Rename file download_data.py to make_dataset.py. Moving all functions associated with TDSP stage 1 into this file.   Jeff Nirschl 3 years ago
Add pytest to requirements.txt   Jeff Nirschl 3 years ago
File renamed. Minor refactoring of package imports and argument inputs   Jeff Nirschl 3 years ago
Run first stage of updated dvc pipeline: download_data   Jeff Nirschl 3 years ago
Deleting previous DVC pipeline to create new pipeline   Jeff Nirschl 3 years ago
Add function for parameter tuning using hyperopt.   Jeff Nirschl 3 years ago
Add function for parameter tuning using hyperopt.   Jeff Nirschl 3 years ago
Re-configure Stage train_model to send outputs to results directory   Jeff Nirschl 3 years ago
update train_model pipeline   Jeff Nirschl 3 years ago
Add script to train RandomForest model and create DVC stage   Jeff Nirschl 3 years ago
Correct documentation in function encode_labels   Jeff Nirschl 3 years ago
Update make dataset to encode categorical variables, optionally remove nan from training, and save categorized data as well as yaml encoding categorical classes   Jeff Nirschl 3 years ago
Adding DVC Stage 1: Prepare dataset   Jeff Nirschl 3 years ago
add train/test DVC files   Jeff Nirschl 3 years ago
add data folder structure   Jeff Nirschl 3 years ago
Remove train/test dvc   Jeff Nirschl 3 years ago
Data cleaning to remove indices with nan values   Jeff Nirschl 3 years ago
Adding original data files   Jeff Nirschl 3 years ago