Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Commit History
Message Author SHA1 Date
Move function create_data_dictionary out of make_dataset.py to reduce code complexity. Create new script data_dictionary.py to manage data dictionary and data summary table. DVC stage 1 working but other stages currently broken.   Jeff Nirschl 3 years ago
Add Drange to data dictionary. Reformat file according to PEP   Jeff Nirschl 3 years ago
Add Drange to data dictionary. Revert TableOne to output mean (SD) for Age and Fare.   Jeff Nirschl 3 years ago
Update files to read str "nan" in CSV as np.Nan and save np.Nan as "nan" when writing to CSV   Jeff Nirschl 3 years ago
Change Table one output to give [min-max] for float variables instead of SD. Re-run dvc repro   Jeff Nirschl 3 years ago
Refactor make_dataset.py to only include functions to download data and save data dictionary/summary table. Keep encode_labels.py separate.   Jeff Nirschl 3 years ago
Rename file download_data.py to make_dataset.py. Moving all functions associated with TDSP stage 1 into this file.   Jeff Nirschl 3 years ago
File renamed. Minor refactoring of package imports and argument inputs   Jeff Nirschl 3 years ago
Correct documentation in function encode_labels   Jeff Nirschl 3 years ago
Update make dataset to encode categorical variables, optionally remove nan from training, and save categorized data as well as yaml encoding categorical classes   Jeff Nirschl 3 years ago
Adding DVC Stage 1: Prepare dataset   Jeff Nirschl 3 years ago
Data cleaning to remove indices with nan values   Jeff Nirschl 3 years ago
initial commit using cookiecutter data science   Jeff Nirschl 3 years ago