Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
c8498dae5c
Create FUNDING.yml
2 years ago
b873679a1f
use git-lfs
2 years ago
e1ae9d9575
revert the changes of prefect
2 years ago
1bb29e42c4
add checklist
2 years ago
img
294c34ebde
add supporters to README
2 years ago
b873679a1f
use git-lfs
2 years ago
b873679a1f
use git-lfs
2 years ago
nlp
a0e24e9764
change readme
2 years ago
f31c9c8bd2
add deepdiff
2 years ago
8c8af53fd9
add functools
2 years ago
72dfeffece
add top-github-scraper
3 years ago
1985f82929
edit stackoverflow
2 years ago
2 years ago
628868b75b
add google analytics
2 years ago
e1ae9d9575
revert the changes of prefect
2 years ago
9cb267cff8
good functions example
3 years ago
b873679a1f
use git-lfs
2 years ago
becd8ea051
add data
2 years ago
9cb267cff8
good functions example
3 years ago
e3cd18943b
add mlfoundry
2 years ago
c1d42813ef
Update README.md
2 years ago
0c4b490f13
Set theme jekyll-theme-slate
2 years ago
e3cd18943b
add mlfoundry
2 years ago
24aaf62485
add link to Deepnote and articles
2 years ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

View on GitHub View on Medium Daily Data Science Tips

Data Science Topics

Collection of useful data science topics along with code and articles in my data science blog.

If you want to received updates of these blogs in your mailbox, you can subscribe to my Medium newsletter. To received bite-sized Python and daily data science tips in your mailbox, you can subscribe to Data Science Simplified.

How to Download the Code in This Repository to Your Local Machine

To download the code in this repo, you can simply use git clone

git clone https://github.com/khuyentran1401/Data-science

However, due to the large number of files in this repository, it will take around 5 minutes. To clone in couple of seconds, use git-lfs.

git-lfs clone https://github.com/khuyentran1401/Data-science

Contents

  1. Data Science Tools
  2. Testing
  3. Productive Tools
  4. Tools for Deployment
  5. Speed-up Tools
  6. Math Tools
  7. Machine Learning
  8. Natural Language Processing
  9. Computer Vision
  10. Time Series
  11. Feature Engineering
  12. Visualization
  13. Mathematical Programming
  14. Scraping
  15. Python
  16. Terminal
  17. Linear Algebra
  18. Data Structure
  19. Statistics
  20. Applications
  21. Learning Tips
  22. Productive Tips
  23. VSCode
  24. Book Review
  25. Data Science Portfolio

Data Science Tools

Title Article Repository
How to Create Fake Data with Faker πŸ”— πŸ”—
Introduction to DVC: Data Version Control Tool for Machine Learning Projects πŸ”— πŸ”—
Introduction to Datasette: Explore and Publish Your Data in One Line of Code πŸ”—
Introduction to Datapane: A Python Library to Build Interactive Reports πŸ”—
Datapane’s New Features: Create a Beautiful Dashboard in Python in a Few Lines of Code πŸ”— πŸ”—
Introduction to Hydra.cc: A Powerful Framework to Configure your Data Science Projects πŸ”—
Introduction to Weight & Biases: Track and Visualize your Machine Learning Experiments in 3 Lines of Code πŸ”— πŸ”—
Kedro β€” A Python Framework for Reproducible Data Science Project πŸ”— πŸ”—
Orchestrate a Data Science Project in Python With Prefect πŸ”— πŸ”—
Introduction to Deepnote: Real-time Collaboration on Jupyter Notebook πŸ”—

Testing

Title Article Repository
Pytest for Data Scientists πŸ”— πŸ”—
4 Lessor-Known Yet Awesome Tips forΒ Pytest πŸ”— πŸ”—
Great Expectations: Always Know What to Expect From Your Data πŸ”— πŸ”—
Introduction to Schema: A Python Libary to Validate your Data πŸ”— πŸ”—
DeepDiff β€” Recursively Find and Ignore Trivial Differences Using Python πŸ”— πŸ”—
Checklist β€” Behavioral Testing of NLP Models πŸ”— πŸ”—

Productive Tools

Title Article Repository
How to Share your Python Objects Across Different Environments in One Line of Code πŸ”— πŸ”—
How to Share your Jupyter Notebook in 3 Lines of Code with Ngrok πŸ”—
3 Tools to Track and Visualize the Execution of your Python Code πŸ”— πŸ”—
2 Tools to Automatically Reload when Python Files Change πŸ”— πŸ”—
How to Strip Outputs and Execute Interactive Code in a Python Script πŸ”— πŸ”—
Pydash: A Kitchen Sink of Missing Python Utilities πŸ”— πŸ”—
4 pre-commit Plugins to Automate Code Reviewing and Formatting in Python πŸ”— πŸ”—
Write Clean Python Code Using Pipes πŸ”— πŸ”—
Introducing FugueSQL β€” SQL for Pandas, Spark, and Dask DataFrames πŸ”— πŸ”—

Tools for Deployment

Title Article Repository
How to Effortlessly Publish your Python Package to PyPI Using Poetry πŸ”— πŸ”—
Typer: Build Powerful CLIs in One Line of Code using Python πŸ”— πŸ”—

Speed-up Tools

Title Article Repository
Cython-A Speed-Up Tool for your Python Function πŸ”— πŸ”—
Train your Machine Learning Model 150x Faster with cuML πŸ”— πŸ”—

Math Tools

Title Article Repository
SymPy: Symbolic Computation in Python πŸ”— πŸ”—
How to Create Mathematical Animations like 3Blue1Brown Using Python πŸ”— πŸ”—

Machine Learning

Title Article Repository
How to Monitor And Log your Machine Learning Experiment Remotely with HyperDash πŸ”— πŸ”—
How to Efficiently Fine-Tune your Machine Learning Models πŸ”— πŸ”—
How to Learn Non-linear Dataset with Support Vector Machines πŸ”— πŸ”—
Introduction to IBM Federated Learning: A Collaborative Approach to Train ML Models on Private Data πŸ”— πŸ”—
3 Steps to Improve your Efficiency when Hypertuning ML Models πŸ”—
human-learn: Create a Human Learning Model by Drawing πŸ”— πŸ”—
Patsy: Build Powerful Features with Arbitrary Python Code πŸ”— πŸ”—
SHAP: Explain Any Machine Learning Model in Python πŸ”— πŸ”—
BentoML: Create an ML Powered Prediction Service in Minutes πŸ”— πŸ”—

Natural Language Processing

Title Article Repository
Sentiment Analysis of LinkedInΒ Messages πŸ”— πŸ”—
Find Common Words in Article with Python Module Newspaper and NLTK πŸ”— πŸ”—
How to Tokenize Tweets with Python πŸ”— πŸ”—
How to Solve Analogies with Word2Vec πŸ”— πŸ”—
What is PyTorch πŸ”— πŸ”—
Convolutional Neural Network in Natural Language Processing πŸ”— πŸ”—
Supercharge your Python String with TextBlob πŸ”— πŸ”—
pyLDAvis: Topic Modelling Exploration Tool That Every NLP Data Scientist Should Know πŸ”— πŸ”—
Streamlit and spaCy: Create an App to Predict Sentiment and Word Similarities with Minimal Domain Knowledge πŸ”— πŸ”—
Build a Robust Conversational Assistant with Rasa πŸ”— πŸ”—
I Analyzed 2k Data Scientist and Data Engineer Jobs and This is What I Found πŸ”— πŸ”—
Checklist β€” Behavioral Testing of NLP Models πŸ”— πŸ”—

Computer Vision

Title Article Repository
How to Create an App to Classify Dogs Using fastai and Streamlit πŸ”— πŸ”—

Time Series

Title Article Repository
Kats: a Generalizable Framework to Analyze Time Series Data in Python πŸ”— πŸ”—
How to Detect Seasonality, Outliers, and Changepoints in Your Time Series πŸ”— πŸ”—

Feature Engineering

Title Article Repository
3 Ways to Extract Features from Dates with Python πŸ”— πŸ”—
Similarity Encoding for Dirty Categories Using dirty_cat πŸ”— πŸ”—
Snorkel β€” Programmatically Build Training Data in Python πŸ”— πŸ”—

Visualization

Title Article Repository
How to Embed Interactive Charts on your Articles and Personal Website πŸ”— πŸ”—
What I Learned from Scraping 15k Data Science Articles on Medium πŸ”— πŸ”—
How to Create Interactive Plots with Altair πŸ”— πŸ”—
How to Create a Drop-Down Menu and a Slide Bar for your Favorite Visualization Tool πŸ”— πŸ”—
I Scraped more than 1k Top Machine Learning Github Profiles and this is what I Found πŸ”— πŸ”—
Top 6 Python Libraries for Visualization: Which one to Use? πŸ”— πŸ”—
Introduction to Yellowbrick: A Python Library to Visualize the Prediction of your Machine Learning Model πŸ”— πŸ”—
Visualize Gender-Specific Tweets with Scattertext πŸ”— πŸ”—
Visualize Your Team’s Projects Using Python Gantt Chart πŸ”— πŸ”—
How to Create Bindings and Conditions Between Multiple Plots Using Altair πŸ”— πŸ”—
How to Sketch your Data Science Ideas With Excalidraw πŸ”—
Pyvis: Visualize Interactive Network Graphs in Python πŸ”— πŸ”—
Build and Analyze Knowledge Graphs with Diffbot πŸ”—
Observe The Friend Paradox in Facebook Data Using Python πŸ”— πŸ”—
What skills and backgrounds do data scientists have in common? πŸ”— πŸ”—
Visualize Similarities Between Companies With Graph Database πŸ”— πŸ”—
Visualize GitHub Social Network with PyGraphistry πŸ”— πŸ”—
Find the Top Bootcamps for Data Professionals From Over 5k Profiles πŸ”— πŸ”—
floWeaver β€” Turn Flow Data Into a Sankey Diagram In Python πŸ”— πŸ”—
atoti β€” Build a BI Platform in Python πŸ”— πŸ”—

Mathematical Programming

Title Article Repository
How to choose stocks to invest in with Python πŸ”— πŸ”—
Maximize your Productivity with Python πŸ”— πŸ”—
How to Find a Good Match with Python πŸ”— πŸ”—
How to Solve a Staff Scheduling Problem with Python πŸ”— πŸ”—
How to Find Best Locations for your Restaurants with Python πŸ”— πŸ”—
How to Schedule Flights in Python πŸ”— πŸ”—
How to Solve a Production Planning and Inventory Problem in Python πŸ”— πŸ”—

Scraping

Title Article Repository
Web Scrape Movie Database with Beautiful Soup πŸ”— πŸ”—
top-github-scraper: Scrape Top Github Users and Repositories Based On a Keyword in One Line of Code πŸ”— πŸ”—

Python

Title Article Repository
Numpy Tricks for your Data Science Projects πŸ”— πŸ”—
Timing for Efficient Python Code πŸ”— πŸ”—
How to Use Lambda for Efficient Python Code πŸ”— πŸ”—
Python Tricks for Keeping Track of Your Data πŸ”— πŸ”—
Boost Your Efficiency With Specialized Dictionary Implementations in Python πŸ”— πŸ”—
Dictionary as an Alternative to If-Else πŸ”— πŸ”—
How to Use Zip to Manipulate a List of Tuples πŸ”— πŸ”—
Get the Most out of Your Array With These Four Numpy Methods πŸ”— πŸ”—
3 Python Tricks to Read, Create, and Run Multiple Files Automatically πŸ”— πŸ”—
How to Exclude the Outliers in Pandas DataFrame πŸ”— πŸ”—
Python Clean Code: 6 Best Practices to Make Your Python Functions More Readable πŸ”— πŸ”—
3 Techniques to Effortlessly Import and Execute Python Modules πŸ”— πŸ”—
Simplify Your Functions with Functools’ Partial and Singledispatch πŸ”— πŸ”—

Terminal

Title Article Repository
How to Create and View Interactive Cheatsheets on the Command-line πŸ”—
Understand CSV Files from your Terminal with XSV πŸ”—
Prettify your Terminal Text With Termcolor and Pyfiglet πŸ”— πŸ”—
Stop Using Print to Debug in Python. Use Icecream Instead πŸ”—
Rich: Generate Rich and Beautiful Text in the Terminal with Python πŸ”— πŸ”—
Create a Beautiful Dashboard in your Terminal with Wtfutil πŸ”— πŸ”—
3 Tools to Monitor and Optimize your Linux System πŸ”—
Ptpython: A Better Python REPL πŸ”— πŸ”—
fd: a Simple but Powerful Tool to Find and Execute Files on the Command Line πŸ”—
Speed Up your Command-Line Navigation with These 3 Tools πŸ”—
Python and Data Science Snippets on the Command Line πŸ”— πŸ”—

Linear Algebra

Title Article Repository
How to Build a Matrix Module from Scratch πŸ”— πŸ”—
Linear Algebra for Machine Learning: Solve a System of Linear Equations πŸ”— πŸ”—

Data Structure

Title Article Repository
Convex Hull: An Innovative Approach to Gift-Wrap your Data πŸ”— πŸ”—
How to Visualize Social Network With Graph Theory πŸ”— πŸ”—
How to Search Data with KDTree πŸ”— πŸ”—
How to Find the Nearest Hospital with a Voronoi Diagram πŸ”— πŸ”—

Statistics

Title Article Repository
Can Datasets of a Dinosaur and a Circle have Identical Statistics? πŸ”— πŸ”—
Introduction to One-Way ANOVA: A Test to Compare the Means between More than Two Groups πŸ”— πŸ”—
Bayes’ Theorem, Clearly Explained with Visualization πŸ”— πŸ”—
Detect Change Points with Bayesian Inference and PyMC3 πŸ”— πŸ”—
Bayesian Linear Regression with Bambi πŸ”— πŸ”—
Earn More Salary as a Coder β€” Higher Degree or More Years of Experience? πŸ”— πŸ”—

Applications

Title Article Repository
How to Create an Interactive Startup Growth Calculator with Python πŸ”— πŸ”—
Streamlit and spaCy: Create an App to Predict Sentiment and Word Similarities with Minimal Domain Knowledge πŸ”— πŸ”—
PyWebIO: Write Interactive Web App in Script Way Using Python πŸ”— πŸ”—
PyWebIO 1.3.0: Add Tabs, Pin Input, and Update an Input Based on Another Input πŸ”— πŸ”—
Simulate Real-life Events in Python Using SimPy πŸ”— πŸ”—
Create an App to Deal with Boredom Using PyWebIO πŸ”— πŸ”—

Learning Tips

Title Article Repository
How to Learn Data Science when Life does not Give You a Break πŸ”—
How to Accelerate your Data Science Career by Putting yourself in the Right Environment πŸ”—
To become a Better Data Scientist, you need to Think like a Programmer πŸ”—
How not to be Overwhelmed with Data Science πŸ”—

Productive Tips

Title Article Repository
How to Organize your Data Science Articles with Github πŸ”— πŸ”—
How to Create Reusable Command-Line πŸ”—
5 Reasons why you should Switch from Jupyter Notebook to Scripts πŸ”—
3 Ways to Get Notified with Python πŸ”— πŸ”—
7 Reasons Why you Should Start Documenting your Code πŸ”—

VSCode

Title Article Repository
How to Leverage Visual Studio Code for your Data Science Projects πŸ”—
Top 4 Code Viewers for Data Scientist in VSCode πŸ”—
Incorporate the Best Practices for Python with These Top 4 VSCode Extensions πŸ”—
Boost Your Efficiency with Customized Code Snippets on VSCode πŸ”—
Top 9 Keyboard Shortcuts in VSCode for Data Scientists πŸ”—

Book Review

Title Article Repository
Python Machine Learning: A Comprehensive Handbook for Machine Learning πŸ”—

Data Science Portfolio

Title Article Repository
How to Create an Elegant Website for your Data Science Portfolio in 10 minutes πŸ”—
Build an Impressive Github Profile in 3 Steps πŸ”—

Supporters

Special thanks to these supporters for supporting this project!

Tip!

Press p or to see the previous file or, n or to see the next file

About

No description

Collaborators 1

Comments

Loading...