Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
DataHack 31785338fb
Update 'README.md'
4 years ago
31785338fb
Update 'README.md'
4 years ago
e6ba2e9a1e
Upload files to ''
4 years ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

DataHack Resources 2019

This is a list of resources for DataHack, and includes link to repositories and guides for machine learning and data science. Want to join DataHack 2019? Go to https://registration.datahack.org.il

The Basics 🛠️

  • Pandas - pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
  • Numpy - NumPy is the fundamental package for scientific computing with Python. It contains among other things:
    • a powerful N-dimensional array object
    • sophisticated (broadcasting) functions
    • tools for integrating C/C++ and Fortran code
    • useful linear algebra, Fourier transform, and random number capabilities
  • Scikit-Learn - Machine Learning in Python
    • Simple and efficient tools for data mining and data analysis
    • Accessible to everybody, and reusable in various contexts
    • Built on NumPy, SciPy, and matplotlib
  • TensorFlow - Google's machine learning (and deep learning) framework
  • PyTorch - Facebook's machine learning framework - Tensors and Dynamic neural networks in Python with strong GPU acceleration.

General Resources 🤖

Blog Posts

Awesome Lists

Repositories

  • Tensor2Tensor GitHub repository - A library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research

Vision & Image 👁️ + 🖼️

Awesome Lists

Datasets

NLP 💬

Awesome Lists

  • Awesome NLP - 📖 A curated list of resources dedicated to Natural Language Processing (NLP)

Repositories

  • Fairseq - A sequence modeling toolkit enabling training custom models for translation, summarization, language modeling and other text generation tasks. It provides reference implementations of various sequence-to-sequence models
  • OpenAI GPT-2 - A really good pretrained language model (Code for the paper "Language Models are Unsupervised Multitask Learners")

Datasets

Voice & Audio 📣

Awesome Lists

Tabular Data 📊

Blog Posts

Repositories

  • XGBoost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
Tip!

Press p or to see the previous file or, n or to see the next file

About

This is a list of resources for DataHack, and includes link to repositories and guides for machine learning and data science. Want to join DataHack 2019? Go to

https://www.datahack.org.il
Collaborators 2

Comments

Loading...