Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

General

open-data-registry aws-pds sustainability agriculture earth observation geospatial life sciences + 710

Task

disaster response classification image classification object detection autonomous vehicles machine translation vision + 490

 Open Source Data Science Datasets

Path: .

World Mortality Dataset: international data on all-cause mortality.

dataset tabular git github

Path: .

Showcasing DagsHub Annotations, Label Studio integration, Discussions, and other related features

dataset nlp audio computer vision tabular label studio

DagsHub / IMDb

Updated 1 year ago

Path: .

Subsets of IMDb data are available for access to customers for personal and non-commercial use

dataset nlp tabular dvc git

Dean / RPPP

Updated 2 years ago

Path: raw

RPPP – Reddit Post Popularity Predictor A project with two goals: 1. Given a Reddit post, predict how popular it's going to be (what it's score will be) 2. Showcasing a remote working file system with DVC

dataset model nlp tabular dvc git

Path: .

Designing your first machine learning pipeline with few lines of codes and simple drag and drop using Orchest. In this project we will train binary classification model to predict epitope which is used for vaccine development.

dataset classification tabular scikit-learn git github

Path: .

I have experimented with multiple traditional models including Light GBM, Catboost, and BiLSTM, but the result was quite bad as compare to triple GRU layers. Using simple 3 Bidirectional GRU layer with linear activation. This model is quite simple and derived from xhlulu initial model.

dataset model tabular tensorflow

Path: .

We will be using Internet News and Consumer Engagement dataset from Kaggle to predict top article and popularity score.

dataset model classification tabular

Path: .

A repo for the tutorial explaining the benefits of DVC and DAGsHub, using the classification of questions for the Cross Validated statistics Stack Exchange as an example problem

dataset model classification tabular