Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

General

open-data-registry aws-pds sustainability agriculture earth observation geospatial life sciences + 724

Task

disaster response classification image classification object detection autonomous vehicles machine translation vision + 490

 Open Source Data Science Datasets

DagsHub / IMDb

Updated 1 year ago

Path: .

Subsets of IMDb data are available for access to customers for personal and non-commercial use

dataset nlp tabular dvc git

DagsHub / enwiki

Updated 1 year ago

Path: .

The test data for the Large Text Compression Benchmark is the first 109 bytes of the English Wikipedia

dataset nlp dvc git

DagsHub / SQuAD

Updated 1 year ago

Path: .

SQuAD (Stanford Question Answering Dataset) is a dataset for reading comprehension. It consists of a list of questions by crowdworkers on a set of Wikipedia articles. The answers to each of the questions is a segment of text, or span, from the corresponding Wikipedia reading passage. Alternatively, the question may also be unanswerable.

dataset nlp question answering reading comprehension dvc git

DagsHub / LAMBADA

Updated 1 year ago

Path: .

This archive contains the LAMBADA dataset (LAnguage Modeling Broadened to Account for Discourse Aspects)

dataset nlp language modelling dvc git

DagsHub / hotpot

Updated 1 year ago

Path: .

A Dataset for Diverse, Explainable Multi-hop Question Answering

dataset nlp dvc git github

achamug645 / ML_Opt

Updated 1 year ago

Path: README.md

This repositor mainly discusses the application of Mchine learning and optimization approaches to the decision-making process

dataset model nlp classification tensorflow dvc git

Dean / RPPP

Updated 2 years ago

Path: raw

RPPP – Reddit Post Popularity Predictor A project with two goals: 1. Given a Reddit post, predict how popular it's going to be (what it's score will be) 2. Showcasing a remote working file system with DVC

dataset model nlp tabular dvc git

Path: .

A repo for the tutorial explaining the benefits of DVC and DAGsHub, using the classification of questions for the Cross Validated statistics Stack Exchange as an example problem

dataset nlp classification dvc git

Path: .

Design your first machine learning pipeline using simple steps on Orchest cloud.

dataset nlp classification scikit-learn git github

Path: .

Creating bot using pretrained model and Rick and Morty Data from Kaggle

dataset model nlp huggingface transformers text generation

Path: .

Using text classifier to predict various categories in Malawi News articles using SMOTE and SGDClassifier.

dataset nlp scikit-learn text classification

Path: .

This repository holds the dataset for the VotesMigration project

dataset nlp

Path: data

T5 Summarisation Using Pytorch Lightning, DVC, DagsHub, and HuggingFace Spaces

dataset model nlp

nirbarazida / Enron

Updated 3 years ago

Path: .

No description

dataset nlp