Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

General

open-data-registry aws-pds sustainability agriculture earth observation geospatial life sciences + 726

Task

disaster response classification image classification object detection autonomous vehicles machine translation vision + 490

 Open Source Data Science Datasets

Path: .

Open-source audio datasets hosted on DagsHub

dataset audio git github

Path: .

The CHiME-Home dataset is a collection of annotated domestic environment audio recordings.

dataset audio dvc git

Path: .

WARBLRB10k is a collection of 10,000 smartphone audio recordings from around the UK, crowdsourced by users of Warblr the bird recognition app

dataset audio dvc git

Path: .

The FSL4 dataset contains ~4000 user-contributed loops uploaded to Freesound.

dataset audio dvc git

Path: .

The FSDnoisy18k dataset is an open dataset containing 42.5 hours of audio across 20 sound event classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.

dataset audio dvc git

Path: .

Urban Sound 8K is an audio dataset that contains 8732 labeled sound excerpts (<=4s) of urban sounds from 10 classes.

dataset audio dvc git

Path: .

Urban Sound 8K is an audio dataset that contains 8732 labeled sound excerpts (<=4s) of urban sounds from 10 classes.

dataset audio dvc git

Path: .

The FSDnoisy18k dataset is an open dataset containing 42.5 hours of audio across 20 sound event classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.

dataset audio dvc git

Path: .

The FSL4 dataset contains ~4000 user-contributed loops uploaded to Freesound.

dataset audio dvc git

Path: .

WARBLRB10k is a collection of 10,000 smartphone audio recordings from around the UK, crowdsourced by users of Warblr the bird recognition app

dataset audio dvc git

Path: .

The LEGOv2 database is a parameterized and annotated version of the CMU Let’s Go database from 2006 and 2007. This spoken dialogue corpus contains interactions captured from the CMU Let’s Go (LG) System by Carnegie Mellon University in 2006 and 2007. It is based on raw log-files from the LG system. The corpus has been parameterized and annotated by the Dialogue Systems Group at Ulm University, Germany.

dataset audio dvc git

Path: .

The CHiME-Home dataset is a collection of annotated domestic environment audio recordings.

dataset audio dvc git