Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

General

open-data-registry aws-pds sustainability agriculture earth observation geospatial life sciences + 726

Task

disaster response classification image classification object detection autonomous vehicles machine translation vision + 490

 Open Source Data Science Datasets

Path: .

The Flickr 8k Audio Caption Corpus contains 40,000 spoken captions of 8,000 natural images. It was collected in 2015 to investigate multimodal learning schemes for unsupervised speech pattern discovery.

dataset audio

L-theorist / Golos

Updated 2 years ago

Path: .

Russian ASR dataset, see https://github.com/sberdevices/golos

dataset audio

hazalkl / RSC

Updated 2 years ago

Path: .

Runescape classic sounds

dataset audio

hazalkl / MS-SNSD

Updated 2 years ago

Path: .

Microsoft Scalable Noisy Speech Dataset (MS-SNSD)

dataset audio

michizhou / CMU-MOSI

Updated 2 years ago

Path: .

CMU Multimodal Opinion Sentiment Intensity (CMU-MOSI) is a dataset of opinion level sentiment intensity in online videos. It contains 2199 opinion utterances with sentiment annotated between very negative to very positive in seven Likert steps.

dataset audio

Path: .

The dataset of the Zero Resource Speech Challenge 2021, http://www.zerospeech.com/ .

dataset audio

L-theorist / Att-HACK

Updated 2 years ago

Path: .

Att-hack: an expressive speech database with social attitudes

dataset audio

hazalkl / JL-Corpus

Updated 2 years ago

Path: .

Emotional speech corpus with primary and secondary emotions.

dataset audio

Path: .

Dataset from Open SLR http://www.openslr.org/99/

dataset audio

Path: .

Speech Emotion Recognition (SER) is the process of extracting emotional paralinguistic information from speech.

dataset audio

Path: .

26 text passage read by 10 speakers; 4 main emotions: joy, sadness, anger and neutral.

dataset audio

Path: .

Parallel English speech samples from 177 countries

dataset audio

Path: .

This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.

dataset audio

Path: .

347 dialogs with 9,083 system-user exchanges; emotions classified as garbage, non-angry, slightly angry and very angry.

dataset audio

Path: .

4 speakers, 2,000 recordings (50 of each digit per speaker), English pronunciations.

dataset audio

Path: .

The DAPS (Device and Produced Speech) dataset is a collection of aligned versions of professionally produced studio speech recordings and recordings of the same speech on common consumer devices (tablet and smartphone) in real-world environments.

dataset audio

Path: .

Children's Song Dataset is open source dataset for singing voice research. This dataset contains 50 Korean and 50 English songs sung by one Korean female professional pop singer.

dataset audio

kinkusuma / emo-db

Updated 2 years ago

Path: .

800 recording spoken by 10 actors (5 males and 5 females); 7 emotions: anger, neutral, fear, boredom, happiness, sadness, disgust.

dataset audio