Open Source Data Science Datasets

Path: .

The Flickr 8k Audio Caption Corpus contains 40,000 spoken captions of 8,000 natural images. It was collected in 2015 to investigate multimodal learning schemes for unsupervised speech pattern discovery.

dataset audio

0 0 0

Path: .

Russian ASR dataset, see https://github.com/sberdevices/golos

dataset audio

0 0 0

Path: .

Runescape classic sounds

dataset audio

0 0 0

Path: .

Microsoft Scalable Noisy Speech Dataset (MS-SNSD)

dataset audio

1 0 0

Path: .

No description

dataset audio

6 0 0

Path: .

CMU Multimodal Opinion Sentiment Intensity (CMU-MOSI) is a dataset of opinion level sentiment intensity in online videos. It contains 2199 opinion utterances with sentiment annotated between very negative to very positive in seven Likert steps.

dataset audio

0 0 0

Path: .

The dataset of the Zero Resource Speech Challenge 2021, http://www.zerospeech.com/ .

dataset audio

0 0 0

Path: .

Att-hack: an expressive speech database with social attitudes

dataset audio

0 0 0

Path: .

Emotional speech corpus with primary and secondary emotions.

dataset audio

0 0 0

Path: .

Dataset from Open SLR http://www.openslr.org/99/

dataset audio

1 0 0

Path: .

No description

dataset audio

1 0 0

Path: .

Speech Emotion Recognition (SER) is the process of extracting emotional paralinguistic information from speech.

dataset audio

1 0 0

Path: .

26 text passage read by 10 speakers; 4 main emotions: joy, sadness, anger and neutral.

dataset audio

0 0 0

Path: .

Parallel English speech samples from 177 countries

dataset audio

9 0 0

Path: .

This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.

dataset audio

0 0 0

Path: .

347 dialogs with 9,083 system-user exchanges; emotions classified as garbage, non-angry, slightly angry and very angry.

dataset audio

1 0 0

Path: .

4 speakers, 2,000 recordings (50 of each digit per speaker), English pronunciations.

dataset audio

0 0 0

Path: .

The DAPS (Device and Produced Speech) dataset is a collection of aligned versions of professionally produced studio speech recordings and recordings of the same speech on common consumer devices (tablet and smartphone) in real-world environments.

dataset audio

1 0 0

Path: .

Children's Song Dataset is open source dataset for singing voice research. This dataset contains 50 Korean and 50 English songs sung by one Korean female professional pop singer.

dataset audio

3 0 0

Path: .

800 recording spoken by 10 actors (5 males and 5 females); 7 emotions: anger, neutral, fear, boredom, happiness, sadness, disgust.

dataset audio

0 0 0

Previous 1 2 3 4 5 Next

General

Task

Data Domain

Framework

Integration

Open Source Data Science Datasets