Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
avatar

michizhou

avatar

michizhou

Flickr-Audio-Caption-Corpus

Updated 2 years ago

The Flickr 8k Audio Caption Corpus contains 40,000 spoken captions of 8,000 natural images. It was collected in 2015 to investigate multimodal learning schemes for unsupervised speech pattern discovery.

dataset audio

CMU-MOSI

Updated 2 years ago

CMU Multimodal Opinion Sentiment Intensity (CMU-MOSI) is a dataset of opinion level sentiment intensity in online videos. It contains 2199 opinion utterances with sentiment annotated between very negative to very positive in seven Likert steps.

dataset audio