Photo by Milad Fakurian on Unsplash

lj-speech-dataset Dataset for Machine Learning

Install DagsHub:

pip install dagshub
Click on copy button to copy content

To stream this data directly on DagsHub

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/lj-speech-dataset")

fs.listdir("wavs")
Click on copy button to copy content

Description

This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.

Additional information

Tags

Related datasets

CREMA-D

daps-dataset

UrbanSounds

UrbanSound8K-Labeled Urban Sound Excerpts Dataset

Launch your ML development to new heights with DagsHub

Back to top