Are you sure you want to delete this access key?
This dataset is uploaded to a DAGsHub repository
If you use the data in a published academic work we would appreciate if you cite the following article:
Ardila, R., Branson, M., Davis, K., Henretty, M., Kohler, M., Meyer, J., Morais, R., Saunders, L., Tyers, F. M. and Weber, G. (2020) "Common Voice: A Massively-Multilingual Speech Corpus". Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). pp. 4211—4215
This dataset is released under the MPL (Mozilla Public License) version 2.0
.
Here is a brief description of what is included in the Common Voice audio data:
An open-source, multi-language dataset of voices that anyone can use to train speech-enabled applications.
Each entry in the dataset consists of a unique MP3 and corresponding text file. Many of the 13,905 recorded hours in the dataset also include demographic metadata like age, sex, and accent that can help train the accuracy of speech recognition engines.
The dataset currently consists of 11,192 validated hours in 76 languages, but we’re always adding more voices and languages.
The official website for Mozilla Common Voice (you can download the uncompressed dataset and past/newer ones here)
The DAGsHub Repository (This repository is at Common-Voice-Corpus 7.0
version en_2637h_2021-07-21
)
This open source contribution is part of DagsHub x Hacktoberfest
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?