Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
The ESC-50 dataset is a labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification.
The dataset consists of 5-second-long recordings organized into 50 semantical classes (with 40 examples per class) loosely arranged into 5 major categories:
Animals | Natural soundscapes & water sounds | Human, non-speech sounds | Interior/domestic sounds | Exterior/urban noises |
---|---|---|---|---|
Dog | Rain | Crying baby | Door knock | Helicopter |
Rooster | Sea waves | Sneezing | Mouse click | Chainsaw |
Pig | Crackling fire | Clapping | Keyboard typing | Siren |
Cow | Crickets | Breathing | Door, wood creaks | Car horn |
Frog | Chirping birds | Coughing | Can opening | Engine |
Cat | Water drops | Footsteps | Washing machine | Train |
Hen | Wind | Laughing | Vacuum cleaner | Church bells |
Insects (flying) | Pouring water | Brushing teeth | Clock alarm | Airplane |
Sheep | Toilet flush | Snoring | Clock tick | Fireworks |
Crow | Thunderstorm | Drinking, sipping | Glass breaking | Hand saw |
Clips in this dataset have been manually extracted from public field recordings gathered by the Freesound.org project. The dataset has been prearranged into 5 folds for comparable cross-validation, making sure that fragments from the same original source file are contained in a single fold.
A more thorough description of the dataset is available in the original paper with some supplementary materials on GitHub: ESC: Dataset for Environmental Sound Classification - paper replication data.
2000 audio recordings in WAV format (5 seconds, 44.1 kHz, mono) with the following naming convention:
{FOLD}-{CLIP_ID}-{TAKE}-{TARGET}.wav
{FOLD}
- index of the cross-validation fold,{CLIP_ID}
- ID of the original Freesound clip,{TAKE}
- letter disambiguating between different fragments from the same Freesound clip,{TARGET}
- class in numeric format [0, 49].CSV file with the following structure:
filename | fold | target | category | esc10 | src_file | take |
---|
The esc10
column indicates if a given file belongs to the ESC-10 subset (10 selected classes, CC BY license).
The dataset is available under the terms of the Creative Commons Attribution Non-Commercial license.
A smaller subset (clips tagged as ESC-10) is distributed under CC BY (Attribution).
Attributions for each clip are available in the LICENSE file.
If you find this dataset useful in an academic setting please cite:
K. J. Piczak. ESC: Dataset for Environmental Sound Classification. Proceedings of the 23rd Annual ACM Conference on Multimedia, Brisbane, Australia, 2015.
[DOI: http://dx.doi.org/10.1145/2733373.2806390] @inproceedings{piczak2015dataset, title = {{ESC}: {Dataset} for {Environmental Sound Classification}}, author = {Piczak, Karol J.}, booktitle = {Proceedings of the 23rd {Annual ACM Conference} on {Multimedia}}, date = {2015-10-13}, url = {http://dl.acm.org/citation.cfm?doid=2733373.2806390}, doi = {10.1145/2733373.2806390}, location = {{Brisbane, Australia}}, isbn = {978-1-4503-3459-4}, publisher = {{ACM Press}}, pages = {1015--1018} }
Please be aware of potential information leakage while training models on ESC-50, as some of the original Freesound recordings were already preprocessed in a manner that might be class dependent (mostly bandlimiting). Unfortunately, this issue went unnoticed when creating the original version of the dataset. Due to the number of methods already evaluated on ESC-50, no changes rectifying this issue will be made in order to preserve comparability.
This dataset cloned from this repository: @karolpiczak/ESC-50
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?