Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
General:  hacktoberfest Type:  dataset Data Domain:  audio
d2fb869273
add dataset
2 years ago
d2fb869273
add dataset
2 years ago
d2fb869273
add dataset
2 years ago
9998af5d35
add dataset
2 years ago
3b812aa828
Update 'README.md'
2 years ago
d2fb869273
add dataset
2 years ago
d2fb869273
add dataset
2 years ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

ESC-50: Dataset for Environmental Sound Classification

The ESC-50 dataset is a labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification.

The dataset consists of 5-second-long recordings organized into 50 semantical classes (with 40 examples per class) loosely arranged into 5 major categories:

Animals Natural soundscapes & water sounds Human, non-speech sounds Interior/domestic sounds Exterior/urban noises
Dog Rain Crying baby Door knock Helicopter
Rooster Sea waves Sneezing Mouse click Chainsaw
Pig Crackling fire Clapping Keyboard typing Siren
Cow Crickets Breathing Door, wood creaks Car horn
Frog Chirping birds Coughing Can opening Engine
Cat Water drops Footsteps Washing machine Train
Hen Wind Laughing Vacuum cleaner Church bells
Insects (flying) Pouring water Brushing teeth Clock alarm Airplane
Sheep Toilet flush Snoring Clock tick Fireworks
Crow Thunderstorm Drinking, sipping Glass breaking Hand saw

Clips in this dataset have been manually extracted from public field recordings gathered by the Freesound.org project. The dataset has been prearranged into 5 folds for comparable cross-validation, making sure that fragments from the same original source file are contained in a single fold.

A more thorough description of the dataset is available in the original paper with some supplementary materials on GitHub: ESC: Dataset for Environmental Sound Classification - paper replication data.

Content

  • audio/*.wav

    2000 audio recordings in WAV format (5 seconds, 44.1 kHz, mono) with the following naming convention:

    {FOLD}-{CLIP_ID}-{TAKE}-{TARGET}.wav

    • {FOLD} - index of the cross-validation fold,
    • {CLIP_ID} - ID of the original Freesound clip,
    • {TAKE} - letter disambiguating between different fragments from the same Freesound clip,
    • {TARGET} - class in numeric format [0, 49].
  • esc50.csv

    CSV file with the following structure:

filename fold target category esc10 src_file take

The esc10 column indicates if a given file belongs to the ESC-10 subset (10 selected classes, CC BY license).

License

The dataset is available under the terms of the Creative Commons Attribution Non-Commercial license.

A smaller subset (clips tagged as ESC-10) is distributed under CC BY (Attribution).

Attributions for each clip are available in the LICENSE file.

Citing

If you find this dataset useful in an academic setting please cite: PDF File

K. J. Piczak. ESC: Dataset for Environmental Sound Classification. Proceedings of the 23rd Annual ACM Conference on Multimedia, Brisbane, Australia, 2015.

[DOI: http://dx.doi.org/10.1145/2733373.2806390] @inproceedings{piczak2015dataset, title = {{ESC}: {Dataset} for {Environmental Sound Classification}}, author = {Piczak, Karol J.}, booktitle = {Proceedings of the 23rd {Annual ACM Conference} on {Multimedia}}, date = {2015-10-13}, url = {http://dl.acm.org/citation.cfm?doid=2733373.2806390}, doi = {10.1145/2733373.2806390}, location = {{Brisbane, Australia}}, isbn = {978-1-4503-3459-4}, publisher = {{ACM Press}}, pages = {1015--1018} }

Caveats

Please be aware of potential information leakage while training models on ESC-50, as some of the original Freesound recordings were already preprocessed in a manner that might be class dependent (mostly bandlimiting). Unfortunately, this issue went unnoticed when creating the original version of the dataset. Due to the number of methods already evaluated on ESC-50, no changes rectifying this issue will be made in order to preserve comparability.

This dataset cloned from this repository: @karolpiczak/ESC-50

Tip!

Press p or to see the previous file or, n or to see the next file

About

A labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification.

Collaborators 1

Comments

Loading...