nEMO is a simulated dataset of emotional speech in the Polish language. The corpus contains over 3 hours of samples recorded with the participation of nine actors portraying six emotional states: anger, fear, happiness, sadness, surprise, and a neutral state. The text material used was carefully selected to represent the phonetics of the Polish language. The corpus is available for free under the Creative Commons license (CC BY-NC-SA 4.0).
The dataset is available on Hugging Face and GitHub.
file_id
- filename, i.e. {speaker_id}_{emotion}_{sentence_id}
,
audio
(audio) - dictionary containing audio array, path and sampling rate (available when accessed via datasets library),
emotion
- label corresponding to emotional state,
raw_text
- original (orthographic) transcription of the audio,
normalized_text
- normalized transcription of the audio,
speaker_id
- id of speaker,
gender
- gender of the speaker,
age
- age of the speaker.
The nEMO dataset can be loaded and processed using the datasets library:
from datasets import load_dataset
nemo = load_dataset("amu-cai/nEMO", split="train")
To work with the nEMO dataset on GitHub, you may clone the repository and access the files directly within the samples
folder. Corresponding metadata can be found in the data.tsv
file.
The nEMO dataset is provided as a whole, without predefined training and test splits. This allows researchers and developers flexibility in creating their splits based on the specific needs.
The dataset is available under the Creative Commons license (CC BY-NC-SA 4.0).
You can access the nEMO paper at arXiv. Please cite the paper when referencing the nEMO dataset as:
@misc{christop2024nemo,
title={nEMO: Dataset of Emotional Speech in Polish},
author={Iwona Christop},
year={2024},
eprint={2404.06292},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Thanks to @iwonachristop for adding this dataset.
Paper | Code | Results | Date | Stars |
---|