The E-MASAC Dataset is a collection of code-mixed conversations sourced from an Indian TV series, focusing on Hindi-English interactions. It was derived from the MASAC dataset and specifically annotated for Emotion Recognition in Conversations (ERC) tasks. The dataset comprises 8,607 dialogues with 11,440 utterances, containing instances of sarcasm and humor. Emotions such as anger, fear, joy, sadness, surprise, contempt, and neutral are annotated for each utterance by three fluent English and Hindi-speaking linguists, ensuring a high inter-annotator agreement of 0.85.
1 PAPER • NO BENCHMARKS YET