The dataset is taken from the First shared task on Information Extractor for Conversational Systems in Indian Languages (IECSIL) . It consists of 15,48,570 Hindi words in Devanagari script and corresponding NER labels. Each sentence end is marked by \newline" tag. Fig. 1 shows a snapshot of one sentence in the dataset. Our Dataset has nine classes, namely, Datenum, Event, Location, Name, Number, Occupation, Organization, Other, Things.
Image taken from paper: https://www.researchgate.net/publication/349190662_Analysis_of_Contextual_and_Non-contextual_Word_Embedding_Models_for_Hindi_NER_with_Web_Application_for_Data_Collection
Paper | Code | Results | Date | Stars |
---|