TuPyE, an enhanced iteration of TuPy, encompasses a compilation of 43,668 meticulously annotated documents specifically selected for the purpose of hate speech detection within diverse social network contexts. This augmented dataset integrates supplementary annotations and amalgamates with datasets sourced from Fortuna et al. (2019), Leite et al. (2020), and Vargas et al. (2022), complemented by an infusion of 10,000 original documents from the TuPy-Dataset.
In light of the constrained availability of annotated data in Portuguese pertaining to the English language, TuPyE is committed to the expansion and enhancement of existing datasets. This augmentation serves to facilitate the development of advanced hate speech detection models through the utilization of machine learning (ML) and natural language processing (NLP) techniques.
Paper | Code | Results | Date | Stars |
---|