Please use this identifier to cite or link to this item:
https://dspace.univ-ouargla.dz/jspui/handle/123456789/40046| Title: | ALG-Misogyny: Dataset Creation for Misogyny Detection in Algerian Dialect |
| Authors: | Toumi, Chahrazad Bouafia, Rania Naam, Lidya |
| Keywords: | Misogyny Algerian dialect Hate speech Social media YouTube com- ments |
| Issue Date: | 2025 |
| Publisher: | UNIVERSITY OF KASDI MERBAH OUARGLA |
| Citation: | FACULTY OF NEW TECHNOLOGIES OF INFORMATION AND COMMUNICATION |
| Abstract: | Social media platforms have become central to global communication but are increasingly exploited for hate speech and antisocial behavior, including misogyny and harassment. Addressing this issue requires robust tools for detecting harmful content, particularly in underrepresented languages and dialects that lack adequate computational resources. Algerian Arabic, a linguistically rich but low-resource dialect, exemplifies this gap. In this work, we present ALG-Misogyny, the first annotated dataset of misogynistic YouTube comments in Algerian Arabic, designed to enable machine learning models to detect misogynistic content. We collected and manually labeled thousands of comments from diverse Algerian YouTube channels (e.g., cooking, entertainment, news), categorizing them as misogynistic or non-misogynistic. To validate the dataset’s utility, we evaluated multiple deep learning models, including LSTM and Bidirectional LSTM (BiLSTM) ar- chitectures. Our experiments demonstrate that the BiLSTM model achieves superior performance (F1-score: 0.89) compared to traditional LSTM (F1-score: 0.82). |
| Description: | Industrial Computing |
| URI: | https://dspace.univ-ouargla.dz/jspui/handle/123456789/40046 |
| Appears in Collections: | Département d'informatique et technologie de l'information - Master |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| BOUAFIA-NAAM.pdf | Industrial Computing | 1,1 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.