Biological gender identification in Turkish news text using deep learning models

dc.authoridBEKTAS KOSESOY, Melike/0000-0002-1944-1928
dc.authoridTUFEKCI, PINAR/0000-0003-4842-2635
dc.contributor.authorTufekci, Pinar
dc.contributor.authorKosesoy, Melike Bektas
dc.date.accessioned2024-10-29T17:58:20Z
dc.date.available2024-10-29T17:58:20Z
dc.date.issued2023
dc.departmentTekirdağ Namık Kemal Üniversitesi
dc.description.abstractIdentifying the biological gender of authors based on the content of their written work is a crucial task in Natural Language Processing (NLP). Accurate biological gender identification finds numerous applications in fields such as linguistics, sociology, and marketing. However, achieving high accuracy in identifying the biological gender of the author is heavily dependent on the quality of the collected data and its proper splitting. Therefore, determining the best-performing model necessitates experimental evaluation. This study aimed to develop and evaluate four learning algorithms for biological gender identification in news texts. To this end, a comprehensive dataset, IAG-TNKU, was created from a Turkish newspaper, comprising 43,292 news articles. Four models utilizing popular machine learning algorithms, including Naive Bayes and Random Forest, and two deep learning algorithms, Long Short Term Memory and Convolutional Neural Networks, were developed and evaluated rigorously. The results indicated that the Long Short Term Memory (LSTM) algorithm outperformed the other three models, exhibiting an exceptional accuracy of 88.51%. This model's outstanding performance underpins the importance of utilizing innovative deep learning algorithms for biological gender identification tasks in NLP. The present study contributes to extant literature by developing a new dataset for biological gender identification in news texts and evaluating four machine learning algorithms. Our findings highlight the significance of utilizing innovative techniques for biological gender identification tasks. The dataset and deep learning algorithm can be applied in many areas such as sociolinguistics, marketing research, and journalism, where the identification of biological gender in written content plays a pivotal role.
dc.identifier.doi10.1007/s11042-023-17622-w
dc.identifier.issn1380-7501
dc.identifier.issn1573-7721
dc.identifier.scopus2-s2.0-85176121911
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1007/s11042-023-17622-w
dc.identifier.urihttps://hdl.handle.net/20.500.11776/14236
dc.identifier.wosWOS:001101114600014
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherSpringer
dc.relation.ispartofMultimedia Tools and Applications
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectBiological Gender Identification
dc.subjectText Classification
dc.subjectTurkish News Text
dc.subjectMachine Learning
dc.subjectDeep Learning
dc.titleBiological gender identification in Turkish news text using deep learning models
dc.typeArticle

Dosyalar