Author and genre identification of Turkish news texts using deep learning algorithms

Tüfekçi, PınarBektaş, Melike2023-04-202023-04-2020220256-24990973-7677https://doi.org/10.1007/s12046-022-01975-3https://hdl.handle.net/20.500.11776/10987Nowadays, the increasing amount of data has brought the need to classify the data. Text classification is the process of categorizing similar text data. This paper aims to make a modeling study for author and genre identification, which is one of the important challenges of text classification, for Turkish news texts by using machine and deep learning algorithms. For this purpose, firstly, a total of 13 large-scale datasets having multi classes are built as new datasets. In the modeling stage, Multinomial Naive Bayes (MNB), Random Forest (RF), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM) algorithms were applied to the datasets. Results showed that for dataset AI-TNKU-7, the CNN algorithm demonstrated the highest accuracy for author identification at 95.81%. In relation to genre identification, the LSTM algorithm for the dataset GI-TNKU-6 demonstrated the highest accuracy at 96.73%.en10.1007/s12046-022-01975-3info:eu-repo/semantics/closedAccessAuthor IdentificationGenre IdentificationDeep LearningText ClassificationTurkish News DatasetsMachine LearningCategorizationAuthor and genre identification of Turkish news texts using deep learning algorithmsArticle474Q3WOS:0008554508000012-s2.0-85138352526Q2