Text classification of web based news articles by using Turkish grammatical features
dc.authorscopusid | 11539603200 | |
dc.authorscopusid | 54783608800 | |
dc.authorscopusid | 55292742900 | |
dc.contributor.author | Tüfekçi, Pınar | |
dc.contributor.author | Uzun, Erdinç | |
dc.contributor.author | Sevinç, Burak | |
dc.date.accessioned | 2022-05-11T14:15:46Z | |
dc.date.available | 2022-05-11T14:15:46Z | |
dc.date.issued | 2012 | |
dc.department | Fakülteler, Çorlu Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü | |
dc.description | 2012 20th Signal Processing and Communications Applications Conference, SIU 2012 -- 18 April 2012 through 20 April 2012 -- Fethiye, Mugla -- 90786 | |
dc.description.abstract | The dimensions of the feature vectors being used at the classification methods in the literature affect directly the time performance. In this study, how to reduce the dimension of the feature vector by using Turkish's grammar rules without compromising success rates is explained. The feature vector is weighted on the basis of the word frequency as the word stems have been selected as features. During this selection the effects of selection of the word stems with different length and type to the classification are investigated and when the word stems with noun type and the maximum length are selected as features, the success rate has been found to be at the highest level. When this selection is applied with the other methods which reduce the dimension, the dimension of the feature vector is decreased to 97.46%. Using the reduced feature vector the better succes rates generally have been obtained from Naive Bayes, SVM, C 4.5 and RF classification methods and the best performance achieved is 92.73% which has been obtained using the Naive Bayes method. © 2012 IEEE. | |
dc.identifier.doi | 10.1109/SIU.2012.6204565 | |
dc.identifier.isbn | 978-1467300568 | |
dc.identifier.scopus | 2-s2.0-84863443667 | |
dc.identifier.uri | https://doi.org/10.1109/SIU.2012.6204565 | |
dc.identifier.uri | https://hdl.handle.net/20.500.11776/6067 | |
dc.indekslendigikaynak | Scopus | |
dc.institutionauthor | Tüfekçi, Pınar | |
dc.institutionauthor | Uzun, Erdinç | |
dc.institutionauthor | Sevinç, Burak | |
dc.language.iso | tr | |
dc.relation.ispartof | 2012 20th Signal Processing and Communications Applications Conference, SIU 2012, Proceedings | |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | |
dc.subject | Classification methods | |
dc.subject | Feature vectors | |
dc.subject | Grammar rules | |
dc.subject | Naive bayes | |
dc.subject | News articles | |
dc.subject | Text classification | |
dc.subject | Time performance | |
dc.subject | Turkishs | |
dc.subject | Web based | |
dc.subject | Word frequencies | |
dc.subject | Signal processing | |
dc.subject | Classifiers | |
dc.title | Text classification of web based news articles by using Turkish grammatical features | |
dc.title.alternative | Türkçe di?lbi?lgi?si? özelli? kleri?ni? kullanarak web tabanli haber meti?nleri?ni?n siniflandirilmasi] | |
dc.type | Conference Object |
Dosyalar
Orijinal paket
1 - 1 / 1
Küçük Resim Yok
- İsim:
- 6067.pdf
- Boyut:
- 416.52 KB
- Biçim:
- Adobe Portable Document Format
- Açıklama:
- Tam Metin / Full Text