Öğe A new word-based compression model allowing compressed pattern matching(Tubitak Scientific & Technical Research Council Turkey, 2017) Buluş, Halil Nusret; Carus, Aydın; Mesut, AltanIn this study a new semistatic data compression model that has a fast coding process and that allows compressed pattern matching is introduced. The name of the proposed model is chosen as tagged word-based compression algorithm (TWBCA) since it has a word-based coding and word-based compressed matching algorithm. The model has two phases. In the first phase a dictionary is constructed by adding a phrase, paying attention to word boundaries, and in the second phase compression is done by using codewords of phrases in this dictionary. The first byte of the codeword determines whether the word is compressed or not. By paying attention to this rule, the CPM process can be conducted as word based. In addition, the proposed method makes it possible to also search for the group of consecutively compressed words. Any of the previous pattern matching algorithms can be chosen to use in compressed pattern matching as a black box. The duration of the CPM process is always less than the duration of the same process on the texts coded by Gzip tool. While matching longer patterns, compressed pattern matching takes more time on the texts coded by compress and end-tagged dense code (ETDC). However, searching shorter patterns takes less time on texts coded by our approach than the texts compressed with compress. Besides this, the compression ratio of our algorithm has a better performance against ETDC only on a file that has been written in Turkish. The compression performance of TWBCA is stable and does not vary over 6% on different text files.Öğe Analyzing the Performance Differences Between Pattern Matching and Compressed Pattern Matching on Texts(IEEE, 2013) Erdoğan, Cihat; Buluş, Halil Nusret; Diri, BanuIn this study the statistics of pattern matching on text data and the statistics of compressed pattern matching on compressed form of the same text data are compared. A new application has been developed to count the character matching numbers in compressed and uncompressed texts individually. Also a new text compression algorithm that allows compressed pattern matching by using classical pattern matching algorithms without any change is presented in this paper. In this paper while the presented compression algorithm based on digram and trigram substitution has been giving about 30-35% compression factor, the duration of compressed pattern matching on compressed text is calculated less than the duration of pattern matching on uncompressed text. Also it is confirmed that the number of character comparison on compressed texts while doing a compressed pattern matching is less than the number of character comparison on uncompressed texts. Thus the aim of the developed compression algorithm is to point out the difference in text processing between compressed and uncompressed text and to form opinions for another applications.Öğe Automatically Discovering Relevant Images From Web Pages(Ieee-Inst Electrical Electronics Engineers Inc, 2020) Uzun, Erdinç; Ozhan, Erkan; Agun, Hayri Volkan; Yerlikaya, Tarık; Buluş, Halil NusretWeb pages contain irrelevant images along with relevant images. The classification of these images is an error-prone process due to the number of design variations of web pages. Using multiple web pages provides additional features that improve the performance of relevant image extraction. Traditional studies use the features extracted from a single web page. However, in this study, we enhance the performance of relevant image extraction by employing the features extracted from different web pages consisting of standard news, galleries, video pages, and link pages. The dataset obtained from these web pages contains 100 different web pages for each 200 online news websites from 58 different countries. For discovering relevant images, the most straightforward approach extracts the largest image on the web page. This approach achieves a 0.451 F-Measure score as a baseline. Then, we apply several machine learning methods using features in this dataset to find the most suitable machine learning method. The best f-Measure score is 0.822 using the AdaBoost classifier. Some of these features have been utilized in previous web data extraction studies. To the best of our knowledge, 15 new features are proposed for the first time in this study for discovering the relevant images. We compare the performance of the AdaBoost classifier on different feature sets. The proposed features improve the f-Measure by 35 percent. Besides, using only the cache feature, which is the most prominent feature, corresponds to 7 percent of this improvement.Öğe Mikrodalga Bantlı Kurutucunun Gıda Kurutmada Kullanılabilirliği ve Modellenmesi(Namık Kemal Üniversitesi, Ziraat Fakültesi, 2016) Çelen, Soner; Buluş, Halil Nusret; Moralar, Aytaç; Haksever, Ayşen; Özsoy, ErhanBu çalışmada yeni bir teknoloji olan mikrodalga bantlı kurutucunun gıda ürünlerinde kullanılabilirliği 1mm, 2mm ve 3mm dilimlenen patates örneklerinde araştırılmıştır. Kurutma işlemleri, 1500W ve 2100W güçlerinde ve 0,175, 0,210 ve 0,245 m/dak bant hızlarında gerçekleştirilmiştir. Kullanılabilirliğini araştırmak için kuruma zamanı, enerji tüketimi ve renk kriterleri dikkate alınmıştır. Kurutma işlemi yaş baza göre ilk nem değeri %80±1 olan patatesin %12±0.5 son değerine gelinceye kadar kuruma işlemi gerçekleştirilmiştir. Elde edilen nem değerleri dikkate alınarak literatürde var olan on iki adet kurutma modellerine uygulanmıştır. Korelasyon katsayısı (r), standart sapma (RMSE) ve ?2 değerleri dikkate alınarak en uygun modelin Modified Henderson and Pabis model olduğu belirlenmiştir. Kuruma zamanları dikkate alındığında en kısa sürede kurutma 1mm dilim için 1500W mikrodalga gücünde ve 0,175 m/dak bant hızındaki örneklerde görülürken en az enerji tüketimi 2100 W ve 0,175 m/dak değerlerinde 1.121 kWh olarak ölçülmüştür. Kurutulan gıdalarda kalite kriterlerinden biri olarak değerlendirilen renk kriterine göre taze patatesin renk değerine en yakın sonucu veren toplam renk değişimi dikkate alındığında (?E) 1 mm dilim kalınlığı için 1500 W ve 0,175 m/dak değerlerinde tespit edilmiştir.Öğe Object-Based FlowChart Drawing Library(IEEE, 2017) Uzun, Erdinç; Buluş, Halil NusretWhile flow charts are one of the best ways to describe a computer program, drawing process is a laborious task for developers. This study describes an open source Javascript library named obfc.js, which we have developed to facilitate this task. This library generates SVG output on the client side for modern browsers and allows easy creation of diagrams and links. Moreover, it allows you to link click events to objects and links. This library will allow you to design sophisticated flow charts using a very small amount of text data instead of both large image and SVG Data.Öğe Veritabanı Tasarımının Yazılım Performansına Etkisi: Normalizasyona karşı Denormalizasyon(2018) Uzun, Erdinç; Buluş, Halil Nusret; Erdoğan, Ahmet CihatYazılım performansını etkileyen en önemli faktörlerden biri veritabanıtasarımında yapılabilecek iyileştirmelerdir. Veritabanı tasarımında sıklıkla ilişkiselveritabanı teorisi olan normalizasyon işlemi kullanılır. Fakat veri miktarı arttıkçanormalizasyon işleminden kaynaklı performans sorunları ortaya çıkmaya başlar.Performans sorunlarını ortadan kaldırmak için teorisi oluşmamışdenormalizasyon işlemi kullanılır. Bu çalışmada, bir anket uygulamasındaperformans arttırıcı bir veritabanı tasarımı tanıtılmış ve bu veritabanı tasarımınınMySQL, PostgreSQL ve Oracle olmak üzere üç farklı ilişkisel veritabanı yönetimsistemindeki performans artışı incelenmiştir. Ayrıca, günümüzün popülerveritabanı sistemlerinden NoSQL’e ne zaman geçilmesi gerektiği CAP teoremiüzerinden anlatılıp, normalizasyon ve denormalizasyon işlemlerinin buteoremdeki yeri belirtilmiş olacaktır.