A new approach for determining SARS-CoV-2 epitopes using machine learning-based in silico methods

dc.authorscopusid56539994200
dc.authorscopusid55808009200
dc.contributor.authorCihan, Pınar
dc.contributor.authorÖzger, Zeynep Banu
dc.date.accessioned2023-04-20T08:04:16Z
dc.date.available2023-04-20T08:04:16Z
dc.date.issued2022
dc.departmentFakülteler, Çorlu Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü
dc.description.abstractThe emergence of machine learning-based in silico tools has enabled rapid and high-quality predictions in the biomedical field. In the COVID-19 pandemic, machine learning methods have been used in many topics such as predicting the death of patients, modeling the spread of infection, determining future effects, diagnosis with medical image analysis, and forecasting the vaccination rate. However, there is a gap in the literature regarding identifying epitopes that can be used in fast, useful, and effective vaccine design using machine learning methods and bioinformatics tools. Machine learning methods can give medical biotechnologists an advantage in designing a faster and more successful vaccine. The motivation of this study is to propose a successful hybrid machine learning method for SARS-CoV-2 epitope prediction and to identify nonallergen, nontoxic, antigen peptides that can be used in vaccine design from the predicted epitopes with bioinformatics tools. The identified epitopes will be effective not only in the design of the COVID-19 vaccine but also against viruses from the SARS family that may be encountered in the future. For this purpose, epitope prediction performances of random forest, support vector machine, logistic regression, bagging with decision tree, k-nearest neighbor and decision tree methods were examined. In the SARS-CoV and B-cell datasets used for education in the study, epitope estimation was performed again after the datasets were balanced with the synthetic minority oversampling technique (SMOTE) method since the epitope class samples were in the minority compared to the nonepitope class. The experimental results obtained were compared and the most successful predictions were obtained with the random forest (RF) method. The epitope prediction performance in balanced datasets was found to be higher than that in the original datasets (94.0% AUC and 94.4% PRC for the SMOTE-SARS-CoV dataset; 95.6% AUC and 95.3% PRC for the SMOTE-B-cell dataset). In this study, 252 peptides out of 20312 peptides were determined to be epitopes with the SMOTE-RF-SVM hybrid method proposed for SARS-CoV-2 epitope prediction. Determined epitopes were analyzed with AllerTOP 2.0, VaxiJen 2.0 and ToxinPred tools, and allergic, nonantigen, and toxic epitopes were eliminated. As a result, 11 possible nonallergic, high antigen and nontoxic epitope candidates were proposed that could be used in protein-based COVID-19 vaccine design (“VGGNYNY”, “VNFNFNGLTG”, “RQIAPGQTGKI”, “QIAPGQTGKIA”, “SYECDIPIGAGI”, “STFKCYGVSPTKL”, “GVVFLHVTYVPAQ”, “KNHTSPDVDLGDI”, “NHTSPDVDLGDIS”, “AGAAAYYVGYLQPR”, “KKSTNLVKNKCVNF”). It is predicted that the few epitopes determined by machine learning-based in silico methods will help biotechnologists design fast and accurate vaccines by reducing the number of trials in the laboratory environment. © 2022 Elsevier Ltd
dc.description.sponsorshipTürkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK: 121E326
dc.description.sponsorshipThis study was supported by Turkish Scientific and Technical Research Council, Turkey-TÜBİTAK (Project Number: 121E326).
dc.description.sponsorshipThis study was supported by Turkish Scientific and Technical Research Council, Turkey -TÜBİTAK (Project Number: 121E326 ).
dc.identifier.doi10.1016/j.compbiolchem.2022.107688
dc.identifier.issn1476-9271
dc.identifier.pmid35561658
dc.identifier.scopus2-s2.0-85129965508
dc.identifier.scopusqualityQ2
dc.identifier.urihttps://doi.org/10.1016/j.compbiolchem.2022.107688
dc.identifier.urihttps://hdl.handle.net/20.500.11776/11065
dc.identifier.volume98
dc.identifier.wosWOS:000942459600002
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakPubMed
dc.institutionauthorCihan, Pınar
dc.language.isoen
dc.publisherElsevier Ltd
dc.relation.ispartofComputational Biology and Chemistry
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectB-cell
dc.subjectIn silico
dc.subjectMachine learning
dc.subjectSARS-CoV
dc.subjectSARS-CoV-2
dc.subjectVaccine design
dc.subjectBioinformatics
dc.subjectCytology
dc.subjectDecision trees
dc.subjectDiagnosis
dc.subjectDiseases
dc.subjectEpitopes
dc.subjectForecasting
dc.subjectMedical imaging
dc.subjectNearest neighbor search
dc.subjectPeptides
dc.subjectSupport vector machines
dc.subjectVaccines
dc.subjectB cells
dc.subjectBioinformatic tools
dc.subjectEpitope predictions
dc.subjectIn-silico
dc.subjectMachine learning methods
dc.subjectPrediction performance
dc.subjectSARS-CoV
dc.subjectSARS-CoV-2
dc.subjectSynthetic minority over-sampling techniques
dc.subjectVaccine design
dc.subjectSARS
dc.subjectepitope
dc.subjectpeptide
dc.subjectvaccine
dc.subjectdiagnosis
dc.subjecthuman
dc.subjectmachine learning
dc.subjectpandemic
dc.subjectCOVID-19
dc.subjectCOVID-19 Vaccines
dc.subjectEpitopes, B-Lymphocyte
dc.subjectEpitopes, T-Lymphocyte
dc.subjectHumans
dc.subjectMachine Learning
dc.subjectPandemics
dc.subjectPeptides
dc.subjectSARS-CoV-2
dc.subjectVaccines
dc.titleA new approach for determining SARS-CoV-2 epitopes using machine learning-based in silico methods
dc.typeArticle

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
11065.pdf
Boyut:
1.26 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text