A new approach for determining SARS-CoV-2 epitopes using machine learning-based in silico methods
dc.authorscopusid | 56539994200 | |
dc.authorscopusid | 55808009200 | |
dc.contributor.author | Cihan, Pınar | |
dc.contributor.author | Özger, Zeynep Banu | |
dc.date.accessioned | 2023-04-20T08:04:16Z | |
dc.date.available | 2023-04-20T08:04:16Z | |
dc.date.issued | 2022 | |
dc.department | Fakülteler, Çorlu Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü | |
dc.description.abstract | The emergence of machine learning-based in silico tools has enabled rapid and high-quality predictions in the biomedical field. In the COVID-19 pandemic, machine learning methods have been used in many topics such as predicting the death of patients, modeling the spread of infection, determining future effects, diagnosis with medical image analysis, and forecasting the vaccination rate. However, there is a gap in the literature regarding identifying epitopes that can be used in fast, useful, and effective vaccine design using machine learning methods and bioinformatics tools. Machine learning methods can give medical biotechnologists an advantage in designing a faster and more successful vaccine. The motivation of this study is to propose a successful hybrid machine learning method for SARS-CoV-2 epitope prediction and to identify nonallergen, nontoxic, antigen peptides that can be used in vaccine design from the predicted epitopes with bioinformatics tools. The identified epitopes will be effective not only in the design of the COVID-19 vaccine but also against viruses from the SARS family that may be encountered in the future. For this purpose, epitope prediction performances of random forest, support vector machine, logistic regression, bagging with decision tree, k-nearest neighbor and decision tree methods were examined. In the SARS-CoV and B-cell datasets used for education in the study, epitope estimation was performed again after the datasets were balanced with the synthetic minority oversampling technique (SMOTE) method since the epitope class samples were in the minority compared to the nonepitope class. The experimental results obtained were compared and the most successful predictions were obtained with the random forest (RF) method. The epitope prediction performance in balanced datasets was found to be higher than that in the original datasets (94.0% AUC and 94.4% PRC for the SMOTE-SARS-CoV dataset; 95.6% AUC and 95.3% PRC for the SMOTE-B-cell dataset). In this study, 252 peptides out of 20312 peptides were determined to be epitopes with the SMOTE-RF-SVM hybrid method proposed for SARS-CoV-2 epitope prediction. Determined epitopes were analyzed with AllerTOP 2.0, VaxiJen 2.0 and ToxinPred tools, and allergic, nonantigen, and toxic epitopes were eliminated. As a result, 11 possible nonallergic, high antigen and nontoxic epitope candidates were proposed that could be used in protein-based COVID-19 vaccine design (“VGGNYNY”, “VNFNFNGLTG”, “RQIAPGQTGKI”, “QIAPGQTGKIA”, “SYECDIPIGAGI”, “STFKCYGVSPTKL”, “GVVFLHVTYVPAQ”, “KNHTSPDVDLGDI”, “NHTSPDVDLGDIS”, “AGAAAYYVGYLQPR”, “KKSTNLVKNKCVNF”). It is predicted that the few epitopes determined by machine learning-based in silico methods will help biotechnologists design fast and accurate vaccines by reducing the number of trials in the laboratory environment. © 2022 Elsevier Ltd | |
dc.description.sponsorship | Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK: 121E326 | |
dc.description.sponsorship | This study was supported by Turkish Scientific and Technical Research Council, Turkey-TÜBİTAK (Project Number: 121E326). | |
dc.description.sponsorship | This study was supported by Turkish Scientific and Technical Research Council, Turkey -TÜBİTAK (Project Number: 121E326 ). | |
dc.identifier.doi | 10.1016/j.compbiolchem.2022.107688 | |
dc.identifier.issn | 1476-9271 | |
dc.identifier.pmid | 35561658 | |
dc.identifier.scopus | 2-s2.0-85129965508 | |
dc.identifier.scopusquality | Q2 | |
dc.identifier.uri | https://doi.org/10.1016/j.compbiolchem.2022.107688 | |
dc.identifier.uri | https://hdl.handle.net/20.500.11776/11065 | |
dc.identifier.volume | 98 | |
dc.identifier.wos | WOS:000942459600002 | |
dc.identifier.wosquality | Q2 | |
dc.indekslendigikaynak | Web of Science | |
dc.indekslendigikaynak | Scopus | |
dc.indekslendigikaynak | PubMed | |
dc.institutionauthor | Cihan, Pınar | |
dc.language.iso | en | |
dc.publisher | Elsevier Ltd | |
dc.relation.ispartof | Computational Biology and Chemistry | |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.subject | B-cell | |
dc.subject | In silico | |
dc.subject | Machine learning | |
dc.subject | SARS-CoV | |
dc.subject | SARS-CoV-2 | |
dc.subject | Vaccine design | |
dc.subject | Bioinformatics | |
dc.subject | Cytology | |
dc.subject | Decision trees | |
dc.subject | Diagnosis | |
dc.subject | Diseases | |
dc.subject | Epitopes | |
dc.subject | Forecasting | |
dc.subject | Medical imaging | |
dc.subject | Nearest neighbor search | |
dc.subject | Peptides | |
dc.subject | Support vector machines | |
dc.subject | Vaccines | |
dc.subject | B cells | |
dc.subject | Bioinformatic tools | |
dc.subject | Epitope predictions | |
dc.subject | In-silico | |
dc.subject | Machine learning methods | |
dc.subject | Prediction performance | |
dc.subject | SARS-CoV | |
dc.subject | SARS-CoV-2 | |
dc.subject | Synthetic minority over-sampling techniques | |
dc.subject | Vaccine design | |
dc.subject | SARS | |
dc.subject | epitope | |
dc.subject | peptide | |
dc.subject | vaccine | |
dc.subject | diagnosis | |
dc.subject | human | |
dc.subject | machine learning | |
dc.subject | pandemic | |
dc.subject | COVID-19 | |
dc.subject | COVID-19 Vaccines | |
dc.subject | Epitopes, B-Lymphocyte | |
dc.subject | Epitopes, T-Lymphocyte | |
dc.subject | Humans | |
dc.subject | Machine Learning | |
dc.subject | Pandemics | |
dc.subject | Peptides | |
dc.subject | SARS-CoV-2 | |
dc.subject | Vaccines | |
dc.title | A new approach for determining SARS-CoV-2 epitopes using machine learning-based in silico methods | |
dc.type | Article |
Dosyalar
Orijinal paket
1 - 1 / 1
Yükleniyor...
- İsim:
- 11065.pdf
- Boyut:
- 1.26 MB
- Biçim:
- Adobe Portable Document Format
- Açıklama:
- Tam Metin / Full Text