Noise robust voice activity detection based on multi-layer feed-forward neural network

dc.authorscopusid57203165669
dc.authorscopusid7801396079
dc.contributor.authorArslan, Özkan
dc.contributor.authorEngin, Erkan Zeki
dc.date.accessioned2022-05-11T14:03:04Z
dc.date.available2022-05-11T14:03:04Z
dc.date.issued2019
dc.departmentFakülteler, Çorlu Mühendislik Fakültesi, Elektronik ve Haberleşme Mühendisliği Bölümü
dc.description.abstractThis paper proposes a voice activity detection (VAD) method based on time and spectral domain features using multi-layer feed-forward neural network (MLF-NN) for various noisy conditions. In the proposed method, time features that were short-time energy and zero-crossing rate and spectral features that were entropy, centroid, roll-off, and flux of speech signals were extracted. Clean speech signals were used in training MLF-NN and the network was tested for noisy speech at various noisy conditions. The proposed VAD method was evaluated for six kinds of noises which are white, car, babble, airport, street, and train at four different signal-to-noise ratio (SNR) levels. The proposed method was tested on core TIMIT database and its performance was compared with SOHN, G.729B and Long-Term Spectral Flatness (LSFM) VAD methods in point of correct speech rate, false alarm rate, and overall accuracy rate. Extensive simulation results show that the proposed method gives the most successful average correct speech rate, false alarm rate, and overall accuracy rate in most low and high SNR level conditions for different noise environments. © 2019 Istanbul University. All rights reserved.
dc.identifier.doi10.26650/electrica.2019.18042
dc.identifier.endpage100
dc.identifier.issn2619-9831
dc.identifier.issue2en_US
dc.identifier.scopus2-s2.0-85072693867
dc.identifier.scopusqualityQ3
dc.identifier.startpage91
dc.identifier.urihttps://doi.org/10.26650/electrica.2019.18042
dc.identifier.urihttps://hdl.handle.net/20.500.11776/4593
dc.identifier.volume19
dc.identifier.wosWOS:000474421400001
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorEngin, Erkan Zeki
dc.language.isoen
dc.publisherIstanbul University
dc.relation.ispartofElectrica
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectMulti-layer feed-forward neural network
dc.subjectTime and spectral features
dc.subjectVoice activity detection
dc.subjectErrors
dc.subjectFeature extraction
dc.subjectFeedforward neural networks
dc.subjectImage resolution
dc.subjectSignal to noise ratio
dc.subjectSpeech
dc.subjectSpeech communication
dc.subjectSpeech recognition
dc.subjectExtensive simulations
dc.subjectMultilayer feedforward neural networks
dc.subjectNoise environments
dc.subjectOverall accuracies
dc.subjectShort-time energy
dc.subjectSpectral feature
dc.subjectVoice activity detection
dc.subjectZero crossing rate
dc.subjectMultilayer neural networks
dc.titleNoise robust voice activity detection based on multi-layer feed-forward neural network
dc.typeArticle

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
4593.pdf
Boyut:
940.29 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text