Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks

Kaya, Heysem; Karpov, Alexey A.

Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks

dc.authorid	0000-0003-3424-652X
dc.authorid	0000-0001-7947-5508
dc.authorscopusid	36241785000
dc.authorscopusid	57219469958
dc.authorwosid	Karpov, Alexey A/A-8905-2012
dc.authorwosid	KAYA, Heysem/V-4493-2019
dc.contributor.author	Kaya, Heysem
dc.contributor.author	Karpov, Alexey A.
dc.date.accessioned	2022-05-11T14:15:49Z
dc.date.available	2022-05-11T14:15:49Z
dc.date.issued	2016
dc.department	Fakülteler, Çorlu Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü
dc.description	17th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2016) -- SEP 08-12, 2016 -- San Francisco, CA
dc.description.abstract	The field of Computational Paralinguistics is rapidly growing and is of interest in various application domains ranging from biomedical engineering to forensics. The INTERSPEECH ComParE challenge series has a field-leading role, introducing novel problems with a common benchmark protocol for comparability. In this work, we tackle all three ComParE 2016 Challenge corpora (Native Language, Sincerity and Deception) benefiting from multi-level normalization on features followed by fast and robust kernel learning methods. Moreover, we employ computer vision inspired low level descriptor representation methods such as the Fisher vector encoding. After nonlinear preprocessing, obtained Fisher vectors are kernelized and mapped to target variables by classifiers based on Kernel Extreme Learning Machines and Partial Least Squares regression. We finally combine predictions of models trained on popularly used functional based descriptor encoding (openSMILE features) with those obtained from the Fisher vector encoding. In the preliminary experiments, our approach has significantly outperformed the baseline systems for Native Language and Sincerity sub-challenges both in the development and test sets.
dc.description.sponsorship	apple, amazon alexa, Google, Microsoft, ebay, facebook, YAHOO JAPAN, Baidu Res, IBM Res, CIRRUS LOGIC, DATATANG, NUANCE, Speechocean Ltd, Yandex, Raytheon Technol
dc.description.sponsorship	Russian Foundation for Basic ResearchRussian Foundation for Basic Research (RFBR) [16-37-60100]
dc.description.sponsorship	This research is financially supported by the Russian Foundation for Basic Research (project Ns 16-37-60100).
dc.identifier.doi	10.21437/Interspeech.2016-995
dc.identifier.endpage	2050
dc.identifier.isbn	978-1-5108-3313-5
dc.identifier.issn	2308-457X
dc.identifier.scopus	2-s2.0-84994384423
dc.identifier.scopusquality	N/A
dc.identifier.startpage	2046
dc.identifier.uri	https://doi.org/10.21437/Interspeech.2016-995
dc.identifier.uri	https://hdl.handle.net/20.500.11776/6081
dc.identifier.wos	WOS:000409394401111
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.institutionauthor	Kaya, Heysem
dc.language.iso	en
dc.publisher	Isca-Int Speech Communication Assoc
dc.relation.ispartof	17th Annual Conference of the International Speech Communication Association (Interspeech 2016), Vols 1-5: Understanding Speech Processing in Humans and Machines
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	ComParE
dc.subject	computational paralinguistics
dc.subject	Native Language
dc.subject	Sincerity
dc.subject	Fisher vector
dc.subject	PLS
dc.subject	ELM
dc.subject	Extreme Learning-Machine
dc.subject	Emotion
dc.title	Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks
dc.type	Conference Object

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: 6081.PDF
Boyut:: 237.61 KB
Biçim:: Adobe Portable Document Format
Açıklama:: Tam Metin / Full Text

İndir

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
Çorlu Mühendislik Fakültesi Koleksiyonu