Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips
Yükleniyor...
Dosyalar
Tarih
2018
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Assoc Computing Machinery
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Perceptual understanding of media content has many applications, including content-based retrieval, marketing, content optimization, psychological assessment, and affect-based learning. In this paper, we model audio visual features extracted from videos via machine learning approaches to estimate the affective responses of the viewers. We use the LIRIS-ACCEDE dataset and the MediaEval 2017 Challenge setting to evaluate the proposed methods. This dataset is composed of movies of professional or amateur origin, annotated with viewers' arousal, valence, and fear scores. We extract a number of audio features, such as Mel-frequency Cepstral Coefficients, and visual features, such as dense SIFT, hue-saturation histogram, and features from a deep neural network trained for object recognition. We contrast two different approaches in the paper, and report experiments with different fusion and smoothing strategies. We demonstrate the benefit of feature selection and multimodal fusion on estimating affective responses to movie segments.
Açıklama
8th ACM International Conference on Multimedia Retrieval (ACM ICMR) -- JUN 11-14, 2018 -- Yokohama, JAPAN
Anahtar Kelimeler
Affective computing, multimodal interaction, emotion estimation, audio-visual features, movie analysis, face analysis, Extreme Learning-Machine
Kaynak
Icmr '18: Proceedings of the 2018 Acm International Conference on Multimedia Retrieval
WoS Q Değeri
N/A