Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips

Yükleniyor...
Küçük Resim

Tarih

2018

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Assoc Computing Machinery

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Perceptual understanding of media content has many applications, including content-based retrieval, marketing, content optimization, psychological assessment, and affect-based learning. In this paper, we model audio visual features extracted from videos via machine learning approaches to estimate the affective responses of the viewers. We use the LIRIS-ACCEDE dataset and the MediaEval 2017 Challenge setting to evaluate the proposed methods. This dataset is composed of movies of professional or amateur origin, annotated with viewers' arousal, valence, and fear scores. We extract a number of audio features, such as Mel-frequency Cepstral Coefficients, and visual features, such as dense SIFT, hue-saturation histogram, and features from a deep neural network trained for object recognition. We contrast two different approaches in the paper, and report experiments with different fusion and smoothing strategies. We demonstrate the benefit of feature selection and multimodal fusion on estimating affective responses to movie segments.

Açıklama

8th ACM International Conference on Multimedia Retrieval (ACM ICMR) -- JUN 11-14, 2018 -- Yokohama, JAPAN

Anahtar Kelimeler

Affective computing, multimodal interaction, emotion estimation, audio-visual features, movie analysis, face analysis, Extreme Learning-Machine

Kaynak

Icmr '18: Proceedings of the 2018 Acm International Conference on Multimedia Retrieval

WoS Q Değeri

N/A

Scopus Q Değeri

Cilt

Sayı

Künye