Affective Action and Interaction Recognition by Multi-View Representation Learning from Handcrafted Low-Level Skeleton Features.

Affective action affective interaction bag-of-visual-words handcrafted low-level skeleton features multi-view representation learning

Journal

International journal of neural systems
ISSN: 1793-6462
Titre abrégé: Int J Neural Syst
Pays: Singapore
ID NLM: 9100527

Informations de publication

Date de publication:
Oct 2022
Historique:
pubmed: 27 7 2022
medline: 24 9 2022
entrez: 26 7 2022
Statut: ppublish

Résumé

Human feelings expressed through verbal (e.g. voice) and non-verbal communication channels (e.g. face or body) can influence either human actions or interactions. In the literature, most of the attention was given to facial expressions for the analysis of emotions conveyed through non-verbal behaviors. Despite this, psychology highlights that the body is an important indicator of the human affective state in performing daily life activities. Therefore, this paper presents a novel method for affective action and interaction recognition from videos, exploiting multi-view representation learning and only full-body handcrafted characteristics selected following psychological and proxemic studies. Specifically, 2D skeletal data are extracted from RGB video sequences to derive diverse low-level skeleton features, i.e. multi-views, modeled through the bag-of-visual-words clustering approach generating a condition-related codebook. In this way, each affective action and interaction within a video can be represented as a frequency histogram of codewords. During the learning phase, for each affective class, training samples are used to compute its global histogram of codewords stored in a database and later used for the recognition task. In the recognition phase, the video frequency histogram representation is matched against the database of class histograms and classified as the closest affective class in terms of Euclidean distance. The effectiveness of the proposed system is evaluated on a specifically collected dataset containing 6 emotion for both actions and interactions, on which the proposed system obtains 93.64% and 90.83% accuracy, respectively. In addition, the devised strategy also achieves in line performances with other literature works based on deep learning when tested on a public collection containing 6 emotions plus a neutral state, demonstrating the effectiveness of the presented approach and confirming the findings in psychological and proxemic studies.

Identifiants

pubmed: 35881015
doi: 10.1142/S012906572250040X
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

2250040

Auteurs

Danilo Avola (D)

Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy.

Marco Cascio (M)

Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy.

Luigi Cinque (L)

Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy.

Alessio Fagioli (A)

Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome 00198, Italy.

Gian Luca Foresti (GL)

Department of Computer Science, Mathematics and Physics, University of Udine, Via delle Scienze 206, Udine 33100, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH