Open Set Audio Classification Using Autoencoders Trained on Few Data.

audio classification autoencoders few-shot learning open set classification open set recognition

Journal

Sensors (Basel, Switzerland)
ISSN: 1424-8220
Titre abrégé: Sensors (Basel)
Pays: Switzerland
ID NLM: 101204366

Informations de publication

Date de publication:
03 Jul 2020
Historique:
received: 22 05 2020
revised: 29 06 2020
accepted: 01 07 2020
entrez: 9 7 2020
pubmed: 9 7 2020
medline: 9 7 2020
Statut: epublish

Résumé

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solutions aimed at addressing both limitations. This paper proposes an audio OSR/FSL system divided into three steps: a high-level audio representation, feature embedding using two different autoencoder architectures and a multi-layer perceptron (MLP) trained on latent space representations to detect known classes and reject unwanted ones. An extensive set of experiments is carried out considering multiple combinations of openness factors (OSR condition) and number of shots (FSL condition), showing the validity of the proposed approach and confirming superior performance with respect to a baseline system based on transfer learning.

Identifiants

pubmed: 32635378
pii: s20133741
doi: 10.3390/s20133741
pmc: PMC7374438
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : Horizon 2020
ID : 779158.
Organisme : Spanish Ministry of Science, Innovation and Universities
ID : DIN2018-009982
Organisme : Spanish Ministry of Science, Innovation and Universities
ID : PTQ-17-09106
Organisme : Spanish Ministry of Science, Innovation and Universities
ID : RTI2018-097045-B-C21
Organisme : FEDER
ID : RTI2018-097045-B-C21

Références

IEEE Trans Pattern Anal Mach Intell. 2013 Nov;35(11):2624-37
pubmed: 24051724
IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2317-24
pubmed: 26353070
IEEE Trans Pattern Anal Mach Intell. 2017 Aug;39(8):1690-1696
pubmed: 28114060
IEEE Trans Pattern Anal Mach Intell. 2020 Mar 18;PP:
pubmed: 32191881

Auteurs

Javier Naranjo-Alcazar (J)

Visualfy, 46181 Benisanó, Spain.
Computer Science Department, Universitat de València, 46100 Burjassot, Spain.

Sergi Perez-Castanos (S)

Visualfy, 46181 Benisanó, Spain.

Pedro Zuccarello (P)

Visualfy, 46181 Benisanó, Spain.

Fabio Antonacci (F)

Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, 20133 Milan, Italy.

Maximo Cobos (M)

Computer Science Department, Universitat de València, 46100 Burjassot, Spain.

Classifications MeSH