Unobserved classes and extra variables in high-dimensional discriminant analysis.

Adaptive supervised classification Conditional estimation Model-based discriminant analysis Unobserved classes Variable selection

Journal

Advances in data analysis and classification
ISSN: 1862-5347
Titre abrégé: Adv Data Anal Classif
Pays: Germany
ID NLM: 101562922

Informations de publication

Date de publication:
2022
Historique:
received: 29 01 2021
revised: 15 07 2021
accepted: 03 10 2021
entrez: 21 3 2022
pubmed: 22 3 2022
medline: 22 3 2022
Statut: ppublish

Résumé

In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach, Dimension-Adaptive Mixture Discriminant Analysis (D-AMDA), which can detect unobserved classes and adapt to the increasing dimensionality. Model estimation is carried out via a full inductive approach based on an EM algorithm. The method is then embedded in a more general framework for adaptive variable selection and classification suitable for data of large dimensions. A simulation study and an artificial experiment related to classification of adulterated honey samples are used to validate the ability of the proposed framework to deal with complex situations.

Identifiants

pubmed: 35308632
doi: 10.1007/s11634-021-00474-3
pii: 474
pmc: PMC8924148
doi:

Types de publication

Journal Article

Langues

eng

Pagination

55-92

Informations de copyright

© The Author(s) 2021.

Références

BMC Bioinformatics. 2015 Feb 18;16:48
pubmed: 25886892
Data Min Knowl Discov. 2017;31(3):606-660
pubmed: 30930678
Biometrics. 2009 Sep;65(3):701-9
pubmed: 19210744
J Acoust Soc Am. 2012 Feb;131(2):EL184-90
pubmed: 22352620
Ann Appl Stat. 2010 Mar 1;4(1):396-421
pubmed: 20936055
IEEE Trans Image Process. 2013 Nov;22(11):4380-93
pubmed: 23893719
J Agric Food Chem. 2006 Aug 23;54(17):6166-71
pubmed: 16910703
Adv Data Anal Classif. 2015 Dec;9(4):447-460
pubmed: 26949421
R J. 2016 Aug;8(1):289-317
pubmed: 27818791
J Stat Softw. 2018 Apr;84:
pubmed: 30450020

Auteurs

Michael Fop (M)

School of Mathematics & Statistics, University College Dublin, Dublin, Ireland.

Pierre-Alexandre Mattei (PA)

Université Côte d'Azur, Inria, CNRS, Laboratoire J.A. Dieudonné, Maasai team, Nice, France.

Charles Bouveyron (C)

Université Côte d'Azur, Inria, CNRS, Laboratoire J.A. Dieudonné, Maasai team, Nice, France.

Thomas Brendan Murphy (TB)

Université Côte d'Azur, Inria, CNRS, Laboratoire J.A. Dieudonné, Maasai team, Nice, France.

Classifications MeSH