Unobserved classes and extra variables in high-dimensional discriminant analysis.
Adaptive supervised classification
Conditional estimation
Model-based discriminant analysis
Unobserved classes
Variable selection
Journal
Advances in data analysis and classification
ISSN: 1862-5347
Titre abrégé: Adv Data Anal Classif
Pays: Germany
ID NLM: 101562922
Informations de publication
Date de publication:
2022
2022
Historique:
received:
29
01
2021
revised:
15
07
2021
accepted:
03
10
2021
entrez:
21
3
2022
pubmed:
22
3
2022
medline:
22
3
2022
Statut:
ppublish
Résumé
In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach, Dimension-Adaptive Mixture Discriminant Analysis (D-AMDA), which can detect unobserved classes and adapt to the increasing dimensionality. Model estimation is carried out via a full inductive approach based on an EM algorithm. The method is then embedded in a more general framework for adaptive variable selection and classification suitable for data of large dimensions. A simulation study and an artificial experiment related to classification of adulterated honey samples are used to validate the ability of the proposed framework to deal with complex situations.
Identifiants
pubmed: 35308632
doi: 10.1007/s11634-021-00474-3
pii: 474
pmc: PMC8924148
doi:
Types de publication
Journal Article
Langues
eng
Pagination
55-92Informations de copyright
© The Author(s) 2021.
Références
BMC Bioinformatics. 2015 Feb 18;16:48
pubmed: 25886892
Data Min Knowl Discov. 2017;31(3):606-660
pubmed: 30930678
Biometrics. 2009 Sep;65(3):701-9
pubmed: 19210744
J Acoust Soc Am. 2012 Feb;131(2):EL184-90
pubmed: 22352620
Ann Appl Stat. 2010 Mar 1;4(1):396-421
pubmed: 20936055
IEEE Trans Image Process. 2013 Nov;22(11):4380-93
pubmed: 23893719
J Agric Food Chem. 2006 Aug 23;54(17):6166-71
pubmed: 16910703
Adv Data Anal Classif. 2015 Dec;9(4):447-460
pubmed: 26949421
R J. 2016 Aug;8(1):289-317
pubmed: 27818791
J Stat Softw. 2018 Apr;84:
pubmed: 30450020