Geometric anomaly detection in data.
persistent cohomology
singularities
stratification inference
Journal
Proceedings of the National Academy of Sciences of the United States of America
ISSN: 1091-6490
Titre abrégé: Proc Natl Acad Sci U S A
Pays: United States
ID NLM: 7505876
Informations de publication
Date de publication:
18 08 2020
18 08 2020
Historique:
pubmed:
5
8
2020
medline:
5
8
2020
entrez:
5
8
2020
Statut:
ppublish
Résumé
The quest for low-dimensional models which approximate high-dimensional data is pervasive across the physical, natural, and social sciences. The dominant paradigm underlying most standard modeling techniques assumes that the data are concentrated near a single unknown manifold of relatively small intrinsic dimension. Here, we present a systematic framework for detecting interfaces and related anomalies in data which may fail to satisfy the manifold hypothesis. By computing the local topology of small regions around each data point, we are able to partition a given dataset into disjoint classes, each of which can be individually approximated by a single manifold. Since these manifolds may have different intrinsic dimensions, local topology discovers singular regions in data even when none of the points have been sampled precisely from the singularities. We showcase this method by identifying the intersection of two surfaces in the 24-dimensional space of cyclo-octane conformations and by locating all of the self-intersections of a Henneberg minimal surface immersed in 3-dimensional space. Due to the local nature of the topological computations, the algorithmic burden of performing such data stratification is readily distributable across several processors.
Identifiants
pubmed: 32747569
pii: 2001741117
doi: 10.1073/pnas.2001741117
pmc: PMC7443892
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
19664-19669Subventions
Organisme : Medical Research Council
Pays : United Kingdom
Informations de copyright
Copyright © 2020 the Author(s). Published by PNAS.
Déclaration de conflit d'intérêts
The authors declare no competing interest.
Références
J Chem Phys. 2010 Jun 21;132(23):234115
pubmed: 20572697
Science. 2000 Dec 22;290(5500):2319-23
pubmed: 11125149
Science. 2000 Dec 22;290(5500):2268-9
pubmed: 11188725
Nat Biotechnol. 2008 Mar;26(3):303-4
pubmed: 18327243
EPJ Data Sci. 2017;6(1):17
pubmed: 32025466