Centered Partition Processes: Informative Priors for Clustering (with Discussion).

Bayesian clustering Bayesian nonparametrics Dirichlet Process centered process exchangeable probability partition function mixture model product partition model

Journal

Bayesian analysis
ISSN: 1936-0975
Titre abrégé: Bayesian Anal
Pays: United States
ID NLM: 101307045

Informations de publication

Date de publication:
Mar 2021
Historique:
entrez: 12 8 2022
pubmed: 1 3 2021
medline: 1 3 2021
Statut: ppublish

Résumé

There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations in terms of Exchangeable Partition Probability Functions (EPPF). Gibbs-type priors encompass a broad class of such cases, including Dirichlet and Pitman-Yor processes. Even though there have been some proposals to relax the exchangeability assumption, allowing covariate-dependence and partial exchangeability, limited consideration has been given on how to include concrete prior knowledge on the partition. For example, we are motivated by an epidemiological application, in which we wish to cluster birth defects into groups and we have prior knowledge of an initial clustering provided by experts. As a general approach for including such prior knowledge, we propose a Centered Partition (CP) process that modifies the EPPF to favor partitions close to an initial one. Some properties of the CP prior are described, a general algorithm for posterior computation is developed, and we illustrate the methodology through simulation examples and an application to the motivating epidemiology study of birth defects.

Identifiants

pubmed: 35958029
doi: 10.1214/20-BA1197
pmc: PMC9364237
mid: NIHMS1815470
doi:

Types de publication

Journal Article

Langues

eng

Pagination

301-370

Subventions

Organisme : NIEHS NIH HHS
ID : R01 ES027498
Pays : United States
Organisme : NCBDD CDC HHS
ID : U01 DD001231
Pays : United States

Références

Biometrics. 2010 Jun;66(2):455-62
pubmed: 19508244
Biometrics. 2009 Sep;65(3):772-80
pubmed: 19173703
Public Health Rep. 2001;116 Suppl 1:32-40
pubmed: 11889273
Am J Med Genet. 1999 May 21;84(2):102-10
pubmed: 10323733
J Comput Graph Stat. 2011 Mar 1;20(1):260-278
pubmed: 21566678
Birth Defects Res A Clin Mol Teratol. 2003 Mar;67(3):193-201
pubmed: 12797461
Bayesian Anal. 2021 Mar;16(1):301-370
pubmed: 35958029
Biometrika. 2016 Jun;103(2):319-335
pubmed: 27279660
Arch Pediatr Adolesc Med. 2007 Aug;161(8):745-50
pubmed: 17679655
Am J Obstet Gynecol. 2008 Sep;199(3):237.e1-9
pubmed: 18674752
J Am Stat Assoc. 2017;112(518):721-732
pubmed: 29276318
Birth Defects Res A Clin Mol Teratol. 2007 Oct;79(10):714-27
pubmed: 17729292
BMJ. 2015 Jul 08;351:h3190
pubmed: 26156519
Biometrika. 2008;95(2):307-323
pubmed: 18800173
Bayesian Anal. 2011 Mar 1;6(1):
pubmed: 24358072
IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):212-29
pubmed: 26353237
Reprod Toxicol. 2014 Aug;47:77-80
pubmed: 24893173

Auteurs

Sally Paganin (S)

Department of Environmental Science, Policy, and Management, University of California, Berkeley.

Amy H Herring (AH)

Department of Statistical Science, Duke University, Durham.

Andrew F Olshan (AF)

Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill.

David B Dunson (DB)

Department of Statistical Science, Duke University, Durham.

Classifications MeSH