CausNet-partial : 'Partial Generational Orderings' based search for optimal sparse Bayesian networks via dynamic programming with parent set constraints.

Optimal Bayesian Network dynamic programming generational orderings

Journal

Research square
Titre abrégé: Res Sq
Pays: United States
ID NLM: 101768035

Informations de publication

Date de publication:
07 Mar 2024
Historique:
pubmed: 18 3 2024
medline: 18 3 2024
entrez: 18 3 2024
Statut: epublish

Résumé

In our recent work, we developed a novel dynamic programming algorithm to find optimal Bayesian networks (BNs) with parent set constraints. This 'generational orderings' based dynamic programming search algorithm - CausNet - efficiently searches the space of possible BNs given the possible parent sets. The algorithm supports both continuous and categorical data, as well as continuous, binary and survival outcomes. In the present work, we develop a variant of CausNet - CausNet-partial - which searches the space of 'partial generational orderings', which further reduces the search space and is suited for finding smaller sparse optimal Bayesian networks; and can be applied to 1000s of variables. We test this method both on synthetic and real data. Our algorithm performs better than three state-of-art algorithms that are currently used extensively to find optimal BNs. We apply it to simulated continuous data and also to a benchmark discrete Bayesian network ALARM, a Bayesian network designed to provide an alarm message system for patient monitoring. We first apply the original CausNet and then CausNet-partial varying the partial order from 5 to 2. CausNet-partial discovers small sparse networks with drastically reduced runtime as expected from theory. Our partial generational orderings based search for small optimal networks, is both an efficient and highly scalable approach for finding optimal sparse and small Bayesian Networks and can be applied to 1000s of variables. Using specifiable parameters - correlation, FDR cutoffs, in-degree, and partial order - one can increase or decrease the number of nodes and density of the networks. Availability of two scoring option - BIC and Bge - and implementation for survival outcomes and mixed data types makes our algorithm very suitable for many types of high dimensional data in a variety of fields.

Sections du résumé

Background UNASSIGNED
In our recent work, we developed a novel dynamic programming algorithm to find optimal Bayesian networks (BNs) with parent set constraints. This 'generational orderings' based dynamic programming search algorithm - CausNet - efficiently searches the space of possible BNs given the possible parent sets. The algorithm supports both continuous and categorical data, as well as continuous, binary and survival outcomes. In the present work, we develop a variant of CausNet - CausNet-partial - which searches the space of 'partial generational orderings', which further reduces the search space and is suited for finding smaller sparse optimal Bayesian networks; and can be applied to 1000s of variables.
Results UNASSIGNED
We test this method both on synthetic and real data. Our algorithm performs better than three state-of-art algorithms that are currently used extensively to find optimal BNs. We apply it to simulated continuous data and also to a benchmark discrete Bayesian network ALARM, a Bayesian network designed to provide an alarm message system for patient monitoring. We first apply the original CausNet and then CausNet-partial varying the partial order from 5 to 2. CausNet-partial discovers small sparse networks with drastically reduced runtime as expected from theory.
Conclusions UNASSIGNED
Our partial generational orderings based search for small optimal networks, is both an efficient and highly scalable approach for finding optimal sparse and small Bayesian Networks and can be applied to 1000s of variables. Using specifiable parameters - correlation, FDR cutoffs, in-degree, and partial order - one can increase or decrease the number of nodes and density of the networks. Availability of two scoring option - BIC and Bge - and implementation for survival outcomes and mixed data types makes our algorithm very suitable for many types of high dimensional data in a variety of fields.

Identifiants

pubmed: 38496505
doi: 10.21203/rs.3.rs-4021074/v1
pmc: PMC10942557
pii:
doi:

Types de publication

Preprint

Langues

eng

Subventions

Organisme : NIA NIH HHS
ID : P01 AG055367
Pays : United States
Organisme : NCI NIH HHS
ID : P01 CA196569
Pays : United States
Organisme : NICHD NIH HHS
ID : R01 HD098161
Pays : United States

Déclaration de conflit d'intérêts

Competing interests The authors have no competing interests as defined by BMC, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Auteurs

Nand Sharma (N)

Division of Biostatistics, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, USA.

Joshua Millstein (J)

Division of Biostatistics, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, USA.

Classifications MeSH