Uncovering Effective Explanations for Interactive Genomic Data Analysis.

explanation feature pair optimization separability problem

Journal

Patterns (New York, N.Y.)
ISSN: 2666-3899
Titre abrégé: Patterns (N Y)
Pays: United States
ID NLM: 101767765

Informations de publication

Date de publication:
11 Sep 2020
Historique:
received: 10 05 2020
revised: 13 07 2020
accepted: 05 08 2020
entrez: 18 11 2020
pubmed: 19 11 2020
medline: 19 11 2020
Statut: epublish

Résumé

Better tools are needed to enable researchers to quickly identify and explore effective and interpretable feature-based explanations for discriminating multi-class genomic datasets, e.g., healthy versus diseased samples. We develop an interactive exploration tool, GENVISAGE, which rapidly discovers the most discriminative feature pairs that separate two classes of genomic objects and then displays the corresponding visualizations. Since quickly finding top feature pairs is computationally challenging, especially for large numbers of objects and features, we propose a suite of optimizations to make GENVISAGE responsive at scale and demonstrate that our optimizations lead to a 400× speedup over competitive baselines for multiple biological datasets. We apply our rapid and interpretable tool to identify literature-supported pairs of genes whose transcriptomic responses significantly discriminate several chemotherapy drug treatments. With its generalizable optimizations and framework, GENVISAGE opens up real-time feature-based explanation generation to data from massive sequencing efforts, as well as many other scientific domains.

Identifiants

pubmed: 33205133
doi: 10.1016/j.patter.2020.100093
pii: S2666-3899(20)30121-5
pmc: PMC7660438
doi:

Types de publication

Journal Article

Langues

eng

Pagination

100093

Subventions

Organisme : NIBIB NIH HHS
ID : U54 EB020406
Pays : United States
Organisme : NIGMS NIH HHS
ID : U54 GM114838
Pays : United States

Informations de copyright

© 2020 The Authors.

Déclaration de conflit d'intérêts

The authors declare no competing interests.

Références

Bioinformatics. 2016 Jul 15;32(14):2167-75
pubmed: 27153592
Environ Health Perspect. 2004 Nov;112(16):1589-606
pubmed: 15598610
Nucleic Acids Res. 2017 Jul 3;45(W1):W98-W102
pubmed: 28407145
Toxicol Appl Pharmacol. 2009 Nov 1;240(3):355-66
pubmed: 19619570
Br J Pharmacol. 2013 May;169(1):167-78
pubmed: 23373633
Environ Health Perspect. 2018 Jan 18;126(1):014501
pubmed: 29351546
Toxicology. 2015 Jul 3;333:76-88
pubmed: 25896364
Cell Syst. 2015 Dec 23;1(6):417-425
pubmed: 26771021
Bioinformatics. 2007 Oct 15;23(20):2651-9
pubmed: 17720984
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
BMC Bioinformatics. 2017 Nov 3;18(1):466
pubmed: 29100492
Bioinformatics. 2007 Nov 1;23(21):2866-72
pubmed: 17925306
IEEE Trans Vis Comput Graph. 2008 Nov-Dec;14(6):1253-60
pubmed: 18988971
PLoS Biol. 2015 Jul 07;13(7):e1002195
pubmed: 26151137
BMC Syst Biol. 2008 Jan 30;2:10
pubmed: 18234101
Ann Appl Stat. 2010 Mar;4(1):53-77
pubmed: 24489618
Nat Biotechnol. 2013 Jun;31(6):545-52
pubmed: 23685480
Pharmacogenomics J. 2013 Feb;13(1):94-104
pubmed: 22083351
Cell. 2017 Nov 30;171(6):1437-1452.e17
pubmed: 29195078
J Comput Biol. 2003;10(3-4):599-615
pubmed: 12935347
BMC Bioinformatics. 2011 Sep 23;12:375
pubmed: 21939564
Br J Cancer. 2007 Aug 20;97(4):531-8
pubmed: 17667921
Bioinformatics. 2010 Jul 15;26(14):1752-8
pubmed: 20505004
Genome Res. 2008 Mar;18(3):477-88
pubmed: 18256240
Stat Appl Genet Mol Biol. 2004;3:Article19
pubmed: 16646797
Nat Commun. 2019 Nov 28;10(1):5416
pubmed: 31780648
Proc Natl Acad Sci U S A. 2018 Feb 20;115(8):1943-1948
pubmed: 29351989
IEEE/ACM Trans Comput Biol Bioinform. 2010 Apr-Jun;7(2):375-81
pubmed: 20431156
BMC Bioinformatics. 2006 May 02;7:235
pubmed: 16670007
Comput Math Methods Med. 2014;2014:867289
pubmed: 25371703
Science. 2006 Sep 29;313(5795):1929-35
pubmed: 17008526
PLoS One. 2012;7(1):e30397
pubmed: 22291950

Auteurs

Silu Huang (S)

Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.

Charles Blatti (C)

Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.

Saurabh Sinha (S)

Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.

Aditya Parameswaran (A)

Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
School of Information and Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA 94704, USA.

Classifications MeSH