Representation transfer for differentially private drug sensitivity prediction.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
15 07 2019
Historique:
entrez: 13 9 2019
pubmed: 13 9 2019
medline: 13 6 2020
Statut: ppublish

Résumé

Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymization strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine learning can help by guaranteeing that the published results do not leak too much information about any individual data point. Recent research has reached promising results on differentially private drug sensitivity prediction using gene expression data. Differentially private learning with genomic data is challenging because it is more difficult to guarantee privacy in high dimensions. Dimensionality reduction can help, but if the dimension reduction mapping is learned from the data, then it needs to be differentially private too, which can carry a significant privacy cost. Furthermore, the selection of any hyperparameters (such as the target dimensionality) needs to also avoid leaking private information. We study an approach that uses a large public dataset of similar type to learn a compact representation for differentially private learning. We compare three representation learning methods: variational autoencoders, principal component analysis and random projection. We solve two machine learning tasks on gene expression of cancer cell lines: cancer type classification, and drug sensitivity prediction. The experiments demonstrate significant benefit from all representation learning methods with variational autoencoders providing the most accurate predictions most often. Our results significantly improve over previous state-of-the-art in accuracy of differentially private drug sensitivity prediction. Code used in the experiments is available at https://github.com/DPBayes/dp-representation-transfer.

Identifiants

pubmed: 31510659
pii: 5529143
doi: 10.1093/bioinformatics/btz373
pmc: PMC6612875
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

i218-i224

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press.

Références

Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10787-92
pubmed: 11553813
PLoS Genet. 2008 Aug 29;4(8):e1000167
pubmed: 18769715
Nucleic Acids Res. 2013 Jan;41(Database issue):D955-61
pubmed: 23180760
Science. 2013 Jan 18;339(6117):321-4
pubmed: 23329047
Mach Learn. 2013 Oct;93(1):163-183
pubmed: 24482559
Nat Biotechnol. 2014 Dec;32(12):1202-12
pubmed: 24880487
Biol Direct. 2018 Feb 6;13(1):1
pubmed: 29409513

Auteurs

Teppo Niinimäki (T)

Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.

Mikko A Heikkilä (MA)

Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.

Antti Honkela (A)

Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.
Department of Public Health, University of Helsinki, Helsinki, Finland.
Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland.

Samuel Kaski (S)

Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH