Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling, and cloud-native open-source tools.


Journal

bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187

Informations de publication

Date de publication:
28 Apr 2023
Historique:
pubmed: 10 5 2023
medline: 10 5 2023
entrez: 10 5 2023
Statut: epublish

Résumé

Pose estimation algorithms are shedding new light on animal behavior and intelligence. Most existing models are only trained with labeled frames (supervised learning). Although effective in many cases, the fully supervised approach requires extensive image labeling, struggles to generalize to new videos, and produces noisy outputs that hinder downstream analyses. We address each of these limitations with a semi-supervised approach that leverages the spatiotemporal statistics of unlabeled videos in two different ways. First, we introduce unsupervised training objectives that penalize the network whenever its predictions violate smoothness of physical motion, multiple-view geometry, or depart from a low-dimensional subspace of plausible body configurations. Second, we design a new network architecture that predicts pose for a given frame using temporal context from surrounding unlabeled frames. These context frames help resolve brief occlusions or ambiguities between nearby and similar-looking body parts. The resulting pose estimation networks achieve better performance with fewer labels, generalize better to unseen videos, and provide smoother and more reliable pose trajectories for downstream analysis; for example, these improved pose trajectories exhibit stronger correlations with neural activity. We also propose a Bayesian post-processing approach based on deep ensembling and Kalman smoothing that further improves tracking accuracy and robustness. We release a deep learning package that adheres to industry best practices, supporting easy model development and accelerated training and prediction. Our package is accompanied by a cloud application that allows users to annotate data, train networks, and predict new videos at scale, directly from the browser.

Identifiants

pubmed: 37162966
doi: 10.1101/2023.04.28.538703
pmc: PMC10168383
pii:
doi:

Types de publication

Preprint

Langues

eng

Subventions

Organisme : NINDS NIH HHS
ID : U19 NS104649
Pays : United States

Auteurs

Dan Biderman (D)

Columbia University, New York, USA.

Matthew R Whiteway (MR)

Columbia University, New York, USA.

Cole Hurwitz (C)

Columbia University, New York, USA.

Nicholas Greenspan (N)

Columbia University, New York, USA.

Robert S Lee (RS)

Work done while at Lightning.ai, New York, USA.

Ankit Vishnubhotla (A)

Columbia University, New York, USA.

Richard Warren (R)

Columbia University, New York, USA.

Federico Pedraja (F)

Columbia University, New York, USA.

Dillon Noone (D)

Columbia University, New York, USA.

Michael Schartner (M)

Champalimaud Centre for the Unknown, Lisbon, Portugal.

Julia M Huntenburg (JM)

Max Planck Institute for Biological Cybernetics, Tübingen, Germany.

Anup Khanal (A)

University of California Los Angeles, Los Angeles, USA.

Guido T Meijer (GT)

Champalimaud Centre for the Unknown, Lisbon, Portugal.

Jean-Paul Noel (JP)

New York University, New York, USA.

Alejandro Pan-Vazquez (A)

Princeton University, Princeton, USA.

Karolina Z Socha (KZ)

University College London, London, United Kingdom.

Anne E Urai (AE)

Leiden University, Leiden, The Netherlands.

John P Cunningham (JP)

Columbia University, New York, USA.

Nathaniel Sawtell (N)

Columbia University, New York, USA.

Liam Paninski (L)

Columbia University, New York, USA.

Classifications MeSH