Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl.

Anatidae Animal behavior Automated model pipeline Biologging Classification Daily activity Daily activity routine Global positioning system Life history state Supervised machine learning Telemetry Waterfowl

Journal

Movement ecology
ISSN: 2051-3933
Titre abrégé: Mov Ecol
Pays: England
ID NLM: 101635009

Informations de publication

Date de publication:
16 May 2022
Historique:
received: 05 01 2022
accepted: 03 05 2022
entrez: 16 5 2022
pubmed: 17 5 2022
medline: 17 5 2022
Statut: epublish

Résumé

Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using "automated modelling pipelines". Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter "feature sets": GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the "features" (independent variables) used in models. Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting). Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing.

Sections du résumé

BACKGROUND BACKGROUND
Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using "automated modelling pipelines".
METHODS METHODS
Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter "feature sets": GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the "features" (independent variables) used in models.
RESULTS RESULTS
Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting).
CONCLUSIONS CONCLUSIONS
Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing.

Identifiants

pubmed: 35578372
doi: 10.1186/s40462-022-00324-7
pii: 10.1186/s40462-022-00324-7
pmc: PMC9109391
doi:

Types de publication

Journal Article

Langues

eng

Pagination

23

Informations de copyright

© 2022. The Author(s).

Références

PLoS One. 2021 Jul 15;16(7):e0254841
pubmed: 34264999
Ecol Evol. 2019 Apr 16;9(9):5490-5500
pubmed: 31110697
Proc Natl Acad Sci U S A. 2008 Dec 9;105(49):19052-9
pubmed: 19060196
Curr Zool. 2017 Dec;63(6):667-674
pubmed: 29492028
Ecol Appl. 2014 Jun;24(4):593-601
pubmed: 24988762
Mov Ecol. 2020 Jun 03;8:24
pubmed: 32518652
PLoS One. 2012;7(5):e37997
pubmed: 22693586
Philos Trans R Soc Lond B Biol Sci. 2010 Jul 27;365(1550):2267-78
pubmed: 20566503
Science. 2015 Jun 12;348(6240):aaa2478
pubmed: 26068858
Science. 2020 Jul 10;369(6500):145-147
pubmed: 32646989
Mov Ecol. 2019 Feb 25;7:6
pubmed: 30834128
Ecol Evol. 2020 Feb 12;10(5):2513-2529
pubmed: 32184998
J Anim Ecol. 2016 Jan;85(1):69-84
pubmed: 25907267
Philos Trans R Soc Lond B Biol Sci. 2016 Sep 26;371(1704):
pubmed: 27528787
Philos Trans R Soc Lond B Biol Sci. 2019 Sep 16;374(1781):20180046
pubmed: 31352884
Mov Ecol. 2017 Jun 1;5:12
pubmed: 28580149
Nat Rev Genet. 2018 Jan;19(1):9-20
pubmed: 29129921
Mov Ecol. 2016 Sep 01;4(1):21
pubmed: 27595001
Science. 2011 Jan 21;331(6015):296-302
pubmed: 21252339
J Environ Manage. 2021 Nov 1;297:113170
pubmed: 34280859
Proc Natl Acad Sci U S A. 2008 Dec 9;105(49):19066-71
pubmed: 19060192
Sensors (Basel). 2018 Nov 02;18(11):
pubmed: 30400204

Auteurs

Cory Overton (C)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA. coverton@usgs.gov.

Michael Casazza (M)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Joseph Bretz (J)

Cloud Hosting Solutions, U.S. Geological Survey, Bozeman, MT, USA.

Fiona McDuie (F)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.
Moss Landing Laboratories, San Jose State University Research Foundation, San Jose, CA, USA.

Elliott Matchett (E)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Desmond Mackell (D)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Austen Lorenz (A)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Andrea Mott (A)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Mark Herzog (M)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Josh Ackerman (J)

Western Ecological Research Center, U.S. Geological Survey, Dixon Field Station, Dixon, CA, USA.

Classifications MeSH