Evaluation of home detection algorithms on mobile phone data using individual-level ground truth.

Data science Home location detection Human mobility Mobile phone data

Journal

EPJ data science
ISSN: 2193-1127
Titre abrégé: EPJ Data Sci
Pays: Germany
ID NLM: 101686785

Informations de publication

Date de publication:
2021
Historique:
received: 05 11 2020
accepted: 12 05 2021
entrez: 7 6 2021
pubmed: 8 6 2021
medline: 8 6 2021
Statut: ppublish

Résumé

Inferring mobile phone users' home location, i.e., assigning a location in space to a user based on data generated by the mobile phone network, is a central task in leveraging mobile phone data to study social and urban phenomena. Despite its widespread use, home detection relies on assumptions that are difficult to check without ground truth, i.e., where the individual who owns the device resides. In this paper, we present a dataset that comprises the mobile phone activity of sixty-five participants for whom the geographical coordinates of their residence location are known. The mobile phone activity refers to Call Detail Records (CDRs), eXtended Detail Records (XDRs), and Control Plane Records (CPRs), which vary in their temporal granularity and differ in the data generation mechanism. We provide an unprecedented evaluation of the accuracy of home detection algorithms and quantify the amount of data needed for each stream to carry out successful home detection for each stream. Our work is useful for researchers and practitioners to minimize data requests and maximize the accuracy of the home antenna location. The online version contains supplementary material available at 10.1140/epjds/s13688-021-00284-9.

Identifiants

pubmed: 34094810
doi: 10.1140/epjds/s13688-021-00284-9
pii: 284
pmc: PMC8170634
doi:

Types de publication

Journal Article

Langues

eng

Pagination

29

Informations de copyright

© The Author(s) 2021.

Déclaration de conflit d'intérêts

Competing interestsThe authors declare that they have no competing interests.

Références

PLoS One. 2015 Apr 21;10(4):e0124160
pubmed: 25897957
Phys Rep. 2021 May 23;913:1-52
pubmed: 33612922
Nat Commun. 2015 Sep 08;6:8166
pubmed: 26349016
Int J Environ Res Public Health. 2019 Nov 15;16(22):
pubmed: 31731743
PLoS One. 2015 May 28;10(5):e0128692
pubmed: 26020628
Data Min Knowl Discov. 2018;32(3):787-829
pubmed: 31258383
Sci Rep. 2013;3:1376
pubmed: 23524645
Palgrave Commun. 2019 Mar 26;5:
pubmed: 31579302
PLoS One. 2020 Jun 30;15(6):e0235224
pubmed: 32603345
PLoS One. 2012;7(6):e39253
pubmed: 22761748
Science. 2010 May 21;328(5981):1029-31
pubmed: 20489022
J Expo Sci Environ Epidemiol. 2019 Mar;29(2):278-291
pubmed: 30185946
Nat Commun. 2017 May 16;8:15227
pubmed: 28509896
Proc Natl Acad Sci U S A. 2014 Nov 11;111(45):15888-93
pubmed: 25349388
Sci Data. 2018 Dec 11;5:180286
pubmed: 30532052

Auteurs

Luca Pappalardo (L)

Institute of Information Science and Technologies (ISTI), National Research Council (CNR), Pisa, Italy.

Leo Ferres (L)

Faculty of Engineering, Universidad del Desarrollo, Santiago, Chile.
Telefónica R&D, Santiago, Chile.
ISI Foundation, Turin, Italy.

Manuel Sacasa (M)

Telefónica R&D, Santiago, Chile.

Ciro Cattuto (C)

University of Turin, Turin, Italy.
ISI Foundation, Turin, Italy.

Loreto Bravo (L)

Faculty of Engineering, Universidad del Desarrollo, Santiago, Chile.
Telefónica R&D, Santiago, Chile.

Classifications MeSH