Clustering of Small Territories Based on Axes of Inequality.
big data
classifiers
clustering
e-cohort
hierarchical k-means
inequalities
machine learning
Journal
International journal of environmental research and public health
ISSN: 1660-4601
Titre abrégé: Int J Environ Res Public Health
Pays: Switzerland
ID NLM: 101238455
Informations de publication
Date de publication:
12 03 2022
12 03 2022
Historique:
received:
07
02
2022
revised:
07
03
2022
accepted:
10
03
2022
entrez:
25
3
2022
pubmed:
26
3
2022
medline:
20
4
2022
Statut:
epublish
Résumé
In the present paper, we conduct a study before creating an e-cohort for the design of the sample. This e-cohort had to enable the effective representation of the province of Girona to facilitate its study according to the axes of inequality. The territory under study is divided by municipalities, considering these different axes. The study consists of a comparison of 14 clustering algorithms, together with 3 data sets of municipal information to detect the grouping that was the most consistent. Prior to carrying out the clustering, a variable selection process was performed to discard those that were not useful. The comparison was carried out following two axes: results and graphical representation. The intra-cluster results were also analyzed to observe the coherence of the grouping. Finally, we study the probability of belonging to a cluster, such as the one containing the county capital. This clustering can be the basis for working with a sample that is significant and representative of the territory.
Sections du résumé
BACKGROUND
In the present paper, we conduct a study before creating an e-cohort for the design of the sample. This e-cohort had to enable the effective representation of the province of Girona to facilitate its study according to the axes of inequality.
METHODS
The territory under study is divided by municipalities, considering these different axes. The study consists of a comparison of 14 clustering algorithms, together with 3 data sets of municipal information to detect the grouping that was the most consistent. Prior to carrying out the clustering, a variable selection process was performed to discard those that were not useful. The comparison was carried out following two axes: results and graphical representation.
RESULTS
The intra-cluster results were also analyzed to observe the coherence of the grouping. Finally, we study the probability of belonging to a cluster, such as the one containing the county capital.
CONCLUSIONS
This clustering can be the basis for working with a sample that is significant and representative of the territory.
Identifiants
pubmed: 35329047
pii: ijerph19063359
doi: 10.3390/ijerph19063359
pmc: PMC8955561
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Références
BMC Public Health. 2019 Jun 3;19(1):684
pubmed: 31159766
BMC Public Health. 2010 May 11;10:242
pubmed: 20459807
Eur J Public Health. 2010 Feb;20(1):27-35
pubmed: 20081212
Environ Manage. 2002 Jan;29(1):3-15
pubmed: 11740620
J Med Internet Res. 2019 Mar 01;21(3):e12143
pubmed: 30821691
Behav Res Ther. 1994 Jun;32(5):547-58
pubmed: 8042967
Soc Indic Res. 2018;137(1):379-390
pubmed: 29651193
J Nephrol. 2008 May-Jun;21(3):290-8
pubmed: 18587716
JMIR Mhealth Uhealth. 2019 Sep 30;7(9):e13238
pubmed: 31573928
Nat Rev Endocrinol. 2016 Mar;12(3):177-83
pubmed: 26775764
JMIR Public Health Surveill. 2016 Oct 18;2(2):e160
pubmed: 27756715
J Med Internet Res. 2015 Feb 02;17(2):e34
pubmed: 25648178
Lancet. 2006 May 6;367(9521):1533-40
pubmed: 16679167
Soc Sci Med. 1999 Nov;49(10):1309-23
pubmed: 10509822
PLoS One. 2015 Jul 06;10(7):e0131521
pubmed: 26147611