Automated Coding of Job Descriptions From a General Population Study: Overview of Existing Tools, Their Application and Comparison.
automatic job coding tool
free-text job description
general population studies
reliability
Journal
Annals of work exposures and health
ISSN: 2398-7316
Titre abrégé: Ann Work Expo Health
Pays: England
ID NLM: 101698454
Informations de publication
Date de publication:
06 06 2023
06 06 2023
Historique:
received:
22
09
2022
accepted:
09
01
2023
medline:
8
6
2023
pubmed:
4
2
2023
entrez:
3
2
2023
Statut:
ppublish
Résumé
Automatic job coding tools were developed to reduce the laborious task of manually assigning job codes based on free-text job descriptions in census and survey data sources, including large occupational health studies. The objective of this study is to provide a case study of comparative performance of job coding and JEM (Job-Exposure Matrix)-assigned exposures agreement using existing coding tools. We compared three automatic job coding tools [AUTONOC, CASCOT (Computer-Assisted Structured Coding Tool), and LabourR], which were selected based on availability, coding of English free-text into coding systems closely related to the 1988 version of the International Standard Classification of Occupations (ISCO-88), and capability to perform batch coding. We used manually coded job histories from the AsiaLymph case-control study that were translated into English prior to auto-coding to assess their performance. We applied two general population JEMs to assess agreement at exposure level. Percent agreement and PABAK (Prevalence-Adjusted Bias-Adjusted Kappa) were used to compare the agreement of results from manual coders and automatic coding tools. The coding per cent agreement among the three tools ranged from 17.7 to 26.0% for exact matches at the most detailed 4-digit ISCO-88 level. The agreement was better at a more general level of job coding (e.g. 43.8-58.1% in 1-digit ISCO-88), and in exposure assignments (median values of PABAK coefficient ranging 0.69-0.78 across 12 JEM-assigned exposures). Based on our testing data, CASCOT was found to outperform others in terms of better agreement in both job coding (26% 4-digit agreement) and exposure assignment (median kappa 0.61). In this study, we observed that agreement on job coding was generally low for the three tools but noted a higher degree of agreement in assigned exposures. The results indicate the need for study-specific evaluations prior to their automatic use in general population studies, as well as improvements in the evaluated automatic coding tools.
Identifiants
pubmed: 36734402
pii: 7025461
doi: 10.1093/annweh/wxad002
pmc: PMC10243927
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, N.I.H., Intramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
663-672Informations de copyright
© The Author(s) 2023. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Références
Ann Occup Hyg. 2014 May;58(4):482-92
pubmed: 24504175
JMIR Form Res. 2020 Aug 5;4(8):e16422
pubmed: 32755893
Occup Environ Med. 2013 Apr;70(4):261-7
pubmed: 23175734
Occup Environ Med. 2018 Feb;75(2):155-159
pubmed: 29089391
Scand J Work Environ Health. 2009 Dec;35(6):454-61
pubmed: 19806271
J Clin Epidemiol. 1993 May;46(5):423-9
pubmed: 8501467
Ann Work Expo Health. 2020 Jul 1;64(6):565-568
pubmed: 32556221
Biometrics. 1977 Mar;33(1):159-74
pubmed: 843571
Am J Ind Med. 2012 Mar;55(3):228-31
pubmed: 22420026
Occup Environ Med. 2016 Jun;73(6):417-24
pubmed: 27102331
Ann Work Expo Health. 2022 Jan 7;66(1):113-118
pubmed: 34145882
Ann Occup Hyg. 2013 Jan;57(1):107-14
pubmed: 22805748
Ann Occup Hyg. 2016 Aug;60(7):885-99
pubmed: 27250109
Occup Environ Med. 2000 Sep;57(9):635-41
pubmed: 10935945