Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias.

COVID-19 Corpora Knowledge Reproduction Language Models Natural Language Processing

Journal

Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582

Informations de publication

Date de publication:
29 Jun 2022
Historique:
entrez: 1 7 2022
pubmed: 2 7 2022
medline: 6 7 2022
Statut: ppublish

Résumé

This paper explores a methodology for bias quantification in transformer-based deep neural network language models for Chinese, English, and French. When queried with health-related mythbusters on COVID-19, we observe a bias that is not of a semantic/encyclopaedical knowledge nature, but rather a syntactic one, as predicted by theoretical insights of structural complexity. Our results highlight the need for the creation of health-communication corpora as training sets for deep learning.

Identifiants

pubmed: 35773848
pii: SHTI220702
doi: 10.3233/SHTI220702
doi:

Types de publication

Journal Article

Langues

eng

Pagination

221-225

Auteurs

Giuseppe Samo (G)

Beijing Language and Culture University.

Caterina Bonan (C)

University of Cambridge.

Fuzhen Si (F)

Beijing Language and Culture University.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH