Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias.

COVID-19 Humans Language Linguistics Neural Networks, Computer Semantics

COVID-19 Corpora Knowledge Reproduction Language Models Natural Language Processing

Journal

Studies in health technology and informatics

ISSN: 1879-8365

Titre abrégé: Stud Health Technol Inform

Pays: Netherlands

ID NLM: 9214582

Informations de publication

Date de publication:
29 Jun 2022

Historique:

entrez: 1 7 2022

pubmed: 2 7 2022

medline: 6 7 2022

Statut: ppublish

Résumé

This paper explores a methodology for bias quantification in transformer-based deep neural network language models for Chinese, English, and French. When queried with health-related mythbusters on COVID-19, we observe a bias that is not of a semantic/encyclopaedical knowledge nature, but rather a syntactic one, as predicted by theoretical insights of structural complexity. Our results highlight the need for the creation of health-communication corpora as training sets for deep learning.

Identifiants

DOI: 10.3233/SHTI220702 PMID: 35773848

pubmed: 35773848

pii: SHTI220702

doi: 10.3233/SHTI220702

doi:

Types de publication

Journal Article

Langues

eng

Pagination

221-225

Health-Related Content in Transformer-Based Deep Neural Network Language Models: Exploring Cross-Linguistic Syntactic Bias.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Auteurs

Giuseppe Samo (G)

Caterina Bonan (C)

Fuzhen Si (F)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH