The Potential of Using Generative AI/NLP to Identify and Analyse Critical Incidents in a Critical Incident Reporting System (CIRS): A Feasibility Case-Control Study.

healthcare quality improvement human error human factors patient safety safety culture

Résumé

To enhance patient safety in healthcare, it is crucial to address the underreporting of issues in Critical Incident Reporting Systems (CIRSs). This study aims to evaluate the effectiveness of generative Artificial Intelligence and Natural Language Processing (AI/NLP) in reviewing CIRS cases by comparing its performance with human reviewers and categorising these cases into relevant topics. A case-control feasibility study was conducted using CIRS cases from the German CIRS-Anaesthesiology subsystem. Each case was reviewed by a human expert and by an AI/NLP model (ChatGPT-3.5). Two CIRS experts blindly assessed these reviews, rating them on linguistic quality, recognisable expertise, logical derivability, and overall quality using six-point Likert scales. On average, the CIRS experts correctly classified 80% of human CIRS reviews as created by a human and misclassified 45.8% of AI reviews as written by a human. Ratings on a scale of 1 (very good) to 6 (failed) revealed a comparable performance between human- and AI-generated reviews across the dimensions of linguistic expression ( This feasibility study demonstrates the potential of generative AI/NLP in analysing and categorising cases from the CIRS. This could have implications for improving incident reporting in healthcare. Therefore, additional research is required to verify and expand upon these discoveries.

Sections du résumé

BACKGROUND BACKGROUND

To enhance patient safety in healthcare, it is crucial to address the underreporting of issues in Critical Incident Reporting Systems (CIRSs). This study aims to evaluate the effectiveness of generative Artificial Intelligence and Natural Language Processing (AI/NLP) in reviewing CIRS cases by comparing its performance with human reviewers and categorising these cases into relevant topics.

METHODS METHODS

A case-control feasibility study was conducted using CIRS cases from the German CIRS-Anaesthesiology subsystem. Each case was reviewed by a human expert and by an AI/NLP model (ChatGPT-3.5). Two CIRS experts blindly assessed these reviews, rating them on linguistic quality, recognisable expertise, logical derivability, and overall quality using six-point Likert scales.

RESULTS RESULTS

On average, the CIRS experts correctly classified 80% of human CIRS reviews as created by a human and misclassified 45.8% of AI reviews as written by a human. Ratings on a scale of 1 (very good) to 6 (failed) revealed a comparable performance between human- and AI-generated reviews across the dimensions of linguistic expression (

CONCLUSIONS CONCLUSIONS

This feasibility study demonstrates the potential of generative AI/NLP in analysing and categorising cases from the CIRS. This could have implications for improving incident reporting in healthcare. Therefore, additional research is required to verify and expand upon these discoveries.

Identifiants

DOI: 10.3390/healthcare12191964 PMID: 39408144

pubmed: 39408144

pii: healthcare12191964

doi: 10.3390/healthcare12191964

pii:

doi:

Types de publication

Journal Article

Langues

eng

The Potential of Using Generative AI/NLP to Identify and Analyse Critical Incidents in a Critical Incident Reporting System (CIRS): A Feasibility Case-Control Study.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Auteurs

Carlos Ramon Hölzing (CR)

Sebastian Rumpf (S)

Stephan Huber (S)

Nathalie Papenfuß (N)

Patrick Meybohm (P)

Oliver Happel (O)

Classifications MeSH