Large-scale analysis of interobserver agreement and reliability in cardiotocography interpretation during labor using an online tool.

Cardiotocography Fetal heart rate Fetal hypoxia Interobserver agreement Intrapartum Labor

Journal

BMC pregnancy and childbirth
ISSN: 1471-2393
Titre abrégé: BMC Pregnancy Childbirth
Pays: England
ID NLM: 100967799

Informations de publication

Date de publication:
14 Feb 2024
Historique:
received: 27 09 2023
accepted: 05 02 2024
medline: 15 2 2024
pubmed: 15 2 2024
entrez: 14 2 2024
Statut: epublish

Résumé

While the effectiveness of cardiotocography in reducing neonatal morbidity is still debated, it remains the primary method for assessing fetal well-being during labor. Evaluating how accurately professionals interpret cardiotocography signals is essential for its effective use. The objective was to evaluate the accuracy of fetal hypoxia prediction by practitioners through the interpretation of cardiotocography signals and clinical variables during labor. We conducted a cross-sectional online survey, involving 120 obstetric healthcare providers from several countries. One hundred cases, including fifty cases of fetal hypoxia, were randomly assigned to participants who were invited to predict the fetal outcome (binary criterion of pH with a threshold of 7.15) based on the cardiotocography signals and clinical variables. After describing the participants, we calculated (with a 95% confidence interval) the success rate, sensitivity and specificity to predict the fetal outcome for the whole population and according to pH ranges, professional groups and number of years of experience. Interobserver agreement and reliability were evaluated using the proportion of agreement and Cohen's kappa respectively. The overall ability to predict a pH level below 7.15 yielded a success rate of 0.58 (95% CI 0.56-0.60), a sensitivity of 0.58 (95% CI 0.56-0.60) and a specificity of 0.63 (95% CI 0.61-0.65). No significant difference in the success rates was observed with respect to profession and number of years of experience. The success rate was higher for the cases with a pH level below 7.05 (0.69) and above 7.20 (0.66) compared to those falling between 7.05 and 7.20 (0.48). The proportion of agreement between participants was good (0.82), with an overall kappa coefficient indicating substantial reliability (0.63). The use of an online tool enabled us to collect a large amount of data to analyze how practitioners interpret cardiotocography data during labor. Despite a good level of agreement and reliability among practitioners, the overall accuracy is poor, particularly for cases with a neonatal pH between 7.05 and 7.20. Factors such as profession and experience level do not present notable impact on the accuracy of the annotations. The implementation and use of a computerized cardiotocography analysis software has the potential to enhance the accuracy to detect fetal hypoxia, especially for ambiguous cardiotocography tracings.

Sections du résumé

BACKGROUND BACKGROUND
While the effectiveness of cardiotocography in reducing neonatal morbidity is still debated, it remains the primary method for assessing fetal well-being during labor. Evaluating how accurately professionals interpret cardiotocography signals is essential for its effective use. The objective was to evaluate the accuracy of fetal hypoxia prediction by practitioners through the interpretation of cardiotocography signals and clinical variables during labor.
MATERIAL AND METHODS METHODS
We conducted a cross-sectional online survey, involving 120 obstetric healthcare providers from several countries. One hundred cases, including fifty cases of fetal hypoxia, were randomly assigned to participants who were invited to predict the fetal outcome (binary criterion of pH with a threshold of 7.15) based on the cardiotocography signals and clinical variables. After describing the participants, we calculated (with a 95% confidence interval) the success rate, sensitivity and specificity to predict the fetal outcome for the whole population and according to pH ranges, professional groups and number of years of experience. Interobserver agreement and reliability were evaluated using the proportion of agreement and Cohen's kappa respectively.
RESULTS RESULTS
The overall ability to predict a pH level below 7.15 yielded a success rate of 0.58 (95% CI 0.56-0.60), a sensitivity of 0.58 (95% CI 0.56-0.60) and a specificity of 0.63 (95% CI 0.61-0.65). No significant difference in the success rates was observed with respect to profession and number of years of experience. The success rate was higher for the cases with a pH level below 7.05 (0.69) and above 7.20 (0.66) compared to those falling between 7.05 and 7.20 (0.48). The proportion of agreement between participants was good (0.82), with an overall kappa coefficient indicating substantial reliability (0.63).
CONCLUSIONS CONCLUSIONS
The use of an online tool enabled us to collect a large amount of data to analyze how practitioners interpret cardiotocography data during labor. Despite a good level of agreement and reliability among practitioners, the overall accuracy is poor, particularly for cases with a neonatal pH between 7.05 and 7.20. Factors such as profession and experience level do not present notable impact on the accuracy of the annotations. The implementation and use of a computerized cardiotocography analysis software has the potential to enhance the accuracy to detect fetal hypoxia, especially for ambiguous cardiotocography tracings.

Identifiants

pubmed: 38355457
doi: 10.1186/s12884-024-06322-4
pii: 10.1186/s12884-024-06322-4
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

136

Informations de copyright

© 2024. The Author(s).

Références

Wang B, Zeng H, Liu J, Sun M. Effects of Prenatal Hypoxia on Nervous System Development and Related Diseases. Front Neurosci. 2021;15. Cited 2023 Feb 10. Available from: https://www.frontiersin.org/articles/10.3389/fnins.2021.755554 .
FIGO consensus guidelines on intrapartum fetal monitoring: Cardiotocography - Ayres‐de‐Campos - 2015 - International Journal of Gynecology & Obstetrics - Wiley Online Library. Cited 2021 Jun 30. Available from: https://obgyn.onlinelibrary.wiley.com/doi/10.1016/j.ijgo.2015.06.020 .
Recommendations | Fetal monitoring in labour | Guidance | NICE. NICE; 2022. Cited 2023 Apr 12. Available from: https://www.nice.org.uk/guidance/ng229/chapter/Recommendations .
Carbonne B, Dreyfus M, Schaal JP, Bretelle F, Dupuis O, Foulhy C, et al. Classification CNGOF du rythme cardiaque fœtal : obstétriciens et sages-femmes au tableau ! J de Gynécologie Obstétrique et Biologie de la Reprod. 2013;42(6):509–10.
doi: 10.1016/j.jgyn.2013.07.003
Svenska riktlinjer för bedömning av antepartalt CTG. Cited 2023 May 3. Available from: https://ctgutbildning.se/index.php/om-utbildningen/riktlinjer-2 .
Chandraharan E. Introduction of the Physiological CTG Interpretation & Hypoxia in Labour (HIL) Tool, and its Incorporation into a Software Programme: Impact on Perinatal Outcomes. Glob J Reprod Med. 2021;8:8.
Santo S, Ayres-de-Campos D, Costa-Santos C, Schnettler W, Ugwumadu A, Da Graça LM, et al. Agreement and accuracy using the FIGO, ACOG and NICE cardiotocography interpretation guidelines. Acta Obstet Gynecol Scand. 2017;96(2):166–75.
doi: 10.1111/aogs.13064 pubmed: 27869985
Zamora Del Pozo C, Chóliz Ezquerro M, Mejía I, Díaz de Terán Martínez-Berganza E, Esteban LM, Rivero Alonso A, et al. Diagnostic capacity and interobserver variability in FIGO, ACOG, NICE and Chandraharan cardiotocographic guidelines to predict neonatal acidemia. J Matern Fetal Neonatal Med. 2022;35(25):8498–506.
Garabedian C, Butruille L, Drumez E, Servan Schreiber E, Bartolo S, Bleu G, et al. Inter-observer reliability of 4 fetal heart rate classifications. J Gynecol Obstet Hum Reprod. 2017;46(2):131–5.
doi: 10.1016/j.jogoh.2016.11.002 pubmed: 28403968
Devoe L, Golde S, Kilman Y, Morton D, Shea K, Waller J. A comparison of visual analyses of intrapartum fetal heart rate tracings according to the new national institute of child health and human development guidelines with computer analyses by an automated fetal heart rate monitoring system. Am J Obstet Gynecol. 2000;183(2):361–6.
doi: 10.1067/mob.2000.107665 pubmed: 10942470
Jia YJ, Ghi T, Pereira S, Gracia Perez-Bonfils A, Chandraharan E. Pathophysiological interpretation of fetal heart rate tracings in clinical practice. Am J Obstet Gynecol. 2023;228(6):622–44.
Ayres-de-Campos D, Bernardes J, FIGO Subcommittee. Twenty-five years after the FIGO guidelines for the use of fetal monitoring: time for a simplified approach? Int J Gynaecol Obstet. 2010;110(1):1–6.
Blackwell SC, Grobman WA, Antoniewicz L, Hutchinson M, Gyamfi Bannerman C. Interobserver and intraobserver reliability of the NICHD 3-Tier Fetal Heart Rate Interpretation System. Am J Obstet Gynecol. 2011;205(4):378.e1-5.
doi: 10.1016/j.ajog.2011.06.086 pubmed: 21864826
Hruban L, Spilka J, Chudáček V, Janků P, Huptych M, Burša M, et al. Agreement on intrapartum cardiotocogram recordings between expert obstetricians. J Eval Clin Pract. 2015;21(4):694–702.
doi: 10.1111/jep.12368 pubmed: 26011725
Hernandez Engelhart C, Gundro Brurberg K, Aanstad KJ, Pay ASD, Kaasen A, Blix E, et al. Reliability and agreement in intrapartum fetal heart rate monitoring interpretation: a systematic review. Acta Obstetricia et Gynecologica Scandinavica. 2023;102(8):970–85.
doi: 10.1111/aogs.14591 pubmed: 37310765 pmcid: 10378030
Chudáček V, Spilka J, Burša M, Janků P, Hruban L, Huptych M, et al. Open access intrapartum CTG database. BMC Pregnancy Childbirth. 2014;13(14):16.
doi: 10.1186/1471-2393-14-16
Kottner J, Audigé L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, et al. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol. 2011;64(1):96–106.
doi: 10.1016/j.jclinepi.2010.03.002 pubmed: 21130355
FHR annotator. Cited 2023 May 25. Available from: https://www.fhr-annotator.com/ .
DuPont TL, Chalak LF, Morriss MC, Burchfield PJ, Christie L, Sánchez PJ. Short-term outcomes of newborns with perinatal acidemia who are not eligible for systemic hypothermia therapy. J Pediatr. 2013;162(1):35–41.
doi: 10.1016/j.jpeds.2012.06.042 pubmed: 22871488
Buderer NMF. Statistical Methodology: I. Incorporating the Prevalence of Disease into the Sample Size Calculation for Sensitivity and Specificity. Acad Emerg Med. 1996;3(9):895–900.
doi: 10.1111/j.1553-2712.1996.tb03538.x pubmed: 8870764
Tang NS, Li HQ, Tang ML, Li J. Confidence interval construction for the difference between two correlated proportions with missing observations. J Biopharm Stat. 2016;26(2):323–38.
doi: 10.1080/10543406.2014.1000544 pubmed: 25632882
Grant JM. The fetal heart rate trace is normal, isn’t it?: Observer agreement of categorical assessments. Lancet. 1991;337(8735):215–8.
doi: 10.1016/0140-6736(91)92169-3 pubmed: 1670851
Hripcsak G, Heitjan DF. Measuring agreement in medical informatics reliability studies. J Biomed Inform. 2002;35(2):99–110.
doi: 10.1016/S1532-0464(02)00500-2 pubmed: 12474424
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74.
doi: 10.2307/2529310 pubmed: 843571
Costa Santos C, Costa Pereira A, Bernardes J. Agreement studies in obstetrics and gynaecology: inappropriateness, controversies and consequences. BJOG. 2005;112(5):667–9.
doi: 10.1111/j.1471-0528.2004.00505.x pubmed: 15842294
Altman D. Practical Statistics for Medical Research. Chapman and Hall. London; 1991. 404–408. Cited 2023 May 3. Available from: https://www.routledge.com/Practical-Statistics-for-Medical-Research/Altman/p/book/9780412276309 .
Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol. 1990;43(6):543–9.
doi: 10.1016/0895-4356(90)90158-L pubmed: 2348207
Bhatia M, Mahtani KR, Nunan D, Reddy A. A cross-sectional comparison of three guidelines for intrapartum cardiotocography. Int J Gynaecol Obstet. 2017;138(1):89–93.
doi: 10.1002/ijgo.12161 pubmed: 28346664
Ayres-de-Campos D, Spong CY, Chandraharan E. FIGO consensus guidelines on intrapartum fetal monitoring: Cardiotocography. Int J Gynecol Obstet. 2015;131(1):13–24.
doi: 10.1016/j.ijgo.2015.06.020
Ben M’Barek I, Jauvion G, Vitrou J, Holmström E, Koskas M, Ceccaldi PF. DeepCTG® 1.0: an interpretable model to detect fetal hypoxia from cardiotocography data during labor and delivery. Frontiers in Pediatrics. 2023;11. Cited 2023 Sep 25. Available from: https://www.frontiersin.org/articles/10.3389/fped.2023.1190441 .
Boudet S, Houzé de l’Aulnoit A, Peyrodie L, Demailly R, Houzé de l’Aulnoit D. Use of Deep Learning to Detect the Maternal Heart Rate and False Signals on Fetal Heart Rate Recordings. Biosensors. 2022;12(9):691.
Di Tommaso M, Seravalli V, Petraglia F. Errors and pitfalls in reading the cardiotocographic tracing. Minerva Ginecol. 2019;71(2):91–6.
doi: 10.23736/S0026-4784.18.04336-8 pubmed: 30318881
Nurani R, Chandraharan E, Lowe V, Ugwumadu A, Arulkumaran S. Misidentification of maternal heart rate as fetal on cardiotocography during the second stage of labor: the role of the fetal electrocardiograph. Acta Obstet Gynecol Scand. 2012;91(12):1428–32.
doi: 10.1111/j.1600-0412.2012.01511.x pubmed: 22881463
Epstein AJ, Twogood S, Lee RH, Opper N, Beavis A, Miller DA. Interobserver reliability of fetal heart rate pattern interpretation using NICHD definitions. Am J Perinatol. 2013;30(6):463–8.
pubmed: 23161350
Blix E, Sviggum O, Koss KS, Øian P. Inter-observer variation in assessment of 845 labour admission tests: comparison between midwives and obstetricians in the clinical setting and two experts. BJOG. 2003;110(1):1–5.
pubmed: 12504927
Pehrson C, Sorensen JL, Amer-Wåhlin I. Evaluation and impact of cardiotocography training programmes: a systematic review. BJOG. 2011;118(8):926–35.
doi: 10.1111/j.1471-0528.2011.03021.x pubmed: 21658193
Ekengård F, Cardell M, Herbst A. Low sensitivity of the new FIGO classification system for electronic fetal monitoring to identify fetal acidosis in the second stage of labor. Eur J Obstet Gynecol Reprod Biol X. 2021;9:100120.
doi: 10.1016/j.eurox.2020.100120 pubmed: 33319210
Schiermeier S, Westhof G, Leven A, Hatzmann H, Reinhard J. Intra- and interobserver variability of intrapartum cardiotocography: a multicenter study comparing the FIGO classification with computer analysis software. Gynecol Obstet Invest. 2011;72(3):169–73.
doi: 10.1159/000327133 pubmed: 21921568
Kundu S, Kuehnle E, Schippert C, von Ehr J, Hillemanns P, Staboulidou I. Estimation of neonatal outcome artery pH value according to CTG interpretation of the last 60 min before delivery: a retrospective study. Can the outcome pH value be predicted? Arch Gynecol Obstet. 2017;296(5):897–905.
doi: 10.1007/s00404-017-4516-4 pubmed: 28879450
Figueras F, Albela S, Bonino S, Palacio M, Barrau E, Hernandez S, et al. Visual analysis of antepartum fetal heart rate tracings: inter- and intra-observer agreement and impact of knowledge of neonatal outcome. J Perinat Med. 2005;33(3):241–5.
doi: 10.1515/JPM.2005.044 pubmed: 15914348
Palomäki O, Luukkaala T, Luoto R, Tuimala R. Intrapartum cardiotocography – the dilemma of interpretational variation. J Perinat Med. 2006;34(4):298–302.
doi: 10.1515/JPM.2006.057 pubmed: 16856819
Westerhuis MEMH, van Horen E, Kwee A, van der Tweel I, Visser GHA, Moons KGM. Inter- and intra-observer agreement of intrapartum ST analysis of the fetal electrocardiogram in women monitored by STAN. BJOG. 2009;116(4):545–51.
doi: 10.1111/j.1471-0528.2008.02092.x pubmed: 19250366
Al Wattar BH, Lakhiani A, Sacco A, Siddharth A, Bain A, Calvia A, et al. Evaluating the value of intrapartum fetal scalp blood sampling to predict adverse neonatal outcomes: a UK multicentre observational study. Eur J Obstet Gynecol Reprod Biol. 2019;240:62–7.
doi: 10.1016/j.ejogrb.2019.06.012 pubmed: 31229725
Vayssière C, Tsatsaris V, Pirrello O, Cristini C, Arnaud C, Goffinet F. Inter-observer agreement in clinical decision-making for abnormal cardiotocogram (CTG) during labour: a comparison between CTG and CTG plus STAN. BJOG. 2009;116(8):1081–8.
doi: 10.1111/j.1471-0528.2009.02204.x pubmed: 19515149
Ben M’Barek I, Jauvion G, Ceccaldi PF. Computerized cardiotocography analysis during labor – A state-of-the-art review. Acta Obstetricia et Gynecologica Scandinavica. n/a(n/a). cited 2022 Dec 21. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/aogs.14498 .
Gagnon R, Campbell MK, Hunse C. A comparison between visual and computer analysis of antepartum fetal heart rate tracings. Am J Obstet Gynecol. 1993;168(3 Pt 1):842–7.
doi: 10.1016/S0002-9378(12)90831-5 pubmed: 8456890
Costa A, Santos C, Ayres-de-Campos D, Costa C, Bernardes J. Access to computerised analysis of intrapartum cardiotocographs improves clinicians’ prediction of newborn umbilical artery blood pH. BJOG. 2010;117(10):1288–93.
doi: 10.1111/j.1471-0528.2010.02645.x pubmed: 20618316
Chen CY, Yu C, Chang CC, Lin CW. Comparison of a novel computerized analysis program and visual interpretation of cardiotocography. PLoS One. 2014;9(12):e112296.
doi: 10.1371/journal.pone.0112296 pubmed: 25437442 pmcid: 4249819
Alfirevic Z, Gyte GM, Cuthbert A, Devane D. Continuous cardiotocography (CTG) as a form of electronic fetal monitoring (EFM) for fetal assessment during labour. Cochrane Database Syst Rev. 2017;2017(2). Cited 2020 Dec 17. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6464257/ .

Auteurs

Imane Ben M'Barek (I)

Service de Gynécologie Obstétrique, Assistance Publique Hôpitaux de Paris - Hôpital Beaujon, 100 boulevard du Général Leclerc, Clichy La Garenne, France. imane.benmbarek@aphp.fr.
Université Paris Cité, 75006, Paris, France. imane.benmbarek@aphp.fr.
Health Simulation Department, iLumens, Université Paris Cité, Paris, France. imane.benmbarek@aphp.fr.

Badr Ben M'Barek (B)

Genos Care, Paris, France.

Grégoire Jauvion (G)

Genos Care, Paris, France.

Emilia Holmström (E)

Service de Gynécologie Obstétrique, Assistance Publique Hôpitaux de Paris - Hôpital Beaujon, 100 boulevard du Général Leclerc, Clichy La Garenne, France.
Université Paris Cité, 75006, Paris, France.

Antoine Agman (A)

Service de Gynécologie Obstétrique, Assistance Publique Hôpitaux de Paris - Hôpital Beaujon, 100 boulevard du Général Leclerc, Clichy La Garenne, France.

Jade Merrer (J)

AP-HP.Nord-Université Paris Cité, Hôpital Universitaire Robert Debré, Unité d'épidémiologie clinique, 1426, InsermParis, CIC, France.

Pierre-François Ceccaldi (PF)

Service de Gynécologie-Obstétrique et Médecine de la reproduction, Hôpital Foch, 40 Rue Worth, 92150, Suresnes, France.

Classifications MeSH