Searching PubMed to Retrieve Publications on the COVID-19 Pandemic: Comparative Analysis of Search Strings.
COVID-19
PubMed
coronavirus
literature
literature searching
pandemic
performance
research
scientific publishing
search
Journal
Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882
Informations de publication
Date de publication:
26 11 2020
26 11 2020
Historique:
received:
17
08
2020
accepted:
24
10
2020
revised:
10
10
2020
pubmed:
17
11
2020
medline:
22
12
2020
entrez:
16
11
2020
Statut:
epublish
Résumé
Since it was declared a pandemic on March 11, 2020, COVID-19 has dominated headlines around the world and researchers have generated thousands of scientific articles about the disease. The fast speed of publication has challenged researchers and other stakeholders to keep up with the volume of published articles. To search the literature effectively, researchers use databases such as PubMed. The aim of this study is to evaluate the performance of different searches for COVID-19 records in PubMed and to assess the complexity of searches required. We tested PubMed searches for COVID-19 to identify which search string performed best according to standard metrics (sensitivity, precision, and F-score). We evaluated the performance of 8 different searches in PubMed during the first 10 weeks of the COVID-19 pandemic to investigate how complex a search string is needed. We also tested omitting hyphens and space characters as well as applying quotation marks. The two most comprehensive search strings combining several free-text and indexed search terms performed best in terms of sensitivity (98.4%/98.7%) and F-score (96.5%/95.7%), but the single-term search COVID-19 performed best in terms of precision (95.3%) and well in terms of sensitivity (94.4%) and F-score (94.8%). The term Wuhan virus performed the worst: 7.7% for sensitivity, 78.1% for precision, and 14.0% for F-score. We found that deleting a hyphen or space character could omit a substantial number of records, especially when searching with SARS-CoV-2 as a single term. Comprehensive search strings combining free-text and indexed search terms performed better than single-term searches in PubMed, but not by a large margin compared to the single term COVID-19. For everyday searches, certain single-term searches that are entered correctly are probably sufficient, whereas more comprehensive searches should be used for systematic reviews. Still, we suggest additional measures that the US National Library of Medicine could take to support all PubMed users in searching the COVID-19 literature.
Sections du résumé
BACKGROUND
Since it was declared a pandemic on March 11, 2020, COVID-19 has dominated headlines around the world and researchers have generated thousands of scientific articles about the disease. The fast speed of publication has challenged researchers and other stakeholders to keep up with the volume of published articles. To search the literature effectively, researchers use databases such as PubMed.
OBJECTIVE
The aim of this study is to evaluate the performance of different searches for COVID-19 records in PubMed and to assess the complexity of searches required.
METHODS
We tested PubMed searches for COVID-19 to identify which search string performed best according to standard metrics (sensitivity, precision, and F-score). We evaluated the performance of 8 different searches in PubMed during the first 10 weeks of the COVID-19 pandemic to investigate how complex a search string is needed. We also tested omitting hyphens and space characters as well as applying quotation marks.
RESULTS
The two most comprehensive search strings combining several free-text and indexed search terms performed best in terms of sensitivity (98.4%/98.7%) and F-score (96.5%/95.7%), but the single-term search COVID-19 performed best in terms of precision (95.3%) and well in terms of sensitivity (94.4%) and F-score (94.8%). The term Wuhan virus performed the worst: 7.7% for sensitivity, 78.1% for precision, and 14.0% for F-score. We found that deleting a hyphen or space character could omit a substantial number of records, especially when searching with SARS-CoV-2 as a single term.
CONCLUSIONS
Comprehensive search strings combining free-text and indexed search terms performed better than single-term searches in PubMed, but not by a large margin compared to the single term COVID-19. For everyday searches, certain single-term searches that are entered correctly are probably sufficient, whereas more comprehensive searches should be used for systematic reviews. Still, we suggest additional measures that the US National Library of Medicine could take to support all PubMed users in searching the COVID-19 literature.
Identifiants
pubmed: 33197230
pii: v22i11e23449
doi: 10.2196/23449
pmc: PMC7695541
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e23449Commentaires et corrections
Type : CommentIn
Informations de copyright
©Jeffrey V Lazarus, Adam Palayew, Lauge Neimann Rasmussen, Tue Helms Andersen, Joey Nicholson, Ole Norgaard. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 26.11.2020.
Références
Database (Oxford). 2011 Jan 18;2011:baq036
pubmed: 21245076
Database (Oxford). 2009;2009:bap018
pubmed: 20157491
J Clin Epidemiol. 2020 Jul;123:171-173
pubmed: 32376119
Nat Hum Behav. 2020 Jul;4(7):666-669
pubmed: 32576981
Health Info Libr J. 2008 Dec;25(4):313-7
pubmed: 19076679
J Clin Epidemiol. 2016 Jul;75:40-6
pubmed: 27005575
Inf Retr Boston. 2006 Nov;9(5):543-564
pubmed: 18080004
FASEB J. 2008 Feb;22(2):338-42
pubmed: 17884971
Res Synth Methods. 2020 Sep;11(5):627-640
pubmed: 32495989
Cochrane Database Syst Rev. 2019 Oct 3;10:ED000142
pubmed: 31643080
BMJ. 2020 Apr 23;369:m1601
pubmed: 32327431
Syst Rev. 2018 Nov 20;7(1):200
pubmed: 30458825
Nature. 2020 Mar;579(7798):193
pubmed: 32157233
BMC Med Res Methodol. 2018 Aug 14;18(1):85
pubmed: 30107788
Database (Oxford). 2018 Jan 1;2018:
pubmed: 30239682
Int J Med Inform. 2020 Jul;139:104144
pubmed: 32334400
J Clin Epidemiol. 2018 Jul;99:53-63
pubmed: 29526555
Health Info Libr J. 2019 Dec;36(4):318-340
pubmed: 30006959
Int J Surg. 2020 Apr;76:71-76
pubmed: 32112977
J Med Libr Assoc. 2018 Oct;106(4):531-541
pubmed: 30271302
PLoS One. 2010 Apr 07;5(4):e10039
pubmed: 20383330