Code4ML: a large-scale dataset of annotated Machine Learning code.
Jupyter code snippets
ML code dataset
Journal
PeerJ. Computer science
ISSN: 2376-5992
Titre abrégé: PeerJ Comput Sci
Pays: United States
ID NLM: 101660598
Informations de publication
Date de publication:
2023
2023
Historique:
received:
05
10
2022
accepted:
09
01
2023
medline:
22
6
2023
pubmed:
22
6
2023
entrez:
22
6
2023
Statut:
epublish
Résumé
The use of program code as a data source is increasingly expanding among data scientists. The purpose of the usage varies from the semantic classification of code to the automatic generation of programs. However, the machine learning model application is somewhat limited without annotating the code snippets. To address the lack of annotated datasets, we present the Code4ML
Identifiants
pubmed: 37346615
doi: 10.7717/peerj-cs.1230
pii: cs-1230
pmc: PMC10280557
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e1230Informations de copyright
© 2023 Drozdova et al.
Déclaration de conflit d'intérêts
The authors declare that they have no competing interests.