DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding.
Journal
Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192
Informations de publication
Date de publication:
07 11 2023
07 11 2023
Historique:
received:
31
01
2023
accepted:
17
10
2023
medline:
9
11
2023
pubmed:
8
11
2023
entrez:
7
11
2023
Statut:
epublish
Résumé
Recent advances in computer vision (CV) and natural language processing have been driven by exploiting big data on practical applications. However, these research fields are still limited by the sheer volume, versatility, and diversity of the available datasets. CV tasks, such as image captioning, which has primarily been carried out on natural images, still struggle to produce accurate and meaningful captions on sketched images often included in scientific and technical documents. The advancement of other tasks such as 3D reconstruction from 2D images requires larger datasets with multiple viewpoints. We introduce DeepPatent2, a large-scale dataset, providing more than 2.7 million technical drawings with 132,890 object names and 22,394 viewpoints extracted from 14 years of US design patent documents. We demonstrate the usefulness of DeepPatent2 with conceptual captioning. We further provide the potential usefulness of our dataset to facilitate other research areas such as 3D image reconstruction and image retrieval.
Identifiants
pubmed: 37935698
doi: 10.1038/s41597-023-02653-7
pii: 10.1038/s41597-023-02653-7
pmc: PMC10630310
doi:
Types de publication
Dataset
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
772Subventions
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : 20200041ER
Organisme : DOE | LDRD | Los Alamos National Laboratory (Los Alamos Lab)
ID : BA601958
Informations de copyright
© 2023. The Author(s).
Références
Neural Netw. 2020 Apr;124:1-11
pubmed: 31945639
IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3349-3364
pubmed: 32248092
Sci Data. 2023 Nov 7;10(1):772
pubmed: 37935698