Knowledge mining of unstructured information: application to cyber domain.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
31 Jan 2023
31 Jan 2023
Historique:
received:
16
08
2022
accepted:
24
01
2023
entrez:
31
1
2023
pubmed:
1
2
2023
medline:
1
2
2023
Statut:
epublish
Résumé
Information on cyber-related crimes, incidents, and conflicts is abundantly available in numerous open online sources. However, processing large volumes and streams of data is a challenging task for the analysts and experts, and entails the need for newer methods and techniques. In this article we present and implement a novel knowledge graph and knowledge mining framework for extracting the relevant information from free-form text about incidents in the cyber domain. The computational framework includes a machine learning-based pipeline for generating graphs of organizations, countries, industries, products and attackers with a non-technical cyber-ontology. The extracted knowledge graph is utilized to estimate the incidence of cyberattacks within a given graph configuration. We use publicly available collections of real cyber-incident reports to test the efficacy of our methods. The knowledge extraction is found to be sufficiently accurate, and the graph-based threat estimation demonstrates a level of correlation with the actual records of attacks. In practical use, an analyst utilizing the presented framework can infer additional information from the current cyber-landscape in terms of the risk to various entities and its propagation between industries and countries.
Identifiants
pubmed: 36720897
doi: 10.1038/s41598-023-28796-6
pii: 10.1038/s41598-023-28796-6
pmc: PMC9889742
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1714Informations de copyright
© 2023. The Author(s).
Références
Sci Rep. 2017 Jul 20;7(1):5994
pubmed: 28729710