A comprehensive guide to CAN IDS data and introduction of the ROAD dataset.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2024
2024
Historique:
received:
20
07
2023
accepted:
19
12
2023
medline:
22
1
2024
pubmed:
22
1
2024
entrez:
22
1
2024
Statut:
epublish
Résumé
Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions or anomalies on CANs. Producing vehicular CAN data with a variety of intrusions is a difficult task for most researchers as it requires expensive assets and deep expertise. To illuminate this task, we introduce the first comprehensive guide to the existing open CAN intrusion detection system (IDS) datasets. We categorize attacks on CANs including fabrication (adding frames, e.g., flooding or targeting and ID), suspension (removing an ID's frames), and masquerade attacks (spoofed frames sent in lieu of suspended ones). We provide a quality analysis of each dataset; an enumeration of each datasets' attacks, benefits, and drawbacks; categorization as real vs. simulated CAN data and real vs. simulated attacks; whether the data is raw CAN data or signal-translated; number of vehicles/CANs; quantity in terms of time; and finally a suggested use case of each dataset. State-of-the-art public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, lacking fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but is missing a corresponding "raw" binary version. This issue pigeon-holes CAN IDS research into testing on limited and often inappropriate data (usually with attacks that are too easily detectable to truly test the method). The scarcity of appropriate data has stymied comparability and reproducibility of results for researchers. As our primary contribution, we present the Real ORNL Automotive Dynamometer (ROAD) CAN IDS dataset, consisting of over 3.5 hours of one vehicle's CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real (i.e. non-simulated) fuzzing, fabrication, unique advanced attacks, and simulated masquerade attacks. To facilitate a benchmark for CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS research field.
Identifiants
pubmed: 38252659
doi: 10.1371/journal.pone.0296879
pii: PONE-D-23-22932
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0296879Informations de copyright
Copyright: © 2024 Verma et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.