LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Visualizing Realistic Benchmarked IDS Dataset: CIRA-CIC-DoHBrw-2020

Photo by betteratf8 from unsplash

Intrusion Detection System (IDS) dataset is crucial to detect lateral movement of cyber-attacks. IDS dataset will help to train the IDS classifier model to achieve earliest detection. A good near-realism… Click to show full abstract

Intrusion Detection System (IDS) dataset is crucial to detect lateral movement of cyber-attacks. IDS dataset will help to train the IDS classifier model to achieve earliest detection. A good near-realism public dataset is essential to assist the development of advanced IDS classifier models. However, the available public IDS dataset has long been under scrutiny for its practicality to reflect real low-footprint cyber threats, render real-time network scenario, reflect recent malware attack over newly developed DoH protocol, disregard layer 3 information and finally publish contradictory results of classification and analysis between various studies which makes it non-reproducible and without shareable results. This problem can be resolved by sophisticatedly visualizing a new realistic, real-time, low footprint and up-to-date benchmarked dataset. Visualization helps to detect data deformation before designing the optimized and highly accurate classifier model. Therefore, this study aims to review a new realistic benchmarked IDS dataset and apply sophisticated technique to visualize them. The review starts by carefully examining production network features. These are then compared with various well-established public IDS datasets. Many of them are static, unrealistic meta-features and disregard source and destination Internet Protocol (IP) information except CIRA-CIC-DoHBrw-2020 dataset. The study then applies Eigen Centrality (EC) technique from the graph theory to visualize this layer 3 (L3) information. Finally, using various visualization techniques such as Principal Component Analysis (PCA) and Gaussian Mixture Model (GMM), the study further analyzes and subsequently visualizes the data. Results show that the CIRA-CIC-DoHBrw-2020 simulated recent malware attack and has a very imbalanced dataset which reflects the realistic low-footprint cyber-attacks. The centrality graph clearly visualizes IPs that are compromised by recent DoH attack in real-time, and the study concludes decisively that smaller packet length of size 1000 to 2000 bytes is to fit an attack trait.

Keywords: cira cic; ids dataset; cic dohbrw; dohbrw 2020

Journal Title: IEEE Access
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.