Full Program »
NHIDS Dataset Requirements
In the cyber domain, situational awareness of the critical assets is extremely important. For achieving comprehensive situational awareness, the accurate sensor information is required. An important branch of sensors in the cyber domain are Intrusion Detection Systems (IDS), especially anomaly based intrusion Detection Systems applying artificial intelligence or machine learning for anomaly detection. This millennium has seen the transformation of industries due to developments in data based modelling methods, the computative resources they require and the availability of training data. The application of novel modelling methods has been straightforward within industries, where the data is both readily representative and relatively easy to accumulate. Unfortunately, this is not the case for the modelling of IDS in the context of cyber security. The most crucial bottleneck is in the absence of publicly available datasets compliant to modern equipment, system design standards and cyber threat landscape. The predominant dataset, the KDD Cup 1999, is still actively used in IDS modelling research despite the expressed criticism. Other, more recent datasets, tend to record data only either from the perimeters of the tetsbed environment's network traffic or the effects that the malware has on a single host machine. Our study focuses on forming a synthesis of requirements for a holistic network and host intrusion detection system (NHIDS) dataset by reviewing existing and studied datasets within the field of IDS modelling. As a result, the requirements for state-of-the-art NHIDS dataset with future are found and can be utilized for research and development of future NHIDS applying artificial intelligence.