| dc.description.abstract |
The increase in interconnectivity and advancement in network technologies have
influenced a parallel rise in Distributed Denial of Service (DDoS) attacks, and the
perpetrators have become sophisticated such that previously dependable tools and
techniques have become ineffective. The purpose of the study was to design an intrusion
detection model based on K-Means and CART algorithms, and train and test it using the
CICDDoS2019 dataset, which represents application-layer DDOS attacks.
The
objectives of the study were to: Determine the existing application-layer intrusion
detection techniques and models; Explore the weaknesses of existing intrusion detection
models; Classify the dataset using individual K-Means and CART algorithms; Develop a
hybrid intrusion detection model for application-layer DDoS attacks by combining K
Means and CART algorithms; and evaluate the performance of the hybrid model. The
study was designed as a quantitative experimental simulation. It adopted the empirical
positivist paradigm. A machine learning theory and network security theory formed the
theoretical framework. The Scikit-Learn libraries were employed using Python
programming to perform the analysis. The study utilised secondary data obtained from
the CICDDoS2019 dataset, containing 49.59 million records of 12 unlabelled DDoS
attack types including NTP, DNS, LDAP, MSSQL, NetBIOS, SNMP, SSDP, UDP,
UDP-Lag, WebDDoS, SYN, and TFTP. This research used simple random sampling to
select 30000 records from each attack type, yielding a dataframe of 110,000 rows and 88
columns. The Unsupervised component of the experiment requires no training and testing
sets. For the supervised component using the CART algorithm, the dataset was split into
67% for training and 33% for testing. Individually, the K-Means algorithm achieved
homogeneity, completeness, and V-measure scores of 50.76%, 51.95%, and 51.35%
respectively. On the other hand, CART was measured on accuracy, precision,
recall/sensitivity, and F1-Score and it achieved scores of 74% on all counts. The hybrid
model was fundamentally a CART algorithm improved by K-means clustered features
and therefore was scored on the CART algorithm metrics basis. It scored 78% on
accuracy, 79% on precision, 78% on recall, and 78.5% on F1-score. The dataset proved
to have high dimensionality and complexity with multiple overlapping clusters. K-Means
had an average performance proving its unsuitability for this type of dataset. CART
algorithm had a relatively high success in identifying application layer DDoS attacks.
The hybrid model achieved a better performance score compared to its constituent
models as shown by the difference between the chosen metrics and their averages. This
study concludes that our hybrid intrusion detection model can outperform existing K
Mean and CART algorithms in terms of accuracy, precision, recall and F1 score. The
study recommends that future studies should investigate a similar model using density
based clustering algorithms like DBSCAN in place of K-Means in a similar setup. |
en_US |