Analisis perbandingan algoritma machine learning untuk deteksi serangan DDoS pada jaringan IoT
Keywords:
DDoS, machine learning, XGBoost, random forest, deteksi seranganAbstract
Serangan Distributed Denial of Service (DDoS) merupakan ancaman serius bagi keamanan jaringan yang dapat melumpuhkan layanan dan menyebabkan kerugian besar. Penelitian ini bertujuan untuk menganalisis dan membandingkan kinerja enam algoritma machine learning dalam mendeteksi serangan DDoS, yaitu Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, XGBoost, dan K-Nearest Neighbors (KNN). Dataset yang digunakan adalah kombinasi CICDDoS2019 untuk data serangan dan CICIDS2017 untuk data lalu lintas normal, dengan total 1.000.000 sampel yang seimbang (50% serangan dan 50% normal) serta 78 fitur jaringan. Preprocessing dilakukan melalui pembersihan data, penanganan missing values, dan normalisasi menggunakan StandardScaler. Pembagian data dilakukan dengan rasio 80:20 untuk training dan testing dengan stratified sampling. Hasil eksperimen menunjukkan bahwa XGBoost mencapai performa terbaik dengan F1-Score 99,99%, AUC 0,999999, dan waktu training tercepat (5,34 detik), diikuti oleh Random Forest dengan F1-Score 99,99% dan AUC 0,999984. Validasi menggunakan 5-fold cross-validation mengkonfirmasi stabilitas model tanpa overfitting. Temuan ini menunjukkan bahwa algoritma ensemble berbasis boosting, khususnya XGBoost, merupakan pendekatan yang paling efektif dan efisien untuk sistem deteksi serangan DDoS.
Downloads
References
Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54(3), 1937–1967. https://doi.org/10.1007/s10462-020-09896-5
Berrar, D. (2019). Cross-validation. In S. Ranganathan et al. (Eds.), Encyclopedia of Bioinformatics and Computational Biology (Vol. 1, pp. 542–545). Elsevier. https://doi.org/10.1016/B978-0-12-809633-8.20349-X
Bhattacharyya, D. K., & Kalita, J. K. (2016). DDoS attacks: Evolution, detection, prevention, reaction, and tolerance. CRC Press.
Biau, G., & Scornet, E. (2016). A random forest guided tour. TEST, 25(2), 197–227. https://doi.org/10.1007/s11749-016-0481-7
Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153–1176. https://doi.org/10.1109/COMST.2015.2494502
Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(1), 20–28. https://doi.org/10.38094/jastt20165
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6. https://doi.org/10.1186/s12864-019-6413-7
Christodoulou, E., et al. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22. https://doi.org/10.1016/j.jclinepi.2019.02.004
Cloudflare. (2024). DDoS threat report for 2024 Q1. Cloudflare.
Dong, S., Abbas, K., & Jain, R. (2020). A survey on distributed denial of service (DDoS) attacks in SDN and cloud computing environments. IEEE Access, 8, 80813–80828. https://doi.org/10.1109/ACCESS.2019.2936774
Doriguzzi-Corin, R., et al. (2020). LUCID: A practical, lightweight deep learning solution for DDoS attack detection. IEEE Transactions on Network and Service Management, 17(2), 876–889. https://doi.org/10.1109/TNSM.2020.2971776
Emmanuel, T., et al. (2021). A survey on missing data in machine learning. Journal of Big Data, 8(1), 140. https://doi.org/10.1186/s40537-021-00516-9
García, S., Luengo, J., & Herrera, F. (2016). Tutorial on practical tips of the most influential data preprocessing algorithms in data mining. Knowledge-Based Systems, 98, 1–29. https://doi.org/10.1016/j.knosys.2015.12.006
Halimu, C., et al. (2019). Empirical comparison of AUC and MCC for evaluating machine learning algorithms on imbalanced datasets. Proceedings of the 3rd International Conference on Machine Learning and Soft Computing, 1–6. https://doi.org/10.1145/3310986.3311023
Hossin, M., & Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2), 1–11. https://doi.org/10.5121/ijdkp.2015.5201
Hussain, F., et al. (2021). IoT DoS and DDoS attack detection using ResNet. IEEE INMIC 2021 Proceedings, 1–6. https://doi.org/10.1109/INMIC53986.2021.9642085
Kapoor, S., & Narayanan, A. (2023). Leakage and the reproducibility crisis in machine-learning-based science. Patterns, 4(9), 100804. https://doi.org/10.1016/j.patter.2023.100804
Leevy, J. L., et al. (2018). A survey on addressing high-class imbalance in big data. Journal of Big Data, 5(1), 42. https://doi.org/10.1186/s40537-018-0151-6
Li, Y., Liu, Q., & Sun, L. (2022). DDoS attack detection for IoT using XGBoost. Computers & Security, 118, 102731. https://doi.org/10.1016/j.cose.2022.102731
Panigrahi, R., & Borah, S. (2018). A detailed analysis of CICIDS2017 dataset for designing intrusion detection systems. International Journal of Engineering and Technology, 7(3.24), 479–482.
Raschka, S., Patterson, J., & Nolet, C. (2020). Machine learning in Python. Information, 11(4), 193. https://doi.org/10.3390/info11040193
Raschka, S. (2018). Model evaluation and selection in machine learning. arXiv preprint arXiv:1811.12808.
Sahoo, K. S., et al. (2020). An evolutionary SVM model for DDOS attack detection in SDN. IEEE Access, 8, 132502–132513. https://doi.org/10.1109/ACCESS.2020.3009733
Sharafaldin, I., Lashkari, A. H., & Ghorbani, A. A. (2018). Toward generating a new intrusion detection dataset. ICISSP Proceedings, 108–116.
Sharafaldin, I., Lashkari, A. H., Hakak, S., & Ghorbani, A. A. (2019). Developing realistic DDoS attack dataset and taxonomy. ICCST Proceedings, 1–8.
Yan, Q., et al. (2016). SDN and DDoS attacks in cloud computing environments. IEEE Communications Surveys & Tutorials, 18(1), 602–622.
Zhang, Z. (2017). Introduction to machine learning: K-nearest neighbors. Annals of Translational Medicine, 5(11), 230.
Somani, G., et al. (2017). DDoS attacks in cloud computing: Issues, taxonomy, and future directions.
In: Proceedings of the 11th International Conference on Cloud Computing (CLOUD). IEEE.
Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. 2nd ed. O’Reilly Media.
Alqahtani, S. A. (2020). Machine Learning Approaches for DDoS Attack Detection in IoT Networks.
Master Thesis. King Saud University.
Canadian Institute for Cybersecurity. (2019). CICDDoS2019 Dataset Documentation.