基于深度特征學(xué)習(xí)的網(wǎng)絡(luò)流量異常檢測方法
doi: 10.11999/JEIT190266
-
1.
中國人民解放軍戰(zhàn)略支援部隊信息工程大學(xué) 鄭州 450001
-
2.
河南省信息安全重點實驗室 鄭州 450001
Network Traffic Anomaly Detection Method Based on Deep Features Learning
-
1.
PLA SSF Information Engineering University, Zhengzhou 450001, China
-
2.
Henan Key Laboratory of Information Security, Zhengzhou 450001, China
-
摘要:
針對網(wǎng)絡(luò)流量異常檢測過程中提取的流量特征準(zhǔn)確性低、魯棒性差導(dǎo)致流量攻擊檢測率低、誤報率高等問題,該文結(jié)合堆疊降噪自編碼器(SDA)和softmax,提出一種基于深度特征學(xué)習(xí)的網(wǎng)絡(luò)流量異常檢測方法。首先基于粒子群優(yōu)化算法設(shè)計SDA結(jié)構(gòu)兩階段尋優(yōu)算法:根據(jù)流量檢測準(zhǔn)確率依次對隱藏層層數(shù)及每層節(jié)點數(shù)進(jìn)行尋優(yōu),確定搜索空間中的最優(yōu)SDA結(jié)構(gòu),從而提高SDA提取特征的準(zhǔn)確性。然后采用小批量梯度下降算法對優(yōu)化的SDA進(jìn)行訓(xùn)練,通過最小化含噪數(shù)據(jù)重構(gòu)向量與原始輸入向量間的差異,提取具有較強(qiáng)魯棒性的流量特征。最后基于提取的流量特征對softmax進(jìn)行訓(xùn)練構(gòu)建異常檢測分類器,從而實現(xiàn)對流量攻擊的高性能檢測。實驗結(jié)果表明:該文所提方法可根據(jù)實驗數(shù)據(jù)及其分類任務(wù)動態(tài)調(diào)整SDA結(jié)構(gòu),提取的流量特征具有更高的準(zhǔn)確性和魯棒性,流量攻擊檢測率高、誤報率低。
-
關(guān)鍵詞:
- 流量異常檢測 /
- 深度學(xué)習(xí) /
- 堆疊降噪自編碼器 /
- 粒子群優(yōu)化
Abstract:In view of the problems of low attack detection rate and high false positive rate caused by poor accuracy and robustness of the extracted traffic features in network traffic anomaly detection, a network traffic anomaly detection method based on deep features learning is proposed, which is combined with Stacked Denoising Autoencoders (SDA) and softmax. Firstly, a two-stage optimization algorithm is designed based on particle swarm optimization algorithm to optimize the structure of SDA, the number of hidden layers and nodes in each layer is optimized successively based on the traffic detection accuracy, and the optimal structure of SDA in the search space is determined, improving the accuracy of traffic features extracted by SDA. Secondly, the optimized SDA is trained by the mini-batch gradient descent algorithm, and the traffic features with strong robustness are extracted by minimizing the difference between the reconstruction vector of the corrupted data and the original input vector. Finally, softmax is trained by the extracted traffic features to construct an anomaly detection classifier for detecting traffic attacks with high performance. The experimental results show that the proposed method can adjust the structure of SDA based on the experimental data and its classification tasks, extract traffic features with a higher accuracy and robustness, and detect traffic attacks with high detection rate and low false positive rate.
-
表 1 隱藏層層數(shù)尋優(yōu)算法
輸入:流量異常檢測數(shù)據(jù)集,NP,${t_{\max }}$, $w$, ${c_1}$, ${c_2}$, ${l_{\max }}$, ${l_{\min }}$, ${v_{l,\max }}$, ${v_{l,\min }}$, ${n_{\max }}$, ${n_{\min }}$, ${v_{n,\max }}$, ${v_{n,\min }}$ 輸出:具有${l_{{\rm{gbest}}}}$個隱藏層且每層節(jié)點數(shù)為${n_{{\rm{gbest}}}}$的SDA for $i = 1\;{\rm{to}}\;{{\rm{NP}}}$ do 采用式(5)—式(8)對粒子群進(jìn)行初始化,并分別將${l_{i,{\rm{pbest}}}}$和${n_{i,{\rm{pbest}}}}$初始化為${l_i}(0)$和${n_i}(0)$; 基于實驗數(shù)據(jù),采用式(9)計算粒子i的適應(yīng)度值; 將最小適應(yīng)度值對應(yīng)的l和n設(shè)置為${l_{{\rm{gbest}}}}$和${n_{{\rm{gbest}}}}$初始化值; for $t = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{t_{\max }}} \end{array}$ do for $i = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{\rm{NP}}} \end{array}$ do 采用式(1)—式(4)更新粒子i的${l_i}(t)$速度和數(shù)值,以及${n_i}(t)$的速度和數(shù)值; if ${v_{{l_i}}}(t)$, ${l_i}(t)$,${v_{{n_i}}}(t)$ or ${n_i}(t)$超過其搜索范圍 對${v_{{l_i}}}(t)$, ${l_i}(t)$,${v_{{n_i}}}(t)$ or ${n_i}(t)$再次進(jìn)行隨機(jī)初始化; 生成具有${l_i}(t)$個隱藏層且每層節(jié)點數(shù)為${n_i}(t)$的SDA; 基于實驗數(shù)據(jù),采用式(9)計算粒子i的適應(yīng)度值; if(${\rm{fit} } ({l_i}(t),{n_i}(t)) < {\rm{fit} } ({l_{i,{\rm pbest}} },{n_{i,{\rm{pbest} } } })$)//若粒子i的適應(yīng)度值小于局部最優(yōu)值對應(yīng)的適應(yīng)度值,則對局部最優(yōu)值進(jìn)行更新 分別將${l_i}(t)$和${n_i}(t)$賦值給${l_{i,{\rm{pbest}}}}$和${n_{i,{\rm{pbest}}}}$; if(${\rm{fit}} ({l_i}(t),{n_i}(t)) < {\rm{fit}} ({l_{{\rm{gbest}}}},{n_{{\rm{gbest}}}})$)//若粒子i的適應(yīng)度值小于全局最優(yōu)值對應(yīng)的適應(yīng)度值,則對全局最優(yōu)值進(jìn)行更新 分別將${l_i}(t)$和${n_i}(t)$賦值給${l_{{\rm{gbest}}}}$和${n_{{\rm{gbest}}}}$; 迭代結(jié)束后,生成具有${l_{{\rm{gbest}}}}$個隱藏層且每層節(jié)點數(shù)為${n_{{\rm{gbest}}}}$的SDA; return 具有${l_{{\rm{gbest}}}}$個隱藏層且每層節(jié)點數(shù)為${n_{{\rm{gbest}}}}$的SDA。 下載: 導(dǎo)出CSV
表 2 隱藏層每層節(jié)點數(shù)尋優(yōu)算法
輸入:流量異常檢測數(shù)據(jù)集,NP, ${t_{\max }}$, $w$, ${c_1}$, ${c_2}$, ${v_{\max }}$, ${v_{\min }}$, ${l_{{\rm{gbest}}}}$, ${n_{{\rm{gbest}}}}$ 輸出:最優(yōu)SDA結(jié)構(gòu) for $i = {1 }\ {\rm{t} }{ {\rm{o} }_{} }\ { {\rm{NP} } } $ do for $h = {1_{} }\ {\rm{t} }{ {\rm{o} }_{} }\ {l_{ {\rm{gbest} } } }$ do 初始化粒子位置$n_i^{(h)}(0) = {n_{{\rm{gbest}}}}$,采用式(12)初始化粒子速度,并將局部最優(yōu)向量${{{n}}_{i,{\rm{pbest}}}}$中的$n_{i,{\rm{pbest}}}^{(h)}$初始化為${n_{{\rm{gbest}}}}$; 設(shè)置全局最優(yōu)向量${ {{n} }_{ {\rm{gbest} } } } = \min \{ { {{n} }_{ {\rm{1,pbest} } } },{ {{n} }_{ {\rm{2,pbest} } } }, ··· ,{ {{n} }_{ {\rm{NP,pbest} } } }\} = {[{n_{ {\rm{gbest} } } }_{}{n_{ {\rm{gbest} } } } ··· {n_{ {\rm{gbest} } } }]^{\rm T}}$; for $t = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{t_{\max }}} \end{array}$ do for $i = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{\rm{NP}}} \end{array}$ do for $h = {1_{}}{\rm{t}}{{\rm{o}}_{}}{l_{{\rm{gbest}}}}$ do 采用式(10)和式(11)更新粒子i位置向量${{{n}}_i}(t)$中元素$n_i^{(h)}(t)$的速度和數(shù)值; if $v_i^{(h)}(t)$ or $n_i^{(h)}(t)$超過其搜索范圍 對$v_i^{(h)}(t)$ or $n_i^{(h)}(t)$再次進(jìn)行隨機(jī)初始化; 根據(jù)更新后的${{{n}}_i}(t)$,將SDA每個隱藏層的節(jié)點數(shù)分別更新為$n_i^{(1)}(t),n_i^{(2)}(t), ··· ,n_i^{({l_{{\rm{gbest}}}})}(t)$; 基于實驗數(shù)據(jù),采用式(13)計算粒子i的適應(yīng)度值; if(${\rm{fit}} ({{{n}}_i}(t)) < {\rm{fit}} ({{{n}}_{i,{\rm{pbest}}}})$)//若粒子i的適應(yīng)度值小于局部最優(yōu)向量對應(yīng)的適應(yīng)度值,則對局部最優(yōu)向量進(jìn)行更新 ${{{n}}_{i,{\rm{pbest}}}} \leftarrow {{{n}}_i}(t)$; ${{{n}}_{{\rm{gbest}}}} \leftarrow \min \{ {{{n}}_{{\rm{1,pbest}}}},{{{n}}_{{\rm{2,pbest}}}}, ··· ,{{{n}}_{{\rm{NP,pbest}}}}\} $;//采用局部最優(yōu)向量中的最小值更新全局最優(yōu)向量 迭代結(jié)束后,根據(jù)最終${{{n}}_{{\rm{gbest}}}}$分別將SDA的隱藏層每層節(jié)點數(shù)更新為$n_{{\rm{gbest}}}^{(1)},n_{{\rm{gbest}}}^{(2)}, ··· ,n_{{\rm{gbest}}}^{({l_{{\rm{pbest}}}})}$; return 最優(yōu)SDA結(jié)構(gòu)。 下載: 導(dǎo)出CSV
表 3 二分類場景不同模型檢測性能
模型類型 基于SAE的異常
檢測模型基于傳統(tǒng)SDA的
異常檢測模型基于一階段尋優(yōu)SDA的
異常檢測模型基于兩階段尋優(yōu)SDA的
異常檢測模型模型結(jié)構(gòu) [28, 3, 2, 2, 2, 1, 3, 3, 3, 2] [28, 28, 28, 28, 2] [28, 2, 2, 2, 2, 2, 2, 2, 2, 2] [28, 3, 2, 2, 2, 1, 3, 3, 3, 2] Acc (%) 86.29 86.52 86.58 92.68 DR (%) 92.85 96.10 94.75 96.80 Rec (%) 90.04 92.68 89.26 94.48 FPR (%) 4.96 3.38 3.51 2.72 ${T_{{\rm{tr}}}}$(m) 8.24 8.52 7.45 8.50 ${T_{{\rm{te}}}}$(s) 0.18 0.18 0.18 0.18 下載: 導(dǎo)出CSV
表 4 多分類場景不同模型檢測性能
模型類型 基于SAE的異常
檢測模型基于傳統(tǒng)SDA的
異常檢測模型基于一階段尋優(yōu)SDA的
異常檢測模型基于兩階段尋優(yōu)SDA的
異常檢測模型模型結(jié)構(gòu) [28, 24, 5] [28, 28, 28, 28, 5] [28, 25, 5] [28, 24, 5] Acc (%) 84.12 84.31 84.96 85.37 Normal DR (%) 84.58 85.37 85.87 86.34 Rec (%) 96.74 96.88 97.01 97.28 FPR (%) 17.98 18.89 18.06 17.25 DoS DR (%) 94.08 94.74 94.92 95.59 Rec (%) 83.65 84.51 82.63 85.88 FPR (%) 2.05 2.04 2.02 1.72 Probe DR (%) 79.42 75.58 79.71 83.27 Rec (%) 65.14 67.29 63.78 68.28 FPR (%) 1.78 2.21 1.70 1.34 R2L DR (%) 90.96 92.06 83.78 90.50 Rec (%) 58.23 60.99 58.34 60.23 FPR (%) 0.27 0.21 0.57 0.30 U2R DR (%) 88.05 28.60 72.58 76.19 Rec (%) 2.50 2.00 4.50 3.00 FPR (%) 0.01 0.03 0.01 0.01 ${T_{{\rm{tr}}}}$(m) 3.94 6.32 6.54 5.36 ${T_{{\rm{te}}}}$(s) 0.20 0.40 0.41 0.26 下載: 導(dǎo)出CSV
表 5 多分類場景不同模型檢測含噪流量的準(zhǔn)確率
模型類型 Acc (%) 0.1 0.2 0.3 基于SAE的異常檢測模型 81.57 79.31 76.69 基于傳統(tǒng)SDA的異常檢測模型 83.63 83.54 83.48 基于一階段尋優(yōu)SDA的異常檢測模型 84.71 84.52 84.23 基于兩階段尋優(yōu)SDA的異常檢測模型 85.08 85.01 85.02 下載: 導(dǎo)出CSV
-
KWON D, KIM H, KIM J, et al. A survey of deep learning-based network anomaly detection[J]. Cluster Computing, 2019, 22(Suppl 1): 949–961. 高妮, 高嶺, 賀毅岳, 等. 基于自編碼網(wǎng)絡(luò)特征降維的輕量級入侵檢測模型[J]. 電子學(xué)報, 2017, 45(3): 730–739. doi: 10.3969/j.issn.0372-2112.2017.03.033GAO Ni, GAO Ling, HE Yiyue, et al. A lightweight intrusion detection model based on autoencoder network with feature reduction[J]. Acta Electronica Sinica, 2017, 45(3): 730–739. doi: 10.3969/j.issn.0372-2112.2017.03.033 ALRAWASHDEH K and PURDY C. Toward an online anomaly intrusion detection system based on deep learning[C]. The 15th IEEE International Conference on Machine Learning and Applications, Anaheim, USA, 2016: 195–200. doi: 10.1109/ICMLA.2016.0040. JAVAID A, NIYAZ Q, SUN Weiqing, et al. A deep learning approach for network intrusion detection system[C]. The 9th EAI International Conference on Bio-inspired Information and Communications Technologies, New York, USA, 2015: 21–26. doi: 10.4108/eai.3-12-2015.2262516. YOUSEFI-AZAR M, VARADHARAJAN V, HAMEY M, et al. Autoencoder-based feature learning for cyber security applications[C]. The 2017 International Joint Conference on Neural Networks, Anchorage, USA, 2017: 3854–3861. doi: 10.1109/IJCNN.2017.7966342. WANG Wei, ZHU Ming, ZENG Xuewen, et al. Malware traffic classification using convolutional neural network for representation learning[C]. 2017 International Conference on Information Networking, Da Nang, Vietnam, 2017: 712–717. doi: 10.1109/ICOIN.2017.7899588. 王勇, 周慧怡, 俸皓, 等. 基于深度卷積神經(jīng)網(wǎng)絡(luò)的網(wǎng)絡(luò)流量分類方法[J]. 通信學(xué)報, 2018, 39(1): 14–23. doi: 10.11959/j.issn.1000-436x.2018018WANG Yong, ZHOU Huiyi, FENG Hao, et al. Network traffic classification method basing on CNN[J]. Journal on Communications, 2018, 39(1): 14–23. doi: 10.11959/j.issn.1000-436x.2018018 YU Yang, LONG Jun, and CAI Zhiping. Session-based network intrusion detection using a deep learning architecture[C]. The 14th International Conference on Modeling Decisions for Artificial Intelligence, Kitakyushu, Japan, 2017: 144–155. doi: 10.1007/978-3-319-67422-3_13. VINCENT P, LAROCHELLE H, LAJOIE I, et al. Stacked Denoising Autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. The Journal of Machine Learning Research, 2010, 11: 3371–3408. Canadian Institute for Cybersecurity. NSL-KDD dataset[EB/OL]. https://www.unb.ca/cic/datasets/nsl.html, 2018. QOLOMANY B, MAABREH M, AL-FUQAHA, et al. Parameters optimization of deep learning models using particle swarm optimization[C]. The 13th International Wireless Communications and Mobile Computing Conference, Valencia, Spain, 2017: 1285–1290. doi: 10.1109/IWCMC.2017.7986470. WANG Yao, CAI Wandong, and WEI Pengcheng. A deep learning approach for detecting malicious JavaScript code[J]. Security and Communication Networks, 2016, 9(11): 1520–1534. doi: 10.1002/sec.1441 陳建廷, 向陽. 深度神經(jīng)網(wǎng)絡(luò)訓(xùn)練中梯度不穩(wěn)定現(xiàn)象研究綜述[J]. 軟件學(xué)報, 2018, 29(7): 2071–2091. doi: 10.13328/j.cnki.jos.005561CHEN Jianting and XIANG Yang. Survey of unstable gradients in deep neural network training[J]. Journal of Software, 2018, 29(7): 2071–2091. doi: 10.13328/j.cnki.jos.005561 谷叢叢, 王艷, 嚴(yán)大虎, 等. 基于自編碼組合特征提取的分類方法研究[J]. 系統(tǒng)仿真學(xué)報, 2018, 30(11): 4132–4140. doi: 10.16182/j.issn1004731x.joss.201811011GU Congcong, WANG Yan, YAN Dahu, et al. Research on classification based on autoencoder combination features extraction method[J]. Journal of System Simulation, 2018, 30(11): 4132–4140. doi: 10.16182/j.issn1004731x.joss.201811011 FIORE U, PALMIERI F, CASTIGLIONE A, et al. Network anomaly detection with the restricted Boltzmann machine[J]. Neurocomputing, 2013, 122: 13–23. doi: 10.1016/j.neucom.2012.11.050 KINGMA D and BA J. Adam: A method for stochastic optimization[C/OL]. https://arxiv.org/abs/1412.6980, 2017. -