基于深度特征學(xué)習(xí)的網(wǎng)絡(luò)流量異常檢測方法

董書琴; 張斌

doi:10.11999/JEIT190266

基于深度特征學(xué)習(xí)的網(wǎng)絡(luò)流量異常檢測方法

doi: 10.11999/JEIT190266 cstr: 32379.14.JEIT190266

董書琴^,,
張斌

1.
中國人民解放軍戰(zhàn)略支援部隊信息工程大學(xué) 鄭州 450001
2.
河南省信息安全重點實驗室鄭州 450001

基金項目: 河南省基礎(chǔ)與前沿技術(shù)研究計劃基金(142300413201)，信息工程大學(xué)新興科研方向培育基金(2016604703)，信息工程大學(xué)科研項目(2019f3303)

詳細(xì)信息

作者簡介:
董書琴：男，1990年生，博士生，研究方向為網(wǎng)絡(luò)安全態(tài)勢感知

張斌：男，1969年生，教授，博士生導(dǎo)師，研究方向為網(wǎng)絡(luò)空間安全

通訊作者:
董書琴　dongshuqin377@126.com

中圖分類號: TP393.08
計量
- 文章訪問數(shù): 5553
- HTML全文瀏覽量: 2349
- PDF下載量: 319
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-04-18
- 修回日期: 2019-10-09
- 網(wǎng)絡(luò)出版日期: 2019-10-16
- 刊出日期: 2020-03-19

Network Traffic Anomaly Detection Method Based on Deep Features Learning

Shuqin DONG^,,
Bin ZHANG

1.
PLA SSF Information Engineering University, Zhengzhou 450001, China
2.
Henan Key Laboratory of Information Security, Zhengzhou 450001, China

Funds: The Foundation and Frontier Technology Research Project of Henan Province (142300413201), The New Research Direction Cultivation Fund of Information Engineering University (2016604703), The Research Project of Information Engineering University (2019f3303)

摘要

摘要:
針對網(wǎng)絡(luò)流量異常檢測過程中提取的流量特征準(zhǔn)確性低、魯棒性差導(dǎo)致流量攻擊檢測率低、誤報率高等問題，該文結(jié)合堆疊降噪自編碼器(SDA)和softmax，提出一種基于深度特征學(xué)習(xí)的網(wǎng)絡(luò)流量異常檢測方法。首先基于粒子群優(yōu)化算法設(shè)計SDA結(jié)構(gòu)兩階段尋優(yōu)算法：根據(jù)流量檢測準(zhǔn)確率依次對隱藏層層數(shù)及每層節(jié)點數(shù)進(jìn)行尋優(yōu)，確定搜索空間中的最優(yōu)SDA結(jié)構(gòu)，從而提高SDA提取特征的準(zhǔn)確性。然后采用小批量梯度下降算法對優(yōu)化的SDA進(jìn)行訓(xùn)練，通過最小化含噪數(shù)據(jù)重構(gòu)向量與原始輸入向量間的差異，提取具有較強(qiáng)魯棒性的流量特征。最后基于提取的流量特征對softmax進(jìn)行訓(xùn)練構(gòu)建異常檢測分類器，從而實現(xiàn)對流量攻擊的高性能檢測。實驗結(jié)果表明：該文所提方法可根據(jù)實驗數(shù)據(jù)及其分類任務(wù)動態(tài)調(diào)整SDA結(jié)構(gòu)，提取的流量特征具有更高的準(zhǔn)確性和魯棒性，流量攻擊檢測率高、誤報率低。
- 流量異常檢測 /
- 深度學(xué)習(xí) /
- 堆疊降噪自編碼器 /
- 粒子群優(yōu)化
Abstract:
In view of the problems of low attack detection rate and high false positive rate caused by poor accuracy and robustness of the extracted traffic features in network traffic anomaly detection, a network traffic anomaly detection method based on deep features learning is proposed, which is combined with Stacked Denoising Autoencoders (SDA) and softmax. Firstly, a two-stage optimization algorithm is designed based on particle swarm optimization algorithm to optimize the structure of SDA, the number of hidden layers and nodes in each layer is optimized successively based on the traffic detection accuracy, and the optimal structure of SDA in the search space is determined, improving the accuracy of traffic features extracted by SDA. Secondly, the optimized SDA is trained by the mini-batch gradient descent algorithm, and the traffic features with strong robustness are extracted by minimizing the difference between the reconstruction vector of the corrupted data and the original input vector. Finally, softmax is trained by the extracted traffic features to construct an anomaly detection classifier for detecting traffic attacks with high performance. The experimental results show that the proposed method can adjust the structure of SDA based on the experimental data and its classification tasks, extract traffic features with a higher accuracy and robustness, and detect traffic attacks with high detection rate and low false positive rate.
- Traffic anomaly detection /
- Deep learning /
- Stacked Denoising Autoencoders (SDA) /
- Particle Swarm Optimization (PSO)

HTML全文

圖 1 基于兩階段尋優(yōu)SDA的流量異常檢測模型

下載: 全尺寸圖片幻燈片

圖 2 基于PSO的SDA結(jié)構(gòu)兩階段尋優(yōu)算法流程

下載: 全尺寸圖片幻燈片

圖 3 二分類場景下SDA結(jié)構(gòu)尋優(yōu)過程

下載: 全尺寸圖片幻燈片

圖 4 多分類場景下SDA結(jié)構(gòu)尋優(yōu)過程

下載: 全尺寸圖片幻燈片

表 1 隱藏層層數(shù)尋優(yōu)算法

輸入：流量異常檢測數(shù)據(jù)集，NP,${t_{\max }}$, $w$, ${c_1}$, ${c_2}$, ${l_{\max }}$, ${l_{\min }}$, ${v_{l,\max }}$, ${v_{l,\min }}$, ${n_{\max }}$, ${n_{\min }}$, ${v_{n,\max }}$, ${v_{n,\min }}$
輸出：具有${l_{{\rm{gbest}}}}$個隱藏層且每層節(jié)點數(shù)為${n_{{\rm{gbest}}}}$的SDA
for $i = 1\;{\rm{to}}\;{{\rm{NP}}}$ do
采用式(5)—式(8)對粒子群進(jìn)行初始化，并分別將${l_{i,{\rm{pbest}}}}$和${n_{i,{\rm{pbest}}}}$初始化為${l_i}(0)$和${n_i}(0)$；
基于實驗數(shù)據(jù)，采用式(9)計算粒子i的適應(yīng)度值；
將最小適應(yīng)度值對應(yīng)的l和n設(shè)置為${l_{{\rm{gbest}}}}$和${n_{{\rm{gbest}}}}$初始化值；
for $t = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{t_{\max }}} \end{array}$ do
for $i = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{\rm{NP}}} \end{array}$ do
采用式(1)—式(4)更新粒子i的${l_i}(t)$速度和數(shù)值，以及${n_i}(t)$的速度和數(shù)值；
if ${v_{{l_i}}}(t)$, ${l_i}(t)$,${v_{{n_i}}}(t)$ or ${n_i}(t)$超過其搜索范圍
對${v_{{l_i}}}(t)$, ${l_i}(t)$,${v_{{n_i}}}(t)$ or ${n_i}(t)$再次進(jìn)行隨機(jī)初始化；
生成具有${l_i}(t)$個隱藏層且每層節(jié)點數(shù)為${n_i}(t)$的SDA；
基于實驗數(shù)據(jù)，采用式(9)計算粒子i的適應(yīng)度值；
if(${\rm{fit} } ({l_i}(t),{n_i}(t)) < {\rm{fit} } ({l_{i,{\rm pbest}} },{n_{i,{\rm{pbest} } } })$)//若粒子i的適應(yīng)度值小于局部最優(yōu)值對應(yīng)的適應(yīng)度值，則對局部最優(yōu)值進(jìn)行更新
分別將${l_i}(t)$和${n_i}(t)$賦值給${l_{i,{\rm{pbest}}}}$和${n_{i,{\rm{pbest}}}}$；
if(${\rm{fit}} ({l_i}(t),{n_i}(t)) < {\rm{fit}} ({l_{{\rm{gbest}}}},{n_{{\rm{gbest}}}})$)//若粒子i的適應(yīng)度值小于全局最優(yōu)值對應(yīng)的適應(yīng)度值，則對全局最優(yōu)值進(jìn)行更新
分別將${l_i}(t)$和${n_i}(t)$賦值給${l_{{\rm{gbest}}}}$和${n_{{\rm{gbest}}}}$；
迭代結(jié)束后，生成具有${l_{{\rm{gbest}}}}$個隱藏層且每層節(jié)點數(shù)為${n_{{\rm{gbest}}}}$的SDA；
return 具有${l_{{\rm{gbest}}}}$個隱藏層且每層節(jié)點數(shù)為${n_{{\rm{gbest}}}}$的SDA。

下載: 導(dǎo)出CSV

表 2 隱藏層每層節(jié)點數(shù)尋優(yōu)算法

輸入：流量異常檢測數(shù)據(jù)集，NP, ${t_{\max }}$, $w$, ${c_1}$, ${c_2}$, ${v_{\max }}$, ${v_{\min }}$, ${l_{{\rm{gbest}}}}$, ${n_{{\rm{gbest}}}}$
輸出：最優(yōu)SDA結(jié)構(gòu)
for $i = {1 }\ {\rm{t} }{ {\rm{o} }_{} }\ { {\rm{NP} } } $ do
for $h = {1_{} }\ {\rm{t} }{ {\rm{o} }_{} }\ {l_{ {\rm{gbest} } } }$ do
初始化粒子位置$n_i^{(h)}(0) = {n_{{\rm{gbest}}}}$，采用式(12)初始化粒子速度，并將局部最優(yōu)向量${{{n}}_{i,{\rm{pbest}}}}$中的$n_{i,{\rm{pbest}}}^{(h)}$初始化為${n_{{\rm{gbest}}}}$；
設(shè)置全局最優(yōu)向量${ {{n} }_{ {\rm{gbest} } } } = \min \{ { {{n} }_{ {\rm{1,pbest} } } },{ {{n} }_{ {\rm{2,pbest} } } }, ··· ,{ {{n} }_{ {\rm{NP,pbest} } } }\} = {[{n_{ {\rm{gbest} } } }_{}{n_{ {\rm{gbest} } } } ··· {n_{ {\rm{gbest} } } }]^{\rm T}}$；
for $t = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{t_{\max }}} \end{array}$ do
for $i = {1_{}}{\rm{t}}{{\rm{o}}_{}}\begin{array}{*{20}{c}} {{\rm{NP}}} \end{array}$ do
for $h = {1_{}}{\rm{t}}{{\rm{o}}_{}}{l_{{\rm{gbest}}}}$ do
采用式(10)和式(11)更新粒子i位置向量${{{n}}_i}(t)$中元素$n_i^{(h)}(t)$的速度和數(shù)值；
if $v_i^{(h)}(t)$ or $n_i^{(h)}(t)$超過其搜索范圍
對$v_i^{(h)}(t)$ or $n_i^{(h)}(t)$再次進(jìn)行隨機(jī)初始化；
根據(jù)更新后的${{{n}}_i}(t)$，將SDA每個隱藏層的節(jié)點數(shù)分別更新為$n_i^{(1)}(t),n_i^{(2)}(t), ··· ,n_i^{({l_{{\rm{gbest}}}})}(t)$；
基于實驗數(shù)據(jù)，采用式(13)計算粒子i的適應(yīng)度值；
if(${\rm{fit}} ({{{n}}_i}(t)) < {\rm{fit}} ({{{n}}_{i,{\rm{pbest}}}})$)//若粒子i的適應(yīng)度值小于局部最優(yōu)向量對應(yīng)的適應(yīng)度值，則對局部最優(yōu)向量進(jìn)行更新
${{{n}}_{i,{\rm{pbest}}}} \leftarrow {{{n}}_i}(t)$；
${{{n}}_{{\rm{gbest}}}} \leftarrow \min \{ {{{n}}_{{\rm{1,pbest}}}},{{{n}}_{{\rm{2,pbest}}}}, ··· ,{{{n}}_{{\rm{NP,pbest}}}}\} $；//采用局部最優(yōu)向量中的最小值更新全局最優(yōu)向量
迭代結(jié)束后，根據(jù)最終${{{n}}_{{\rm{gbest}}}}$分別將SDA的隱藏層每層節(jié)點數(shù)更新為$n_{{\rm{gbest}}}^{(1)},n_{{\rm{gbest}}}^{(2)}, ··· ,n_{{\rm{gbest}}}^{({l_{{\rm{pbest}}}})}$；
return 最優(yōu)SDA結(jié)構(gòu)。

下載: 導(dǎo)出CSV

表 3 二分類場景不同模型檢測性能

模型類型	基于SAE的異常檢測模型	基于傳統(tǒng)SDA的異常檢測模型	基于一階段尋優(yōu)SDA的異常檢測模型	基于兩階段尋優(yōu)SDA的異常檢測模型
模型結(jié)構(gòu)	[28, 3, 2, 2, 2, 1, 3, 3, 3, 2]	[28, 28, 28, 28, 2]	[28, 2, 2, 2, 2, 2, 2, 2, 2, 2]	[28, 3, 2, 2, 2, 1, 3, 3, 3, 2]
Acc (%)	86.29	86.52	86.58	92.68
DR (%)	92.85	96.10	94.75	96.80
Rec (%)	90.04	92.68	89.26	94.48
FPR (%)	4.96	3.38	3.51	2.72
${T_{{\rm{tr}}}}$(m)	8.24	8.52	7.45	8.50
${T_{{\rm{te}}}}$(s)	0.18	0.18	0.18	0.18

下載: 導(dǎo)出CSV

表 4 多分類場景不同模型檢測性能

模型類型		基于SAE的異常檢測模型	基于傳統(tǒng)SDA的異常檢測模型	基于一階段尋優(yōu)SDA的異常檢測模型	基于兩階段尋優(yōu)SDA的異常檢測模型
模型結(jié)構(gòu)		[28, 24, 5]	[28, 28, 28, 28, 5]	[28, 25, 5]	[28, 24, 5]
Acc (%)		84.12	84.31	84.96	85.37
Normal	DR (%)	84.58	85.37	85.87	86.34
	Rec (%)	96.74	96.88	97.01	97.28
	FPR (%)	17.98	18.89	18.06	17.25
DoS	DR (%)	94.08	94.74	94.92	95.59
	Rec (%)	83.65	84.51	82.63	85.88
	FPR (%)	2.05	2.04	2.02	1.72
Probe	DR (%)	79.42	75.58	79.71	83.27
	Rec (%)	65.14	67.29	63.78	68.28
	FPR (%)	1.78	2.21	1.70	1.34
R2L	DR (%)	90.96	92.06	83.78	90.50
	Rec (%)	58.23	60.99	58.34	60.23
	FPR (%)	0.27	0.21	0.57	0.30
U2R	DR (%)	88.05	28.60	72.58	76.19
	Rec (%)	2.50	2.00	4.50	3.00
	FPR (%)	0.01	0.03	0.01	0.01
${T_{{\rm{tr}}}}$(m)		3.94	6.32	6.54	5.36
${T_{{\rm{te}}}}$(s)		0.20	0.40	0.41	0.26

下載: 導(dǎo)出CSV

表 5 多分類場景不同模型檢測含噪流量的準(zhǔn)確率

模型類型	Acc (%)
模型類型	0.1	0.2	0.3
基于SAE的異常檢測模型	81.57	79.31	76.69
基于傳統(tǒng)SDA的異常檢測模型	83.63	83.54	83.48
基于一階段尋優(yōu)SDA的異常檢測模型	84.71	84.52	84.23
基于兩階段尋優(yōu)SDA的異常檢測模型	85.08	85.01	85.02

下載: 導(dǎo)出CSV

參考文獻(xiàn)(16)

KWON D, KIM H, KIM J, et al. A survey of deep learning-based network anomaly detection[J]. Cluster Computing, 2019, 22(Suppl 1): 949–961.

高妮, 高嶺, 賀毅岳, 等. 基于自編碼網(wǎng)絡(luò)特征降維的輕量級入侵檢測模型[J]. 電子學(xué)報, 2017, 45(3): 730–739. doi: 10.3969/j.issn.0372-2112.2017.03.033

GAO Ni, GAO Ling, HE Yiyue, et al. A lightweight intrusion detection model based on autoencoder network with feature reduction[J]. Acta Electronica Sinica, 2017, 45(3): 730–739. doi: 10.3969/j.issn.0372-2112.2017.03.033

ALRAWASHDEH K and PURDY C. Toward an online anomaly intrusion detection system based on deep learning[C]. The 15th IEEE International Conference on Machine Learning and Applications, Anaheim, USA, 2016: 195–200. doi: 10.1109/ICMLA.2016.0040.

JAVAID A, NIYAZ Q, SUN Weiqing, et al. A deep learning approach for network intrusion detection system[C]. The 9th EAI International Conference on Bio-inspired Information and Communications Technologies, New York, USA, 2015: 21–26. doi: 10.4108/eai.3-12-2015.2262516.

YOUSEFI-AZAR M, VARADHARAJAN V, HAMEY M, et al. Autoencoder-based feature learning for cyber security applications[C]. The 2017 International Joint Conference on Neural Networks, Anchorage, USA, 2017: 3854–3861. doi: 10.1109/IJCNN.2017.7966342.

WANG Wei, ZHU Ming, ZENG Xuewen, et al. Malware traffic classification using convolutional neural network for representation learning[C]. 2017 International Conference on Information Networking, Da Nang, Vietnam, 2017: 712–717. doi: 10.1109/ICOIN.2017.7899588.

王勇, 周慧怡, 俸皓, 等. 基于深度卷積神經(jīng)網(wǎng)絡(luò)的網(wǎng)絡(luò)流量分類方法[J]. 通信學(xué)報, 2018, 39(1): 14–23. doi: 10.11959/j.issn.1000-436x.2018018

WANG Yong, ZHOU Huiyi, FENG Hao, et al. Network traffic classification method basing on CNN[J]. Journal on Communications, 2018, 39(1): 14–23. doi: 10.11959/j.issn.1000-436x.2018018

YU Yang, LONG Jun, and CAI Zhiping. Session-based network intrusion detection using a deep learning architecture[C]. The 14th International Conference on Modeling Decisions for Artificial Intelligence, Kitakyushu, Japan, 2017: 144–155. doi: 10.1007/978-3-319-67422-3_13.

VINCENT P, LAROCHELLE H, LAJOIE I, et al. Stacked Denoising Autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. The Journal of Machine Learning Research, 2010, 11: 3371–3408.

Canadian Institute for Cybersecurity. NSL-KDD dataset[EB/OL]. https://www.unb.ca/cic/datasets/nsl.html, 2018.

QOLOMANY B, MAABREH M, AL-FUQAHA, et al. Parameters optimization of deep learning models using particle swarm optimization[C]. The 13th International Wireless Communications and Mobile Computing Conference, Valencia, Spain, 2017: 1285–1290. doi: 10.1109/IWCMC.2017.7986470.

WANG Yao, CAI Wandong, and WEI Pengcheng. A deep learning approach for detecting malicious JavaScript code[J]. Security and Communication Networks, 2016, 9(11): 1520–1534. doi: 10.1002/sec.1441

陳建廷, 向陽. 深度神經(jīng)網(wǎng)絡(luò)訓(xùn)練中梯度不穩(wěn)定現(xiàn)象研究綜述[J]. 軟件學(xué)報, 2018, 29(7): 2071–2091. doi: 10.13328/j.cnki.jos.005561

CHEN Jianting and XIANG Yang. Survey of unstable gradients in deep neural network training[J]. Journal of Software, 2018, 29(7): 2071–2091. doi: 10.13328/j.cnki.jos.005561

谷叢叢, 王艷, 嚴(yán)大虎, 等. 基于自編碼組合特征提取的分類方法研究[J]. 系統(tǒng)仿真學(xué)報, 2018, 30(11): 4132–4140. doi: 10.16182/j.issn1004731x.joss.201811011

GU Congcong, WANG Yan, YAN Dahu, et al. Research on classification based on autoencoder combination features extraction method[J]. Journal of System Simulation, 2018, 30(11): 4132–4140. doi: 10.16182/j.issn1004731x.joss.201811011

FIORE U, PALMIERI F, CASTIGLIONE A, et al. Network anomaly detection with the restricted Boltzmann machine[J]. Neurocomputing, 2013, 122: 13–23. doi: 10.1016/j.neucom.2012.11.050

KINGMA D and BA J. Adam: A method for stochastic optimization[C/OL]. https://arxiv.org/abs/1412.6980, 2017.

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問統(tǒng)計