基于深度增強學(xué)習(xí)的軟件定義網(wǎng)絡(luò)路由優(yōu)化機制

蘭巨龍; 于倡和; 胡宇翔; 李子勇

doi:10.11999/JEIT180870

基于深度增強學(xué)習(xí)的軟件定義網(wǎng)絡(luò)路由優(yōu)化機制

doi: 10.11999/JEIT180870 cstr: 32379.14.JEIT180870

國家數(shù)字交換系統(tǒng)工程技術(shù)研究中心 ??鄭州 ??450002

基金項目: 國家自然科學(xué)基金群體創(chuàng)新項目(61521003)，國家自然科學(xué)基金(61502530)

詳細(xì)信息

作者簡介:
蘭巨龍：男，1962年生，教授，博士生導(dǎo)師，主要研究方向為新型網(wǎng)絡(luò)體系結(jié)構(gòu)與網(wǎng)絡(luò)安全

于倡和：男，1993年生，碩士，研究方向為新型網(wǎng)絡(luò)體系結(jié)構(gòu)與網(wǎng)絡(luò)安全

通訊作者:
于倡和　yu_changhe@hotmail.com

中圖分類號: TP393
計量
- 文章訪問數(shù): 5832
- HTML全文瀏覽量: 2313
- PDF下載量: 257
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2018-09-06
- 修回日期: 2019-05-12
- 網(wǎng)絡(luò)出版日期: 2019-05-27
- 刊出日期: 2019-11-01

A SDN Routing Optimization Mechanism Based on Deep Reinforcement Learning

National Digital Switching System Engineering & Technological Research Center, Zhengzhou 450002, China

Funds: The National Natural Science Foundation of China for Innovative Research Groups (61521003), The National Natural Science Foundation of China (61502530)

摘要

摘要: 為優(yōu)化軟件定義網(wǎng)絡(luò)(SDN)的路由選路，該文將深度增強學(xué)習(xí)原理引入到軟件定義網(wǎng)絡(luò)的選路過程，提出一種基于深度增強學(xué)習(xí)的路由優(yōu)化選路機制，用以削減網(wǎng)絡(luò)運行時延、提高吞吐量等網(wǎng)絡(luò)性能，實現(xiàn)連續(xù)時間上的黑盒優(yōu)化，減少網(wǎng)絡(luò)運維成本。此外，該文通過實驗對所提出的路由優(yōu)化機制進(jìn)行評估，實驗結(jié)果表明，路由優(yōu)化機制具有良好的收斂性與有效性，較傳統(tǒng)路由協(xié)議可提供更優(yōu)的路由方案與實現(xiàn)更穩(wěn)定的性能。
- 軟件定義網(wǎng)絡(luò) /
- 路由優(yōu)化 /
- 深度增強學(xué)習(xí)
Abstract: In order to achieve routing optimization in the Software Defined Network (SDN) environment, deep reinforcement learning is imposed to the SDN routing process and a mechanism based on deep reinforcement learning is proposed to optimize routing. This mechanism can improve network performance such as delay, throughput, and realize black-box optimization in continuous time, which surely reduces network operation and maintenance costs. Besides, the proposed routing optimization mechanism is evaluated through a series of experiments. The experimental results show that the proposed SDN routing optimization mechanism has good convergence and effectiveness, and can provide better routing configurations and performance stability than traditional routing protocols.
- Software Defined Network (SDN) /
- Routing optimization /
- Deep reinforcement learning

HTML全文

圖 1 加裝機器學(xué)習(xí)機制的SDN網(wǎng)絡(luò)架構(gòu)

下載: 全尺寸圖片幻燈片

圖 2 DDPG的訓(xùn)練運行框架

下載: 全尺寸圖片幻燈片

圖 3 DDPG優(yōu)化SDN路由選路的框架設(shè)計

下載: 全尺寸圖片幻燈片

圖 4 不同流量強度下網(wǎng)絡(luò)的時延隨訓(xùn)練步數(shù)的變化

下載: 全尺寸圖片幻燈片

圖 5 DDPG智能體與隨機路由對比

下載: 全尺寸圖片幻燈片

圖 6 DDPG與OSPF的網(wǎng)絡(luò)運行時延對比

下載: 全尺寸圖片幻燈片

參考文獻(xiàn)(17)

BOUTABA R, SALAHUDDIN M A, LIMAM N, et al. A comprehensive survey on machine learning for networking: Evolution, applications and research opportunities[J]. Journal of Internet Services and Applications, 2018, 9(1): 16. doi: 10.1186/s13174-018-0087-2

FADLULLAH Z M, TANG Fengxiao, MAO Bomin, et al. State-of-the-art deep learning: Evolving machine intelligence toward tomorrow’s intelligent network traffic control systems[J]. IEEE Communications Surveys & Tutorials, 2017, 19(4): 2432–2455. doi: 10.1109/COMST.2017.2707140

LI Wei, LI Guojun, and YU Xiufen. A fast traffic classification method based on SDN network[C]. The 4th International Conference on Electronics, Communications and Networks, Beijing, China, 2015: 223–229.

WANG Fu, LIU Bo, ZHANG Lijia, et al. Dynamic routing and spectrum assignment based on multilayer virtual topology and ant colony optimization in elastic software-defined optical networks[J]. Optical Engineering, 2017, 56(7): 076111. doi: 10.1117/1.OE.56.7.076111

PARSAEI M R, MOHAMMADI R, and JAVIDAN R. A new adaptive traffic engineering method for telesurgery using ACO algorithm over Software Defined Networks[J]. European Research in Telemedicine, 2017, 6(3/4): 173–180. doi: 10.1016/j.eurtel.2017.10.003

WANG Junchao, DE LAAT C, and ZHAO Zhiming. QoS-aware virtual SDN network planning[C]. 2017 IFIP/IEEE Symposium on Integrated Network and Service Management, Lisbon, Portugal, 2017: 644–647. doi: 10.23919/INM.2017.7987350.

LIN S C, AKYILDIZ I F, WANG Pu, et al. QoS-aware adaptive routing in multi-layer hierarchical software defined networks: a reinforcement learning approach[C]. 2016 IEEE International Conference on Services Computing, San Francisco, USA, 2016: 25–33. doi: 10.1109/SCC.2016.12.

JIANG Jingyan, HU Liang, HAO Pingting, et al. Q-FDBA: Improving QoE fairness for video streaming[J]. Multimedia Tools and Applications, 2018, 77(9): 10787–10806. doi: 10.1007/s11042-017-4917-1

SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge, MA: The MIT Press, 1988.

SENDRA S, REGO A, LLORET J, et al. Including artificial intelligence in a routing protocol using Software Defined Networks[C]. 2017 IEEE International Conference on Communications Workshops, Paris, France, 2017: 670–674. doi: 10.1109/ICCW.2017.7962735.

MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236

LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[P]. USA, Patent, WO2017019555, 2017.

MESTRES A, RODRIGUEZ-NATAL A, CARNER J, et al. Knowledge-defined networking[J]. ACM SIGCOMM Computer Communication Review, 2017, 47(3): 2–10. doi: 10.1145/3138808.3138810

SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]. International Conference on Machine Learning, Beijing, China, 2014: I-387–I-395.

VARGA A and HORNIG R. An overview of the OMNeT++ simulation environment[C]. The 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops, Marseille, France, 2008: 60.

ROUGHAN M. Simplifying the synthesis of internet traffic matrices[J]. ACM SIGCOMM Computer Communication Review, 2005, 35(5): 93–96. doi: 10.1145/1096536.1096551

PAN S J and YANG Qiang. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345–1359. doi: 10.1109/TKDE.2009.191

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問統(tǒng)計