一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

基于強化學習的5G網(wǎng)絡切片虛擬網(wǎng)絡功能遷移算法

唐倫 周鈺 譚頎 魏延南 陳前斌

唐倫, 周鈺, 譚頎, 魏延南, 陳前斌. 基于強化學習的5G網(wǎng)絡切片虛擬網(wǎng)絡功能遷移算法[J]. 電子與信息學報, 2020, 42(3): 669-677. doi: 10.11999/JEIT190290
引用本文: 唐倫, 周鈺, 譚頎, 魏延南, 陳前斌. 基于強化學習的5G網(wǎng)絡切片虛擬網(wǎng)絡功能遷移算法[J]. 電子與信息學報, 2020, 42(3): 669-677. doi: 10.11999/JEIT190290
Lun TANG, Yu ZHOU, Qi TAN, Yannan WEI, Qianbin CHEN. Virtual Network Function Migration Algorithm Based on Reinforcement Learning for 5G Network Slicing[J]. Journal of Electronics & Information Technology, 2020, 42(3): 669-677. doi: 10.11999/JEIT190290
Citation: Lun TANG, Yu ZHOU, Qi TAN, Yannan WEI, Qianbin CHEN. Virtual Network Function Migration Algorithm Based on Reinforcement Learning for 5G Network Slicing[J]. Journal of Electronics & Information Technology, 2020, 42(3): 669-677. doi: 10.11999/JEIT190290

基于強化學習的5G網(wǎng)絡切片虛擬網(wǎng)絡功能遷移算法

doi: 10.11999/JEIT190290
基金項目: 國家自然科學基金(61571073),重慶市教委科學技術研究項目(KJZD-M201800601)
詳細信息
    作者簡介:

    唐倫:男,1973年生,教授,博士生導師,研究方向為新一代無線通信網(wǎng)絡、異構蜂窩網(wǎng)絡、軟件定義無線網(wǎng)絡等

    周鈺:男,1993年生,碩士生,研究方向為5G網(wǎng)絡切片資源分配和深度學習

    譚頎:女,1995年生,碩士生,研究方向為5G網(wǎng)絡切片、資源分配、隨機優(yōu)化理論

    魏延南:男,1995年生,碩士生,研究方向為5G網(wǎng)絡切片、虛擬資源分配,可靠性

    陳前斌:男,1967年生,教授,博士生導師,研究方向為個人通信、多媒體信息處理與傳輸、下一代移動通信網(wǎng)絡

    通訊作者:

    周鈺 137068966@qq.com

  • 中圖分類號: TN929.5

Virtual Network Function Migration Algorithm Based on Reinforcement Learning for 5G Network Slicing

Funds: The National Natural Science Foundation of China (61571073), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M201800601)
  • 摘要:

    針對5G網(wǎng)絡切片架構下業(yè)務請求動態(tài)性引起的虛擬網(wǎng)絡功能(VNF)遷移優(yōu)化問題,該文首先建立基于受限馬爾可夫決策過程(CMDP)的隨機優(yōu)化模型以實現(xiàn)多類型服務功能鏈(SFC)的動態(tài)部署,該模型以最小化通用服務器平均運行能耗為目標,同時受限于各切片平均時延約束以及平均緩存、帶寬資源消耗約束。其次,為了克服優(yōu)化模型中難以準確掌握系統(tǒng)狀態(tài)轉移概率及狀態(tài)空間過大的問題,該文提出了一種基于強化學習框架的VNF智能遷移學習算法,該算法通過卷積神經(jīng)網(wǎng)絡(CNN)來近似行為值函數(shù),從而在每個離散的時隙內(nèi)根據(jù)當前系統(tǒng)狀態(tài)為每個網(wǎng)絡切片制定合適的VNF遷移策略及CPU資源分配方案。仿真結果表明,所提算法在有效地滿足各切片QoS需求的同時,降低了基礎設施的平均能耗。

  • 圖  1  5G網(wǎng)絡切片架構下的VNF遷移系統(tǒng)場景圖

    圖  2  基于DQN的虛擬網(wǎng)絡功能智能遷移學習架構圖

    圖  3  各切片數(shù)據(jù)包平均總時延

    圖  4  緩存資源和鏈路帶寬資源平均利用率

    圖  5  通用服務器平均總功耗

    圖  6  平均切片總時延

    表  1  基于DQN的價值函數(shù)近似

     (1) 初始化Q網(wǎng)絡,采用Xavier[14]初始化權重,即令權重的概率分布函數(shù)服從$W \sim U\left[ { - \dfrac{ {\sqrt 6 } }{ {\sqrt { {\upsilon _l} + {\upsilon _{l + 1} } } } },\dfrac{ {\sqrt 6 } }{ {\sqrt { {\upsilon _l} + {\upsilon _{l + 1} } } } } } \right]$的均勻分布,初始化目
    標Q網(wǎng)絡,權重為${w^ - } = w$,其中$l$為網(wǎng)絡層數(shù),$\upsilon $為神經(jīng)元個數(shù)
     (2) 初始化拉格朗日乘子$\beta _i^d \leftarrow 0,\beta _h^q \leftarrow 0,\beta _{h,l}^x \leftarrow 0,$$\forall i \in I,\forall h,l \in H$,初始化經(jīng)驗回放池
     (3) for episode $k = 1,2, ···,K$ do
     (4)   隨機選取一個狀態(tài)初始化${r_1}$
     (5)  for $t = 1,2, ···,T$ do
     (6)   隨機選擇一個概率$p$,if $p \ge \varepsilon $
     (7)     計算VNF遷移及CPU資源分配策略$a_t^{\rm{*} } = \arg \mathop {\min }\limits_{a \in A} { Q}({r_t},a,w)$
     (8)     else 選擇一個隨機的行動${a_t} \ne a_t^{\rm{*}}$
     (9)     執(zhí)行行動${a_t}$,獲得拉格朗日回報${g^\beta }({r_t},{a_t})$,并觀察下一時刻狀態(tài)${r_{t + 1}}$
     (10)    將經(jīng)驗樣本$\left( {{r_t},{a_t},{g^\beta }({r_t},{a_t}),{r_{t + 1}}} \right)$存入經(jīng)驗回放池中
     (11)    從經(jīng)驗池中隨機抽取一組Mini-batch的經(jīng)驗樣本$\left( {{r_k},{a_k},{g^\beta }({r_k},{a_k}),{r_{k + 1}}} \right)$
     (12)    利用目標Q網(wǎng)絡得到$\mathop {\min }\limits_{ {a'} \in A} { Q}({r_{t + 1} },{a'},{w^ - })$,求得${y_k} = {g^\beta }({r_k},{a_k}) + \gamma \mathop {\min }\limits_{ {a'} \in A} { Q}({r_{t + 1} },{a'},{w^ - })$
     (13)    對${\left( { {y_k} - { Q}({r_t},{a_k},w)} \right)^2}$使用梯度下降法對$w$進行更新
     (14)    每隔時間長度${T_q}$更新目標Q網(wǎng)絡,即${w^ - } = w$
     (15)    利用隨機次梯度法更新拉格朗日乘子${ \beta} :\beta \ge 0$
     (16)   end for
     (17) end for
    下載: 導出CSV

    表  2  基于DQN的VNF在線遷移算法

     (1) for $t = 1,2,···,T$ do
     (2) \*網(wǎng)絡狀態(tài)的監(jiān)測*\
     (3) 監(jiān)測當前時隙$t$下的全局狀態(tài)$r(t)$,包括全局隊列狀態(tài)${{Q}}({{t}})$、全局節(jié)點狀態(tài)${{\zeta}} ({{t}})$以及全局鏈路狀態(tài)${{\eta}} ({{t}})$
     (4) if ${\zeta _h}(t) = 0{\text{或}}{\eta _{h,l} }(t) = 0$
     (5)   在將滿足$B(h,f) = 1{\text{或}}P({f_p}|{f_j})B({f_j},h)B({f_p},l) \ne 0$的所有$\forall f \in F$遷移至其它節(jié)點的基礎上,計算最優(yōu)的VNF遷移策略及
    CPU資源分配策略$a_t^{\rm{*} } = \arg \mathop {\min }\limits_{a \in A} { Q}({r_t},a,w)$
     (6)   else
     (7)   直接計算最優(yōu)的VNF遷移策略及CPU資源分配策略$a_t^{\rm{*} } = \arg \mathop {\min }\limits_{a \in A} { Q}({r_t},a,w)$
     (8)  基于最優(yōu)行動$a_t^{\rm{*}}$執(zhí)行VNF的遷移,并進行資源的分配
     (9)  $t = t + 1$
     (10) end for
    下載: 導出CSV

    表  3  仿真參數(shù)

    仿真參數(shù)仿真值仿真參數(shù)仿真值
    網(wǎng)絡切片業(yè)務數(shù)量$I$3服務器總臺數(shù)$H$8
    VNF種類$J$10節(jié)點失效率服從均值為[0.01,0.02]均勻分布
    時隙長度${T_s} $10 s鏈路失效率服從均值為[0.02,0.04]均勻分布
    數(shù)據(jù)包到達過程獨立同分布的泊松過程鏈路傳輸時延$\delta $0.5 ms
    平均數(shù)據(jù)包大小$\overline P$500 kbit/packet服務器最高功率$P_h$800 W
    節(jié)點緩存空間$\chi $300 MB服務器功耗百分比$u_h$0.3
    節(jié)點CPU個數(shù)$\kappa $8最大迭代輪數(shù)2000
    單個CPU最大服務速率$\xi $25 MB/s總訓練步長200000
    鏈路帶寬容量Δ640 Mbps學習率$\alpha $0.0001
    折扣因子$\gamma $0.9Mini-batch8
    下載: 導出CSV

    表  4  CNN神經(jīng)網(wǎng)絡參數(shù)

    網(wǎng)絡層卷積核大小卷積步長卷積核個數(shù)激活函數(shù)
    卷積層1$7 \times 7$232ReLU
    卷積層2$5 \times 5$264ReLU
    卷積層3$3 \times 3$164ReLU
    全連接層1512ReLU
    全連接層2122Linear
    下載: 導出CSV
  • GE Xiaohu, TU Song, MAO Guoqiang, et al. 5G ultra-dense cellular networks[J]. IEEE Wireless Communications, 2016, 23(1): 72–79. doi: 10.1109/mwc.2016.7422408
    SUGISONO K, FUKUOKA A, and YAMAZAKI H. Migration for VNF instances forming service chain[C]. The 7th IEEE International Conference on Cloud Networking, Tokyo, Japan, 2018: 1–3. doi: 10.1109/CloudNet.2018.8549194.
    ZHENG Qinghua, LI Rui, LI Xiuqi, et al. Virtual machine consolidated placement based on multi-objective biogeography-based optimization[J]. Future Generation Computer Systems, 2016, 54: 95–122. doi: 10.1016/j.future.2015.02.010
    ZHANG Xiaoqing, YUE Qiang, and HE Zhongtang. Dynamic Energy-efficient Virtual Machine Placement Optimization for Virtualized Clouds[M]. JIA Limin, LIU Zhigang, QIN Yong, et al. Proceedings of the 2013 International Conference on Electrical and Information Technologies for Rail Transportation (EITRT2013)-Volume II. Berlin, Heidelberg: Springer, 2014, 288: 439–448. doi: 10.1007/978-3-642-53751-6_47.
    ERAMO V, AMMAR M, and LAVACCA F G. Migration energy aware reconfigurations of virtual network function instances in NFV architectures[J]. IEEE Access, 2017, 5: 4927–4938. doi: 10.1109/ACCESS.2017.2685437
    ERAMO V, MIUCCI E, AMMAR M, et al. An approach for service function chain routing and virtual function network instance migration in network function virtualization architectures[J]. IEEE/ACM Transactions on Networking, 2017, 25(4): 2008–2025. doi: 10.1109/TNET.2017.2668470
    WEN Tao, YU Hongfang, SUN Gang, et al. Network function consolidation in service function chaining orchestration[C]. 2016 IEEE International Conference on Communications, Kuala Lumpur, Malaysia, 2016: 1–6. doi: 10.1109/ICC.2016.7510679.
    YANG Jian, ZHANG Shuben, WU Xiaomin, et al. Online learning-based server provisioning for electricity cost reduction in data center[J]. IEEE Transactions on Control Systems Technology, 2017, 25(3): 1044–1051. doi: 10.1109/TCST.2016.2575801
    CHENG Aolin, LI Jian, YU Yuling, et al. Delay-sensitive user scheduling and power control in heterogeneous networks[J]. IET Networks, 2015, 4(3): 175–184. doi: 10.1049/iet-net.2014.0026
    LI Rongpeng, ZHAO Zhifeng, CHEN Xianfu, et al. TACT: A transfer actor-critic learning framework for energy saving in cellular radio access networks[J]. IEEE Transactions on Wireless Communications, 2014, 13(4): 2000–2011. doi: 10.1109/TWC.2014.022014.130840
    WANG Shangxing, LIU Hanpeng, GOMES P H, et al. Deep reinforcement learning for dynamic multichannel access in wireless networks[J]. IEEE Transactions on Cognitive Communications and Networking, 2018, 4(2): 257–265. doi: 10.1109/TCCN.2018.2809722
    HUANG Xiaohong, YUAN Tingting, QIAO Guanghua, et al. Deep reinforcement learning for multimedia traffic control in software defined networking[J]. IEEE Network, 2018, 32(6): 35–41. doi: 10.1109/MNET.2018.1800097
    HE Ying, ZHANG Zheng, YU F R, et al. Deep-reinforcement-learning-based optimization for cache-enabled opportunistic interference alignment wireless networks[J]. IEEE Transactions on Vehicular Technology, 2017, 66(11): 10433–10445. doi: 10.1109/TVT.2017.2751641
    GLOROT X and BENGIO Y. Understanding the difficulty of training deep feedforward neural networks[C]. The International Conference on Artificial Intelligence and Statistics, Sardinia, 2010: 249–256.
    PERUMAL V and SUBBIAH S. Power-conservative server consolidation based resource management in cloud[J]. International Journal of Network Management, 2014, 24(6): 415–432. doi: 10.1002/nem.1873
    QU Long, ASSI C, SHABAN K, et al. Delay-aware scheduling and resource optimization with network function virtualization[J]. IEEE Transactions on Communications, 2016, 64(9): 3746–3758. doi: 10.1109/TCOMM.2016.2580150
  • 加載中
圖(6) / 表(4)
計量
  • 文章訪問數(shù):  4424
  • HTML全文瀏覽量:  1673
  • PDF下載量:  245
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2019-04-25
  • 修回日期:  2019-09-11
  • 網(wǎng)絡出版日期:  2019-09-19
  • 刊出日期:  2020-03-19

目錄

    /

    返回文章
    返回