一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

基于生成對抗網(wǎng)絡(luò)輔助多智能體強化學習的邊緣計算網(wǎng)絡(luò)聯(lián)邦切片資源管理

林艷 夏開元 張一晉

林艷, 夏開元, 張一晉. 基于生成對抗網(wǎng)絡(luò)輔助多智能體強化學習的邊緣計算網(wǎng)絡(luò)聯(lián)邦切片資源管理[J]. 電子與信息學報. doi: 10.11999/JEIT240773
引用本文: 林艷, 夏開元, 張一晉. 基于生成對抗網(wǎng)絡(luò)輔助多智能體強化學習的邊緣計算網(wǎng)絡(luò)聯(lián)邦切片資源管理[J]. 電子與信息學報. doi: 10.11999/JEIT240773
LIN Yan, XIA Kaiyuan, ZHANG Yijin. Federated Slicing Resource Management in Edge Computing Networks based on GAN-Assisted Multi-Agent Reinforcement Learning[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240773
Citation: LIN Yan, XIA Kaiyuan, ZHANG Yijin. Federated Slicing Resource Management in Edge Computing Networks based on GAN-Assisted Multi-Agent Reinforcement Learning[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240773

基于生成對抗網(wǎng)絡(luò)輔助多智能體強化學習的邊緣計算網(wǎng)絡(luò)聯(lián)邦切片資源管理

doi: 10.11999/JEIT240773
基金項目: 國家自然科學基金(62001225, 62071236)
詳細信息
    作者簡介:

    林艷:女,副教授,研究方向為車聯(lián)網(wǎng)、無人機通信等6G網(wǎng)絡(luò)無線智能資源分配

    夏開元:男,碩士生,研究方向為網(wǎng)絡(luò)切片資源分配與強化學習

    張一晉:男,教授,研究方向為序列設(shè)計、無線網(wǎng)絡(luò)與人工智能

    通訊作者:

    張一晉 yijin.zhang@gmail.com

  • 中圖分類號: TN929.5

Federated Slicing Resource Management in Edge Computing Networks based on GAN-Assisted Multi-Agent Reinforcement Learning

Funds: The National Natural Science Foundation of China (62001225, 62071236)
  • 摘要: 為滿足動態(tài)邊緣計算網(wǎng)絡(luò)場景下用戶差異化服務(wù)需求,該文提出一種基于生成對抗網(wǎng)絡(luò)(GAN)輔助多智能體強化學習(RL)的聯(lián)邦切片資源管理方案。首先,考慮未知時變信道和隨機用戶流量到達的場景,以同時優(yōu)化長期平均服務(wù)等待時延和服務(wù)滿意率為目標,構(gòu)建聯(lián)合帶寬和計算切片資源管理優(yōu)化問題,并進一步建模為分布式部分可觀測馬爾可夫決策過程 (Dec-POMDP)。其次,運用多智能體競爭雙深度Q網(wǎng)絡(luò)(D3QN)方法,結(jié)合GAN算法對狀態(tài)值分布多模態(tài)學習的優(yōu)勢,以及利用聯(lián)邦學習框架促使智能體合作學習,最終實現(xiàn)僅需共享各智能體生成網(wǎng)絡(luò)加權(quán)參數(shù)即可完成切片資源管理協(xié)同決策。仿真結(jié)果表明,所提方案相較于基準方案能夠在保護用戶隱私的前提下,降低用戶平均服務(wù)等待時延28%以上,且同時提升用戶平均服務(wù)滿意率8%以上。
  • 圖  1  網(wǎng)絡(luò)切片輔助的邊緣計算網(wǎng)絡(luò)系統(tǒng)模型

    圖  2  基于多智能體聯(lián)邦D3QN-GAN的邊緣計算網(wǎng)絡(luò)切片資源管理算法原理圖

    圖  3  收斂性比較

    圖  4  平均服務(wù)滿意率與平均服務(wù)等待時延權(quán)衡

    圖  5  不同用戶數(shù)量下用戶平均滿意率比較

    圖  6  不同AP數(shù)量下用戶平均服務(wù)等待時延比較

    1  基于GAN輔助多智能體強化學習的邊緣計算網(wǎng)絡(luò)聯(lián)邦切片資源管理算法

     (1) 每個AP智能體初始化的生成網(wǎng)絡(luò)${G_b}$和判別網(wǎng)絡(luò)${D_b}$;
     (2) 每個AP智能體初始化目標生成網(wǎng)絡(luò)${\hat G_b}$,本地經(jīng)驗回放池
     ${\mathcal{M}_b}$,粒子個數(shù)$N$;
     (3) ${T^{{\text{train}}}} \leftarrow 0$;
     (4) for Episode $v = 1,2, \cdots ,V$ do:
     (5)  重置環(huán)境
     (6)  for TS $t = 1,2, \cdots ,T$ do:
     (7)   for AP智能體 $b = 1,2, \cdots ,B$ do:
     (8)    采樣噪聲${{\boldsymbol{\tau}} _{b,t}}{\text{~}}U\left( {0,1} \right)$,獲取本地觀測${{\boldsymbol{o}}_{b,t}}$,同時輸
          入生成網(wǎng)絡(luò)${G_b}$;
     (9)    得到狀態(tài)值粒子$\left\{ {G_{b,t}^{\text{V}}\left( {{{\boldsymbol{o}}_{b,t}},{{\boldsymbol{\tau}} _{b,t}}} \right)} \right\}$和動作優(yōu)勢值
          $G_{b,t,{{\boldsymbol{a}}_{b,t}}}^{\text{A}}\left( {{{\boldsymbol{o}}_{b,t}},{{\boldsymbol{\tau}} _{b,t}}} \right)$
     (10)    根據(jù)式(10)計算狀態(tài)動作價值函數(shù)${Q_{b,t}}\left( {{{\boldsymbol{o}}_{b,t}},{{\boldsymbol{a}}_{b,t}}} \right)$;
     (11)    執(zhí)行動作${\boldsymbol{a}}_{b,t}^* \leftarrow {\text{argmax}}{Q_{b,t}}\left( {{{\boldsymbol{o}}_{b,t}},{{\boldsymbol{a}}_{b,t}}} \right)$;
     (12)    獲取環(huán)境獎勵${r_t}$和下一時隙觀測${{\boldsymbol{o}}_{b,t + 1}}$;
     (13)    儲存訓練信息$\left\{ {{{\boldsymbol{o}}_{b,t}},{{\boldsymbol{a}}_{b,t}},{{\boldsymbol{o}}_{b,t + 1}},{r_t}} \right\}$至本地經(jīng)驗回放
          池${\mathcal{M}_b}$;
     (14)   end for;
     (15)   if ${T^{{\text{train}}}} \ge {T^{{\text{update}}}}$:
     (16)    for AP智能體 $b = 1,2, \cdots ,B$ do:
     (17)     隨機抽取 $\left\{ {{{\boldsymbol o}_{b,k}},{a_{b,k}},{o_{b,k + 1}},{r_k}} \right\}_{k = 1}^K{\text{~}}{\mathcal{M}_b}$,采樣
           噪聲$\left\{ {{{\boldsymbol{\tau}} _{b,k}}} \right\}_{k = 1}^K$和$ {\left\{{\varepsilon }_{b,k}\right\}}_{k=1}^{K} $;
     (18)     根據(jù)式(14)-式(16)計算損失函數(shù)$J_{b,k}^D$并根據(jù)式
          $\begin{array}{*{20}{c}} {\theta _{b,t + 1}^D \leftarrow \theta _{b,t}^D - {\eta ^D}{\nabla _\theta }J_{b,t}^D} \end{array}$更新網(wǎng)絡(luò)${D_b}$;
     (19)     計算${Q_{b,k}}\left( {{{\boldsymbol o}_{b,k}},{a_{b,k}}} \right)$和${\hat Q_{b,k}}\left( {{{\boldsymbol o}_{b,k}},{a_{b,k}}} \right)$;
     (20)     根據(jù)式(12)-式(13)計算損失函數(shù)$J_{b,k}^G$更新網(wǎng)絡(luò)${G_b}$,
           和網(wǎng)絡(luò)${\hat G_b}$;
     (21)     根據(jù)式$\begin{array}{*{20}{c}} {\theta _{b,t + 1}^G \leftarrow \theta _{b,t}^G - {\eta ^G}{\nabla _\theta }J_{b,t}^G} \end{array}$更新主網(wǎng)絡(luò)
           ${G_b}$,根據(jù)式$\hat \theta _{b,t}^G \leftarrow \theta _{b,t}^G\hat \theta _{b,t}^G \leftarrow \theta _{b,t}^G$更新目標網(wǎng)絡(luò)
           ${\hat G_b}$;
     (22)    end for;
     (23)   end if;
     (24) end for;
     (25) 根據(jù)式(17)–式(18)執(zhí)行聯(lián)邦聚合并向所有智能體廣播生成
       網(wǎng)絡(luò)參數(shù)$\theta _{b,1,v + 1}^G$;
     (26) ${T^{{\text{train}}}} \leftarrow {T^{{\text{train}}}} + 1$;
     (27) end for;
    下載: 導出CSV

    表  1  仿真參數(shù)設(shè)置

    系統(tǒng)參數(shù)
    AP傳輸功率${P^{\text{A}}}$ 46 dBm
    用戶傳輸功率${P^{\text{U}}}$ 23 dBm
    時隙持續(xù)時間$\tau $ 10 ms
    最大可容忍時延$l_i^{{\text{max}}}$ {5,8,9} ms
    上行任務(wù)包大小${x_{{u_b},t}}$ {2.4,12,30} kbit
    處理前后數(shù)據(jù)包之比$\beta $ 0.25
    計算任務(wù)量${s_{{u_b},t}}$ {0.1,0.2,1} kMc
    用戶數(shù)${U_b}$ 20
    切片數(shù)量$I$ 3
    AP覆蓋半徑 40 m
    AP數(shù)$B$ 4
    AP總帶寬${W_b}$ 36 MHz
    AP總計算資源${C_b}$ 900 kMc/s
    帶寬資源塊大小${\rho ^{\text{B}}}$ 2 MHz
    計算資源塊大小${\rho ^{\text{C}}}$ 50 kMc
    訓練參數(shù)
    生成網(wǎng)絡(luò)學習率${\eta ^G}$ 1e–3
    判別網(wǎng)絡(luò)學習率${\eta ^D}$ 1e–3
    獎勵折扣系數(shù) $\gamma $ 0.8
    每回合步數(shù) 100
    狀態(tài)值粒子個數(shù)$N$ 30
    權(quán)重系數(shù)$\alpha $ 0.5
    批大小 32
    經(jīng)驗回放池大小 50 000
    目標網(wǎng)絡(luò)更新頻率 10$ \tau $
    輸入噪聲維度 10
    下載: 導出CSV
  • [1] GHONGE M, MANGRULKAR R S, JAWANDHIYA P M, et al. Future Trends in 5G and 6G: Challenges, Architecture, and Applications[M]. Boca Raton: CRC Press, 2022.
    [2] DEBBABI F, JMAL R, FOURATI L C, et al. Algorithmics and modeling aspects of network slicing in 5G and Beyonds network: Survey[J]. IEEE Access, 2020, 8: 162748–162762. doi: 10.1109/ACCESS.2020.3022162.
    [3] MATENCIO-ESCOLAR A, WANG Qi, and CALERO J M A. SliceNetVSwitch: Definition, design and implementation of 5G multi-tenant network slicing in software data paths[J]. IEEE Transactions on Network and Service Management, 2020, 17(4): 2212–2225. doi: 10.1109/TNSM.2020.3029653.
    [4] 吳大鵬, 鄭豪, 崔亞平. 面向服務(wù)的車輛網(wǎng)絡(luò)切片協(xié)調(diào)智能體設(shè)計[J]. 電子與信息學報, 2020, 42(8): 1910–1917. doi: 10.11999/JEIT190635.

    WU Dapeng, ZHENG Hao, and CUI Yaping. Service-oriented coordination agent design for network slicing in vehicular networks[J]. Journal of Electronics & Information Technology, 2020, 42(8): 1910–1917. doi: 10.11999/JEIT190635.
    [5] 唐倫, 魏延南, 譚頎, 等. H-CRAN網(wǎng)絡(luò)下聯(lián)合擁塞控制和資源分配的網(wǎng)絡(luò)切片動態(tài)資源調(diào)度策略[J]. 電子與信息學報, 2020, 42(5): 1244–1252. doi: 10.11999/JEIT190439.

    TANG Lun, WEI Yannan, TAN Qi, et al. Joint congestion control and resource allocation dynamic scheduling strategy for network slices in heterogeneous cloud raido access network[J]. Journal of Electronics & Information Technology, 2020, 42(5): 1244–1252. doi: 10.11999/JEIT190439.
    [6] SHAH S D A, GREGORY M A, and LI Shuo. Cloud-native network slicing using software defined networking based multi-access edge computing: A survey[J]. IEEE Access, 2021, 9: 10903–10924. doi: 10.1109/ACCESS.2021.3050155.
    [7] SHAH S D A, GREGORY M A, and LI Shuo. Toward network-slicing-enabled edge computing: A cloud-native approach for slice mobility[J]. IEEE Internet of Things Journal, 2024, 11(2): 2684–2700. doi: 10.1109/JIOT.2023.3292520.
    [8] FAN Wenhao, LI Xuewei, TANG Bihua, et al. MEC network slicing: Stackelberg-game-based slice pricing and resource allocation with QoS guarantee[J]. IEEE Transactions on Network and Service Management, 2024, 21(4): 4494–4509. doi: 10.1109/TNSM.2024.3409277.
    [9] JO?ILO S and DáN G. Joint wireless and edge computing resource management with dynamic network slice selection[J]. IEEE/ACM Transactions on Networking, 2022, 30(4): 1865–1878. doi: 10.1109/TNET.2022.3156178.
    [10] HUSAIN S, KUNZ A, PRASAD A, et al. Mobile edge computing with network resource slicing for internet-of-things[C]. The 2018 IEEE 4th World Forum on Internet of Things, Singapore, 2018: 1–6. doi: 10.1109/WF-IoT.2018.8355232.
    [11] SHEN Xuemin, GAO Jie, WU Wen, et al. AI-assisted network-slicing based next-generation wireless networks[J]. IEEE Open Journal of Vehicular Technology, 2020, 1: 45–66. doi: 10.1109/OJVT.2020.2965100.
    [12] ELSAYED M and EROL-KANTARCI M. Reinforcement learning-based joint power and resource allocation for URLLC in 5G[C]. 2019 IEEE Global Communications Conference, Waikoloa, USA, 2019: 1–6. doi: 10.1109/GLOBECOM38437.2019.9014032.
    [13] AZIMI Y, YOUSEFI S, KALBKHANI H, et al. Energy-efficient deep reinforcement learning assisted resource allocation for 5G-RAN slicing[J]. IEEE Transactions on Vehicular Technology, 2022, 71(1): 856–871. doi: 10.1109/TVT.2021.3128513.
    [14] HUA Yuxiu, LI Rongpeng, ZHAO Zhifeng, et al. GAN-powered deep distributional reinforcement learning for resource management in network slicing[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(2): 334–349. doi: 10.1109/JSAC.2019.2959185.
    [15] ADDAD R A, DUTRA D L C, TALEB T, et al. Toward using reinforcement learning for trigger selection in network slice mobility[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(7): 2241–2253. doi: 10.1109/JSAC.2021.3078501.
    [16] LI Xuanheng, JIAO Kajia, CHEN Xingyun, et al. Demand-oriented Fog-RAN slicing with self-adaptation via deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2023, 72(11): 14704–14716. doi: 10.1109/TVT.2023.3280242.
    [17] ZHOU Hao, ELSAYED M, and EROL-KANTARCI M. RAN resource slicing in 5G using multi-agent correlated Q-learning[C]. The 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications, Helsinki, Finland, 2021: 1179–1184. doi: 10.1109/PIMRC50174.2021.9569358.
    [18] AKYILDIZ H A, GEMICI ? F, H?KELEK I, et al. Hierarchical reinforcement learning based resource allocation for RAN slicing[J]. IEEE Access, 2024, 12: 75818–75831. doi: 10.1109/ACCESS.2024.3406949.
    [19] CUI Yaping, SHI Hongji, WANG Ruyan, et al. Multi-agent reinforcement learning for slicing resource allocation in vehicular networks[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(2): 2005–2016. doi: 10.1109/TITS.2023.3314929.
    [20] HUANG Chen, CAO Jiannong, WANG Shihui, et al. Dynamic resource scheduling optimization with network coding for multi-user services in the internet of vehicles[J]. IEEE Access, 2020, 8: 126988–127003. doi: 10.1109/ACCESS.2020.3001140.
    [21] LIN Yan, BAO Jinming, ZHANG Yijin, et al. Privacy-preserving joint edge association and power optimization for the internet of vehicles via federated multi-agent reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2023, 72(6): 8256–8261. doi: 10.1109/TVT.2023.3240682.
    [22] GUPTA A, MAURYA M K, DHERE K, et al. Privacy-preserving hybrid federated learning framework for mental healthcare applications: Clustered and quantum approaches[J]. IEEE Access, 2024, 12: 145054–145068. doi: 10.1109/ACCESS.2024.3464240.
    [23] GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of Wasserstein GANs[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 5769–5779.
  • 加載中
圖(6) / 表(2)
計量
  • 文章訪問數(shù):  42
  • HTML全文瀏覽量:  15
  • PDF下載量:  1
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2024-09-09
  • 修回日期:  2025-02-19
  • 網(wǎng)絡(luò)出版日期:  2025-02-28

目錄

    /

    返回文章
    返回