動(dòng)態(tài)車輛網(wǎng)絡(luò)場景中的協(xié)同空地計(jì)算卸載和資源優(yōu)化
doi: 10.11999/JEIT240464
-
1.
南京航空航天大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院 南京 211106
-
2.
南京航空航天大學(xué)經(jīng)濟(jì)與管理學(xué)院 南京 211106
-
3.
南京信息工程大學(xué)計(jì)算機(jī)學(xué)院 南京 210044
Collaborative Air-Ground Computation Offloading and Resource Optimization in Dynamic Vehicular Network Scenarios
-
1.
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
-
2.
College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
-
3.
School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China
-
摘要: 針對(duì)移動(dòng)用戶數(shù)量迅猛增長和地面基礎(chǔ)設(shè)施分布稀疏所帶來的挑戰(zhàn),該文提出一種能量收集輔助的空地協(xié)同計(jì)算卸載架構(gòu)。該架構(gòu)充分利用無人機(jī)(UAVs)的靈活機(jī)動(dòng)性和路側(cè)單元(RSUs)及基站(BS)的強(qiáng)大算力,實(shí)現(xiàn)了任務(wù)計(jì)算的動(dòng)態(tài)實(shí)時(shí)分發(fā)。特別地,無人機(jī)通過能量收集來維持其持續(xù)運(yùn)行和穩(wěn)定的計(jì)算性能??紤]到無人機(jī)與地面車輛的高動(dòng)態(tài)性、車輛計(jì)算任務(wù)的隨機(jī)性,以及信道模型的時(shí)變性,提出一個(gè)能耗受限的長期優(yōu)化問題,旨在從全局角度有效降低整個(gè)系統(tǒng)的平均時(shí)延。為了解決這一復(fù)雜的混合整數(shù)規(guī)劃(MIP)問題,提出一種基于改進(jìn)演員-評(píng)論家(Actor-Critic)強(qiáng)化學(xué)習(xí)算法的計(jì)算卸載策略(IACA)。該算法運(yùn)用李雅普諾夫優(yōu)化技術(shù),將長期系統(tǒng)時(shí)延優(yōu)化問題分解為一系列易于處理的幀級(jí)子問題。然后,利用遺傳算法計(jì)算目標(biāo)Q值替代目標(biāo)神經(jīng)網(wǎng)絡(luò)輸出以調(diào)整強(qiáng)化學(xué)習(xí)進(jìn)化方向,有效避免了算法陷入局部最優(yōu),從而實(shí)現(xiàn)動(dòng)態(tài)車輛網(wǎng)絡(luò)中的高效卸載和資源優(yōu)化。通過綜合仿真驗(yàn)證了所提計(jì)算卸載架構(gòu)和算法的可行性和優(yōu)越性。
-
關(guān)鍵詞:
- 空地一體化車聯(lián)網(wǎng) /
- 能量收集 /
- 計(jì)算卸載 /
- 強(qiáng)化學(xué)習(xí) /
- 遺傳算法
Abstract:Objective In response to the rapid growth of mobile users and the limited distribution of ground infrastructure, this research addresses the challenges faced by vehicular networks. It emphasizes the need for efficient computation offloading and resource optimization, highlighting the role of Unmanned Aerial Vehicles (UAVs), RoadSide Units (RSUs), and Base Stations (BSs) in enhancing overall system performance. Methods This paper presents an innovative research methodology that proposes an energy harvesting-assisted air-ground cooperative computation offloading architecture. This architecture integrates UAVs, RSUs, and BSs to effectively manage the dynamic task queues generated by vehicles. By incorporating Energy Harvesting (EH) technology, UAVs can capture and convert ambient renewable energy, ensuring a continuous power supply and stable computing capabilities. To address the challenges associated with time-varying channel conditions and high mobility of nodes, a Mixed Integer Programming (MIP) problem is formulated. An iterative process is used to adjust offloading decisions and computing resource allocations at low cost, aiming to optimize overall system performance. The approach is outlined as follows: Firstly, an innovative framework for energy harvesting-assisted air-ground cooperative computation offloading is introduced. This framework enables the collaborative management of dynamic task queues generated by vehicles through the integration of UAVs, RSUs, and BSs. The inclusion of EH technology ensures that UAVs maintain a continuous power supply and stable computing capabilities, addressing limitations due to finite energy resources. Secondly, to address system complexities—such as time-varying channel conditions, high node mobility, and dynamic task arrivals—an MIP problem is formulated. The objective is to optimize system performance by determining effective joint offloading decisions and resource allocation strategies, minimizing global service delays while meeting various dynamic and long-term energy constraints. Thirdly, an Improved Actor-Critic Algorithm (IACA), based on reinforcement learning principles, is introduced to solve the formulated MIP problem. This algorithm utilizes Lyapunov optimization to decompose the problem into frame-level deterministic optimizations, thereby enhancing its manageability. Additionally, a genetic algorithm is employed to compute target Q-values, which guides the reinforcement learning process and enhances both solution efficiency and global optimality. The IACA algorithm is implemented to iteratively refine offloading decisions and resource allocations, striving for optimized system performance. Through the integration of these research methodologies, this paper makes significant contributions to the field of air-ground cooperative computation offloading by providing a novel framework and algorithm designed to address the challenges posed by limited energy resources, fluctuating channel conditions, and high node mobility. Results and Discussions The effectiveness and efficiency of the proposed framework and algorithm are evaluated through extensive simulations. The results illustrate the capability of the proposed approach to achieve dynamic and efficient offloading and resource optimization within vehicular networks. The performance of the IACA algorithm is illustrated, emphasizing its efficient convergence. Over the course of 4 000 training episodes, the agent continuously interacted with the environment, refining its decision-making strategy and updating network parameters. As shown, the loss function values for both the Actor and Critic networks progressively decreased, indicating improvements in their ability to model the real-world environment. Meanwhile, a rising trend in reward values is observed as training episodes increase, ultimately stabilizing, which signifies that the agent has discovered a more effective decision-making strategy. The average system delay and energy consumption relative to time slots are presented. As the number of slots increases, the average delay decreases for all algorithms except for RA, which remains the highest due to random offloading. RLA2C demonstrates superior performance over RLASD due to its advantage function. IACA, trained repeatedly in dynamic environments, achieves an average service delay that closely approximates CPLEX’s optimal performance. Additionally, it significantly reduces average energy consumption by minimizing Lyapunov drift and penalties, outperforming both RA and RLASD. The impact of task input data size on system performance is examined. As the data size increases from 750 kbit to 1 000 kbit, both average delay and energy consumption rise. The IACA algorithm, with its effective interaction with the environment and enhanced genetic algorithm, consistently produces near-optimal solutions, demonstrating strong performance in both energy efficiency and delay management. In contrast, the performance gap between RLASD and RLA2C widens compared to CPLEX due to unstable training environments for larger tasks. RA leads to significant fluctuations in average delay and energy consumption. The effect of the Lyapunov parameter V on average delay and energy consumption at T=200 is illustrated. With V, performance can be finely tuned; as V increases, average delay decreases while energy consumption rises, eventually stabilizing. The IACA algorithm, with its enhanced Q-values, effectively optimizes both delay and energy. Furthermore, the impact of UAV energy thresholds and counts on average system delay is demonstrated. IACA avoids local optima and adapts effectively to thresholds, outperforming RLA2C, RLASD, and RA. An increase in the number of UAVs initially reduces delay; however, an excess can lead to increased delay due to limited computing power. Conclusions The proposed EH-assisted collaborative air-ground computing offloading framework and IACA algorithm significantly improve the performance of vehicular networks by optimizing offloading decisions and resource allocations. Simulation results validate the effectiveness of the proposed methodology in reducing average delay, enhancing energy efficiency, and increasing system throughput. Future research could focus on integrating more advanced energy harvesting technologies and further refining the proposed algorithm to better address the complexities associated with large-scale vehicular networks. (While specific figures or tables are not referenced in this summary due to format constraints, the simulations conducted within the paper provide comprehensive quantitative results to support the findings discussed.) -
1 基于改進(jìn)Actor-Critic強(qiáng)化學(xué)習(xí)算法的計(jì)算卸載策略
輸入:系統(tǒng)狀態(tài) $ {\boldsymbol{S}}_{t} $,參數(shù) $ V $,獎(jiǎng)勵(lì)折扣因子 $ \gamma $,Actor 網(wǎng)絡(luò)結(jié)構(gòu),Critic 網(wǎng)絡(luò)結(jié)構(gòu) 輸出:卸載決策$ {\hat{\boldsymbol{\alpha }}}^{t} $,每個(gè)時(shí)間幀對(duì)應(yīng)的最優(yōu)計(jì)算頻率分配$ {\hat{\boldsymbol{f}}}^{t} $ (1) 初始化經(jīng)驗(yàn)池, 網(wǎng)絡(luò)模型參數(shù)以及系統(tǒng)環(huán)境參數(shù); (2) for episode $ \leftarrow \mathrm{1,2},\cdots $ do (3) 獲取當(dāng)前環(huán)境系統(tǒng)初始狀態(tài) $ {\boldsymbol{S}}_{0} $ (4) Actor 生成一個(gè)0~1的松馳動(dòng)作 $ {\hat{\alpha }}_{u,s}^{t},{\hat{f}}_{u}^{t} $; (5) 將$ {\hat{\alpha }}_{u,s}^{t} $和$ {\hat{f}}_{u}^{t} $量化為二進(jìn)制動(dòng)作$ {\hat{\boldsymbol{\alpha }}}^{t} $和滿足約束條件的計(jì)算頻率$ {\hat{\boldsymbol{f}}}^{t} $,得到動(dòng)作$ {\boldsymbol{A}}_{t} $; (6) 基于動(dòng)作 $ {\boldsymbol{A}}_{t} $ 得到下一個(gè)的狀態(tài) $ {\boldsymbol{S}}_{t+1} $ 和當(dāng)前獎(jiǎng)勵(lì) $ {R}_{t} $; (7) 改進(jìn)遺傳算法生成卸載決策$ {\bar{\alpha }}_{u,s}^{t}, $和獎(jiǎng)勵(lì) $ {{R}}'_{t} $; (8) if $ {{R}}'_{t} > {R}_{t} $ then (9) $ {\boldsymbol{A}}_{t}=\left\{{\bar{\alpha }}_{u,s}^{t},{f}_{u}^{t}\right\} $ (10) $ {R}_{t}={{R}}'_{t} $ (11) 將 $ \left\{{\boldsymbol{S}}_{t},{\boldsymbol{A}}_{t},{R}_{t},{\boldsymbol{S}}_{t+1}\right\} $ 存儲(chǔ)至緩沖池中; (12) for Agent do (13) 從經(jīng)驗(yàn)池中隨機(jī)采樣批量數(shù)據(jù) $ \left\{{\boldsymbol{S}}_{t},{\boldsymbol{A}}_{t},{R}_{t},{\boldsymbol{S}}_{t+1}\right\} $; (14) 通過 $ {\lambda }_{t}={R}_{t}+\gamma Q\left({\boldsymbol{S}}_{t+1},{\boldsymbol{A}}_{t+1}:{\omega }^{{{'}}}\right) $ 計(jì)算 TD 目標(biāo)值; (15) 計(jì)算損失值 $ \mathrm{L}\mathrm{o}\mathrm{s}\mathrm{s}\left(\omega \right)=\dfrac{1}{2}{\left[Q\left({\boldsymbol{S}}_{t},{\boldsymbol{A}}_{t}:\omega \right)-{\lambda }_{t}\right]}^{2} $,更新 Critic 網(wǎng)絡(luò); (16) 計(jì)算損失值 $ \mathrm{L}\mathrm{o}\mathrm{s}\mathrm{s}\left(\theta \right)=\nabla_{\mathrm{\theta }}\mathrm{l}\mathrm{n}{\pi }_{\theta }\left({\boldsymbol{S}}_{t},{\boldsymbol{A}}_{t}\right)Q\left({\boldsymbol{S}}_{t},{\boldsymbol{A}}_{t}:\omega \right) $ ,采用策略梯度更新 Actor 網(wǎng)絡(luò); (17) for $ t=\mathrm{1,2},\cdots ,T $ do (18) 獲取時(shí)隙t 的環(huán)境狀態(tài); (19) 利用訓(xùn)練好的 Actor-Critic 模型,得到時(shí)隙t的最優(yōu)卸載決策$ {\hat{\boldsymbol{\alpha }}}^{t} $和計(jì)算頻率$ {\hat{\boldsymbol{f}}}^{t} $; 下載: 導(dǎo)出CSV
表 1 實(shí)驗(yàn)參數(shù)表
參數(shù) 值 參數(shù) 值 UAV計(jì)算能效系數(shù) $ {\kappa }_{u} $ 10–28 UAV飛行速度 $ {v}_{u}^{t} $ 25 m/s 可用帶寬$ {B}_{u,v} $ 3 MHz 可用帶寬 $ {B}_{u,r} $ 1 MHz 可用帶寬 $ {B}_{u,0} $ 2.5 MHz 獎(jiǎng)勵(lì)折扣因子 $ \gamma $ 0.95 模型訓(xùn)練優(yōu)化器 AdamOptimizer 批處理數(shù)量 512 Actor 學(xué)習(xí)率 0.001 Critic 學(xué)習(xí)率 0.002 天線增益$ {A}_q7j3ldu95 $ 3 載波頻率$ {F}_{u,r} $ 915 MHz 路徑損耗$ {g}_{0} $ –40 dB 參考距離 $ q7j3ldu95_{0} $ 1 m 下載: 導(dǎo)出CSV
-
[1] ZHANG Haibo, LIU Xiangyu, XU Yongjun, et al. Partial offloading and resource allocation for MEC-assisted vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2024, 73(1): 1276–1288. doi: 10.1109/TVT.2023.3306939. [2] LIU Qian, LIANG Hairong, LUO Rui, et al. Energy-efficiency computation offloading strategy in UAV aided V2X network with integrated sensing and communication[J]. IEEE Open Journal of the Communications Society, 2022, 3: 1337–1346. doi: 10.1109/OJCOMS.2022.3195703. [3] YU Zhe, GONG Yanmin, GONG Shimin, et al. Joint task offloading and resource allocation in UAV-enabled mobile edge computing[J]. IEEE Internet of Things Journal, 2020, 7(4): 3147–3159. doi: 10.1109/JIOT.2020.2965898. [4] HU Jinna, CHEN Chen, CAI Lin, et al. UAV-assisted vehicular edge computing for the 6G internet of vehicles: Architecture, intelligence, and challenges[J]. IEEE Communications Standards Magazine, 2021, 5(2): 12–18. doi: 10.1109/MCOMSTD.001.2000017. [5] WANG Junhua, WANG Ling, ZHU Kun, et al. Lyapunov-based joint flight trajectory and computation offloading optimization for UAV-assisted vehicular networks[J]. IEEE Internet of Things Journal, 2024, 11(2): 22243–22256. doi: 10.1109/JIOT.2024.3382242. [6] HE Yixin, ZHAI Daosen, ZHANG Ruonan, et al. A mobile edge computing framework for task offloading and resource allocation in UAV-assisted VANETs[C]. IEEE Conference on Computer Communications Workshop, Vancouver, Canada, 2021: 1–6. doi: 10.1109/INFOCOMWKSHPS51825.2021.9484643. [7] DUAN Xuting, ZHOU Yukang, TIAN Daxin, et al. Weighted energy-efficiency maximization for a UAV-assisted multiplatoon mobile-edge computing system[J]. IEEE Internet of Things Journal, 2022, 9(19): 18208–18220. doi: 10.1109/JIOT.2022.3155608. [8] ZHAO Nan, YE Zhiyang, PEI Yiyang, et al. Multi-agent deep reinforcement learning for task offloading in UAV-assisted mobile edge computing[J]. IEEE Transactions on Wireless Communications, 2022, 21(9): 6949–6960. doi: 10.1109/TWC.2022.3153316. [9] ZHANG Liang, JABBARI B, and ANSARI N. Deep reinforcement learning driven UAV-assisted edge computing[J]. IEEE Internet of Things Journal, 2022, 9(24): 25449–25459. doi: 10.1109/JIOT.2022.3196842. [10] YAN Ming, XIONG Rui, WANG Yan, et al. Edge computing task offloading optimization for a UAV-assisted internet of vehicles via deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2024, 73(4): 5647–5658. doi: 10.1109/TVT.2023.3331363. [11] WU Zhiwei, YANG Zilin, YANG Chao, et al. Joint deployment and trajectory optimization in UAV-assisted vehicular edge computing networks[J]. Journal of Communications and Networks, 2022, 24(1): 47–58. doi: 10.23919/JCN.2021.000026. [12] YANG Chao, LIU Baichuan, LI Haoyu, et al. Learning based channel allocation and task offloading in temporary UAV-assisted vehicular edge computing networks[J]. IEEE Transactions on Vehicular Technology, 2022, 71(9): 9884–9895. doi: 10.1109/TVT.2022.3177664. [13] PENG Haixia and SHEN Xuemin. Multi-agent reinforcement learning based resource management in MEC- and UAV-assisted vehicular networks[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(1): 131–141. doi: 10.1109/JSAC.2020.3036962. [14] LIU Yinan, YANG Chao, CHEN Xin, et al. Joint hybrid caching and replacement scheme for UAV-assisted vehicular edge computing networks[J]. IEEE Transactions on Intelligent Vehicles, 2024, 9(1): 866–878. doi: 10.1109/TIV.2023.3323217. [15] YAN Junjie, ZHAO Xiaohui, and LI Zan. Deep-reinforcement-learning-based computation offloading in UAV-assisted vehicular edge computing networks[J]. IEEE Internet of Things Journal, 2024, 11(11): 19882–19897. doi: 10.1109/JIOT.2024.3370553. [16] ZHANG Wenqian, Lü Zilong, GE Mengxia, et al. UAV-assisted vehicular edge computing system: Min-max fair offloading and position optimization[J]. IEEE Transactions on Consumer Electronics, 2024. doi: 10.1109/TCE.2024.3426513. [17] HAN Zihao, ZHOU Ting, XU Tianheng, et al. Joint user association and deployment optimization for delay-minimized UAV-aided MEC networks[J]. IEEE Wireless Communications Letters, 2023, 12(10): 1791–1795. doi: 10.1109/LWC.2023.3294749. [18] FIROZJAE H M, MOGHADDAM J Z, and ARDEBILIPOUR M. A joint trajectory and energy harvesting method for an UAV enabled disaster response network[C]. 13th International Conference on Information and Knowledge Technology, Karaj, Iran, 2022: 1–5. doi: 10.1109/IKT57960.2022.10039000. [19] ZHANG Ning, LIU Juan, XIE Lingfu, et al. A deep reinforcement learning approach to energy-harvesting UAV-aided data collection[C]. 2020 International Conference on Wireless Communications and Signal Processing, Nanjing, China, 2020: 93–98. doi: 10.1109/WCSP49889.2020.9299806. [20] YANG Zheyuan, BI Suzhi, and ZHANG Y J A. Dynamic offloading and trajectory control for UAV-enabled mobile edge computing system with energy harvesting devices[J]. IEEE Transactions on Wireless Communications, 2022, 21(12): 10515–10528. doi: 10.1109/TWC.2022.3184953. [21] CHANG Zheng, LIU Liqing, GUO Xijuan, et al. Dynamic resource allocation and computation offloading for IoT fog computing system[J]. IEEE Transactions on Industrial Informatics, 2021, 17(5): 3348–3357. doi: 10.1109/TII.2020.2978946. [22] DAI Xingxia, XIAO Zhu, JIANG Hongbo, et al. UAV-assisted task offloading in vehicular edge computing networks[J]. IEEE Transactions on Mobile Computing, 2024, 23(4): 2520–2534. doi: 10.1109/TMC.2023.3259394. [23] WANG Feng, XU Jie, WANG Xin, et al. Joint offloading and computing optimization in wireless powered mobile-edge computing systems[J]. IEEE Transactions on Wireless Communications, 2018, 17(3): 1784–1797. doi: 10.1109/TWC.2017.2785305. [24] YUE Yuchen and WANG Junhua. Lyapunov-based dynamic computation offloading optimization in heterogeneous vehicular networks[C]. 2022 IEEE International Symposium on Product Compliance Engineering - Asia, Guangzhou, China, 2022: 1–6. doi: 10.1109/ISPCE-ASIA57917.2022.9971076. [25] TREIBER M and KESTING A. Traffic Flow Dynamics: Data, Models and Simulation[M]. Berlin: Springer, 2013. doi: 10.1007/978-3-642-32460-4. [26] HUANG Liang, BI Suzhi, and ZHANG Y J A. Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks[J]. IEEE Transactions on Mobile Computing, 2020, 19(11): 2581–2593. doi: 10.1109/TMC.2019.2928811. [27] DAGANZO C F. Traffic flow theory[M]. DAGANZO C F. Fundamentals of Transportation and Traffic Operations. Oxford: Pergamon, 1997: 66–160. doi: 10.1108/ 9780585475301-004. [28] SHI Weisen, LI Junling, CHENG Nan, et al. Multi-drone 3-D trajectory planning and scheduling in drone-assisted radio access networks[J]. IEEE Transactions on Vehicular Technology, 2019, 68(8): 8145–8158. doi: 10.1109/TVT.2019.2925629. [29] SUN Yukun and ZHANG Xing. A2C learning for tasks segmentation with cooperative computing in edge computing networks[C]. 2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 2022: 2236–2241. doi: 10.1109/GLOBECOM48099.2022.10000948. -