一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級(jí)搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問(wèn)題, 您可以本頁(yè)添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號(hào)碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

多無(wú)人機(jī)分布式感知任務(wù)分配-通信基站關(guān)聯(lián)與飛行策略聯(lián)合優(yōu)化設(shè)計(jì)

何江 喻莞芯 黃浩 蔣衛(wèi)恒

何江, 喻莞芯, 黃浩, 蔣衛(wèi)恒. 多無(wú)人機(jī)分布式感知任務(wù)分配-通信基站關(guān)聯(lián)與飛行策略聯(lián)合優(yōu)化設(shè)計(jì)[J]. 電子與信息學(xué)報(bào). doi: 10.11999/JEIT240738
引用本文: 何江, 喻莞芯, 黃浩, 蔣衛(wèi)恒. 多無(wú)人機(jī)分布式感知任務(wù)分配-通信基站關(guān)聯(lián)與飛行策略聯(lián)合優(yōu)化設(shè)計(jì)[J]. 電子與信息學(xué)報(bào). doi: 10.11999/JEIT240738
HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng. Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240738
Citation: HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng. Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240738

多無(wú)人機(jī)分布式感知任務(wù)分配-通信基站關(guān)聯(lián)與飛行策略聯(lián)合優(yōu)化設(shè)計(jì)

doi: 10.11999/JEIT240738
基金項(xiàng)目: 重慶市教委科技攻關(guān)計(jì)劃項(xiàng)目(KJQN202203101)
詳細(xì)信息
    作者簡(jiǎn)介:

    何江:男,工 程師,研究方向?yàn)闊o(wú)人機(jī)集群技術(shù)

    喻莞芯:女,碩 士生,研究方向?yàn)闊o(wú)人機(jī)集群,多智能體技術(shù)

    黃浩:男,碩士生,研究方向?yàn)橥ㄐ判盘?hào)處理,深度強(qiáng)化學(xué)習(xí)

    蔣衛(wèi)恒:男,副研究員,研究方向?yàn)橹悄苁鼓軣o(wú)線通信

    通訊作者:

    蔣衛(wèi)恒 whjiang@cqu.edu.cn

  • 中圖分類號(hào): TN929.52

Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles

Funds: The Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJQN202203101)
  • 摘要: 針對(duì)多無(wú)人機(jī)(UAV)分布式感知開(kāi)展研究,為協(xié)調(diào)各UAV行為,該文設(shè)計(jì)了任務(wù)感知-數(shù)據(jù)回傳協(xié)議,并建立了UAV任務(wù)分配、數(shù)據(jù)回傳基站關(guān)聯(lián)與飛行策略聯(lián)合優(yōu)化混合整數(shù)非線性規(guī)劃問(wèn)題模型。鑒于該問(wèn)題數(shù)學(xué)結(jié)構(gòu)的復(fù)雜性,以及集中式優(yōu)化算法設(shè)計(jì)面臨計(jì)算復(fù)雜度高且信息交互開(kāi)銷大等不足,提出將該問(wèn)題轉(zhuǎn)化為協(xié)作式馬爾可夫博弈(MG),定義了基于成本-效用復(fù)合的收益函數(shù)。考慮到MG問(wèn)題連續(xù)-離散動(dòng)作空間復(fù)雜耦合特點(diǎn),設(shè)計(jì)了基于獨(dú)立學(xué)習(xí)者(IL)的復(fù)合動(dòng)作表演評(píng)論家(MA-IL-CA2C)的MG問(wèn)題求解算法。仿真分析結(jié)果表明,相對(duì)于基線算法,所提算法能顯著提高系統(tǒng)收益并降低網(wǎng)絡(luò)能耗。
  • 圖  1  面向分布式感知應(yīng)用的UAV網(wǎng)絡(luò)場(chǎng)景

    圖  2  感知-傳輸協(xié)議時(shí)隙結(jié)構(gòu)

    圖  3  ${\text{UA}}{{\text{V}}_n}$飛行方向角${\boldsymbol{\delta}} _n^t = \left( {\alpha _n^t,\beta _n^t} \right)$

    圖  4  ${\text{UA}}{{\text{V}}_n}$使用MA-IL-CA2C算法進(jìn)行聯(lián)合通信策略與飛行策略優(yōu)化設(shè)計(jì)

    圖  5  不同算法之間的系統(tǒng)收益對(duì)比

    圖  6  不同算法之間的系統(tǒng)成本對(duì)比

    圖  7  UAV在使用不同DRL算法下的3D飛行軌跡任務(wù)

    圖  8  不同算法所選任務(wù)平均收益對(duì)比

    圖  9  MA-IL-CA2C算法在不同功率分配與速度控制考慮情況下系統(tǒng)收益對(duì)比

    1  MA-IL-CA2C算法

     (1)初始化:設(shè)置$t = 0$,最大決策周期數(shù)$T$,選擇經(jīng)驗(yàn)回放模塊
     容量$ {N_{\mathrm{c}}} $,批量大小${N_{\mathrm}}$,網(wǎng)絡(luò)學(xué)習(xí)率${\alpha _{{\boldsymbol{\theta}} _n^t}}$和$ {\alpha _{{\boldsymbol{\omega}} _n^t}} $,軟更新參數(shù)
     $ \rho $;
     (2)對(duì)于每個(gè)智能體$n \in \mathcal{N}$:
      隨機(jī)初始化網(wǎng)絡(luò)參數(shù)$ {{\boldsymbol{\theta}} }_n^t $, $ {\hat {\boldsymbol{\theta}} }_n^t $, $ {{\boldsymbol{\omega}} }_n^t $, $ {\hat {\boldsymbol{\omega}} }_n^t $,并設(shè)置初始狀態(tài)${{\boldsymbol s}^0}$;
     #主循環(huán)
     (3)如果$t \le T$:
      (a)對(duì)于每個(gè)智能體$n \in \mathcal{N}$:
       根據(jù)式(28),在${\boldsymbol{s}}_n^t$處選擇離散動(dòng)作$ {\boldsymbol a}_n^{{\text{dis}},t} $,即選擇感知任務(wù)$m$和$ {\text{B}}{{\text{S}}_k} $;
       #協(xié)作階段
       在控制信道上反饋決策$D_n^{\mathrm{c}} = \left\{ {n,{\boldsymbol a}_n^{{\mathrm{dis}},t}} \right\}$,并接收其余
       UAV的決策信息;
       根據(jù)離散動(dòng)作$ {\boldsymbol a}_n^{{\mathrm{dis}},t} $決定連續(xù)動(dòng)作${\boldsymbol a}_n^{{\text{con}},t}{ = v}_n^t\left( {{{\boldsymbol s}^t},{\boldsymbol a}_n^{{\mathrm{dis}},t}} \right)$,
       即決定飛行方向角$ \delta _n^t $、移動(dòng)速度$ v_n^t $和發(fā)射功率$ P_n^t $;
       #移動(dòng)階段
       基于飛行方向角$ {\boldsymbol{\delta}} _n^t $和移動(dòng)速度$ v_n^t $,飛行至感知位置$ {\boldsymbol{x}}_n^{{\mathrm{s}},t} $;
       #感知階段
       執(zhí)行感知任務(wù)并收集任務(wù)數(shù)據(jù)$D_n^{s,t}$;
       #傳輸階段
       以發(fā)射功率$ P_n^t $將任務(wù)數(shù)據(jù)回傳給$ {\text{B}}{{\text{S}}_k} $;
       根據(jù)式(23)獲得收益$ r_n^{t + 1} $,觀察得到${{\boldsymbol s}^{t + 1}}$;
       將經(jīng)驗(yàn)元組$ \left( {{{\boldsymbol s}^t},{\boldsymbol a}_n^t,r_n^{t + 1},{{\boldsymbol s}^{t + 1}}} \right) $存入經(jīng)驗(yàn)回放模塊${\mathcal{D}_n}$中;
       如果$ t \gt {N_c} $:
        從經(jīng)驗(yàn)回放模塊${\mathcal{D}_n}$中移除舊的經(jīng)驗(yàn)元組;
       #訓(xùn)練網(wǎng)絡(luò)
       在經(jīng)驗(yàn)回放模塊${\mathcal{D}_n}$中隨機(jī)抽取一個(gè)批量${N_{\mathrm}}$的經(jīng)驗(yàn)元組
       $ \left( {{{\boldsymbol s}^t},{\boldsymbol a}_n^t,r_n^{t + 1},{{\boldsymbol s}^{t + 1}}} \right) $;
       根據(jù)式(29)–式(34),更新當(dāng)前網(wǎng)絡(luò)參數(shù)$ {{\boldsymbol{\theta}} }_n^t $與$ {{\boldsymbol{\omega}} }_n^t $;
       根據(jù)式(36)和式(37),更新目標(biāo)網(wǎng)絡(luò)參數(shù)$ {\hat {\boldsymbol{\theta}} }_n^t $與$ {\hat {\boldsymbol{\omega}} }_n^t $;
      (b)令$t = t + 1$, ${{\boldsymbol s}^t} \leftarrow {{\boldsymbol s}^{t + 1}}$;
     (4)重復(fù)步驟(3),直至算法結(jié)束。
    下載: 導(dǎo)出CSV

    表  1  仿真參數(shù)

    參數(shù) 數(shù)值
    UAV數(shù)目$N$,感知任務(wù)數(shù)目$M$,BS數(shù)目$K$ 3, 10, 2
    網(wǎng)絡(luò)范圍半徑${r_{\text{c}}}$ 500 m
    信道帶寬$ W $ 1 MHz
    BS高度$ {H_0} $ 25 m
    UAV最大與最低高度${h_{\min }},{h_{\max }}$ 50 m, 100 m
    UAV最大飛行速度$ {v_{\max }} $ 15 m/s
    UAV最大發(fā)射功率$ {P_{\max }} $ 30 dBm
    感知參數(shù)$\lambda $ 0.01
    環(huán)境參數(shù)$a,b$ 9.61, 0.16
    LoS和NLoS額外路徑損耗${\eta ^{{\text{LoS}}}},{\eta ^{{\text{NLoS}}}}$ 1dB, 20 dB
    載波頻率${f_{\text{c}}}$ 2 GHz
    噪聲功率${N_0}$ –96 dBm
    下載: 導(dǎo)出CSV

    表  2  模型超參數(shù)

    超參數(shù) 數(shù)值
    Actor網(wǎng)絡(luò)與Critic網(wǎng)絡(luò)初始學(xué)習(xí)率$ {\alpha _{{\boldsymbol{\theta}} _n^t}} $,$ {\alpha _{{\boldsymbol{\omega}} _n^t}} $ 0.001, 0.002
    軟更新權(quán)重$\rho $ 0.01
    貪婪率$\varepsilon $ 0.1
    激活函數(shù) ReLu
    批量大小${N_{\text}}$ 64
    經(jīng)驗(yàn)回放模塊大小${N_{\text{c}}}$ 20 000
    DQN網(wǎng)絡(luò)初始學(xué)習(xí)率 0.01
    DQN目標(biāo)網(wǎng)絡(luò)更新周期 100
    Actor網(wǎng)絡(luò)和Critic網(wǎng)絡(luò)層數(shù) 4,4
    隱層神經(jīng)元數(shù) 128
    下載: 導(dǎo)出CSV
  • [1] SHRESTHA R, ROMERO D, and CHEPURI S P. Spectrum surveying: Active radio map estimation with autonomous UAVs[J]. IEEE Transactions on Wireless Communications, 2023, 22(1): 627–641. doi: 10.1109/TWC.2022.3197087.
    [2] NOMIKOS N, GKONIS P K, BITHAS P S, et al. A survey on UAV-aided maritime communications: Deployment considerations, applications, and future challenges[J]. IEEE Open Journal of the Communications Society, 2023, 4: 56–78. doi: 10.1109/OJCOMS.2022.3225590.
    [3] HARIKUMAR K, SENTHILNATH J, and SUNDARAM S. Multi-UAV oxyrrhis marina-inspired search and dynamic formation control for forest firefighting[J]. IEEE Transactions on Automation Science and Engineering, 2019, 16(2): 863–873. doi: 10.1109/TASE.2018.2867614.
    [4] QU Yuben, SUN Hao, DONG Chao, et al. Elastic collaborative edge intelligence for UAV Swarm: Architecture, challenges, and opportunities[J]. IEEE Communications Magazine, 2024, 62(1): 62–68. doi: 10.1109/MCOM.002.2300129.
    [5] ZHANG Tao, ZHU Kun, ZHENG Shaoqiu, et al. Trajectory design and power control for joint radar and communication enabled multi-UAV cooperative detection systems[J]. IEEE Transactions on Communications, 2023, 71(1): 158–172. doi: 10.1109/TCOMM.2022.3224751.
    [6] PAN Hongyang, LIU Yanheng, SUN Geng, et al. Joint power and 3D trajectory optimization for UAV-Enabled wireless powered communication networks with obstacles[J]. IEEE Transactions on Communications, 2023, 71(4): 2364–2380. doi: 10.1109/TCOMM.2023.3240697.
    [7] NGUYEN P X, NGUYEN V D, NGUYEN H V, et al. UAV-assisted secure communications in terrestrial cognitive radio networks: Joint power control and 3D trajectory optimization[J]. IEEE Transactions on Vehicular Technology, 2021, 70(4): 3298–3313. doi: 10.1109/TVT.2021.3062283.
    [8] ZENG Shuhao, ZHANG Hongliang, DI Boya, et al. Trajectory optimization and resource allocation for OFDMA UAV relay networks[J]. IEEE Transactions on Wireless Communications, 2021, 20(10): 6634–6647. doi: 10.1109/TWC.2021.3075594.
    [9] LI Peiming and XU Jie. Fundamental rate limits of UAV-enabled multiple access channel with trajectory optimization[J]. IEEE Transactions on Wireless Communications, 2020, 19(1): 458–474. doi: 10.1109/TWC.2019.2946153.
    [10] GUAN Yue, ZOU Sai, PENG Haixia, et al. Cooperative UAV trajectory design for disaster area emergency communications: A multiagent PPO method[J]. IEEE Internet of Things Journal, 2024, 11(5): 8848–8859. doi: 10.1109/JIOT.2023.3320796.
    [11] SILVIRIANTI, NAROTTAMA B, and SHIN S Y. Layerwise quantum deep reinforcement learning for joint optimization of UAV trajectory and resource allocation[J]. IEEE Internet of Things Journal, 2024, 11(1): 430–443. doi: 10.1109/JIOT.2023.3285968.
    [12] HU Jingzhi, ZHANG Hongliang, SONG Lingyang, et al. Cooperative internet of UAVs: Distributed trajectory design by multi-agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2020, 68(11): 6807–6821. doi: 10.1109/TCOMM.2020.3013599.
    [13] WU Fanyi, ZHANG Hongliang, WU Jianjun, et al. Cellular UAV-to-device communications: Trajectory design and mode selection by Multi-Agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2020, 68(7): 4175–4189. doi: 10.1109/TCOMM.2020.2986289.
    [14] DAI Xunhua, LU Zhiyu, CHEN Xuehan, et al. Multiagent RL-based joint trajectory scheduling and resource allocation in NOMA-assisted UAV swarm network[J]. IEEE Internet of Things Journal, 2024, 11(8): 14153–14167. doi: 10.1109/JIOT.2023.3340669.
    [15] ZHANG Zhongyu, LIU Yunpeng, LIU Tianci, et al. DAGN: A real-time UAV remote sensing image vehicle detection framework[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(11): 1884–1888. doi: 10.1109/LGRS.2019.2956513.
    [16] YANG Jun, YOU Xinghui, WU Gaoxiang, et al. Application of reinforcement learning in UAV cluster task scheduling[J]. Future Generation Computer Systems, 2019, 95: 140–148. doi: 10.1016/j.future.2018.11.014.
    [17] NOBAR S K, AHMED M H, MORGAN Y, et al. Resource allocation in cognitive radio-enabled UAV communication[J]. IEEE Transactions on Cognitive Communications and Networking, 2022, 8(1): 296–310. doi: 10.1109/TCCN.2021.3103531.
    [18] CHEN Jiming, LI Junkun, and LAI T H. Energy-efficient intrusion detection with a barrier of probabilistic sensors: Global and local[J]. IEEE Transactions on Wireless Communications, 2013, 12(9): 4742–4755. doi: 10.1109/TW.2013.072313.122083.
    [19] SHAKHOV V V and KOO I. Experiment design for parameter estimation in probabilistic sensing models[J]. IEEE Sensors Journal, 2017, 17(24): 8431–8437. doi: 10.1109/JSEN.2017.2766089.
    [20] YANG Qianqian, HE Shibo, LI Junkun, et al. Energy-efficient probabilistic area coverage in wireless sensor networks[J]. IEEE Transactions on Vehicular Technology, 2015, 64(1): 367–377. doi: 10.1109/TVT.2014.2300181.
    [21] AL-HOURANI A, KANDEEPAN S, and LARDNER S. Optimal LAP altitude for maximum coverage[J]. IEEE Wireless Communications Letters, 2014, 3(6): 569–572. doi: 10.1109/LWC.2014.2342736.
    [22] ZHANG Xinyu and SHIN K G. E-MiLi: Energy-minimizing idle listening in wireless networks[J]. IEEE Transactions on Mobile Computing, 2012, 11(9): 1441–1454. doi: 10.1109/TMC.2012.112.
    [23] ZHU Changxi, DASTANI M, and WANG Shihan. A survey of multi-agent deep reinforcement learning with communication[J]. Autonomous Agents and Multi-Agent Systems, 2024, 38(1): 4. doi: 10.1007/s10458-023-09633-6.
    [24] 喻莞芯. 基于多智能體強(qiáng)化學(xué)習(xí)的無(wú)人機(jī)集群網(wǎng)絡(luò)優(yōu)化設(shè)計(jì)[D]. [碩士論文], 重慶大學(xué), 2022. doi: 10.27670/d.cnki.gcqdu.2022.001082.

    YU Wanxin. Optimization design of UAV cluster network based on multi-agent reinforcement learning[D]. [Master dissertation], Chongqing University, 2022. doi: 10.27670/d.cnki.gcqdu.2022.001082.
    [25] SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge, USA: MIT Press, 1998.
    [26] WOOD L F. Training neural networks[P]. US, 4914603A, 1990.
    [27] SIPPER M. A serial complexity measure of neural networks[C]. IEEE International Conference on Neural Networks, San Francisco, USA, 1993: 962–966. doi: 10.1109/ICNN.1993.298687.
    [28] GUO Shaoai and ZHAO Xiaohui. Multi-agent deep reinforcement learning based transmission latency minimization for delay-sensitive cognitive satellite-UAV networks[J]. IEEE Transactions on Communications, 2023, 71(1): 131–144. doi: 10.1109/TCOMM.2022.3222460.
  • 加載中
圖(9) / 表(3)
計(jì)量
  • 文章訪問(wèn)數(shù):  56
  • HTML全文瀏覽量:  13
  • PDF下載量:  0
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2024-08-26
  • 修回日期:  2025-02-24
  • 網(wǎng)絡(luò)出版日期:  2025-03-06

目錄

    /

    返回文章
    返回