路徑規(guī)劃算法的高層綜合設(shè)計(jì)研究
doi: 10.11999/JEIT240210
-
1.
汕頭大學(xué)電子系 汕頭 515063
-
2.
中國(guó)科學(xué)院計(jì)算技術(shù)研究所計(jì)算機(jī)體系結(jié)構(gòu)國(guó)家重點(diǎn)實(shí)驗(yàn)室 北京 100190
Case Study of High Level Synthesis on Path Planning Algorithm
-
1.
Department of Electronics and Information Engineering, Shantou University, Shantou 515063, China
-
2.
State key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
-
摘要: 隨著機(jī)器人自動(dòng)導(dǎo)航技術(shù)的快速發(fā)展,基于軟件實(shí)現(xiàn)的路徑規(guī)劃算法在實(shí)時(shí)性上已無(wú)法滿足許多應(yīng)用場(chǎng)景的需求,這就要求對(duì)算法進(jìn)行快速高效的硬件定制,從而獲得低延時(shí)的性能加速。該文以機(jī)器人路徑規(guī)劃中的經(jīng)典A*算法為對(duì)象,通過(guò)構(gòu)建面向硬件設(shè)計(jì)的C/C++數(shù)據(jù)結(jié)構(gòu)和函數(shù)流程優(yōu)化,采用高層綜合(HLS)實(shí)現(xiàn)快速的硬件架構(gòu)探索和選取較優(yōu)的設(shè)計(jì)方案,并完成硬件FPGA綜合。實(shí)驗(yàn)數(shù)據(jù)表明,相較于傳統(tǒng)寄存器傳輸級(jí)(RTL)開發(fā)模式,基于HLS開發(fā)模式的路徑規(guī)劃算法在FPGA實(shí)現(xiàn)上在開發(fā)效率、硬件性能和資源占用率上都有顯著提升,驗(yàn)證了高層綜合在硬件定制中的可行性和成本優(yōu)勢(shì)。
-
關(guān)鍵詞:
- 機(jī)器人自動(dòng)導(dǎo)航 /
- 路徑規(guī)劃算法 /
- 高層綜合 /
- 算法硬件加速
Abstract: With the advancement of robot automatic navigation technology, software-based path planning algorithms can no longer satisfy the needs in scenarios of many real-time applications. Fast and efficient hardware customization of the algorithm is required to achieve low-latency performance acceleration. In this work, High Level Synthesis (HLS) of classic A* algorithm is studied. Hardware-oriented data structure and function optimization, varying design constraints are explored to pick the right architecture, which is then followed by FPGA synthesis. Experimental results show that, compared to the conventional Register Transfer Level (RTL) method, the HLS-based FPGA implementation of the A* algorithm can achieve better productivity, improved hardware performance and resource utilization efficiency, which demonstrates the advantages of high level synthesis in hardware customization in algorithm-centric applications. -
1 A*算法的基本流程
(1)初始化開啟列表和關(guān)閉列表為空 (2)將七點(diǎn)插入開啟列表中 (3)while(開啟列表不為空) (4) 從開啟列表中取出最小F值的格點(diǎn)作為當(dāng)前格點(diǎn),并將之加
入關(guān)閉列表(5) foreach(當(dāng)前格點(diǎn)的相鄰格點(diǎn)) (6) if(該格點(diǎn)在關(guān)閉列表中或是故障格點(diǎn)) then(跳過(guò)該格點(diǎn)) (7) elsif(該格點(diǎn)在開啟列表中) then(比較通過(guò)當(dāng)前格點(diǎn)計(jì)算
得出的F值是否更小,若更小則更新該格點(diǎn)的F值,設(shè)其
父節(jié)點(diǎn)為當(dāng)前節(jié)點(diǎn),并將之插入開啟列表)(8) elsif(該格點(diǎn)不在開啟列表中) then(通過(guò)當(dāng)前節(jié)點(diǎn)計(jì)算該
節(jié)點(diǎn)的F值,設(shè)其父親節(jié)點(diǎn)為當(dāng)前節(jié)點(diǎn),并將之插入開啟
列表)(9) elsif(該格點(diǎn)為終點(diǎn)) then(設(shè)其父節(jié)點(diǎn)為當(dāng)前節(jié)點(diǎn),并通
過(guò)追蹤父節(jié)點(diǎn)鏈條至起點(diǎn),輸出起點(diǎn)到終點(diǎn)的路徑,算
法結(jié)束)(10) end_foreach (11)end_while (12)if(開啟列表為空) then(路徑搜索失敗) 下載: 導(dǎo)出CSV
表 1 SOC FPGA 芯片 PL 資源
資源名稱 總數(shù) 邏輯單元 LUT 87 840 LUTRAM 57 600 觸發(fā)器 Flip-flop(FF) 175 680 BRAM 128 I/O 引腳 256 全局時(shí)鐘緩沖器 BUFG 352 下載: 導(dǎo)出CSV
表 2 15×15地圖下不同開啟列表優(yōu)化比較
數(shù)據(jù)結(jié)構(gòu) LUT個(gè)數(shù) FF個(gè)數(shù) BRAM個(gè)數(shù) 運(yùn)行時(shí)長(zhǎng)(ns) 優(yōu)先隊(duì)列 2 485 2 807 3.0 18 598 隊(duì)列 2 137 2 755 2.5 44 325 堆棧 2 148 2 739 2.5 38 779 下載: 導(dǎo)出CSV
表 3 使用優(yōu)先隊(duì)列數(shù)據(jù)結(jié)構(gòu)的HLS和RTL開發(fā)模式比較
FPGA開發(fā)模式 運(yùn)行時(shí)間(ns) LUT個(gè)數(shù) FF個(gè)數(shù) BRAM個(gè)數(shù) HLS 483 445 3 208 3 303 34.5 RTL 523 046 7 253 7 161 59.5 下載: 導(dǎo)出CSV
表 4 使用堆棧數(shù)據(jù)結(jié)構(gòu)的HLS和RTL開發(fā)模式比較
FPGA開發(fā)模式 運(yùn)行時(shí)間(ns) LUT個(gè)數(shù) FF個(gè)數(shù) BRAM個(gè)數(shù) HLS 3 305 830 1 669 1 523 25 RTL 5 987 348 7 313 7 065 56 下載: 導(dǎo)出CSV
表 5 使用隊(duì)列數(shù)據(jù)結(jié)構(gòu)的HLS和RTL開發(fā)模式比較
FPGA開發(fā)模式 運(yùn)行時(shí)間(ns) LUT個(gè)數(shù) FF個(gè)數(shù) BRAM個(gè)數(shù) HLS 3 389 425 1 756 1 555 25 RTL 5 598 080 7 317 7 053 56 下載: 導(dǎo)出CSV
-
[1] 鄭巖. 改進(jìn)勢(shì)場(chǎng)蟻群算法的機(jī)器人自主導(dǎo)航應(yīng)用研究[D]. [碩士論文], 重慶三峽學(xué)院, 2020. doi: 10.27883/d.cnki.gcqsx.2020.000061.ZHENG Yan. Application of improved potential field ant colony algorithm for autonomous robot navigation[D]. [Master dissertation], Chongqing Three Gorges University, 2020. doi: 10.27883/d.cnki.gcqsx.2020.000061. [2] 郭煒, 魏繼增, 郭箏, 等. SoC設(shè)計(jì)方法與實(shí)現(xiàn)[M]. 3版. 北京: 電子工業(yè)出版社, 2017: 23–24.GUO Wei, WEI Jizeng, GUO Zheng, et al. SoC Design Methodology and Implementation[M]. 3rd ed. Beijing: Publishing House of Electronics Industry, 2017: 23–24. [3] 陳志盛, 朱予涵, 劉耿耿, 等. 考慮流端口數(shù)量約束下的連續(xù)微流控生物芯片流路徑規(guī)劃算法[J]. 電子與信息學(xué)報(bào), 2023, 45(9): 3321–3330. doi: 10.11999/JEIT221168.CHEN Zhisheng, ZHU Yuhan, LIU Genggeng, et al. Flow-path planning algorithm for continuous-flow microfluidic biochips with strictly constrained flow ports[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3321–3330. doi: 10.11999/JEIT221168. [4] CORMEN T H, LEISERSON C E, RIVEST R L, 等, 殷建平, 徐云, 王剛, 等. 譯. 算法導(dǎo)論[M]. 3版. 北京: 機(jī)械工業(yè)出版社, 2013: 374–376.CORMEN T H, LEISERSON C E, RIVEST R L, et al, YIN Jianping, XU Yun, WANG Gang, et al. translation. Introduction to Algorithms[M]. 3rd ed. Beijing: China Machine Press, 2013: 374–376. [5] ABDOKASEB. A C Program to implement A* Search Algorithm[EB/OL]. https://github.com/abdokaseb/AStar-C/, 2022. [6] Mentor, A Siemens Business. HLS Bluebook[M]. Software Version v10. 5b, 2020. [7] ALTOYAN W and ALONSO J J. Investigating performance losses in high-level synthesis for stencil computations[C]. 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines, Fayetteville, USA, 2020: 195–203. doi: 10.1109/FCCM48280.2020.00034. [8] 潘妍, 程岳, 高雅濛. 面向FPGA的高層次綜合技術(shù)綜述[J]. 信息技術(shù)與信息化, 2022(3): 96–99. DOI: 10.3969/j.issn.1672-9528.2022.03.024.PAN Yan, CHENG Yue, and GAO Yameng. An overview of high-level synthesis techniques for FPGAs[J]. Information Technology and Informatization, 2022(3): 96–99. DOI: 10.3969/j.issn.1672-9528.2022.03.024. [9] 石添介, 劉飛陽(yáng), 田徑, 等. 基于高層次綜合的FPGA循環(huán)神經(jīng)網(wǎng)絡(luò)加速器設(shè)計(jì)[J]. 信息技術(shù)與信息化, 2022(1): 151–153. DOI: 10.3969/j.issn.1672-9528.2022.01.042.SHI Tianjie, LIU Feiyang, TIAN Jing, et al. Design of FPGA recurrent neural network accelerator based on high-level synthesis[J]. Information Technology and Informatization, 2022(1): 151–153. DOI: 10.3969/j.issn.1672-9528.2022.01.042. [10] 葉海雄, 陶寧蓉, 匡興紅, 等. 基于Catapult C高層次綜合工具平臺(tái)優(yōu)化運(yùn)動(dòng)檢測(cè)算法的研究[J]. 電子設(shè)計(jì)工程, 2017, 25(14): 1–4. doi: 10.14022/j.cnki.dzsjgc.2017.14.001.YE Haixiong, TAO Ningrong, KUANG Xinghong, et al. Optimization motion detection algorithm based on Catapult C high-level synthesis tool platform[J]. Electronic Design Engineering, 2017, 25(14): 1–4. doi: 10.14022/j.cnki.dzsjgc.2017.14.001. [11] 徐瑞帆, 肖有為, 羅進(jìn), 等. 高層次綜合綜述[J]. 微納電子與智能制造, 2021, 3(2): 74–79. doi: 10.19816/j.cnki.10-1594/tn.2021.02.074.XU Ruifan, XIAO Youwei, LUO Jin, et al. The overview of high-level synthesis[J]. Micro/Nano Electronics and Intelligent Manufacturing, 2021, 3(2): 74–79. doi: 10.19816/j.cnki.10-1594/tn.2021.02.074. [12] PANDA P R, SHARMA N, KURRA S, et al. Exploration of loop unroll factors in high level synthesis[C]. 2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems. Pune, India, 2018: 465–466. doi: 10.1109/VLSID.2018.115. [13] 謝曉燕, 張玉婷, 劉鎮(zhèn)弢. 高層次綜合特征檢測(cè)算法的FPGA實(shí)現(xiàn)[J]. 實(shí)驗(yàn)室研究與探索, 2018, 37(1): 93–97,117. doi: 10.3969/j.issn.1006-7167.2018.01.023.XIE Xiaoyan, ZHANG Yuting, and LIU Zhentao. FPGA implementation of feature detection algorithm based on high level synthesis[J]. Research and Exploration in Laboratory, 2018, 37(1): 93–97,117. doi: 10.3969/j.issn.1006-7167.2018.01.023. [14] 申懿鑫, 韓躍平, 唐道光. 高層次綜合的SM4算法硬件實(shí)現(xiàn)與優(yōu)化[J]. 單片機(jī)與嵌入式系統(tǒng)應(yīng)用, 2023, 23(8): 11–14.SHEN Yixin, HAN Yueping, and TANG Daoguang. hardware implementation and optimization of SM4 algorithm based on high-level synthesis[J]. Microcontrollers & Embedded Systems, 2023, 23(8): 11–14. [15] 周成瑞, 楊玲玲. 基于A星算法的全向移動(dòng)機(jī)器人仿真研究[J]. 電腦與信息技術(shù), 2023, 31(3): 29–31. doi: 10.3969/j.issn.1005-1228.2023.03.008.ZHOU Chengrui and YANG Lingling. Simulation research on omnidirectional mobile robot based on A* algorithm[J]. Computer and Information Technology, 2023, 31(3): 29–31. doi: 10.3969/j.issn.1005-1228.2023.03.008. [16] 王小麗. 基于Vivado HLS霧天圖像預(yù)處理IP核設(shè)計(jì)[J]. 電腦編程技巧與維護(hù), 2023(4): 158–161. doi: 10.16184/j.cnki.comprg.2023.04.020.WANG Xiaoli. Vivado HLS foggy sky image preprocessing IP core based design[J]. Computer Programming Skills & Maintenance, 2023(4): 158–161. doi: 10.16184/j.cnki.comprg.2023.04.020. [17] 韋蘇倫, 陶青川. 基于HLS的MobileNet加速器實(shí)現(xiàn)[J]. 現(xiàn)代計(jì)算機(jī), 2023, 29(8): 91–97. doi: 10.3969/j.issn.1007-1423.2023.08.015.WEI Sulun and TAO Qingchuan. Realization of MobileNet accelerator based on HLS[J]. Modern Computer, 2023, 29(8): 91–97. doi: 10.3969/j.issn.1007-1423.2023.08.015. -