基于存儲(chǔ)劃分和路徑重用的粗粒度可重構(gòu)結(jié)構(gòu)循環(huán)映射算法

張興明; 袁開堅(jiān); 高彥釗

doi:10.11999/JEIT170748

基于存儲(chǔ)劃分和路徑重用的粗粒度可重構(gòu)結(jié)構(gòu)循環(huán)映射算法

doi: 10.11999/JEIT170748

基金項(xiàng)目:

國(guó)家科技重大專項(xiàng)(2016ZX01012101)，國(guó)家自然科學(xué)基金(61572520, 61521003)

計(jì)量
- 文章訪問數(shù): 1336
- HTML全文瀏覽量: 150
- PDF下載量: 109
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2017-07-21
- 修回日期: 2017-12-18
- 刊出日期: 2018-06-19

Coarse Grained Reconfigurable Architecture Loop Mapping Algorithm Based on Memory Partitioning and Path Reuse

Funds:

The National Science Technology Major Project (2016ZX01012101), The National Natural Science Foundation of China (61572520, 61521003)

摘要

摘要: 目前針對(duì)粗粒度可重構(gòu)結(jié)構(gòu)循環(huán)映射的研究主要集中在操作布局和臨時(shí)數(shù)據(jù)路由，缺乏考慮數(shù)據(jù)映射的研究，該文提出一種基于存儲(chǔ)劃分和路徑重用的模調(diào)度映射流程。首先進(jìn)行細(xì)粒度的存儲(chǔ)劃分找到合適的數(shù)據(jù)映射，提高數(shù)據(jù)存取的并行性，再用模調(diào)度尋找操作布局和臨時(shí)數(shù)據(jù)路由，最后利用構(gòu)建的路由開銷模型平衡存儲(chǔ)器路由和處理單元路由的使用，引入路徑重用策略優(yōu)化路由資源。實(shí)驗(yàn)結(jié)果表明，該方法在循環(huán)的啟動(dòng)間隔、每周期指令數(shù)和執(zhí)行延遲等方面均具有良好的性能。
- 粗粒度可重構(gòu)結(jié)構(gòu) /
- 循環(huán)映射 /
- 存儲(chǔ)劃分 /
- 路徑重用
Abstract: The current research on Coarse Grained Reconfigurable Architecture (CGRA) loop mapping mainly focuses on operation placement and data routing, but seldom involves data mapping. To solve this problem, a mapping flow based on memory partitioning and path reuse is designed. Firstly, fine grained memory partitioning is used to find the data placement improving the parallelism of data access. Secondly, placement and routing is searched by modulo scheduling. Finally, the routing overhead model is used to balance memory routing and processing unit routing and path reuse strategy is introduced to optimize routing resources. Experimental results validate the performance of proposed approach in initiation interval, instruction per cycle and execution delay.
- Coarse Grained Reconfigurable Architecture (CGRA) /
- Loop mapping /
- Memory partitioning /
- Path reuse

HTML全文

參考文獻(xiàn)(13)

PAGER J, JEYAPAUL R, and SHRIVASTAVA A. A software scheme for multithreading on CGRAs[J]. ACM Transactions on Embedded Computing Systems, 2015, 14(1): 19:1-19:26. doi: 10.1145/2638558.

UL-ABDIN Z and SVENSSON B. A retargetable compilation framework for heterogeneous reconfigurable computing[J]. ACM Transactions on Reconfigurable Technology Systems, 2016, 9(4): 24:1-24:22. doi: 10.1145/2843946.

YIN S, YAO X, LIU D, et al. Memory-aware loop mapping on coarse-grained reconfigurable architectures[J]. IEEE Transactions on Very Large Scale Integration Systems, 2016, 24(5): 1895-1908. doi: 10.1109/TVLSI.2015.2474129.

THEOCHARIS P and SUTTER B. A bimodal scheduler for coarse-grained reconfigurable arrays[J]. ACM Transactions on Architecture and Code Optimization, 2016, 13(2): 15:1-15:26. doi: 10.1145/2893475.

PARK H, FAN K, MAHLKE S A, et al. Edge-centric modulo scheduling for coarse-grained reconfigurable architectures[C]. International Conference on Parallel Architectures and Compilation Techniques, Toronto, Canada, 2008: 166-176. doi: 10.1145/1454115.1454140.

KIM Y, LEE J, SHRIVASTAVA A, et al. Memory access optimization in compilation for coarse-grained reconfigurable architectures[J]. ACM Transactions on Design Automation of Electronic Systems, 2011, 16(4): 1-27. doi: 10.1145/2003695. 2003702.

KIM Y, LEE J, SHRIVASTAVA A, et al. High throughput data mapping for coarse-grained reconfigurable architectures[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2011, 30(11): 1599-1609. doi: 10.1109/TCAD.2011.2161217.

SU J, YANG F, ZENG X, et al. Efficient memory partitioning for parallel data access via data reuse[C]. ACM /SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, USA, 2016: 138-147. doi: 10.1145/ 2847263.2847264.

YIN S, XIE Z, MENG C, et al. Multibank memory optimization for parallel data access in multiple data arrays [C]. International Conference on Computer-Aided Design, Austin, USA, 2016: 32:1-32:8. doi: 10.1145/2966986.2967056.

YIN S, YAO X, LU T, et al. Joint loop mapping and data placement for coarse-grained reconfigurable architecture with multi-bank memory[C]. International Conference on Computer-Aided Design, Austin, USA, 2016: 127:1-127:8. doi: 10.1145/2966986.2967049.

MUKHERJEE M, FELL A, and GUHA A. DFGenTool: A dataflow graph generation tool for coarse grain reconfigurable architectures[C]. International Conference on VLSI Design, Hyderabad, India, 2017: 67-72. doi: 10.1109/VLSID.2017.62.

陳銳, 楊海鋼, 王飛, 等. 基于自路由互連網(wǎng)絡(luò)的粗粒度可重構(gòu)陣列結(jié)構(gòu)[J]. 電子與信息學(xué)報(bào), 2014, 36(9): 2251-2257. doi: 10.3724/SP.J.1146.2013.01646.

CHEN Rui, YANG Haigang, WANG Fei, et al. Coarse- grained reconfigurable array based on self-routing interconnection network[J]. Journal of Electronics Information Technology, 2014, 36(9): 2251-2257. doi: 10.3724 /SP.J.1146.2013.01646.

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問統(tǒng)計(jì)