基于緩存行為特征的線程數(shù)據預取距離控制策略
doi: 10.11999/JEIT141429
-
1.
(鄭州輕工業(yè)學院軟件學院 鄭州 450002)
-
2.
(鄭州輕工業(yè)學院計算機與通信工程學院 鄭州 450002)
-
3.
(北京理工大學計算機學院 北京 100081)
國家自然科學基金(61370062),鄭州市科技攻關計劃項目(20130725)和博士基金項目(2013BSJJ050)資助課題
Prefetch Distance Control Strategy Based on Cache Behavior in Threaded Prefetching
-
1.
(Department of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China)
-
2.
(College of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China)
-
摘要: 針對目前大多數(shù)面向指針應用程序的線程數(shù)據預取方法在預取距離控制方面的不足,該文提出一種基于緩存行為特征的數(shù)據預取距離控制策略。該策略利用指針應用程序執(zhí)行時的數(shù)據緩存特征構建預取距離控制模型,以避免共享緩存污染,降低系統(tǒng)資源競爭,并通過忽略對部分非循環(huán)依賴數(shù)據預取平衡幫助線程與主線程間的執(zhí)行任務,提高線程數(shù)據預取的時效性。實驗結果表明,通過該策略控制線程數(shù)據預取距離能進一步提高線程預取性能。Abstract: Due to the deficiencies in prefetch distance controlling of most threaded data prefetching methods for pointer application, a prefetch distance control strategy based on the cache behavior characteristics is proposed. In this paper, the prefetch distance control model is constructed using the runtime data cache features of pointer applications to reduce cache pollution and system resources contention. By skipping loop-carried independencies data accesses, the task between main thread and helper thread is balanced and the timeliness of threaded prefetching is improved. The experimental results show that the proposed approach can optimize the performance of threaded prefetching mechanism.
-
Key words:
- Chip Multi-Processors (CMP) /
- Threaded prefetching /
- Helper thread /
- Prefetch ratio /
- Prefetch distance
-
Chen T F and Baer J L. A performance study of software and hardware data prefetching schemes[C]. Proceedings of 21st International Symposium on Computer Architecture, Chicago, USA, 1994: 223-232. Saavedra R H and Daeyeon P. Improving the effectiveness of software prefetching with adaptive execution[C]. Proceedings of Conference on Parallel Architectures and Compilation Techniques, Boston, USA, 1996: 68-78. Hur I and Lin C. Feedback mechanisms for improving probabilistic memory prefetching[C]. Proceedings of 15th International Symposium on High Performance Computer Architecture, North Carolina, USA, 2009: 443-454. Dongkeun K, Liao S S W, Wang P H, et al.. Physical experimentation with prefetching helper threads on Intel,s hyper-threaded processors[C]. Proceedings of International Symposium on Code Generation and Optimization, California, USA, 2004: 27-38. Lu J. Design and implementation of a lightweight runtime optimization system on modern computer architectures[D]. [Ph.D. dissertation], University of Minnesota, 2006. Ro W W and Gaudiot J L. Speculative pre-execution assisted by compiler (SPEAR)[J]. Journal of Parallel and Distributed Computing, 2006, 66(8): 1076-1089. Somogyi S, Wenisch T F, Ailamaki A, et al.. Spatial-temporal memory streaming[C]. Proceedings of the 36th International Symposium on Computer Architecture, Austin, USA, 2009: 69-80. Lee J, Jung C, Lim D, et al.. Prefetching with helper threads for loosely coupled multiprocessor systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2009, 20(9): 1309-1324. 單書暢, 胡瑜, 李曉維. 基于數(shù)據預取的多核處理器末級緩存優(yōu)化方法[J]. 計算機輔助設計與圖形學學報, 2012, 24(9): 1241-1248. Shan Shu-chang, Hu Yu, and Li Xiao-wei. Date prefetching based last-level cache optimization for chip multiprocessors [J]. Journal of Computer-Aided Design Computer Graphics, 2012, 24(9): 1241-1248. 張建勛, 古志民, 胡瀟涵, 等. 面向非規(guī)則大數(shù)據分析應用的多核幫助線程預取方法[J]. 通信學報, 2014, 35(8): 137-146. Zhang Jian-xun, Gu Zhi-min, Hu Xiao-han, et al.. Multi-core helper thread prefetching forirregular data intensive applications[J]. Journal on Communications, 2014, 35(8): 137-146. Marin G, McCurdy C, and Vetter J S. Diagnosis and optimization of application prefetching performance[C]. Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, Oregon, USA, 2013: 303312. Garside J and Audsley N C. Prefetching across a shared memory tree within a network-on-chip architecture[C]. Proceedings of 15th International Symposium on System-on- Chip, Melbourne, Australia, 2013: 1-4. Jain A and Lin C. Linearizing irregular memory accesses for improved correlated prefetching[C]. Proceedings of the 46th IEEE/ACM International Symposium on Microarchitecture (MICRO), Davis, USA, 2013: 247-259. Zhao Y, Yoshigoe K J, and Xie M J. Pre-execution data prefetching with I/O scheduling[J]. The Journal of Supercomputing, 2014, 68(2): 733-752. 巫旭敏, 殷保群, 黃靜, 等. 流媒體服務系統(tǒng)中一種基于數(shù)據預取的緩存策略[J]. 電子與信息學報, 2010, 32(10): 2440-2445. Wu Xu-min, Yin Bao-qun, Huang Jing, et al.. A prefetching- based caching policy in streaming service systems[J]. Journal of Electronics Information Technology, 2010, 32(10): 2440-2445. 劉斌, 趙銀亮, 韓博, 等. 基于性能預測的推測多線程循環(huán)選擇方法[J]. 電子與信息學報, 2014, 36(11): 2768-2774. Liu Bin, Zhao Yin-liang, Han Bo, et al.. A loop selection approach based on performance prediction for speculative multithreading[J]. Journal of Electronics Information Technology, 2014, 36(11): 2768-2774. Emma P G, Hartstein A, Puzak T R, et al.. Exploring the limits of prefetching[J]. IBM Journal of Research and Development, 2005, 49(1): 127-144. Srinath S, Mutlu O, Hyesoon K, et al.. Feedback directed prefetching: improving the performance and bandwidth- efficiency of hardware prefetchers[C]. Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture, Arizona, USA, 2007: 63-74. Doweck J. White paper: inside intel core microarchitecture and smart memory access[R]. Intel Corporation, 2006. Hui K and Jennifer L W. To hardware prefetch or not to prefetch?: a virtualized environment study and core binding approach[C]. Proceedings the 8th International Conference on Architectural Support For Programming Languages And Operating Systems, Houston, USA, 2013: 357-368. -
計量
- 文章訪問數(shù): 1577
- HTML全文瀏覽量: 131
- PDF下載量: 495
- 被引次數(shù): 0