LLVM-IR Instruction Latency Estimation Using Deep Neural Networks for a Software–Hardware Interface for Multi-Many-Cores
DOI:
https://doi.org/10.24297/ijct.v23i.9472Keywords:
neural network, estimation, multicore, embedded system, SHIMAbstract
This study presents a method for estimating the latency of each LLVM-IR instruction to enable effective parallelization in model-based development. In recent embedded systems, such as in-vehicle electronic control, multi-many-core processors are utilized for the hardware, and model-based development for software. In the design of these systems, the degree of parallelism in the software and accuracy of performance estimation in the early design stages of the model-based development can be improved by estimating the performance of the blocks in the models and utilizing the estimate for parallelization. Research is therefore being performed on a software performance estimation technique that uses IEEE2804-2019 hardware feature description called Software-Hardware Interface for Multi-many-core (SHIM). In SHIM, each LLVM-IR instruction is associated with an execution cycle of the target processor. Several types of assembly instruction sequences are generated for the target processor from a given LLVM-IR instruction; thus, it is not easy to estimate the number of execution cycles. In this study, we propose a method that uses deep neural networks to estimate execution cycles for each LLVM-IR instruction. It can be observed that our method obtains a better estimation of LLVM-IR instruction latency compared with previous methods in experiments using the Raspberry Pi3 Model B+.Downloads
References
ARM Limited. "ARM CoreSight ETM-R5 Technical Reference Manual r0p0," (Referred 2023-06-25).
https://developer.arm.com/documentation/ddi0469/b/?lang=en.
EMC, "The Multicore Association Specifications", (Accessed 2023-06-25). https://www.embeddedmulticore.org/the-
multicore-association-specifications/.
eSOL Co. Ltd., "eMBP (Model Based Parallelizer)," (Accessed 2023-06-25).
https://www.esol.com/embedded/product/embp_overview.html.
Gondo, Masaki, Fumio Arakawa and Masato Edahiro, "Establishing a standard interface between multi-manycore
and software tools - SHIM", COOL Chips XVII, VI-1, 2014.
Hwang, Yonghyun, Samar Abdi, and Daniel Gajski. "Cycle-approximate retargetable performance estimation at
the transaction level." Proceedings of the Conference on Design, Automation and Test in Europe. 2008.
IEEE, "IEEE Standard for Software-Hardware Interface for Multi-Many-Core", IEEE 2804-2019, (Accessed
-06-25). https://standards.ieee.org/ieee/2804/7477/.
Kasahara, Hironori, Honda, Hiroki, Mogi, A., Ogura, A., Fujiwara, F., Na rita, Seinosuke. “A multi-grain
parallelizing compilation scheme for OSCAR (optimally scheduled advanced multiprocessor).” International
Workshop on Languages and Compilers for Parallel Computing, Springer, Berlin, Heidelberg, pp.283-297, 1991.
LLVM project, "The LLVM Compiler Infrastructure",(Accessed 2023-06-25). https://llvm.org/.
LLVM project, "LLVM Language Reference Manual", (Accessed 2023-06-25). https://llvm.org/docs/LangRef.html.
LLVM project, "llvm-cov", (Accessed 2023-06-25). https://llvm.org/docs/CommandGuide/llvm-cov.html.
Mikami, Hiro, Kei Torigoe, Makoto Inokawa, and Masato Edahiro. "LLVM Instruction Latency Measurement for
Software-Hardware Interface for Multi-many-core." International Journal of Computers & Technology, 22 (2022):
–63. https://doi.org/10.24297/ijct.v22i.9231
Patel, Rajendra, and Arvind Rajawat. "Recent trends in embedded system software performance estimation."
Design Automation for Embedded Systems 17.1 (2013): 193-213.
Powell, Daniel Christopher, and Björn Franke. "Using continuous statistical machine learning to enable high-speed
performance prediction in hybrid instruction-/cycle-accurate instruction set simulators." Proceedings of the 7th
IEEE/ACM International Conference on Hardware/software Codesign and System Synthesis. 2009.
Ray, Abhijit, Thambipillai Srikanthan, and Jigang Wu. "Rapid techniques for performance estimation of processors."
Journal of Research and Practice in Information Technology 42.2 (2010): 147-165.
Renesas electronics, "RH850/E1M-S2", (Accessed 2023-06-25). https://www.renesas.com/jp/en/products/microcontrollers-
microprocessors/rh850-automotive-mcus.
SHIM Working Group, "SHIM Latency Measurement and Insertion", (Accessed 2023-06-25).
https://github.com/openshim/shim2/tree/master/shim-measure.
Wijesundera, Deshya, et al. "Framework for rapid performance estimation of embedded soft core processors."
ACM Transactions on Reconfigurable Technology and Systems (TRETS) 11.2 (2018): 1-21.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Hiro Mikami, Seira Iwai, Masato Edahiro
This work is licensed under a Creative Commons Attribution 4.0 International License.