상세 보기
- Rizwan, Muhammad;
- Choi, Jaeyoung;
- Kim, Yoonhee
WEB OF SCIENCE
0SCOPUS
0초록
KoHPCG is a performance-portable HPCG benchmark program developed using the Kokkos programming model. The Reference HPCG benchmark is constrained by memory-bound kernels that restrict performance across various architectures. This paper details a thorough implementation of core HPCG kernels, including Dot Product (DDOT), WAXPBY, Sparse Matrix Vector Multiplication (SpMV), Symmetric Gauss-Seidel (SymGS), and Multigrid (MG), utilizing the Kokkos::Views, parallel constructs, execution and memory space abstraction. Evaluated on Intel Xeon Phi (KNL) and Xeon Skylake (SKL) processors with a maximum of 16 nodes, KoHPCG achieves significant performance improvement up to 11.7 × acceleration in MG on SKL, alongside a 17.3 × MG improvement on KNL. The overall HPCG performance increases by as much as 3.2 × on SKL and attains 4.1 × on KNL compared to the Reference HPCG implementation, on problem sizes of 1923 and 1603 on SKL and KNL, respectively. Furthermore, on a larger problem size of 3203 , KoHPCG attains an overall HPCG performance improvement of 5.1 × on SKL. In addition to the overall performance improvement in HPCG, we reported the kernel-level performance that exposes the little and no performance improvement in SpMV and WAXPBY respectively, and increased memory consumption by using the Kokkos data structure, thereby highlighting further optimisation opportunities for future work.
키워드
- 제목
- KoHPCG - High-Performance Conjugate Gradient Benchmark Program on Kokkos Performance Portability Ecosystem
- 저자
- Rizwan, Muhammad; Choi, Jaeyoung; Kim, Yoonhee
- 발행일
- 2025-10
- 유형
- Conference paper
- 저널명
- Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA