Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Analyzing Data Locality on GPU Caches using Static Profiling of Workloads

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Jieun-
dc.contributor.authorEom, Hyeonsang-
dc.contributor.authorKim, Yoonhee-
dc.date.accessioned2023-12-13T01:56:26Z-
dc.date.available2023-12-13T01:56:26Z-
dc.date.issued2023-08-
dc.identifier.issn2169-3536-
dc.identifier.issn2169-3536-
dc.identifier.urihttps://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/159033-
dc.description.abstractThe diversity of workloads drives studies to use GPU more effectively to overcome the limited memory of GPUs. Precisely, it is essential to understand and utilize data locality of workloads to utilize the memory and cache efficiently, which is relatively smaller than CPU’s. It is important to understand GPU memory hierarchy to efficiently use with multi-thread environment. Although there have been previous approaches to analyzing data locality on GPUs, these approaches focused on global memory and L2 cache levels with profiling at thread block levels. The data locality study in warp level in GPU has not been studied much. Especially, the concept of coalescing has been defined but the method of measuring the degree of coalescing has not been discussed. In this paper, we analyze data locality in L1 cache levels, which is the smallest but fastest in cache level to analyze the impact of data locality. To achieve this analysis, we profile data locality in warp level, which is smallest segment in GPU thread groups. We define the degree of coalescing besides static profiling of data locality and provide the estimation of refined locality from profiling of L1 cache data access patterns. As a proof-of-concept, the estimation using the proposed method are evaluated with the performance comparison to diverse real-world GPU benchmarks such as Rodina and Polybench. The locality metrics with coalescing on experiments showed meaningful correlation with cache utilization for performance enhancement. Author-
dc.format.extent9-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleAnalyzing Data Locality on GPU Caches using Static Profiling of Workloads-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/ACCESS.2023.3307315-
dc.identifier.scopusid2-s2.0-85168756471-
dc.identifier.wosid001081653200001-
dc.identifier.bibliographicCitationIEEE Access, v.11, pp 95939 - 95947-
dc.citation.titleIEEE Access-
dc.citation.volume11-
dc.citation.startPage95939-
dc.citation.endPage95947-
dc.type.docTypeArticle-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaTelecommunications-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryTelecommunications-
dc.subject.keywordAuthorCodes-
dc.subject.keywordAuthorCorrelation-
dc.subject.keywordAuthorData locality-
dc.subject.keywordAuthorEstimation-
dc.subject.keywordAuthorGPGPU workload analysis-
dc.subject.keywordAuthorGPU cache-
dc.subject.keywordAuthorGPU profiling-
dc.subject.keywordAuthorGraphics processing units-
dc.subject.keywordAuthorInstruction sets-
dc.subject.keywordAuthorMemory management-
dc.subject.keywordAuthorMessage systems-
dc.subject.keywordAuthorPTX code-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10225495-
Files in This Item
Go to Link
Appears in
Collections
공과대학 > 소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Yoonhee photo

Kim, Yoonhee
공과대학 (소프트웨어학부(첨단))
Read more

Altmetrics

Total Views & Downloads

BROWSE