상세 보기
- Adufu, Theodora;
- Kim, Yoonhee
WEB OF SCIENCE
0SCOPUS
0초록
Efficient GPU resource sharing is critical in dynamic cloud-based environments, particularly for lightweight HPC applications and Small Language Models, which demand partial GPU resources for execution. However, traditional scheduling frameworks fail to address intra-GPU and inter-node resource fragmentation and dynamic placement challenges arising from the heterogeneity in each application's resource demand and job completion times. This leads to resource under-utilization and scheduling delays in GPU clusters. This paper introduces Dyna-P, a novel scheduling framework designed to dynamically adjust GPU partitions to minimize resource fragmentation while improving system throughput and Makespan. Dyna-P proposes a Reconfiguration Last Placement policy which recognizes that workloads consisting of lightweight applications can benefit more from uninterrupted execution. Experimental results demonstrate that Dyna-P improves average throughput by up to 14.7% and reduces Makespan by 39% compared to state-of-the-art methods. These findings underscore Dyna-P's potential to improve resource allocation rates in multi-tenant GPU environments.
키워드
- 제목
- Dyna-P: placement-aware dynamic partitioning for lightweight applications with modern GPUs
- 저자
- Adufu, Theodora; Kim, Yoonhee
- 발행일
- 2025-08
- 유형
- Article
- 권
- 28
- 호
- 9