Dyna-P: placement-aware dynamic partitioning for lightweight applications with modern GPUs
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Efficient GPU resource sharing is critical in dynamic cloud-based environments, particularly for lightweight HPC applications and Small Language Models, which demand partial GPU resources for execution. However, traditional scheduling frameworks fail to address intra-GPU and inter-node resource fragmentation and dynamic placement challenges arising from the heterogeneity in each application's resource demand and job completion times. This leads to resource under-utilization and scheduling delays in GPU clusters. This paper introduces Dyna-P, a novel scheduling framework designed to dynamically adjust GPU partitions to minimize resource fragmentation while improving system throughput and Makespan. Dyna-P proposes a Reconfiguration Last Placement policy which recognizes that workloads consisting of lightweight applications can benefit more from uninterrupted execution. Experimental results demonstrate that Dyna-P improves average throughput by up to 14.7% and reduces Makespan by 39% compared to state-of-the-art methods. These findings underscore Dyna-P's potential to improve resource allocation rates in multi-tenant GPU environments.

키워드

Dynamic partitioningSpatial sharingGPU utilizationPlacementFragmentation
제목
Dyna-P: placement-aware dynamic partitioning for lightweight applications with modern GPUs
저자
Adufu, TheodoraKim, Yoonhee
DOI
10.1007/s10586-025-05284-2
발행일
2025-08
유형
Article
저널명
Cluster Computing
28
9