Toward Interference-aware GPU Container Co-scheduling Learning from Application Profiles
- Authors
- Kim, Sejin; Kim, Yoonhee
- Issue Date
- Aug-2020
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- co-execution; Co-scheML scheduler; GPU applications; GPU utilization; interference; resource contention
- Citation
- Proceedings - 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, ACSOS-C 2020, pp 19 - 23
- Pages
- 5
- Journal Title
- Proceedings - 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, ACSOS-C 2020
- Start Page
- 19
- End Page
- 23
- URI
- https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/1228
- DOI
- 10.1109/ACSOS-C51401.2020.00023
- Abstract
- Issues related to operating Graphic Processing Unit (GPU) applications efficiently and improving overall system throughput in a GPU cluster environment exist. The platform may not utilize resource of a GPU fully depending on application characteristics because a conventional cluster orchestration platform using GPUs only supports a single execution of an application on a GPU. However, co-execution of GPU applications causes interference coming from resource contention among the applications. If various resource usage of GPU applications is not reflected, it could lead to an unbalanced usage of computing resources and consequently reduce performance in a GPU cluster. This study proposes interference-aware architecture, Co-scheML and evaluates case studies with it for workload execution of GPU applications such as High Performance Computing (HPC), Deep Learning (DL) Training, and DL Inference. Diverse resource usage is profiled to identify various degree of their interference of applications. Due to the difficulty of predicting the interference using these characteristics, interference model is generated by applying a Machine Learning (ML) model with defined GPU metrics. Proposed architecture predicts interference and deploys an application which is co-executed with a running application. Experimental results of case studies with Co-ScheML show that average job completion time is improved by 22%, and the makespan is shortened by 32% in average, as compared to baseline schedulers. © 2020 IEEE.
- Files in This Item
-
Go to Link
- Appears in
Collections - 공과대학 > 소프트웨어학부 > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.