Cited By
View all- Um TOh BKang MLee WKim GKim DKim YMuzzammil MJeon MBagchi SZhang Y(2024)MetisProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692027(563-578)Online publication date: 10-Jul-2024
Organizations often build separate training and inference clusters for deep learning, and use separate schedulers to manage them. This leads to problems for both: inference clusters have low utilization when the traffic load is low; training jobs often ...
This paper addresses the problem of scheduling dynamicallymulti-user and independent jobs on clusters, both homogeneous and heterogeneous. The dynamic behaviormeans that the scheduler is able to adapt the schedulingwhen new jobs are submitted and also ...
This paper addresses the problem of minimizing the scheduling length (make-span) of a batch of jobs with different arrival times. A job is described by a direct acyclic graph (DAG) of parallel tasks. The paper proposes a dynamic scheduling method that ...
Association for Computing Machinery
New York, NY, United States
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in