This paper addresses the data-locality-aware task assignment and scheduling problem for distributed job executions. Our goal is to minimize job completion times without prior knowledge of future job arrivals. We… Click to show full abstract
This paper addresses the data-locality-aware task assignment and scheduling problem for distributed job executions. Our goal is to minimize job completion times without prior knowledge of future job arrivals. We propose an Optimal Balanced Task Assignment algorithm (OBTA), which achieves minimal job completion times while significantly reducing computational overhead through efficient narrowing of the solution search space. To balance performance and efficiency, we extend the approximate Water-Filling (WF) algorithm, providing a rigorous proof that its approximation factor equals the number of task groups in a job. We also introduce a novel heuristic, Replica-Deletion (RD), which outperforms WF by leveraging global optimization techniques. To further enhance scheduling efficiency, we incorporate job ordering strategies based on a shortest-estimated-time-first policy, reducing average job completion times across workloads. Extensive trace-driven evaluations validate the effectiveness and scalability of the proposed algorithms.
               
Click one of the above tabs to view related content.