Network-accelerated Scheduling for Large Clusters
MetadataShow full item record
We explore a novel design approach for accelerating schedulers for large scale clusters. Our approach follows a centralized design and leverages the programmability of recent programmable switches to accelerating scheduling operations. We demonstrate the feasibility and benefits of this approach by building two schedulers: one for accelerating data analytics scheduling and one for accelerating scheduling in key-value stores. First, we present a scheduler designed for low-latency data analytics workloads. The proposed scheduler receives job description, maintains a task queue in the switch memory, and schedules tasks on the next available worker at line-rate. The core of this design is a novel pipeline-based scheduling logic that can schedule tasks at line-rate. Our prototype evaluation on a cluster with a Barefoot Tofino switch shows that the proposed approach can reduce scheduling overhead by an order of magnitude compared to state-of-the-art schedulers. Second, we present a network-accelerated scheduler for linearizable key-value stores. The proposed design exploits programmable switches to keep track of write requests and responses, and to identify where the latest version of each object is stored. Our prototype evaluation shows that the proposed design achieves up to 42% higher throughput, and 35-97% lower latency than the current state-of-the art approaches.
Cite this version of the work
Ibrahim Kettaneh (2020). Network-accelerated Scheduling for Large Clusters. UWSpace. http://hdl.handle.net/10012/15812