Lovelock: Towards Smart NIC-hosted Clusters

Seo Jin Park (University of Southern California); Ramesh Govindan (University of Southern California); Kai Shen (Google); David Culler (Google); Fatma Ozcan (Google); Geon-Woo Kim (The University of Texas at Austin); Hank Levy (Google and University of Washington)

Abstract

Traditional cluster designs were originally server-centric, and have evolved recently to support hardware acceleration and storage disaggregation. In applications that leverage acceleration, the server CPU performs the role of orchestrating computation and data movement. Data-intensive applications that leverage disaggregation can be adversely affected by the increased PCIe and network bandwidth required for disaggregation. In this paper, we advocate for a specialized cluster design for important data intensive applications, such as analytics, query processing and ML training. This design, Lovelock, replaces servers with one or more headless smart NICs. Because smart NICs can be significantly cheaper than servers, the resulting cluster can run these applications without adversely impacting performance, while obtaining cost and energy savings.