High-Performance Serverless Computing
Re-architecting serverless platforms with lightweight data planes, elastic control planes, and secure runtimes
Serverless computing promises an event-driven and resource-efficient way to build cloud applications, where developers only write application logic while the platform dynamically manages execution, scaling, and resource allocation. However, existing serverless platforms often inherit heavyweight components from container orchestration systems, including kernel-based networking, sidecar proxies, message brokers, and reactive autoscalers. These components introduce substantial overheads in function chains, increase tail latency, waste CPU resources, and make it difficult to support latency-sensitive and resource-constrained environments such as edge clouds and large-scale distributed ML workloads.
This project explores how to redesign serverless systems across both the data plane and the control plane. The central goal is to make serverless computing truly lightweight, high-performance, elastic, and secure, while preserving its programmability and operational benefits.
Research Overview
My work on serverless computing started from understanding the networking overheads in Kubernetes-based cloud platforms (Qi et al., 2020; Qi et al., 2021). We first characterized the performance of different Container Network Interface (CNI) plugins, showing how network models, overlay tunneling, iptables rules, eBPF processing, and host networking stack interactions affect throughput, latency, scalability, and pod startup latency. This measurement-driven study provided a foundation for understanding why existing cloud-native datapaths can become a major bottleneck for microservices and serverless workloads.
Building on this understanding, we designed a series of serverless platforms that optimize different aspects of the system stack:
-
Mu (Mittal et al., 2021) focuses on the serverless control plane for resource-constrained edge clouds. It integrates SLO-aware autoscaling, load-aware request routing, and fairness-aware placement into Knative. Mu uses lightweight workload prediction and piggybacked runtime metrics to proactively scale functions, improve tail latency, reduce resource consumption, and ensure fairness under limited edge resources.
-
SPRIGHT (Qi et al., 2022; Qi et al., 2024) focuses on the serverless data plane for function chains. It replaces heavyweight sidecar proxies and repeated kernel networking with an event-driven, eBPF-based shared-memory datapath. SPRIGHT enables direct function routing, zero-copy communication within a function chain, lightweight protocol adaptation, and load-proportional resource usage. This design significantly improves throughput and latency while reducing CPU consumption compared with Knative.
-
LIFL (Qi et al., 2024) extends these ideas to federated learning aggregation, where model updates are large, clients are dynamic, and aggregation must be both elastic and efficient. LIFL uses shared memory, eBPF-based sidecars, in-place message queuing, locality-aware placement, hierarchy-aware autoscaling, and aggregator reuse to support scalable serverless FL aggregation with lower CPU cost and faster time-to-accuracy.
-
SURE (Parola et al., 2024) revisits serverless runtime design through unikernels. It combines fast-startup unikernel-based function execution with a secure high-performance datapath. SURE uses distributed zero-copy communication, a library-based sidecar, a zero-copy TCP/IP stack, and MPK-based memory protection to provide both efficiency and isolation. This work explores how serverless platforms can achieve rapid startup, high throughput, low latency, and stronger isolation at the same time.
Together, these systems form a coherent research direction: rearchitecting serverless platforms by removing unnecessary kernel, networking, sidecar, and orchestration overheads, while adding principled support for elasticity, locality, fairness, and isolation.
Key Ideas
Lightweight and Load-Proportional Data Planes
A recurring theme in this project is that serverless datapaths should consume resources only when useful work arrives. Existing platforms often rely on always-running sidecars, message brokers, and kernel networking paths. In contrast, our designs use shared memory, eBPF, event-driven processing, and library-based sidecars to make communication between functions more direct and efficient.
This enables serverless function chains to avoid repeated protocol processing, serialization/deserialization, context switches, interrupts, and data copies.
Control Planes for Elasticity, Locality, and Fairness
Serverless platforms must make fast and accurate control decisions: how many function instances to run, where to place them, and how to route traffic. This is especially important in edge clouds and distributed ML workloads, where resources are limited and demand changes over time.
Our work designs control-plane mechanisms that are aware of SLOs, workload dynamics, resource heterogeneity, function-chain structure, communication locality, and fairness among competing functions.
Secure High-Performance Serverless Runtime
High performance alone is insufficient for multi-tenant serverless clouds. SURE explores how to combine zero-copy shared memory and unikernel-based execution with fine-grained memory protection. By using MPK-based call gates and protecting trusted runtime components, SURE shows how serverless systems can provide both efficient communication and stronger isolation.
References
2024
2022
2021
-
IEEE TNSMAssessing Container Network Interface Plugins: Functionality, Performance, and Scalability
2020
-
LANMAN 20Understanding Container Network Interface Plugins: Design Considerations and Performance