Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting

Session: Datacenter/cloud power/performance--Managing the beast.

Authors: Dong Du (Shanghai Jiao Tong University); Tianyi Yu (Shanghai Jiao Tong University); Yubin Xia (Shanghai Jiao Tong University); Binyu Zang (Shanghai Jiao Tong University); Guanglu Yan (Ant Financial Services Group); Chenggang Qin (Ant Financial Services Group); Qixuan Wu (Ant Financial Services Group); Haibo Chen (Shanghai Jiao Tong University)

Serverless computing promises cost-efficiency and elasticity for high-productive software development. To achieve this, the serverless sandbox system must address two challenges: strong isolation between function instances, and low startup latency to ensure user experience. While strong isolation can be provided by virtualization-based sandboxes, the initialization of sandbox and application causes non-negligible startup overhead. Conventional sandbox systems fall short in low-latency startup due to their application-agnostic nature: they can only reduce the latency of sandbox initialization through hypervisor and guest kernel customization, which is inadequate and does not mitigate the majority of startup overhead. This paper proposes Catalyzer, a serverless sandbox system design providing both strong isolation and extremely fast function startup. Instead of booting from scratch, Catalyzer restores a virtualization-based function instance from a well-formed checkpoint image and thereby skips the initialization on the critical path (init-less). Catalyzer boosts the restore performance by on-demand recovering both user-level memory state and system state. We also propose a new OS primitive, sfork (sandbox fork), to further reduce the startup latency by directly reusing the state of a running sandbox instance. Fundamentally, Catalyzer removes the initialization cost by reusing state, which enables general optimizations for diverse serverless functions. The evaluation shows that Catalyzer reduces startup latency by orders of magnitude, achieves <1ms latency in the best case, and significantly reduces the end-to-end latency for real-world workloads. Catalyzer has been adopted by Ant Financial, and we also present lessons learned from industrial development.