Mahgoub, A., Yi, E., Shankar, K., Elnikety, S., Chaterji, S., and Bagchi, S. (2022). Orion and the Three Rights: Sizing, Bundling, and Prewarming for Serverless DAGs. OSDI 2022.
Accepted to OSDI 2022

Students
Ashraf Mahgoub, Edgardo Barsallo Yi , Karthick Shankar
Abstract
Serverless applications represented as DAGs have been growing
in popularity. For many of these applications, it would
be useful to estimate the end-to-end (E2E) latency and to
allocate resources to individual functions so as to meet
probabilistic guarantees for the E2E latency. This goal has not
been met till now due to three fundamental challenges. The
first is the high variability and correlation in the execution
time of individual functions, the second is the skew in execution
times of the parallel invocations, and the third is the
incidence of cold starts. In this paper, we introduce ORION
to achieve these goals. We first analyze traces from a production
FaaS infrastructure to identify three characteristics
of serverless DAGs. We use these to motivate and design
three features. The first is a performance model that accounts
for runtime variabilities and dependencies among functions
in a DAG. The second is a method for co-locating multiple
parallel invocations within a single VM thus mitigating
content-based skew among these invocations. The third is a
method for pre-warming VMs for subsequent functions in
a DAG with the right look-ahead time. We integrate these
three innovations and evaluate ORION on AWS Lambda with
three serverless DAG applications. Our evaluation shows that
compared to three competing approaches, ORION achieves
up to 90% lower P95 latency without increasing $ cost, or up
to 53% lower $ cost without increasing tail latency.