Mahgoub, A., Yi, E., Shankar, K., Minocha, E., Elnikety, S., Bagchi, S., and Chaterji, S. (2022). WISEFUSE: Workload Characterization and DAG Transformation for Serverless Workflows. Proceedings of the ACM on Measurement and Analysis of Computing Systems, ACM.
Paper BibTeX Extended Abstract Best Paper Award
Joint work with Microsoft Research
Accepted to Sigmetrics 2022
Students
Ashraf Mahgoub, Edgardo Barsallo Yi, Karthick Shankar, Eshaan Minocha
Abstract
We characterize production workloads of serverless DAGs at a major cloud provider.
Our analysis highlights two major factors that limit performance: (a) lack of efficient
communication methods between the serverless functions in the DAG, and (b) stragglers
when a DAG stage invokes a set of parallel functions that must complete before starting
the next DAG stage. To address these limitations, we propose WISEFUSE, an automated
approach to generate an optimized execution plan for serverless DAGs for a user-specified
latency objective or budget. We introduce three optimizations: (1) Fusion combines in-series
functions together in a single VM to reduce the communication overhead between cascaded
functions. (2) Bundling executes a group of parallel invocations of a function in one VM
to improve resource sharing among the parallel workers to reduce skew. (3) Resource
Allocation assigns the right VM size to each function or function bundle in the DAG to
reduce the E2E latency and cost. We implement WISEFUSE to evaluate it experimentally
using three popular serverless applications with different DAG structures, memory
footprints, and intermediate data sizes. Compared to competing approaches and other
alternatives, WISEFUSE shows significant improvements in E2E latency and cost. Specifically,
for a machine learning pipeline, WISEFUSE achieves P95 latency that is 67% lower than Photons,
39% lower than Faastlane, and 90% lower than SONIC without increasing the cost.