Mahgoub, A., Yi, E., Shankar, K., Minocha, E., Elnikety, S., Bagchi, S., and Chaterji, S. (2022). WISEFUSE: Workload Characterization and DAG Transformation for Serverless Workflows. Proceedings of the ACM on Measurement and Analysis of Computing Systems, ACM.
Joint work with Microsoft Research
Accepted to Sigmetrics 2022
Ashraf Mahgoub, Edgardo Barsallo Yi, Karthick Shankar, Eshaan Minocha
We characterize production workloads of serverless DAGs at a major cloud provider. Our analysis highlights two major factors that limit performance: (a) lack of efficient communication methods between the serverless functions in the DAG, and (b) stragglers when a DAG stage invokes a set of parallel functions that must complete before starting the next DAG stage. To address these limitations, we propose WISEFUSE, an automated approach to generate an optimized execution plan for serverless DAGs for a user-specified latency objective or budget. We introduce three optimizations: (1) Fusion combines in-series functions together in a single VM to reduce the communication overhead between cascaded functions. (2) Bundling executes a group of parallel invocations of a function in one VM to improve resource sharing among the parallel workers to reduce skew. (3) Resource Allocation assigns the right VM size to each function or function bundle in the DAG to reduce the E2E latency and cost. We implement WISEFUSE to evaluate it experimentally using three popular serverless applications with different DAG structures, memory footprints, and intermediate data sizes. Compared to competing approaches and other alternatives, WISEFUSE shows significant improvements in E2E latency and cost. Specifically, for a machine learning pipeline, WISEFUSE achieves P95 latency that is 67% lower than Photons, 39% lower than Faastlane, and 90% lower than SONIC without increasing the cost.