Link Search Menu Expand Document

Machine Learning

FaaS can support machine learning in both training and inference applications. Projects that focus on training include MLLess [1] and LambdaML [2]. Cirrus [3] and Stratum [4] address end-to-end machine learning workflows, which include both training and inference. Inference-focused projects demonstrate serving deep learning models [5] and automatic model partitioning for cost optimality and SLO compliance [6].

GPUs and other accelerators [7] are commonplace in machine learning but are not presently supported by commercial FaaS offerings. Research that addresses this shortcoming includes work on efficient GPU sharing for serverless workflows [8]. Another project, PyPlover [9], is a serverless framework that allows the deployment of GPU code directly to a FaaS environment.


  • [1]Marc Sánchez-Artigas and Pablo Gimeno Sarroca. 2021. Experience Paper: Towards Enhancing Cost Efficiency in Serverless Machine Learning Training. In Proceedings of the 22nd International Middleware Conference, 210–222.
  • [2]Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, and Ce Zhang. 2021. Towards Demystifying Serverless Machine Learning Training. In Proceedings of the 2021 International Conference on Management of Data, 857–871.
  • [3]Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2019. Cirrus: A Serverless Framework for End-to-End ML Workflows. In Proceedings of the ACM Symposium on Cloud Computing, 13–24.
  • [4]Anirban Bhattacharjee, Yogesh Barve, Shweta Khare, Shunxing Bao, Aniruddha Gokhale, and Thomas Damiano. 2019. Stratum: A Serverless Framework for the Lifecycle Management of Machine Learning-Based Data Analytics Tasks. In 2019 USENIX Conference on Operational Machine Learning (OpML 19), 59–61.
  • [5]Vatche Ishakian, Vinod Muthusamy, and Aleksander Slominski. 2018. Serving Deep Learning Models in a Serverless Platform. In 2018 IEEE International Conference on Cloud Engineering (IC2E), IEEE, 257–262.
  • [6]Minchen Yu, Zhifeng Jiang, Hok Chun Ng, Wei Wang, Ruichuan Chen, and Bo Li. 2021. Gillis: Serving Large Neural Networks in Serverless Functions With Automatic Model Partitioning. In 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), IEEE, 138–148.
  • [7]Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, and others. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture, 1–12.
  • [8]Klaus Satzke, Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Andre Beck, Paarijaat Aditya, Manohar Vanga, and Volker Hilt. 2020. Efficient GPU Sharing for Serverless Workflows. In Proceedings of the 1st Workshop on High Performance Serverless Computing, 17–24.
  • [9]Ryan Yang, Nathan Pemberton, Jichan Chung, Randy H. Katz, and Joseph Gonzalez. 2020. PyPlover: A System for GPU-enabled Serverless Instances. Technical report, University of California, Berkeley.