University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Serving Machine Learning Workloads in Resource Constrained Environments - a Serverless Deployment Example

Christidis, Angelos, Davies, Roy and Moschoyiannis, Sotiris (2019) Serving Machine Learning Workloads in Resource Constrained Environments - a Serverless Deployment Example In: IEEE SOCA 2019 (The 12th IEEE International Conference on Service Oriented Computing and Applications), 2019-11-18-2019-11-21, Kaohsiung, Taiwan.

[img]
Preview
Text
IEEE SOCA 2019 Paper (Final Version).pdf - Accepted version Manuscript

Download (1MB) | Preview

Abstract

Deployed AI platforms typically ship with bulky system architectures which present bottlenecks and a high risk of failure. A serverless deployment can mitigate these factors and provide a cost-effective, automatically scalable (up or down) and elastic real-time on-demand AI solution. However, deploying high complexity production workloads into serverless environments is far from trivial, e.g., due to factors such as minimal allowance for physical codebase size, low amount of runtime memory, lack of GPU support and a maximum runtime before termination via timeout. In this paper we propose a set of optimization techniques and show how these transform a codebase which was previously incompatible with a serverless deployment into one that can be successfully deployed in a serverless environment; without compromising capability or performance. The techniques are illustrated via worked examples that have been deployed live on rail data and realtime predictions on train movements on the UK rail network. The similarities of a serverless environment to other resource constrained environments (IoT, Mobile) means the techniques can be applied to a range of use cases.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Computer Science
Authors :
NameEmailORCID
Christidis, Angelosa.christidis@surrey.ac.uk
Davies, Roy
Moschoyiannis, SotirisS.Moschoyiannis@surrey.ac.uk
Date : 8 September 2019
Funders : EPSRC - Engineering and Physical Sciences Research Council, EIT Digital IVZW
Copyright Disclaimer : © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords : Serverless; FaaS; AI; Machine Learning; Optimization; AWS Lamda
Additional Information : This research was partly funded by EIT Digital IVZW under the Real-Time Flow project, activity 18387--SGA201, and partly by the EPSRC IAA project AGELink (EP/R511791/1).
Depositing User : Diane Maxfield
Date Deposited : 11 Nov 2019 16:00
Last Modified : 19 Nov 2019 02:08
URI: http://epubs.surrey.ac.uk/id/eprint/853108

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800