22001177 IIEEEEEE 3377tthh IInntteerrnnaattiioonnaall CCoonnffeerreennccee oonn DDiissttrriibbuutteedd CCoommppuuttiinngg SSyysstteemmss WWoorrkksshhooppss
Pipsqueak: Lean Lambdas with Large Libraries
Edward Oakes Leon Yang Kevin Houck
University of Wisconsin-Madison University of Wisconsin-Madison University of Wisconsin-Madison
firstname.lastname@example.org email@example.com firstname.lastname@example.org
Tyler Harter Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau
Microsoft Gray Systems Lab University of Wisconsin-Madison University of Wisconsin-Madison
email@example.com firstname.lastname@example.org email@example.com
Abstract--Microservices are usually fast to deploy because each !
microservice is small, and thus each can be installed and
started quickly. Unfortunately, lean microservices that depend
on large libraries will start slowly and harm elasticity. In this
paper, we explore the challenges of lean microservices that rely
on large libraries in the context of Python packages and the
OpenLambda serverless computing platform. We analyze the
package types and compressibility of libraries distributed via !
the Python Package Index and propose PipBench, a new tool
for evaluating package support. We also propose Pipsqueak, a
package-aware compute platform based on OpenLambda.
Cloud computing has democratized scalability: individ-
ual developers can now create applications that leverage Figure 1. Compute Models. A cloud service based on virtual
thousands of machines to serve millions of users. Beyond machines (left) is compared to a serverless platform based on
simple scalability, though, many modern web applications Lambdas (right).
also require elasticity, the ability to scale quickly. Highly
elastic applications can rapidly respond to load changes, example, Figure 1 contrasts a traditional cloud platform
both saving money under light traffic and taking advantage based on virtual machines with a serverless platform, such
of flash crowds and other opportunities [6, 11]. as OpenLambda . Deployment of new instances of the
Elasticity depends on fast deployment. An application application on virtual machines will be slow because the
will not be able to gracefully serve a sharp load burst if application logic is tied to the runtime and operating system.
it must first provision new virtual machines and perform Starting a new instance will require copying the virtual-
lengthy software installations. machine image to a new physical machine, booting the
Deployment stresses every type of resource: code pack- operating system, and loading the execution runtime (e.g., a
ages are copied over the network, packages are decom- language virtual machine, such as the JVM) into memory.
pressed by the CPU, the decompressed files are written to In contrast, Figure 1b shows how a Lambda-based plat-
disk, and application code and state must be loaded into form encourages developers to shrink their deployment bun-
cold memory. All of these costs directly correlate with the dles along two dimensions. First, developers are encouraged
size of the deployment bundle; a smaller bundle will require to reduce the vertical size of their application by building
less network and disk I/O, will be faster to decompress, and on top of standard shared components (e.g., specific Linux
will require less space when loaded into memory, resulting kernels and runtime environments) which can then be pre-
in a more elastic application. initialized, ready to be used by any application. Second, the
Developers are partly responsible for creating small de- Lambda model forces developers to write their application
ployment bundles, but there is much that cloud providers as a set of handlers that run in response to events. Thus,
can do to facilitate the development of lean applications. even though the whole application may be large, individual
In particular, platforms can encourage sharing and the de- handler bundles will be smaller and faster to deploy.
composition of applications into smaller components. For Unfortunately, Figure 1b shows a simplistic use case
12534352--05667686//1177 $$3311..0000 ?? 22001177 IIEEEEEE 339955
where the Lambda handlers do not depend on user-space !
packages. In practice, developers rely on an assortment of
libraries to avoid implementing everything from scratch.
If a handler is distributed along with its dependencies, a
conceptually lean function will have to be deployed in a
large bundle. Executing the function on a new machine may
require copying said bundle, decompressing it, writing the
contents to local disk, and loading it into memory. It is
easy to see how such costs could dominate the latency of a
Lambda invocation. Of course, asking developers to eschew
the use of popular packages is not acceptable. Such require-
ments will surely deter adoption of serverless computing.
In response to the problem that large libraries pose for
serverless microservices, we propose Pipsqueak, a package-
aware serverless platform based on OpenLambda . Pip-
squeak will cache packages from the Python Package Index
(PyPI), the primary Python repository, by maintaining a pool
of Python interpreter processes as cache entries, each of Figure 2. Library Support. Without special library support, pack-
which has a set of packages installed and already interpreted. ages will need to be installed in each Lambda handler (left),
Each process will act as a template from which to clone creating large deployment bundles. With package support (right),
new, pre-initialized interpreters to serve requests. We explore a common repository of packages can be shared between different
several new policy factors that must be considered in this handlers. The shaded boxes represent packages.
new type of cache. For example, potentially unsafe packages
impose new constraints on cache-entry selection, and the use Unfortunately, refactoring the packages of all the popular
of copy-on-write memory between processes means that an language repositories would be no small task, and relying
evictor will need to consider state shared between processes. on Lambda-specific implementations limits the flexibility
The rest of the paper is organized as follows. First, afforded to developers.
we further motivate the need for a package-aware Lambda Instead, we propose building package support as part
platform (?2). Next, we describe the PyPI repository as well of the serverless platform, as shown in Figure 2b. In this
as its associated package management tool, pip. (?3). We design, the platform would support a set of language-specific
then propose Pipsqueak, a package-aware platform based libraries (e.g., the PyPI repository) and track which are
on OpenLambda (?4). Finally, we suggest PipBench, a new required by each handler. This type of dependency aware-
tool for evaluating performance of handlers with external ness would allow the serverless platform to share packages
dependencies (?5) and conclude (?6). between handlers belonging to different customers. It would
also allow a subset of requests to be handled by workers with
2. Motivation: The Library Challenge required packages in a hot state (i.e., already installed, and
possibly in memory) for increased performance. In a later
Decoupling an application from its operating system in section, we describe our plans to implement support for the
order to increase sharing can reduce deployment sizes by PyPI repository in OpenLambda (?4).
an order of magnitude . Unfortunately, modern applica-
tions rely heavily on large libraries  and other userspace 3. A Study of Python Packages
dependencies . Bundling these dependencies with each
Lambda handler leads to bloated deployment packages and Most modern scripting languages now have large repos-
very slow response times . itories of popular packages and tools for easily installing
Figure 2a illustrates the problem. Even though the devel- packages and their dependencies . Ruby has RubyGems,
oper split the application into small functions (F1 to FN ), NodeJS has npm, and Python has PyPI. In order to under-
dependencies on large third-party libraries such as numpy stand common package characteristics and usage patterns,
and scipy dominate the handler sizes. During a load burst, we plan to analyze the Python packages distributed via the
these packages will need to be copied and installed to many PyPI repository. Toward this end, we have set up a PyPI
worker machines. mirror, downloading a copy of the entire repository.
One solution would be to rewrite old packages, splitting Figure 3 shows the total size of the PyPI packages (as of
the functionality of these large dependencies into many March 19, 2017); simply downloading the packages requires
smaller bundles. For example, perhaps the numerical numpy 466 GB. However, most packages are compressed as .tar.gz
package could be refactored as a set of Lambda handlers files or with a zip based format (.whl, .egg, or .zip). In
(i.e., one for fast Fourier transform, one for matrices, etc.) uncompressed form, the cumulative size of the packages is
and used via inter-Lambda REST calls. Mass deployment of 1.3 TB. We observe that the simple .tar.gz packages are more
these monolithic libraries could be replaced with on-demand popular than the Python-specific .egg and .whl files. Across
deployment of only the features that are actually used. the bars, the number of compressed subfiles is about 100?
700 uncompressed (1339GB total) execute arbitrary code, including calls into other languages
600 compressed (466GB total) such as C.
In order to be able to quickly execute a new handler, we
500 would like to have its necessary packages already down-
12M loaded, installed, and imported. Our measurements (?3)
300 suggest it is not practical to have every package initialized
in this way on every worker machine, so a caching policy
200 539K 5M will dictate which packages are pre-initialized.
100 53K 66K 147K 4.2. Security Assumptions
tar.gz whl egg zip other We assume that handler code may be malicious. We
File Extension further assume that PyPI packages may be malicious. In
practice, Tschacher  showed that it is easy to upload ma-
Figure 3. PyPI Mirror. The logical size of a PyPI mirror (excluding licious packages to the most popular repositories for Python
directory entries) is shown, compressed and uncompressed. The (PyPI), NodeJS (npm), and Ruby (RubyGems). While one
sizes are broken down by file type. The number of files in a category
are shown at the bar ends. Data was collected on March 19, 2017. could imagine vetting packages that are included in such a
repository, doing so on such a large and rapidly growing
greater than the number of files, indicating that a typical body of code would be nontrivial.
packages will contain about 100 files. With respect to the steps described earlier (?4.1), we
Implications: The PyPI repository is too large to cache assume downloading packages is safe since it only involves
in memory on every worker machine in a serverless cluster. copying files. We assume installation and importing, how-
Uncompressed, the packages are over 1 TB, which is also ever, may be malicious, as these steps may involve executing
too large to store on SSDs on every worker at low cost. arbitrary code submitted by a malicious user. These assump-
tions lead to three design decisions:
4. Pipsqueak: Package-Aware Lambdas 1) Package installation and importation must always
In this section, we propose Pipsqueak: a new package be performed in a sandboxed (e.g., containerized)
environment in order to protect the host worker.
caching mechanism in OpenLambda . We first describe 2) In order to protect users from malicious packages,
the different levels of caching that are possible (?4.1) and a handler H must never be allowed to run in an
discuss security requirements (?4.2). We then describe the environment where package P has been imported
basic mechanism for caching (?4.3) as well as new policy or installed, unless handler H depends on P .
decisions to be made at the local and cluster levels (?4.4). 3) We provide no protection guarantees to a handler
4.1. Startup Costs that chooses to import a malicious package. For
example, it is acceptable for information to leak
If a package is being used by a handler for the first time between the handlers belonging to different cus-
(i.e., there is no cached state to rely on), the following steps tomers if the handlers import the same malicious
will be necessary: package. Note that this problem is not unique to a
Download: Fetching the compressed packages from a serverless computing environment .
repository mirror in the cluster may be necessary. This will
consume both network bandwidth and SSD resources on 4.3. Cache Mechanism: Interpreter Forking
the worker. It is conceivable that all the packages could be
stored locally on every worker compressed (466 GB), but Our goal is to able to quickly provision a new Python
using that amount of SSD capacity would be costly. interpreter for a handler, pre-initialized with the packages
Install: Packages are normally compressed, so they will the handler needs downloaded, installed, and imported into
need to be decompressed and written to disk before they memory. More generally, we need to acquire a mostly
can be used. Furthermore, some packages (e.g., numpy) initialized process without paying the cost of starting a new
have extensions written in C, so installation may require a interpreter and loading a variety of dependencies. Thus, we
compile phase. Installation is guided by a setup.py script plan to build our interpreter cache as a collection of paused
that the developer provides, and this script may execute Python processes, each with a different set of packages
arbitrary code, so there may be other steps we do not already imported. Using a cache entry will simply involve
describe here that are specific to individual packages. calling fork from a cache entry to allocate a new, pre-
Import: Importing a module within a package involves initialized interpreter process to run the handler code.
executing the __init__.py script in the root of the This basic design is complicated by security concerns.
directory, often involving defining functions and importing First, user code may be malicious, so when the engine forks
other modules. Thus, there will be a CPU cost to gener- a new process from a cache entry, that new process will have
ating Python bytecode. Furthermore, __init__.py may to join a new container. Second, packages are also assumed
to be malicious, so cache entries will also need to be isolated
Title: Pipsqueak: Lean Lambdas with Large Libraries
Subject: 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops
Keywords: serverless computing, cloud computing, distributed computing, distributed systems, python, distributed cache, software repository
Author: Edward Oakes, Leon Yang, Kevin Houck, Tyler Harter, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Producer: iTextSharp 4.0.7 (based on iText 2.0.7)
CreationDate: Wed Jun 7 08:10:07 2017
ModDate: Thu Jul 13 09:06:07 2017
Page size: 612 x 792 pts (letter) (rotated 0 degrees)
File size: 236357 bytes
PDF version: 1.4