Compressed GraalVM Native Images: the best startup for Java apps comes in tiny packages

Published in

graalvm

10 min readDec 9, 2020

TL;DR — GraalVM 20.3.0 Native Image generated executables can be compressed by a factor of x3 to x4 using tools such as Ultimate Packer for eXecutables with very low impacts on the startup time. The DRAGON Stack manager, a Java CLI application, improves the developer experience because of the resulting perception: a lot of capabilities in less than 20 MB.

The first time I heard about data compression was a short time after I had my first PC where the hard disk was huge: 512 MB. I learned about pkzip, arj, ain, and rar running as command-line tools at the time… Then I discovered a compression book and read it multiple times amazed at the theory behind it.

Some years after reading the compression book, an event changed my vision of data compression forever! I just was choked by something that happens rarely in large enterprises but seems to be common in the demoscene world: a moment of grace, that I’m sharing here with you…

We are in Germany; in the city of Bingen am Rhein where the Breakpoint 2009 demo party is going well in this month of April 2009… when this is the turn of RGBA and TBC to showcase their production: Elevated. The demo lasts for 3 minutes and 35 seconds of pure wonder. Animation, music, effects, this is really beautiful… then I was blown away when I realized this was produced by a 4,096 bytes executable. Yes, 4kb! Needless to say, this demo won the top prize in its category. When we look at the source code, we can find that the secret behind the small size is a tool named Crinkler: a Windows executable file compressor — more specifically, a compressing linker for Windows but which can solely generate 32 bits executables because of its targets: 8kb and 4kb intros.

Fast forward to 2020, I started an open-source project named DRAGON Stack manager. This project aims to simplify the development of applications using an Autonomous backend by automating numerous steps like project scaffolding, provisioning of the cloud resources, and other usually manual steps. It is developed mainly in Java and the GraalVM Native Image capabilities allow the dragon tool to be very simple to use. If you want to test the DRAGON Stack CLI, there is a new Oracle LiveLabs Lab where you can try it.

Basically, DRAGON comes as a Command Line Interface (CLI) that integrates Oracle Cloud Infrastructure SDK for Java which can manage public Cloud resources. The CLI can also generate project source code (starting with React frontend and the Spring-Boot Petclinic sample application) which comes pre-configured to connect to the provisioned Autonomous Database and the REST Data Service. The developer experience here is really good, try it; there are Windows, Linux, and macOS binaries available.

One important point of the developer experience is how lightweight the CLI utility is — it’s a standalone executable which is currently around 20 MB. It’s very easy to download, just wget a small file, and is standalone so it requires no specific runtime. It starts and works fast!

The small size achieved here is possible using UPX: the Ultimate Packer for eXecutables. UPX is portable, free, open-source and it provides a high-performance executable packer for several executable formats with an innovative decompressor.

UPX uses a data compression algorithm from UCL: a portable lossless data compression library written in ANSI C. UCL implements a number of lossless compression algorithms for executables (including libraries: dll, so) bringing the following advantages:
· Decompression is simple and very fast.
· Requires no memory for decompression.
· The decompressors can be squeezed into less than 200 bytes of code.
· Excellent compression ratios.
· Several compression levels are available making the compression step slower for better compression ratio while the speed of the decompressor is not reduced.
· Compressed executables can be used for commercial software distribution as well as non-commercial use.

GraalVM Native Image allows us to build native executables of programs which have instantaneous startup time, are immediately ready to do useful work, and require no warmup. It is also possible to compress the executables produced by the native image with UPX.

Now we can imagine the benefits of combining GraalVM Native Image with UPX:
· Very small standalone distribution not requiring a JDK
· Still instant startup
· Lower memory footprint
· Lower storage footprint
· Lower network bandwidth resources for fast download of applications, Docker images, or Fn based functions (more on that later)…

Compressing your native executable

UPX is available on several platforms, but not on macOS. The good news is that you can compress macOS native executables on a Linux or Windows host.

The process involves first building a native executable, then if needed enhancing it for your particular needs. For example, adding your own icon for a Windows executable using the Rcedit tool. As the very last step you’d run UPX to compress the native executable.

Using UPX is straightforward, you run a command specifying the file to compress and get a smaller binary, something like (use -k — to keep the original file too):

upx -7 -k myapp

UPX can use several approaches to compress an executable using standard NRV (Not Really Vanished) algorithm with different levels from 1 (fastest) to 9 (most aggressive) with a 10th available using the --best CLI argument). It supports filtering for the executable sections: code, resources, icons, etc., and can also use the LZMA algorithm, which usually provides better compression at the cost of decompression (more time and more memory required). Finally, for those wanting the most from UPX, special flags such as --brute and --ultra-brute flags exist to push for even more compression, but these increase compression duration exponentially.

Wondering what is the limit of the UPX compression? Then let’s see how much can we push the size of the native image executables with it.

I ran an experiment on how the compression level influences the size of the binary. The following chart illustrates my tests using the DRAGON Stack manager tool. The bars reflect the different sizes of the compressed executable while the orange line denotes the compression time required by UPX.

Effects of UPX compression factors on Windows native image

Starting with Windows, the bars in blue illustrate the variation of the compression size from 69.1 MB originally down to 18.8 MB using the -9 compression level that I currently use for the DRAGON Stack manager releases, which achieves a x3.8 size reduction!

The purple bars illustrate the effect of using the LZMA algorithm with compression levels: -7 and --best (which correspond to a 10th level) which respectively give x4.47 (15.47 MB) and x4.53 (15.23 MB) size reduction factors.

Finally, you can see that using either the--brute or --ultra-brute doesn’t help to reduce the size compared to --lzma while the compression time increases dramatically. My advice would be to not use these.

For Linux, I get the following chart:

Effects of UPX compression factors on Oracle Linux native image

For macOS, where the compression has been done on the Linux box, I get the following chart of the binary sizes:

Effects of UPX compression factors on MAC OS native image

It confirms that on the 3 platforms, the --best cli argument gives the best (indeed) compression factor while keeping the fast start benefits of the GraalVM native image.

Runtime impact of compression

I also tried running some of the compressed executables to see whether the runtime impact of the compression is visible. The test consists of running the following command: dragon --help which displays the possible arguments the DRAGON Stack CLI can accept, but while doing so it also performs an HTTPS request to look for any new version available on GitHub. Here are the results I got:

Impact of UPX compression on GraalVM Native Image execution time

The timings were retrieved by using the Hyperfine tool that takes care of warming-up (3 times here) the native image execution. As you can see, the impact of compression is still very low: less than 265ms to download a web page while being compressed with the --best compression level. That’s quite acceptable for a CLI application. LZMA compression appears to be a tiny bit slower even compared to actually starting the java -jar command itself, so definitely avoid it for CLIs — indeed, the user experience here is not ideal.

A follow-up to the “CLI applications with GraalVM Native Image” post

In a recent post on the GraalVM blog, Oleg Šelajev described the benefits of GraalVM native image using Micronaut and Picocli. I thought it’d be interesting to follow up on it and try to compress the CLI app created there. Here are the results I get using the newly released Micronaut 2.2.0 version.

Following are the detailed resource usage for the native image with default JVM memory setup:

Micronaut prime numbers computation example uncompressed

Here are my highlights:
· Max memory used by the app (max RSS): 39 MB
· Elapsed total time: 10 ms
· CPU usage: 80%

Now let’s compress it with UPX:
· Original size: 49.75 MB
· Compressed size (using --best): 13.3 MB

And see how the resource usage evolves:

Micronaut prime numbers computation example compressed using UPX

· Max memory used by the app (max RSS): 81.85 MB
· Elapsed total time: 170 ms
· CPU usage: 100%

Note that using the fastest compression level (-1) gives:

· Max memory used by the app (max RSS): 86 MB
· Elapsed total time: 220 ms
· CPU usage: 100%

It looks like compression does affect the startup and a compressed native image does run in a few hundred ms, probably depending on the size of the binary. Most probably for a typical CLI application, this delay will not be noticeable, but if you’re building native images for some serverless platform you should be mindful of this tradeoff.

Bonus point: the Micronaut 2.2.0 cli is a native image executable too, and can be compressed, for example on Linux it gets from 66.5 MB down to around 17 MB.

…and what about Docker images or Fn functions?

As I mentioned earlier, another area where compression might be useful is Docker images for Java applications. David Delabassée, a well-known developer advocate working at Oracle and evangelizing about Java and the Fn project (available on-premises and as a managed service on Oracle Cloud Infrastructure), did a great presentation during the JCON2019 related to reducing the size of Docker images containing Java-based applications.

His insights are really useful and thanks to him, I was able to quickly reproduce some of the results he illustrated in his famous slide:

David Delabassée’s “famous slide” (one of the numerous famous slides :) )

Going from 318 MB down to 32 MB…

This is before going to GraalVM native images which at the end gives a Docker image size of about 9 MB!

So what’s next? Let’s compress the native image of course…

The compressed native image is only 2.52 MB but it’s dynamically linked to the OS libraries like glibc so it needs the OS environment to run. For example you could use a slim linux docker image or a distroless container with no OS but with the libraries. What we can do to have a smaller image is to build a statically linked native image which will have no dependencies at all. This involves going through the path described in the GraalVM Static Native Images documentation. It will rely on the musl and the zlib libraries that you’ll need to compile. In the end, it runs smoothly and ldd shows there is no more dependency:

Compressing this static native image with UPX again using the --best flag allows us to produce a running Docker image using this Dockerfile:

It runs in less than 1 second:

· Max memory used by the app (max RSS): 63.6 MB
· Elapsed total time: 790 ms
· CPU usage: 4%

And the size of the Docker image is 2.64 MB:

… instead of 318 MB, that’s 120 times smaller!

Conclusion

GraalVM Native Image is a game-changer. Compressing native executables allows us to push the benefits you can get from this technology even further with reasonable compromises in terms of start-up time because UPX decompression is really fast and does not involve too much memory (using the --best cli argument). Furthermore, the latest version of GraalVM 20.3.0 now makes the process multi-platform (Windows, Linux, and macOS supported) and removes barriers regarding compression.

How does it feel to have Java command-line applications with a size smaller than 20 MB? If you want to know, you can test the DRAGON Stack manager CLI; a compressed native image by downloading it for your platform, be it Windows, Linux, or macOS or using a new Oracle LiveLabs Lab where you can try it.