G1GC - Java 9 Garbage Collector explained in 5 minutes

Table of Contents show

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In our previous Java 9 series article we looked at JShell in Java 9. This time we will be looking at garbage collection.

With Java 9, the default garbage collector (GC) is being changed from the ParallelGC to the G1GC. But what does this mean?

This article assumes a basic understanding of garbage collection – if the only GC you know is the one that picks up recycling on Tuesdays, check out this tutorial by Oracle on garbage collection in Java.

So, what is the G1GC?

The “Garbage-first” garbage collector, aka G1, is a concurrent multi-threaded GC. It mostly works alongside the application threads (much like the concurrent mark sweep GC) and is designed to offer shorter, more predictable pause times – while still achieving high throughput.

What makes G1 different is that instead of splitting the heap into 3 big regions, it partitions it into a set of lots of equal-sized ones. Certain subsets of regions are still assigned roles just like in the other GCs. The amount of live data in each region is tracked, and when a collection is triggered the G1GC will clear the ones with the most ‘garbage’ first – hence the name. By doing this, it attempts to free as much space as possible with each collection. Furthermore, it compacts the heap during these collections, mostly eliminating potential fragmentation issues.

G1GC Heap Allocation — G1 Heap Allocation
Credit: Oracle, http://www.oracle.com/technetwork/tutorials/tutorials-1876574.html

G1 also tracks various metrics for each region, and calculates how long it will take to collect them – give it a target for pause times, and it will attempt to collect as much garbage as it can within the given constraints.

Comparison to Parallel GC

By default, the JVM (pre-Java-9) uses ParallelGC. This GC is designed to be used by applications that need to do a lot of work (high throughput) and where long pause times are acceptable. Over long periods of time, applications using the ParallelGC tend to spend less time overall in garbage collection, but can potentially have some notoriously long pause times. This can cause latency, affecting the responsiveness of the application.

Concurrency. With the ParallelGC, all collections stop the application threads. G1, however, will only “stop-the-world” during a full garbage collection. In addition to this, it also offers a variety of tuning options so that full collections can be avoided outright.
Throughput. In exchange for a small hit to throughput (the percentage of total time NOT spent in garbage collection), G1 focuses on minimising pause times/latency. As aforementioned, the ParallelGC does the opposite, maximising throughput at the expense of potentially long pause times.
Compaction. As time goes on, and objects are removed from the heap, gaps will open up in-between the remaining objects. After a while, the ParallelGC will need to perform a full collection in order to reorganise and clear the heap. On the other hand, G1 compacts the heap while it does collections, avoiding this situation.
Process size. Lastly, another point worth noting is that the G1 has a larger footprint. This is due to extra data structures needed to manage each region, so the JVM process size is larger (though Oracle have stated that the impact should be less than 6%).

You can test the G1GC yourself using the VM flag –XX:+UseG1GC.

For more detailed information on G1 and how it works, check out Oracle’s tutorial.

Have you tried out the G1GC? Let us know your thoughts in the comments.