Java applications have unique demands—some need to prioritize throughput, while others emphasize latency or scalability. These workloads can vary significantly in size, complexity, and resource requirements.
Achieving an optimal performance balance often means navigating trade-offs: memory footprint, throughput, latency, predictability, and scalability each pull in different directions.
To effectively support this diverse ecosystem of Java applications, we need a variety of collectors in order to best serve all these different kinds of Java applications. and Z is one of those GCs we are going to talk about in this article.
Note: This article will start from the very basics and then move on to discuss in details about the ZGC.
The Java Virtual Machine (JVM) serves as a software-based simulation of a physical computer. Just as physical computers execute machine code through their processors, the JVM processes its own set of virtual instructions. This virtual machine comes complete with its own architecture and execution framework, allowing it to run programs written in its specialized instruction format.
Throughout JVM's history, developers and companies have created various implementations to run Java applications. While many implementations have come and gone, a few key players have emerged as industry standards.
One prominent implementation is HotSpot, which gained widespread adoption in the Java ecosystem. Originally created as a commercial venture by Sun Microsystems, HotSpot's development continued under Oracle's leadership following their acquisition of Sun. Oracle enhanced HotSpot by incorporating valuable features from their JRockit platform. Today, HotSpot exists as open-source software within the OpenJDK project, making it freely accessible to developers worldwide.
Another significant JVM implementation is OpenJ9, which originated at IBM. OpenJ9 provides an alternative to HotSpot and is currently under maintenance by the Eclipse Foundation as an open-source project.
List of other implementations of JVM: WikipediaList of Java virtual machines
<span class="pink">The OpenJDK is a project under which an open source implementation of HotSpot (and many other pieces of the JDK e.g. compiler, APIs, tools, etc) is developed.
</span>
As an application runs, it keeps creating objects and storing them in memory. Most of these objects eventually become unused, meaning they need to be removed to free up memory for other tasks. This cleanup process is known as garbage collection, and it is done automatically, with the help of garbage collectors.
Understanding Pause Time in Garbage Collection
Pause time is the period when your application stops running because the system is cleaning up unused memory. If these pauses are too long, your app might feel slow or unresponsive, which can be frustrating for users. Minimizing pause time is important to keep your app running smoothly.
Among Java's garbage collectors, ZGC (Z Garbage Collector) is known for having the least pause time. It’s designed to keep pauses to just a few milliseconds, even with large amounts of memory.
This makes it ideal for applications where minimal interruptions and smooth performance are critical.We will see a practical demo of pause time comparisons with G1 GC and ZGC in latter part of this article.
HotSpot JVM offers four major types of garbage collectors, each designed to handle different types of workloads and memory management needs. These garbage collectors manage how memory is allocated and freed, ensuring Java applications run efficiently.
The Serial Garbage Collector is designed specifically for small applications and embedded systems where memory and CPU resources are limited. As a single-threaded collector, it pauses all application threads during garbage collection, making its operation quite simple. While this approach might seem basic, its small memory footprint and straightforward implementation make it particularly effective in environments where managing multiple threads would create unnecessary overhead. This collector shines in scenarios like small Java applications, embedded devices, or testing environments where simplicity takes precedence over performance optimization.
The Parallel Garbage Collector, also known as the throughput collector, is engineered for applications where high throughput is the primary concern and longer pause times can be tolerated. It leverages multiple threads to conduct garbage collection, which significantly reduces the overall time spent in GC by maximizing parallel processing. Although this approach may result in longer application pauses, it optimizes the total amount of work the application can accomplish over time. This makes it particularly well-suited for batch processing systems and large-scale applications where overall throughput takes priority over immediate response times.
The G1 (Garbage First) Collector, which became the default garbage collector starting with JDK 9, offers a balanced approach between high throughput and low latency. It employs an innovative strategy of dividing the heap into regions and strategically collecting areas containing the most garbage, thereby minimizing performance impact. One of its most valuable features is the ability to set predictable pause times, making it highly suitable for applications requiring consistent response times. Given its adaptability and balanced performance characteristics, G1 has become the preferred choice for most modern Java applications, especially those running on large heaps where both throughput and responsiveness are crucial.
The Z Garbage Collector (ZGC) represents the cutting edge of garbage collection technology, specifically designed for applications demanding extremely low pause times while handling large heap sizes up to multiple terabytes. ZGC accomplishes this by performing most of its work concurrently with the running application, typically achieving pause times of just a few milliseconds. Its exceptional scalability and minimal pause times make it the ideal choice for demanding environments such as financial trading systems, high-frequency transaction processing, and large-scale cloud applications where maintaining consistent responsiveness is critical.
Although this triangle does not depict anything specific, it provides a more broad understanding of which sector various garbage collectors are most effective in.
💡 But here is a catch: only learning theory about GC and deciding which GC to use for your application might lead you to uninvited trouble.
GC design and choice is a series of trade-offs, and the best and advised solution is to try every GC for your application and decide which one suits best for your application.
<span class="pink">
In order to find the best GC for your application:
</span>
💡 Performance Tuning: If you find that most of the characteristics are pretty close, try the tuning. You may have to tweak some of the performance settings for the garbage collector you have selected for optimal performance.
A useful guide for performance tuning
Now, the fundamentals are done. we will move on to the crux of this article:
The ZGC lifecycle consists of three main phases, and what makes it special is that during most of these phases, your application keeps running (shown by the continuous "<span class="pink">Application Threads
</span>
" arrow at the bottom).
First Phase - Mark Start:There's a very brief pause (shown by the first white vertical line) where ZGC takes a quick snapshot of your program's state. Think of it like quickly taking a photograph of all the active objects in your program. During this pause, ZGC identifies the starting points (roots) from where it will begin its search for live objects. After this tiny pause, your program continues running while ZGC does its marking work.
Concurrent Mark Phase:After the initial pause, ZGC starts from the roots and follows all connections to find which objects are still being used. During this time, ZGC starts from the roots and follows all connections to find which objects are still being used. This happens simultaneously with your program's execution, which is what "Concurrent" means.
Mark End Phase:There's another very brief pause (second white line) where ZGC finalizes its marking work. Similar to double-checking that it hasn't missed anything important.
Prepare for Relocation Phase:This is a concurrent phase (shown by the next set of blue bars) where ZGC plans how it will reorganize memory. It's like planning how to rearrange furniture in a room while people are still using it. During this phase, ZGC decides which objects need to be moved and where they will go.
Final Phase - Relocate:After one more tiny pause (third white line), ZGC begins the relocation phase (last set of blue bars). This is where it actually moves objects to better positions in memory, making it more organized and efficient. Again, this happens while your program continues to run.
Traditional garbage collectors would need to stop your program completely while doing most of this work. But ZGC only needs these three extremely short pauses (the white vertical lines), and each pause typically takes less than one millisecond. The rest of the work happens concurrently - meaning your program keeps running while ZGC does its cleanup work.
<span class="pink"> Concurrency sits at the heart of ZGC's design, allowing it to perform most garbage collection tasks while your application continues to run. This approach dramatically reduces those dreaded stop-the-world pauses that can disrupt application performance.</span>
ZGC’s concurrency depends heavily on two key architectural features: Colored Pointers and Load Barriers.
Colored Pointers: These help ZGC keep track of which objects are being moved without stopping your program
Load Barriers: These ensure your program can still find objects even if they've been moved to a new location
Colored pointers are a fundamental part of ZGC’s architecture. Unlike traditional pointers that only store memory addresses, ZGC uses a 64-bit pointer with additional metadata embedded in it. Here's how it works:
The idea behind colored pointers is to allow the garbage collector to move and manage objects while the application is running without any interruptions. Whenever the GC needs to move an object, it updates the color in the pointer to signal that the object has changed state or location. The application doesn’t have to be paused because the pointers already carry all the necessary information.
Why Is This Useful?
By embedding state information directly into the pointer, ZGC can quickly check the status of an object and decide what to do next without needing to pause the application. This is a key part of how ZGC reduces pause times to mere milliseconds, making the garbage collection process much more efficient.
While colored pointers provide the state information needed for concurrent processing, load barriers make sure this information is used correctly every time an object is accessed. Here's how they work:
By checking the status of the object and "healing" it if necessary, load barriers ensure that every object's access is safe and accurate.
These processes happen so quickly that it’s nearly invisible to the application, contributing to ZGC’s ability to maintain low pause times.
Before the introduction of Generational Mode in JDK 21, the Z Garbage Collector (ZGC) used what’s known as a Non-Generational Mode. In simple terms, ZGC treated all objects the same, regardless of how long they existed in memory. Whether an object was created just a moment ago or had been around for a while, ZGC didn't differentiate between them when deciding what to clean up. This method worked fine for keeping pause times very low, but it wasn’t the most efficient for memory management, especially when it came to dealing with short-lived objects.
Now, let’s talk about Generational Mode. This concept comes from the idea that most objects in a program are either short- or long-lived:
In Generational Mode, the garbage collector splits the memory into two main parts:
💡 By default, Java uses the G1 Garbage Collector unless another option is specified. However, if you choose to enable the Z Garbage Collector (ZGC) over the default G1 garbage collector using the flag <span class="pink">-XX:+UseZGC</span>
, starting from JDK 23 (with JEP 474), ZGC will automatically operate in Generational mode.
While the non-generational version of ZGC is still supported, it is likely to be deprecated in the future. If you prefer to continue using the non-generational ZGC, you can explicitly opt out of Generational mode by including the flag <span class="pink">-XX:+UseZGC</span> <span class="pink">-XX:-ZGenerational
</span>.
💡 From JDK 23 onward, you no longer need to use flags to enable ZGC's generational mode; it's the default setting.
Project Objective: ZGC vs G1 Garbage Collector Performance Analysis
The purpose is to compare and analyze the performance of two Java garbage collectors - ZGC and G1 GC - using a Spring Boot application.
Through controlled memory tests and JDK monitoring tools, we measure:
This helps understand which garbage collector is more suitable for different application needs, particularly focusing on <span class="pink">ZGC's sub-millisecond pause time claims versus G1's general-purpose efficiency
.</span>
pom.xml should look something like this:
MemoryController.java
Explanation:
This code creates a simple web API that lets you test garbage collector performance. For Example, By sending a POST request to <span class="pink">/api/memory/load/500</span>
, you can create a temporary 500MB memory load. This helps in analyzing how different garbage collectors (like ZGC or G1) handle various memory allocation patterns.
MemoryService.java
Explanation:
The MemoryService creates controlled memory pressure to test garbage collectors through two methods. First, the scheduled <span class="pink">simulateMemoryLoad()</span>
runs every 100ms to create steady memory churn by continuously allocating and freeing 1MB chunks. Second, the <span class="pink">generateLoad(mbSize)</span>
method allows you to create immediate memory pressure of any size, letting you test how garbage collectors handle both gradual and sudden memory pressures.
GcTestApplication.java
Explanation:This is the main entry point of our garbage collector testing application. The <span class="pink">@EnableScheduling</span> annotation is crucial as it activates the background memory allocation task in our MemoryService, while <span class="pink">@SpringBootApplication</span> sets up Spring Boot's auto-configuration.
Source code folder structure should look like this:
Note: We have three script files to automate our GC testing process: two <span class="pink">.bat</span> files to start our application with different garbage collectors (ZGC and G1), and a PowerShell script to generate memory loads. The batch scripts configure each GC with monitoring tools like JFR and GC logging, while the PowerShell script automates the process of hitting our endpoints with alternating memory loads.
In our project's root directory, we'll organize our automation scripts by creating a dedicated 'scripts' folder.
test-g1.bat
Explanation:
This Windows batch script starts our application with G1 garbage collector enabled and comprehensive monitoring setup. It first creates directories for logs and recordings, then launches the Java application with the following configurations:
<span class="pink">XX:+UseG1GC</span>
: Enables the G1 garbage collector<span class="pink">Xms3g -Xmx3g
</span>: Sets both initial and maximum heap memory to 3GB for consistent testing
<span class="pink">XX:+FlightRecorder</span>
: Enables JDK Flight Recorder for performance monitoring
<span class="pink">XX:StartFlightRecording=name=GCTest,duration=120s,filename=../recordings/g1-recording.jfr</span>
: Starts a 120-second flight recording named "GCTest" and saves it to a .jfr file
<span class="pink">Xlog:gc*=debug:file=../logs/g1-gc.log:time,uptime,level,tags:filecount=5,filesize=10m</span>
: Enables detailed GC logging where:<span class="pink">gc*=debug</span>
: Logs all GC-related events at debug level
<span class="pink">file=../logs/g1-gc.log</span>
: Specifies the log file location
<span class="pink">time,uptime,level,tags</span>
: Includes timestamps, uptime, log level, and tags in each log entry
<span class="pink">filecount=5,filesize=10m</span>
: Rotates logs across 5 files of 10MB each
test-zgc.bat
Explanation:
This Windows batch script starts our application with ZGC (Z Garbage Collector) enabled and comprehensive monitoring setup. It first creates directories for logs and recordings, then launches the Java application with the following configurations:
<span class="pink">XX:+UseZGC</span>
: Enables the Z Garbage Collector, known for its low-latency garbage collection<span class="pink">Xms3g -Xmx3g</span>
: Sets both initial and maximum heap memory to 3GB, ensuring consistent memory availability<span class="pink">XX:+FlightRecorder</span>
: Enables JDK Flight Recorder for performance data collection<span class="pink">XX:StartFlightRecording=name=GCTest,duration=120s,filename=../recordings/zgc-recording.jfr</span>
: Configures JFR to:<span class="pink">Xlog:gc*=debug:file=../logs/zgc-gc.log:time,uptime,level,tags:filecount=5,filesize=10m</span>
: Sets up GC logging where:<span class="pink">gc*=debug</span>
: Captures all GC events at debug level<span class="pink">file=../logs/zgc-gc.log</span>
: Directs logs to the specified file<span class="pink">time,uptime,level,tags</span>
: Includes detailed timing and categorization in logs<span class="pink">filecount=5,filesize=10m</span>
: Maintains 5 rotating log files of 10MB each
load-test.ps1
Explanation:
This PowerShell script creates a controlled memory load pattern to test garbage collector performance. It runs 5 iterations ($iterations = 5) with 10-second intervals ($waitTime = 10) between loads, creating the following pattern:
/api/memory/load/{size}</span>
endpointThis script helps simulate real-world memory allocation patterns with alternating memory pressures (500MB and 800MB) and cooldown periods, allowing us to analyze how different garbage collectors handle varying memory loads and recovery times.
application.properties
When running the scripts, JFR recordings and garbage collection logs are automatically generated and saved to their respective directories.
The final folder structure should look similar to this after starting our app using script (.bat) files for different GCs and auto-generating GC logs in the logs folder, and JFR reports in the recording folder:
IMAGE 15
For someone who does not know about JFR:JFR (Java Flight Recorder) is a built-in JVM tool that records runtime data about your Java program. It works by collecting events like method calls, memory usage, CPU load, garbage collection activities and thread info into memory buffers, then writes them to .jfr files.
GC Summary in JFR Report for g1:
GC log interpretation for G1:
The maximum GC pausetime time is approximately 50 ms.
💡 Note: It's normal to see slightly different GC pause times between GC logs and JFR reports, even if you're looking at the same code.The reason is that GC logs show events as they happen, while JFR takes samples and averages things out, which can cause slight differences.
Also, depending on the system's state—like how much CPU or memory is being used—can vary, affecting how long each pause takes.Plus, there are 5 GC logs available, but I have used only the most recent one for monitoring purposes. For a more detailed report, try to combine all the logs and then feed it into the monitoring system.
Summary of ZGC in JFR Report:
GC logs interpretation for Z:
<span class="pink">Conclusion
</span>:
The max pause GC time is around 0.7 ms, which is not even 1 ms and on the other hand for the same application and load, the max pause time for G1 GC takes around 50 ms which is around 50 times higher. This will vary heavily across different applications and different loads but the general idea will remain the same: if your application requires very low latency, then ZGC should be ideal.
Recently, Netflix undertook a significant infrastructure change by transitioning from G1 GC to Generational ZGC on JDK 21 for their streaming services. This migration, affecting more than half of their critical infrastructure, emerged as one of their most impactful operational improvements in a decade.
Prior to the migration, Netflix struggled with high tail latencies caused by GC pauses in their GRPC and DGS Framework services. These pauses led to request cancellations that triggered retry mechanisms, creating a cascade of performance issues. Their previous attempts with non-generational ZGC had shown a concerning 36% increase in CPU utilization, making them initially hesitant about the transition.
The results of the migration exceeded expectations across all metrics. GC pause times dropped to sub-millisecond levels while simultaneously improving CPU utilization. The new system demonstrated a 10% performance improvement over non-generational ZGC and provided more consistent memory availability. The operational benefits were equally impressive, with the system requiring minimal tuning and eliminating the need for array pooling mitigations. The fixed 3% heap size overhead proved manageable, and the system handled large data refreshes more efficiently than its predecessor.
However, the migration revealed that certain workload types still performed better with alternative collectors. Throughput-oriented applications, workloads with spiky allocation rates, and long-running tasks with unpredictable object retention patterns sometimes showed better results with G1 or Parallel GC.
The key insight from this migration was that the expected performance trade-offs didn't materialize. Instead of sacrificing CPU efficiency for better pause times, Netflix achieved improvements in both areas. The default configurations proved sufficient for most services, and the proper implementation of transparent, huge pages significantly enhanced performance.
Check out the full blog here
Halodoc, Indonesia's leading healthcare platform, recently undertook a significant performance optimization initiative by transitioning from G1GC to ZGC across their Java applications. As a platform serving millions of users with critical healthcare services, Halodoc faced growing challenges with their existing G1GC implementation, particularly during peak usage periods when their microservices experienced high CPU overhead and memory management issues.
The company's engineering team identified that G1GC's reactive approach to memory management was causing inefficient resource utilization, especially in services with fluctuating workloads. This led them to implement ZGC, a more advanced garbage collector, across their infrastructure of 60 microservices. They enhanced the implementation with custom optimizations, including ZGenerational garbage collection for better handling of short-lived objects and a Soft Limit Parameter to prevent excessive memory usage.
The migration process was methodically executed through canary deployments, allowing the team to monitor and fine-tune performance in real-time. Halodoc's engineering team was able to achieve remarkable improvements in their system's performance. The results were impressive: average response times decreased by 20%, memory usage reduced by 25%, and system throughput increased by 30%. Perhaps most significantly, garbage collection time dropped by 10%, leading to more consistent application performance.
Read the full bog here
Inside.java Introducing Generational ZGC
Inside.java Introducing Generational ZGC
JEP 474: ZGC: Generational Mode by Default
Java Platform, Standard Edition Java Flight Recorder Runtime...
Java Z Garbage Collector: The Next Generation
Oracle Help Center HotSpot Virtual Machine Garbage Collection Tuning Guide
Dev.java: The Destination for Java Developers Overview of ZGC - Dev.java