Blog
navigate_next
Java
Z Garbage Collector in Java
Gaurav Sharma
October 28, 2024

Building with Java: Every Scale and Scope:

Java applications have unique demands—some need to prioritize throughput, while others emphasize latency or scalability. These workloads can vary significantly in size, complexity, and resource requirements.

Achieving an optimal performance balance often means navigating trade-offs: memory footprint, throughput, latency, predictability, and scalability each pull in different directions.

To effectively support this diverse ecosystem of Java applications, we need a variety of collectors in order to best serve all these different kinds of Java applications. and Z is one of those GCs we are going to talk about in this article.

Note: This article will start from the very basics and then move on to discuss in details about the ZGC.

Hotspot:

The Java Virtual Machine (JVM) serves as a software-based simulation of a physical computer. Just as physical computers execute machine code through their processors, the JVM processes its own set of virtual instructions. This virtual machine comes complete with its own architecture and execution framework, allowing it to run programs written in its specialized instruction format.

Throughout JVM's history, developers and companies have created various implementations to run Java applications. While many implementations have come and gone, a few key players have emerged as industry standards.

One prominent implementation is HotSpot, which gained widespread adoption in the Java ecosystem. Originally created as a commercial venture by Sun Microsystems, HotSpot's development continued under Oracle's leadership following their acquisition of Sun. Oracle enhanced HotSpot by incorporating valuable features from their JRockit platform. Today, HotSpot exists as open-source software within the OpenJDK project, making it freely accessible to developers worldwide.

Another significant JVM implementation is OpenJ9, which originated at IBM. OpenJ9 provides an alternative to HotSpot and is currently under maintenance by the Eclipse Foundation as an open-source project.

List of other implementations of JVM: WikipediaList of Java virtual machines

<span class="pink">The OpenJDK is a project under which an open source implementation of HotSpot (and many other pieces of the JDK e.g. compiler, APIs, tools, etc) is developed.</span>

Image of popular JIT compiler technologies

Garbage Collection:

As an application runs, it keeps creating objects and storing them in memory. Most of these objects eventually become unused, meaning they need to be removed to free up memory for other tasks. This cleanup process is known as garbage collection, and it is done automatically,  with the help of garbage collectors.

Understanding Pause Time in Garbage Collection
Pause time is the period when your application stops running because the system is cleaning up unused memory. If these pauses are too long, your app might feel slow or unresponsive, which can be frustrating for users. Minimizing pause time is important to keep your app running smoothly.
Among Java's garbage collectors, ZGC (Z Garbage Collector) is known for having the least pause time. It’s designed to keep pauses to just a few milliseconds, even with large amounts of memory.
This makes it ideal for applications where minimal interruptions and smooth performance are critical.We will see a practical demo of pause time comparisons with G1 GC and ZGC in latter part of this article.

Different Garbage Collectors Present in Hotspot:

The different garbage collectors and what they are optimized for

HotSpot JVM offers four major types of garbage collectors, each designed to handle different types of workloads and memory management needs. These garbage collectors manage how memory is allocated and freed, ensuring Java applications run efficiently.

Serial GC:

The Serial Garbage Collector is designed specifically for small applications and embedded systems where memory and CPU resources are limited. As a single-threaded collector, it pauses all application threads during garbage collection, making its operation quite simple. While this approach might seem basic, its small memory footprint and straightforward implementation make it particularly effective in environments where managing multiple threads would create unnecessary overhead. This collector shines in scenarios like small Java applications, embedded devices, or testing environments where simplicity takes precedence over performance optimization.

Parallel GC (Throughput Collector):

The Parallel Garbage Collector, also known as the throughput collector, is engineered for applications where high throughput is the primary concern and longer pause times can be tolerated. It leverages multiple threads to conduct garbage collection, which significantly reduces the overall time spent in GC by maximizing parallel processing. Although this approach may result in longer application pauses, it optimizes the total amount of work the application can accomplish over time. This makes it particularly well-suited for batch processing systems and large-scale applications where overall throughput takes priority over immediate response times.

G1 GC:

The G1 (Garbage First) Collector, which became the default garbage collector starting with JDK 9, offers a balanced approach between high throughput and low latency. It employs an innovative strategy of dividing the heap into regions and strategically collecting areas containing the most garbage, thereby minimizing performance impact. One of its most valuable features is the ability to set predictable pause times, making it highly suitable for applications requiring consistent response times. Given its adaptability and balanced performance characteristics, G1 has become the preferred choice for most modern Java applications, especially those running on large heaps where both throughput and responsiveness are crucial.

ZGC:

The Z Garbage Collector (ZGC) represents the cutting edge of garbage collection technology, specifically designed for applications demanding extremely low pause times while handling large heap sizes up to multiple terabytes. ZGC accomplishes this by performing most of its work concurrently with the running application, typically achieving pause times of just a few milliseconds. Its exceptional scalability and minimal pause times make it the ideal choice for demanding environments such as financial trading systems, high-frequency transaction processing, and large-scale cloud applications where maintaining consistent responsiveness is critical.

A detailed view of the Garbage collector design triangle

Although this triangle does not depict anything specific, it provides a more broad understanding of which sector various garbage collectors are most effective in.

💡 But here is a catch: only learning theory about GC and deciding which GC to use for your application might lead you to uninvited trouble.

GC design and choice is a series of trade-offs, and the best and advised solution is to try every GC for your application and decide which one suits best for your application.

<span class="pink">In order to find the best GC for your application:</span>

3 simple steps to find the best garbage collector for your application

💡 Performance Tuning: If you find that most of the characteristics are pretty close, try the tuning. You may have to tweak some of the performance settings for the garbage collector you have selected for optimal performance.

A useful guide for performance tuning

Now, the fundamentals are done. we will move on to the crux of this article:
A timeline of ZGC from being in prototyping stage in 2014 to generational mode becoming the default in JDK 23

ZGC overview:

An overview list of ZGC features

  • Concurrent: ZGC performs most of its work while your application continues to run, minimizing the stop-the-world pauses that can disrupt performance. It’s a big step forward in keeping things smooth while garbage collection happens in the background.
  • Constant Pause Times: One of the standout features of ZGC is that it keeps pause times consistent, no matter how large the heap or the number of objects. Whether your application is dealing with a small dataset or managing gigabytes of data, you won’t see longer pauses as memory use grows. This is especially important for real-time or latency-sensitive applications.
  • Parallel: ZGC uses multiple threads to handle garbage collection in parallel, speeding up the process by distributing the work across multiple CPU cores. This approach makes it particularly well-suited for modern hardware with multi-core processors, ensuring efficient garbage collection without dragging down the rest of the system.
  • Compacting: Over time, memory can get fragmented, which can slow down performance. ZGC handles this by compacting memory as it goes, moving objects around to make the most efficient use of the space. This prevents long-term fragmentation and keeps memory management optimized as your application runs.
  • Region-Based: ZGC breaks memory into regions and focuses its efforts on the areas where there's a lot of garbage to collect. This targeted approach makes the process faster and more efficient by not wasting time on regions with little garbage.
  • NUMA-Aware: On systems with NUMA (Non-Uniform Memory Access) architecture, ZGC ensures that memory is allocated close to the CPU that needs it. This minimizes the time spent accessing memory, which can have a significant impact on performance in large-scale systems
  • Auto Tuning: You don’t need to mess with complex settings to get ZGC working efficiently. It automatically adjusts itself based on the workload, adapting as necessary without needing manual intervention. This makes it easier to deploy and manage in a wide range of environments.
  • Load Barriers & Colored Pointers: These are some of the more technical tricks ZGC uses to handle garbage collection while your application continues to run. We will talk about them in detail in the latter part of the article.

ZGC Cycle:

A diagrammatic representation of the ZGC cycle

The ZGC lifecycle consists of three main phases, and what makes it special is that during most of these phases, your application keeps running (shown by the continuous "<span class="pink">Application Threads</span>" arrow at the bottom).

First Phase - Mark Start:There's a very brief pause (shown by the first white vertical line) where ZGC takes a quick snapshot of your program's state. Think of it like quickly taking a photograph of all the active objects in your program. During this pause, ZGC identifies the starting points (roots) from where it will begin its search for live objects. After this tiny pause, your program continues running while ZGC does its marking work.

Concurrent Mark Phase:After the initial pause, ZGC starts from the roots and follows all connections to find which objects are still being used. During this time, ZGC starts from the roots and follows all connections to find which objects are still being used. This happens simultaneously with your program's execution, which is what "Concurrent" means.


Mark End Phase:
There's another very brief pause (second white line) where ZGC finalizes its marking work. Similar to double-checking that it hasn't missed anything important.


Prepare for Relocation Phase:
This is a concurrent phase (shown by the next set of blue bars) where ZGC plans how it will reorganize memory. It's like planning how to rearrange furniture in a room while people are still using it. During this phase, ZGC decides which objects need to be moved and where they will go.


Final Phase - Relocate:
After one more tiny pause (third white line), ZGC begins the relocation phase (last set of blue bars). This is where it actually moves objects to better positions in memory, making it more organized and efficient. Again, this happens while your program continues to run.


Traditional garbage collectors would need to stop your program completely while doing most of this work. But ZGC only needs these three extremely short pauses (the white vertical lines), and each pause typically takes less than one millisecond. The rest of the work happens concurrently - meaning your program keeps running while ZGC does its cleanup work.

ZGC Under the Hood: An Architectural Overview

Concurrent Processing in ZGC:

<span class="pink"> Concurrency sits at the heart of ZGC's design, allowing it to perform most garbage collection tasks while your application continues to run. This approach dramatically reduces those dreaded stop-the-world pauses that can disrupt application performance.</span>

ZGC’s concurrency depends heavily on two key architectural features: Colored Pointers and Load Barriers.
Colored Pointers: These help ZGC keep track of which objects are being moved without stopping your program
Load Barriers: These ensure your program can still find objects even if they've been moved to a new location

Diagram showing the 2 factors that make concurrent processing possible in ZGC: 1. Colored pointers, 2. Load barriers

Colored Pointers

A size and color key chart of ZGC colored pointers

Colored pointers are a fundamental part of ZGC’s architecture. Unlike traditional pointers that only store memory addresses, ZGC uses a 64-bit pointer with additional metadata embedded in it. Here's how it works:

  • Pointer Structure: Out of the 64 bits, ZGC reserves 22 bits for metadata, or "color." The rest of the bits store the actual memory address. This extra information allows ZGC to track the state of objects directly within the pointer, eliminating the need for separate tracking structures.
  • What Does the Color Represent?: The color in these pointers indicates the state or status of the object.
    • Marked Bits (meta): These show whether an object has been marked as alive by the garbage collector.Enables concurrent marking without STW pauses.
    • Remapped Bit(Remap): This indicates whether the pointer has been updated to a new memory location if the object has been moved.
    • Finalizable Bit(Final): Marks objects that need finalization. Used for proper cleanup of resources.

The idea behind colored pointers is to allow the garbage collector to move and manage objects while the application is running without any interruptions. Whenever the GC needs to move an object, it updates the color in the pointer to signal that the object has changed state or location. The application doesn’t have to be paused because the pointers already carry all the necessary information.

Why Is This Useful?

By embedding state information directly into the pointer, ZGC can quickly check the status of an object and decide what to do next without needing to pause the application. This is a key part of how ZGC reduces pause times to mere milliseconds, making the garbage collection process much more efficient.

Load Barriers

A diagram illustrating how java ensures safe concurrent object access during garbage collection

While colored pointers provide the state information needed for concurrent processing, load barriers make sure this information is used correctly every time an object is accessed. Here's how they work:

  • What is a Load Barrier?
  • A load barrier is essentially a small piece of code that the JVM inserts into your application’s code whenever it accesses an object on the heap. It acts like a checkpoint or filter. Think of it as a gatekeeper: every time the application tries to use an object, it first passes through this checkpoint to ensure everything is in order.
  • How Does It Work?
  • When the application accesses an object:
    • The load barrier checks the "color" of the pointer associated with that object.
    • If the color indicates that the object is in a stable state (e.g., it hasn’t moved recently), the load barrier lets the application proceed without delay.
    • If the color suggests that the object is in a transitional state (e.g., being relocated by the garbage collector), the load barrier takes action. It might update the pointer to the new location of the object or even move the object itself before allowing access.
  • Why are Load Barriers Essential?
  • Load barriers are vital because they keep the application and the garbage collector in sync without pausing the application. They enable ZGC to heal or adjust pointers dynamically, ensuring the application always accesses the correct version of an object, even if the object is moved during garbage collection.

By checking the status of the object and "healing" it if necessary, load barriers ensure that every object's access is safe and accurate.

These processes happen so quickly that it’s nearly invisible to the application, contributing to ZGC’s ability to maintain low pause times.

Click to install unlogged from the jetbrains marketplace

Previous Mode of ZGC: Non-Generational Mode

Before the introduction of Generational Mode in JDK 21, the Z Garbage Collector (ZGC) used what’s known as a Non-Generational Mode. In simple terms, ZGC treated all objects the same, regardless of how long they existed in memory. Whether an object was created just a moment ago or had been around for a while, ZGC didn't differentiate between them when deciding what to clean up. This method worked fine for keeping pause times very low, but it wasn’t the most efficient for memory management, especially when it came to dealing with short-lived objects.

Generational Mode in ZGC:

Now, let’s talk about Generational Mode. This concept comes from the idea that most objects in a program are either short- or long-lived:

  • Short-lived objects: Think of temporary data like strings or numbers created in a method and forgotten when that method finishes. These objects only live for a brief time.
  • Long-lived objects: Some objects, like database connections or cached data, tend to hang around for the entire duration of the program.

In Generational Mode, the garbage collector splits the memory into two main parts:

  1. Young Generation: This is where new, short-lived objects are stored. Since many objects are short-lived, they will be collected quickly and often.
  2. Old Generation: Objects that survive long enough in the Young Generation are moved here. These objects live longer and are collected less frequently, meaning fewer pauses for the garbage collector to check on them.

💡 By default, Java uses the G1 Garbage Collector unless another option is specified. However, if you choose to enable the Z Garbage Collector (ZGC) over the default G1 garbage collector using the flag <span class="pink">-XX:+UseZGC</span>, starting from JDK 23 (with JEP 474), ZGC will automatically operate in Generational mode.

While the non-generational version of ZGC is still supported, it is likely to be deprecated in the future. If you prefer to continue using the non-generational ZGC, you can explicitly opt out of Generational mode by including the flag <span class="pink">-XX:+UseZGC</span> <span class="pink">-XX:-ZGenerational</span>.

💡 From JDK 23 onward, you no longer need to use flags to enable ZGC's generational mode; it's the default setting.

Enough of the ZGC talk. Let’s jump in and do some action with code

Project Objective: ZGC vs G1 Garbage Collector Performance Analysis
The purpose is to compare and analyze the performance of two Java garbage collectors - ZGC and G1 GC - using a Spring Boot application.
Through controlled memory tests and JDK monitoring tools, we measure:
  • Garbage collection pause times
  • Memory usage patterns
  • Application responsiveness
This helps understand which garbage collector is more suitable for different application needs, particularly focusing on <span class="pink">ZGC's sub-millisecond pause time claims versus G1's general-purpose efficiency.</span>

Project Setup Steps:

  1. Visit Spring Initializer
  2. Configure the project with:
    • Project: Maven
    • Language: Java
    • Spring Boot: 3.3.5
    • Packaging: JAR
    • Java Version: 23
    • Group: com.unlogged
    • Artifact: gc-test
    • Dependencies:
      • Spring Web (for REST endpoints)

Snapshot of the spring initializr dashboard

pom.xml should look something like this:

	
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>3.3.5</version>
		<relativePath/> <!-- lookup parent from repository -->
	</parent>
	<groupId>com.unlogged</groupId>
	<artifactId>gc-test</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<name>gc-test</name>
	<description>project for measuring the performance of differnet GCs</description>
	<url/>
	<licenses>
		<license/>
	</licenses>
	<developers>
		<developer/>
	</developers>
	<scm>
		<connection/>
		<developerConnection/>
		<tag/>
		<url/>
	</scm>
	<properties>
		<java.version>23</java.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>

		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>

</project>

MemoryController.java


package com.unlogged.gctest.controller;

import com.unlogged.gctest.service.MemoryService;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/api/memory")
public class MemoryController {
    private final MemoryService memoryService;

    public MemoryController(MemoryService memoryService) {
        this.memoryService = memoryService;
    }

    @PostMapping("/load/{mb}")
    public String generateLoad(@PathVariable int mb) {
        memoryService.generateLoad(mb);
        return "Generated " + mb + "MB load";
    }
}

Explanation:

This code creates a simple web API that lets you test garbage collector performance. For Example, By sending a POST request to <span class="pink">/api/memory/load/500</span>, you can create a temporary 500MB memory load. This helps in analyzing how different garbage collectors (like ZGC or G1) handle various memory allocation patterns.

MemoryService.java

	
package com.unlogged.gctest.service;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Service;

import java.util.ArrayList;
import java.util.List;

@Service
public class MemoryService {
    private static final Logger logger = LoggerFactory.getLogger(MemoryService.class);
    private final List<byte[]> memoryList = new ArrayList<>();
    private static final int MB = 1024 * 1024;

    @Scheduled(fixedRate = 100)
    public void simulateMemoryLoad() {
        try {
            // Allocate 1MB
            memoryList.add(new byte[MB]);

            // Release memory when too large
            if (memoryList.size() > 100) {
                memoryList.subList(0, 50).clear();
            }

            logger.info("Current memory list size: {}", memoryList.size());
        } catch (OutOfMemoryError e) {
            logger.error("OutOfMemoryError occurred", e);
            memoryList.clear();
        }
    }

    public void generateLoad(int mbSize) {
        List <byte[]> tempList = new ArrayList<>();
        try {
            for (int i = 0; i < mbSize; i++) {
                tempList.add(new byte[MB]);
                if (i % 10 == 0) {
                    logger.info("Allocated {}MB", i);
                }
            }
            Thread.sleep(1000); // Hold memory for a second
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        } finally {
            tempList.clear();
        }
    }
}

Explanation:

The MemoryService creates controlled memory pressure to test garbage collectors through two methods. First, the scheduled <span class="pink">simulateMemoryLoad()</span> runs every 100ms to create steady memory churn by continuously allocating and freeing 1MB chunks. Second, the <span class="pink">generateLoad(mbSize)</span> method allows you to create immediate memory pressure of any size, letting you test how garbage collectors handle both gradual and sudden memory pressures.

GcTestApplication.java

	
package com.unlogged.gctest;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.scheduling.annotation.EnableScheduling;

@SpringBootApplication
@EnableScheduling
public class GcTestApplication {

    public static void main(String[] args) {
        SpringApplication.run(GcTestApplication.class, args);
    }

}

Explanation:This is the main entry point of our garbage collector testing application. The <span class="pink">@EnableScheduling</span> annotation is crucial as it activates the background memory allocation task in our MemoryService, while <span class="pink">@SpringBootApplication</span> sets up Spring Boot's auto-configuration.

Source code folder structure should look like this:

Snapshot of the IDE project structure showing what the Source code folder structure should look like

Note: We have three script files to automate our GC testing process: two <span class="pink">.bat</span> files to start our application with different garbage collectors (ZGC and G1), and a PowerShell script to generate memory loads. The batch scripts configure each GC with monitoring tools like JFR and GC logging, while the PowerShell script automates the process of hitting our endpoints with alternating memory loads.

In our project's root directory, we'll organize our automation scripts by creating a dedicated 'scripts' folder.

test-g1.bat

	
@echo off
echo Creating directories...
mkdir ..\logs
mkdir ..\recordings

echo Starting application with G1 GC...
java ^
-XX:+UseG1GC ^
-Xms3g ^
-Xmx3g ^
-XX:+FlightRecorder ^
-XX:StartFlightRecording=name=GCTest,duration=120s,filename=../recordings/g1-recording.jfr ^
-Xlog:gc*=debug:file=../logs/g1-gc.log:time,uptime,level,tags:filecount=5,filesize=10m ^
-jar ../target/gc-test-0.0.1-SNAPSHOT.jar

Explanation:

This Windows batch script starts our application with G1 garbage collector enabled and comprehensive monitoring setup. It first creates directories for logs and recordings, then launches the Java application with the following configurations:

  • <span class="pink">XX:+UseG1GC</span>: Enables the G1 garbage collector
  • <span class="pink">Xms3g -Xmx3g</span>: Sets both initial and maximum heap memory to 3GB for consistent testing
  • <span class="pink">XX:+FlightRecorder</span>: Enables JDK Flight Recorder for performance monitoring
  • <span class="pink">XX:StartFlightRecording=name=GCTest,duration=120s,filename=../recordings/g1-recording.jfr</span>: Starts a 120-second flight recording named "GCTest" and saves it to a .jfr file
  • <span class="pink">Xlog:gc*=debug:file=../logs/g1-gc.log:time,uptime,level,tags:filecount=5,filesize=10m</span>: Enables detailed GC logging where:
    • <span class="pink">gc*=debug</span>: Logs all GC-related events at debug level
    • <span class="pink">file=../logs/g1-gc.log</span>: Specifies the log file location
    • <span class="pink">time,uptime,level,tags</span>: Includes timestamps, uptime, log level, and tags in each log entry
    • <span class="pink">filecount=5,filesize=10m</span>: Rotates logs across 5 files of 10MB each

test-zgc.bat

	
@echo off
echo Creating directories...
mkdir ..\logs
mkdir ..\recordings

echo Starting application with ZGC...
java ^
-XX:+UseZGC ^
-Xms3g ^
-Xmx3g ^
-XX:+FlightRecorder ^
-XX:StartFlightRecording=name=GCTest,duration=120s,filename=../recordings/zgc-recording.jfr ^
-Xlog:gc*=debug:file=../logs/zgc-gc.log:time,uptime,level,tags:filecount=5,filesize=10m ^
-jar ../target/gc-test-0.0.1-SNAPSHOT.jar

Explanation:

This Windows batch script starts our application with ZGC (Z Garbage Collector) enabled and comprehensive monitoring setup. It first creates directories for logs and recordings, then launches the Java application with the following configurations:

  • <span class="pink">XX:+UseZGC</span>: Enables the Z Garbage Collector, known for its low-latency garbage collection
  • <span class="pink">Xms3g -Xmx3g</span>: Sets both initial and maximum heap memory to 3GB, ensuring consistent memory availability
  • <span class="pink">XX:+FlightRecorder</span>: Enables JDK Flight Recorder for performance data collection
  • <span class="pink">XX:StartFlightRecording=name=GCTest,duration=120s,filename=../recordings/zgc-recording.jfr</span>: Configures JFR to:
    • Record data for 120 seconds
    • Name the recording "GCTest"
    • Save the recording to zgc-recording.jfr
  • <span class="pink">Xlog:gc*=debug:file=../logs/zgc-gc.log:time,uptime,level,tags:filecount=5,filesize=10m</span>: Sets up GC logging where:
    • <span class="pink">gc*=debug</span>: Captures all GC events at debug level
    • <span class="pink">file=../logs/zgc-gc.log</span>: Directs logs to the specified file
    • <span class="pink">time,uptime,level,tags</span>: Includes detailed timing and categorization in logs
    • <span class="pink">filecount=5,filesize=10m</span>: Maintains 5 rotating log files of 10MB each

load-test.ps1

	
# Create a PowerShell script to generate load pattern
$iterations = 5
$waitTime = 10  # seconds between loads

Write-Host "Starting memory load test..."
Write-Host "Will run $iterations times with ${waitTime}s delay between each"

for ($i = 1; $i -le $iterations; $i++) {
    Write-Host "`nIteration $i of $iterations"
    Write-Host "Generating 500MB load..."
    curl -Method POST "http://localhost:8080/api/memory/load/500"
    Write-Host "Waiting ${waitTime}s..."
    Start-Sleep -Seconds $waitTime

    Write-Host "Generating 1000MB load..."
    curl -Method POST "http://localhost:8080/api/memory/load/800"
    Write-Host "Waiting ${waitTime}s..."
    Start-Sleep -Seconds $waitTime
}

Write-Host "`nTest complete!"

Explanation:

This PowerShell script creates a controlled memory load pattern to test garbage collector performance. It runs 5 iterations ($iterations = 5) with 10-second intervals ($waitTime = 10) between loads, creating the following pattern:

  • In each iteration:
    • First generates a 500MB memory load
    • Waits 10 seconds
    • Then generates an 800MB memory load
    • Waits another 10 seconds
    • Uses curl commands to hit the application's <span class="pink">/api/memory/load/{size}</span> endpoint
    • Provides progress feedback through console messages

This script helps simulate real-world memory allocation patterns with alternating memory pressures (500MB and 800MB) and cooldown periods, allowing us to analyze how different garbage collectors handle varying memory loads and recovery times.

application.properties

	
spring.application.name=gc-test

When running the scripts, JFR recordings and garbage collection logs are automatically generated and saved to their respective directories.

The final folder structure should look similar to this after starting our app using script (.bat) files for different GCs and auto-generating GC logs in the logs folder, and JFR reports in the recording folder:

The final project folder structure in an IDE, showing folders such as .idea, .mvn, logs, recordings, and scripts. The logs folder will store auto-generated garbage collection (GC) logs, while the recordings folder contains JFR reports like g1-recording.jfr and zgc-recording.jfr. The scripts folder includes .bat files (test-g1.bat and test-zgc.bat) for different GCs and load-test.ps1. Additional configuration files, such as .gitignore and pom.xml, are present in the root directory."

IMAGE 15

JFR Events: Analyzing Garbage Collection Details and GC Log Interpretation

For someone who does not know about JFR:JFR (Java Flight Recorder) is a built-in JVM tool that records runtime data about your Java program. It works by collecting events like method calls, memory usage, CPU load, garbage collection activities and thread info into memory buffers, then writes them to .jfr files.

For G1GC:

GC Summary in JFR Report for g1:

GC Summary section of a JFR report for G1 garbage collector, showing statistics for Young Collection Total Time, Old Collection Total Time, All Collections Total Time, and All Collections Pause Time. Metrics include GC Count, Average GC Time, Maximum GC Time, Total GC Time, Average Pause, Longest Pause, and Sum of Pauses.

GC log interpretation for G1:

GC log interpretation for G1 displaying key performance indicators, including Throughput (99.82%), CPU Time (200 ms), and Latency metrics. The average pause GC time is 11.4 ms, with a maximum pause GC time of 50.0 ms. A bar chart shows the GC Duration Time Range, with most pauses in the 0-10 ms range (52.63%) and fewer in higher ranges. A table breaks down GC pause durations by number and percentage across time intervals.

The maximum GC pausetime time is approximately 50 ms.

JVM memory size report displaying allocated vs. peak memory usage. A table shows Heap with 3 GB allocated and 1.7 GB peak usage, Metaspace with 37.94 MB allocated and 37.24 MB peak usage, totaling 3.04 GB allocated and 1.74 GB peak. A bar chart visualizes the allocated and peak memory usage for both Heap and Metaspace.

💡 Note: It's normal to see slightly different GC pause times between GC logs and JFR reports, even if you're looking at the same code.The reason is that GC logs show events as they happen, while JFR takes samples and averages things out, which can cause slight differences.

Also, depending on the system's state—like how much CPU or memory is being used—can vary, affecting how long each pause takes.Plus, there are 5 GC logs available, but I have used only the most recent one for monitoring purposes. For a more detailed report, try to combine all the logs and then feed it into the monitoring system.

For ZGC:

Summary of ZGC in JFR Report:

GC Summary in JFR report for ZGC showing no data for Young and Old Collection Total Time (GC Count, Average GC Time, Maximum GC Time, and Total GC Time all marked N/A). All Collections Total Time includes a GC Count of 10, Average GC Time of 109.249 ms, Maximum GC Time of 149.097 ms, and Total GC Time of 1.092 s. All Collections Pause Time includes an Average Pause of 92.222 µs, Longest Pause of 863.200 µs, and Sum of Pauses of 4.703 ms

GC logs interpretation for Z:

Key Performance Indicators for Z garbage collector, with metrics including Throughput at 99.998% and an average pause GC time of 0.105 ms, and a maximum pause GC time of 0.703 ms. The GC Duration Time Range bar chart shows that 88.89% of GC pauses fall within the 0-0.1 ms range, with smaller percentages in the 0.6-0.7 ms and 0.7-0.8 ms ranges. A table below provides a detailed breakdown of GC pause duration ranges.

<span class="pink">Conclusion</span>:

The max pause GC time is around 0.7 ms, which is not even 1 ms and on the other hand for the same application and load, the max pause time for G1 GC takes around 50 ms which is around 50 times higher. This will vary heavily across different applications and different loads but the general idea will remain the same: if your application requires very low latency, then ZGC should be ideal.

 JVM memory size report showing allocated vs. peak memory usage. The table lists Young Generation with a peak of 472 MB, Old Generation with a peak of 162 MB, and Metaspace with 38 MB allocated and 37 MB peak. The total allocated memory is 3.04 GB, with a peak usage of 567 MB. The bar chart visualizes the allocated and peak usage across Young, Old, and Metaspace regions.

Case Studies

Netflix adoption of ZGC

Recently, Netflix undertook a significant infrastructure change by transitioning from G1 GC to Generational ZGC on JDK 21 for their streaming services. This migration, affecting more than half of their critical infrastructure, emerged as one of their most impactful operational improvements in a decade.

Prior to the migration, Netflix struggled with high tail latencies caused by GC pauses in their GRPC and DGS Framework services. These pauses led to request cancellations that triggered retry mechanisms, creating a cascade of performance issues. Their previous attempts with non-generational ZGC had shown a concerning 36% increase in CPU utilization, making them initially hesitant about the transition.

The results of the migration exceeded expectations across all metrics. GC pause times dropped to sub-millisecond levels while simultaneously improving CPU utilization. The new system demonstrated a 10% performance improvement over non-generational ZGC and provided more consistent memory availability. The operational benefits were equally impressive, with the system requiring minimal tuning and eliminating the need for array pooling mitigations. The fixed 3% heap size overhead proved manageable, and the system handled large data refreshes more efficiently than its predecessor.

However, the migration revealed that certain workload types still performed better with alternative collectors. Throughput-oriented applications, workloads with spiky allocation rates, and long-running tasks with unpredictable object retention patterns sometimes showed better results with G1 or Parallel GC.

The key insight from this migration was that the expected performance trade-offs didn't materialize. Instead of sacrificing CPU efficiency for better pause times, Netflix achieved improvements in both areas. The default configurations proved sufficient for most services, and the proper implementation of transparent, huge pages significantly enhanced performance.

Check out the full blog here

Clickable banner to install the Unlogged plugin from the Jetbrains marketplace

HaloDoc adotpion of ZGC:

Halodoc, Indonesia's leading healthcare platform, recently undertook a significant performance optimization initiative by transitioning from G1GC to ZGC across their Java applications. As a platform serving millions of users with critical healthcare services, Halodoc faced growing challenges with their existing G1GC implementation, particularly during peak usage periods when their microservices experienced high CPU overhead and memory management issues.

The company's engineering team identified that G1GC's reactive approach to memory management was causing inefficient resource utilization, especially in services with fluctuating workloads. This led them to implement ZGC, a more advanced garbage collector, across their infrastructure of 60 microservices. They enhanced the implementation with custom optimizations, including ZGenerational garbage collection for better handling of short-lived objects and a Soft Limit Parameter to prevent excessive memory usage.

The migration process was methodically executed through canary deployments, allowing the team to monitor and fine-tune performance in real-time. Halodoc's engineering team was able to achieve remarkable improvements in their system's performance. The results were impressive: average response times decreased by 20%, memory usage reduced by 25%, and system throughput increased by 30%. Perhaps most significantly, garbage collection time dropped by 10%, leading to more consistent application performance.

Read the full bog here

References

Inside.java Introducing Generational ZGC

Inside.java Introducing Generational ZGC

JEP 474: ZGC: Generational Mode by Default

Java Platform, Standard Edition Java Flight Recorder Runtime...

JDK Mission Control

Java Z Garbage Collector: The Next Generation

Oracle Help Center HotSpot Virtual Machine Garbage Collection Tuning Guide

Dev.java: The Destination for Java Developers Overview of ZGC - Dev.java

Gaurav Sharma
October 28, 2024
Use Unlogged to
mock instantly
record and replay methods
mock instantly
Install Plugin