Blog
navigate_next
Java
How does the lombok magic work underneath?
Parth
October 4, 2023

Exploring How Lombok Enhances Java with Annotation Processor

Much has been already written about what lombok does and how it reduces boilerplate for java developers. This post is not about how to use lombok.

At Unlogged, we extended the lombok implementation and were amazed to learn how flawlessly it works across different build systems, java versions and IDEs.

AnnotationProcessor introduced in J2SE 1.5 cannot make changes to existing files. It could only create new files or bytecode. This makes lombok’s implementation intriguing, since they use AnnotationProcessor to modify the existing java class files, during the compilation phase. 

I want to explain how lombok achieves this feat, in detail.

Detailed Diagram of the internal working of Lombok
This briefly explains the several steps creating the lombok magic.

Using ShadowClassLoader

Lombok uses this ant step called <span  class="white" >“mappedresources”</span>and renames all <span  class="white" >.class</span> files to <span  class="white" >.SCL.lombok</span> - in short it hides the class files in a given jar by renaming it.

Why is it important?

Namespace Isolation  

By renaming the .class files with the <span  class="white" >.SCL.lombok</span> ending, Lombok ensures that it doesn't interfere with or "contaminate" the namespace of any project that uses a JAR file containing Lombok. This means that when you use Lombok in your project, it won't clutter your code suggestions or auto-complete features in your IDE with anything other than Lombok's actual public API. 

Dependency Management

Similar to a tool called "jarjar," SCL allows Lombok to include its own dependencies (like ASM, a popular Java bytecode manipulation library) without imposing these dependencies on the projects that use the Lombok JAR. Lombok can use other libraries it needs to function correctly without forcing those libraries onto you when you use Lombok in your project.

Enhanced Debugging 

SCL also plays a role in debugging. It allows an agent that must be packaged in a JAR file to load everything except the SCL infrastructure from class files generated by your IDE. This is beneficial because it makes debugging easier. You can rely on your IDE's built-in auto-recompile features instead of having to run a full build every time you make changes to your code. Additionally, it can help with features like hot code replacement, which allows you to modify code during debugging without restarting your application.

Assembling the jar with SCL

<span  class="white" >“mappedresources”</span> is an ant step and I use maven for unlogged-sdk. So, I created a maven plugin that renames files based on regex just before the final jar is assembled.

Here is how it’s used.

 
 <plugin>
	<groupId>video.bug</groupId>
	<artifactId>rename-file-maven-plugin</artifactId>
	<version>1.0-SNAPSHOT</version>
	<executions>
		<execution>
			<phase>compile</phase>
			<goals>
				<goal>rename-file</goal>
			</goals>
		</execution>
	</executions>
	<configuration>
		<source>(.+).class</source>
		<target>$1.scl.unlogged</target>
		<workingDirectory>${build.directory}/classes/io/unlogged/processor</workingDirectory>
	</configuration>
</plugin> 
	

You can read more about how class loaders work here.

Discovery of Annotation processors

When you compile your Java code, the process happens in several steps, or "rounds." In each round, the compiler might find new files that haven't been processed yet. It asks each annotation processor if it should do something with these files. The decision is based on another annotation called  <span  class="pink" >@SupportedAnnotationTypes</span>. This annotation tells the processor which types of annotations it's interested in. It can even use <span  class="pink" >a *</span> to say it wants to handle all annotations, including those on classes that aren't annotated.

<span  class="pink" >AnnotationProcessors</span> primarily implement two methods, <span  class="teal" >init()</span> and <span  class="teal" >process()</span> . The first method, <span  class="teal" >init()</span> is called when the <span  class="pink" >AnnotationProcessor</span> is initialized and it receives an instance of <span  class="pink" >ProcessingEnvironment</span> as an argument. The second method, <span  class="teal" >process()</span> , is called for every round the java compiler finds a new set of files eligible to be processed (based on the supported annotation types earlier)

Lombok defines two annotation processors in its jar

 
lombok.launch.AnnotationProcessorHider$AnnotationProcessor
lombok.launch.AnnotationProcessorHider$ClaimingProcessor 
	

I wasn’t very clear on the necessity of 2 processors especially when <span  class="pink" >ClaimingProcessor</span>didn’t seem to do anything except returning true, signaling the compiler that the set of annotations has been processed. But then the below comment inside the codebase makes it clear.

 
// Normally we rely on the claiming processor to claim away all lombok annotations
// One of the many Java9 oversights is that this 'process' API has not been
// fixed to address the point that 'files I want to look at' and 'annotations’
// I want to claim must be one and the same and yet in Java 9 you can no longer have
// 2 providers for the same service. Thus if you go by module path,
// lombok no longer loads the ClaimingProcessor.
// This doesn't do as good a job, but it'll have to do.
// The only way to go from here, I think, is either 2 modules, or use reflection
// hackery to add ClaimingProcessor during our init.
	

Debugging the compilation phase

If you want a hook inside the compilation process, place a lombok jar on your class path and here is how you can run the build process with remote debugging enabled.

 
MAVEN_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005
mvn clean package
	
 
GRADLE_OPTS='-Xdebug -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005
./gradlew clean build
	

With the above parameters, the build process will pause (suspend=y) and start the remote debug server (server=y) at the port 5005. Now the debugger can connect to it.

In IntelliJ you can do this by creating a new Debug Profile of type “Remote JVM Debug”. Make sure the port number matches in your intellij debug config and the one you have used to start the process.

Once you have created that, click the debug button and you will be inside the javac compilation process in debug mode.  Remember to put a breakpoint in the entry point of the annotation processor which is the <span  class="teal" >init()</span> method

Patching Java Compilation

Lombok needs to work with different compilers (javac/eclipse), and different build systems (maven/gradle) - so we will come across a lot of patching.

What’s the Java 9 Silly warning?

How disableJava9SillyWarning suppresses the warning message that Java developers may otherwise see

Java 9 added warnings to prevent the code accessing other private parts of the code. Creators of lombok think that these warnings are unnecessary. So, disableJava9SillyWarning just suppresses the warning message that Java developers may otherwise see. The comment inside the code explains what the developer was thinking.

 
// JVM9 complains about using reflection to access packages from a module that 
// aren't exported. This makes no sense; the whole point of reflection
// is to get past such issues. The only comment from the jigsaw team lead on this
// was some unspecified mumbling about security which makes no sense,
// as the SecurityManager is invoked to check such things. Therefore this warning
// is a bug, so we shall patch java to fix it.
	

You may have come across the <span  class="pink" >--add-opens</span> and <span  class="pink" >--add-exports</span> arguments when running with certain libraries/java-agents which allows module A to access module B via reflection. This access was allowed to begin with, until java 1.9 when warnings/limitations were added for such access. You can read more about it in this very succinct answer on stackoverflow by Nicolai Parlog.

What does the process descriptor do?

Working of the Processor descriptor

Processor Descriptor is like a set of rules. It follows a design pattern called the "delegator pattern”. This pattern helps organize code by having different parts do different jobs.

Want() Method and Conditions

  • The Processor Descriptor has a method called <span  class="teal" >want()</span>. This method is like a decision maker and is called when the annotation processor starts.
  • If <span  class="teal" >want()</span> says "yes" (returns true), then the Processor Descriptor will be involved in the compilation process. If it says "no" (returns false), it won't be used.
  • There are two implementations of this method: <span  class="white" >EcjDescriptor</span> and <span class="white">JavacDescriptor</span>.

How Lombok Tries to Find It

Lombok tries a few different ways to find the real <span  class="pink" >JavacProcessingEnvironment</span>:

  1. First, it looks for a field named "delegate" (for Gradle).
  2. If that doesn't work, it looks for a field named "processingEnv" (for Kotlin).
  3. If still no luck, it looks for a field named "val$delegateTo" (for IntelliJ IDEA version 2020.3 or later).
  4. If any of these fields contain a non-null value, Lombok uses the same logic on that value until it finds the real JavacProcessingEnvironment.

If none of these attempts succeed, Lombok disables its processor because it can't find the necessary environment to work.

Use Unlogged to mock instantly

Finding and Patching the ClassLoader

This process is about finding and preparing a special component in Java called a "ClassLoader." The main goal is to load a specific AnnotationProcessor called "lombok.javac.apt.LombokProcessor."

Three classes play a role in this process:

  1. AnnotationProcessorHider: This is loaded by the built-in ClassLoader and does some setup work.
  2. AnnotationProcessor: This is loaded by a special ShadowClassLoader (SCL). It handles certain tasks and delegates work to other parts.
  3. LombokProcessor: This is the actual AnnotationProcessor we want to use. It's loaded later after we get the ShadowClassLoader.
There's a special case when using the Plexus Apache Compiler, especially in the Eclipse IDE. In this case, Lombok uses a technique called "reflection" to add its own code to the ClassLoader. This allows Lombok to create an instance of LombokProcessor and set it up.
  • First, Lombok finds the ClassLoader it needs to use.
  • With this ClassLoader, it creates an instance of LombokProcessor.
  • Then, it calls the <span  class="teal" >init()</span> method of this processor to prepare it for its tasks.

LombokProcessor's Initialization

The real work of modifying Javac's fields happens inside the <span  class="teal"LombokProcessor.init()</span> method.

Before diving in, we see a disable switch

 
if (System.getProperty("lombok.disable") != null) { lombokDisabled = true;
return;
}
	

The ConfigurationKeys java source itself carries a scary notice with this

"Disables lombok transformers. It does not flag any lombok mentions (so, <span  class="pink" >@Cleanup</span> silently does nothing), and does not disable patched operations in eclipse either. Don't use this unless you know what you're doing. (default: false).

I did come across this thread where the developer moved the lombok dependency from 500 sub-modules to a global parent module, ending up with a

 
java.lang.ClassNotFoundException:
com.sun.tools.javac.processing.JavacProcessingEnvironment.
	

And they resolved it by disabling lombok during tests.

 
<!-- disable lombok during tests -->
<lombok.disable>true</lombok.disable>
	

I couldn't make much sense of this. Disabling lombok should have resulted in non-compilable code (since the <span  class="pink" >@Getters</span>/<span  class="pink" >@Setters</span> are not effective resulting in missing methods). But we move on.

Getting a different JavacProcessingEnvironment

This part of the code is about obtaining another instance of the <span  class="white" >Javac ProcessingEnvironment</span>, but with a slight difference in purpose.

Checking for Java Version

First, it checks if the Java version being used is 1.9 or above. It does this by trying to load a class named <span  class="white" >"java.lang.Module"</span>. If that class doesn't exist, it means it's not Java 1.9 or above, and it stops here.

Accessing the sun.misc.Unsafe Class

It then gets access to a class called <span  class="white" >"sun.misc.Unsafe"</span>. This class holds a special singleton instance named <span  class="white" >"theUnsafe"</span>. Note that there's a security check to ensure only trusted code can access this instance.

Purpose of Unsafe Operations

The code then does something seemingly strange: it gets the offset of a field named <span  class="white" >"first"</span> in a class called <span  class="white" >"Parent"</span> using the <span  class="white" >"theUnsafe"</span> instance. This offset is then used to set a value on a <span  class="white" >"Method"</span> instance. The purpose becomes clear when we look at the <span  class="teal" >Method.invoke()</span> method. The code sets the <span  class="white" >"override"</span> field of the <span  class="white" >"Method"</span> instance to <span  class="s-green" >"true"</span>. This field is used for access checks.

Why Setting "override" to True

In the <span  class="teal" >Method.invoke()</span> method, there's a check that fails if <span  class="white" >"override"</span> is not true. This is part of access control. So, by setting <span  class="white" >"override"</span> to true, the code ensures that it can access the necessary methods without access checks.

Invoking a Method for Modules:

Finally, the code invokes a method for 10 modules in a loop. This part is relatively straightforward.

This paper deep dives into the usage of unsafe class.

Getting Javac Filer

<span  class="white" >JavacFiler</span> is responsible for creating files where the Java compiler stores its results. The <span  class="white" >JavacProcessingEnvironment</span> can provide the <span  class="white" >JavacFiler</span>, but sometimes it's wrapped in another proxy class. To ensure Lombok gets access to it, there's a <span  class="teal" >getJavacFiler()</span> method that uses similar tactics to what we saw earlier in <span  class="teal" >getJavacProcessingEnvironment()</span>.

Post-Compile and No Force Round Dummies Hook: (unusually long name for a method!)

  1. It creates an anonymous <span  class="white" >ClassLoader</span> to prevent the original <span  class="white" >ClassLoader</span> from being closed when <span  class="white" >JavacProcessingEnvironment</span> finishes its work. The reason for this isn't entirely clear.
  2. Force Multiple RoundsIn NetBeans Editor: This is specific to NetBeans and sets a value to indicate background compilation.
  3. Disable Partial Reparse In NetBeans Editor: Again, NetBeans-specific, it seems to disable partial reparse of Java code.
  4. Patching the <span  class="white" >JavaFileManager</span> Context: Lombok wraps the <span  class="white" >JavaFileManager</span> to intercept calls to <span  class="white" >getJavaFileForOutput</span>. This is crucial because Lombok needs to apply bytecode transformations before the class file is written to disk.

Use Unlogged to seamlessly automate testing

Initializing Trees and JavacTransformer

  1. Lombok proceeds to create an instance of <span  class="white" >com.sun.source.util.Trees</span> and a <span  class="white" >JavacTransformer</span>.
  2. Trees class bridges different parts of Java processing, and <span  class="white" >JavacTransformer</span> is a delegator pattern that passes information to actual handlers.

You can read more about Tree API, and pluggable annotation APIs, The Java Compiler API allows a Java program to select and invoke a Java Language Compiler programmatically. The interfaces abstract the way a compiler interacts with its environment.

Completing Initialization

  1. The <span  class="teal" >init()</span> process for both <span  class="white" >LombokProcessor</span> and the original <span  class="white" >AnnotationProcessor</span> is now complete.
  2. The Java compiler can start sending the Abstract Syntax Tree (AST) of source code to Lombok's annotation processor through the <span  class="teal" >process()</span> method.

Modifying the Java AST in-place

The in-place modification of the java code during compilation happens in three parts as we see in the call flow earlier. One is the AST based handlers like HandleFieldDefaults and most notably the HandleVal . The second one is annotation based and most handlers are of this second type. <span  class="pink" >@Getter</span> / <span  class="pink" >@Setter</span> / <span  class="pink" >@Builder</span> are all annotation based handler implementations.

Modifying the Java AST in-place

Order of Transformations

  1. Each transformation handler can specify a priority level.
  2. The priority determines the order in which these handlers are executed during compilation.
  3. For example, <span  class="pink" >HandleVal</span> specifies a priority slightly higher than <span  class="pink" >HandleDelegate</span> because it needs to work after certain code generation tasks.
  4. <span  class="pink" >LombokProcessor</span> organizes these priorities and handlers, ensuring they are executed in the correct order during compilation rounds.

Here is how to set the priority when there is dependency between 2 handlers.

In <span  class="pink" >HandleVal</span>

 
@HandlerPriority(HANDLE_DELEGATE_PRIORITY + 100)

/*run slightly after HandleDelegate; resolution needs to work, so if the RHS expression is a call to a generated getter,  we have to run after that getter has been generated*/

public class HandleVal extends JavacASTAdapter {

And in HandleSynchronized we have

@HandlerPriority(value = 1024)

/*2^10; @NonNull must have run first, so that we wrap around the statements*/

public class HandleSynchronized extends JavacAnnotationHandler<Synchronized>
	

Execution of Handlers

  1. During each compilation round, <span  class="pink" >LombokProcessor</span> identifies the next priority value and the associated handlers.
  2. It then calls a <span  class="pink" >JavacTransformer</span>, which in turn invokes the handlers for that priority.
  3. Handlers at the same priority level are executed in the order they were originally loaded.

AST Visitors and Annotation Handlers

  1. It's a way to visit different parts of your code.
  2. Lombok uses this pattern with the <span  class="pink" >JavacASTVisitor</span> interface.
  3. When you visit a part of your code, you get an instance of <span  class="pink" >JavacNode</span>, which is a wrapper around the actual code (represented by <span  class="pink" >JCTree</span>).
  4. <span  class="pink" >JCTree</span> is the core part of your code that Lombok wants to change or modify.

Modifying Code with Annotations

Let's say you have an annotation like <span  class="pink" >@Setter</span>. Lombok has a handler for it called <span  class="white" >HandleSetter</span>. It adds a new method to your code's structure (AST).

If it's a constructor, it might also do some extra checks to remove an existing default constructor.

It takes care of updating the overall code structure.

Here is a code snippet from their code base.

 
public static void injectMethod(JavacNode typeNode, JCMethodDecl method) {

  JCClassDecl type = (JCClassDecl) typeNode.get();

  if (method.getName().contentEquals("<init>")) {

    //Scan for default constructor, and remove it.

    int idx = 0;

    for (JCTree def: type.defs) {

      if (def instanceof JCMethodDecl) {

        if ((((JCMethodDecl) def).mods.flags & Flags.GENERATEDCONSTR) != 0) {

          JavacNode tossMe = typeNode.getNodeFor(def);

          if (tossMe != null) tossMe.up().removeChild(tossMe);

          type.defs = addAllButOne(type.defs, idx);

          ClassSymbolMembersField.remove(type.sym, ((JCMethodDecl) def).sym);

          break;

        }

      }

      idx++;

    }

  }

  addSuppressWarningsAll(method.mods, typeNode, typeNode.getNodeFor(getGeneratedBy(method)), typeNode.getContext());

  addGenerated(method.mods, typeNode, typeNode.getNodeFor(getGeneratedBy(method)), typeNode.getContext());

  type.defs = type.defs.append(method);

  EnterReflect.memberEnter(method, typeNode);

  typeNode.add(method, Kind.METHOD);

}
	

PostCompiler Transformers

Bytecode-Level Transformations

  1. Lombok does not just modify your Java source code; it also performs transformations at the bytecode level, but not too many. It uses objectweb.asm library for bytecode transformations.
  2. One example of bytecode-level transformation is the <span  class="white" >"PreventNullAnalysisRemover"</span>. This transformer removes calls to the <span  class="teal" >Lombok.preventNullAnalysis</span> method from the generated bytecode.

Intercepting the Bytecode Writing Process

We saw that earlier, Lombok replaced the standard <span  class="white" >JavacFiler</span> with its own <span  class="white" >"InterceptingFiler"</span>.

When it's time for the Java compiler to save the final bytecode to class files on disk (like when you compile your code), the <span  class="white" >InterceptingFiler</span> steps in. It receives the bytecode, applies any necessary transformations (like removing calls to <span  class="teal" >Lombok.preventNullAnalysis</span>), and then writes the modified bytecode to the disk as expected.

Usage of Modified Class Files

Once the modified class files are on your disk, they are ready for use by the Java Virtual Machine (JVM) when you run your application in the future. These modified class files can also be packaged into a JAR file if needed.

Parth
October 4, 2023
Use Unlogged to
mock instantly
record and replay methods
mock instantly
Install Plugin