Much has been already written about what lombok does and how it reduces boilerplate for java developers. This post is not about how to use lombok.
At Unlogged, we extended the lombok implementation and were amazed to learn how flawlessly it works across different build systems, java versions and IDEs.
AnnotationProcessor introduced in J2SE 1.5 cannot make changes to existing files. It could only create new files or bytecode. This makes lombok’s implementation intriguing, since they use AnnotationProcessor to modify the existing java class files, during the compilation phase.
I want to explain how lombok achieves this feat, in detail.
Lombok uses this ant step called <span class="white" >“mappedresources”</span>and renames all <span class="white" >.class</span> files to <span class="white" >.SCL.lombok</span> - in short it hides the class files in a given jar by renaming it.
Why is it important?
By renaming the .class files with the <span class="white" >.SCL.lombok</span> ending, Lombok ensures that it doesn't interfere with or "contaminate" the namespace of any project that uses a JAR file containing Lombok. This means that when you use Lombok in your project, it won't clutter your code suggestions or auto-complete features in your IDE with anything other than Lombok's actual public API.
Similar to a tool called "jarjar," SCL allows Lombok to include its own dependencies (like ASM, a popular Java bytecode manipulation library) without imposing these dependencies on the projects that use the Lombok JAR. Lombok can use other libraries it needs to function correctly without forcing those libraries onto you when you use Lombok in your project.
SCL also plays a role in debugging. It allows an agent that must be packaged in a JAR file to load everything except the SCL infrastructure from class files generated by your IDE. This is beneficial because it makes debugging easier. You can rely on your IDE's built-in auto-recompile features instead of having to run a full build every time you make changes to your code. Additionally, it can help with features like hot code replacement, which allows you to modify code during debugging without restarting your application.
<span class="white" >“mappedresources”</span> is an ant step and I use maven for unlogged-sdk. So, I created a maven plugin that renames files based on regex just before the final jar is assembled.
Here is how it’s used.
You can read more about how class loaders work here.
When you compile your Java code, the process happens in several steps, or "rounds." In each round, the compiler might find new files that haven't been processed yet. It asks each annotation processor if it should do something with these files. The decision is based on another annotation called <span class="pink" >@SupportedAnnotationTypes</span>. This annotation tells the processor which types of annotations it's interested in. It can even use <span class="pink" >a *</span> to say it wants to handle all annotations, including those on classes that aren't annotated.
<span class="pink" >AnnotationProcessors</span> primarily implement two methods, <span class="teal" >init()</span> and <span class="teal" >process()</span> . The first method, <span class="teal" >init()</span> is called when the <span class="pink" >AnnotationProcessor</span> is initialized and it receives an instance of <span class="pink" >ProcessingEnvironment</span> as an argument. The second method, <span class="teal" >process()</span> , is called for every round the java compiler finds a new set of files eligible to be processed (based on the supported annotation types earlier)
Lombok defines two annotation processors in its jar
I wasn’t very clear on the necessity of 2 processors especially when <span class="pink" >ClaimingProcessor</span>didn’t seem to do anything except returning true, signaling the compiler that the set of annotations has been processed. But then the below comment inside the codebase makes it clear.
If you want a hook inside the compilation process, place a lombok jar on your class path and here is how you can run the build process with remote debugging enabled.
With the above parameters, the build process will pause (suspend=y) and start the remote debug server (server=y) at the port 5005. Now the debugger can connect to it.
In IntelliJ you can do this by creating a new Debug Profile of type “Remote JVM Debug”. Make sure the port number matches in your intellij debug config and the one you have used to start the process.
Once you have created that, click the debug button and you will be inside the javac compilation process in debug mode. Remember to put a breakpoint in the entry point of the annotation processor which is the <span class="teal" >init()</span> method
Lombok needs to work with different compilers (javac/eclipse), and different build systems (maven/gradle) - so we will come across a lot of patching.
Java 9 added warnings to prevent the code accessing other private parts of the code. Creators of lombok think that these warnings are unnecessary. So, disableJava9SillyWarning just suppresses the warning message that Java developers may otherwise see. The comment inside the code explains what the developer was thinking.
You may have come across the <span class="pink" >--add-opens</span> and <span class="pink" >--add-exports</span> arguments when running with certain libraries/java-agents which allows module A to access module B via reflection. This access was allowed to begin with, until java 1.9 when warnings/limitations were added for such access. You can read more about it in this very succinct answer on stackoverflow by Nicolai Parlog.
Processor Descriptor is like a set of rules. It follows a design pattern called the "delegator pattern”. This pattern helps organize code by having different parts do different jobs.
Lombok tries a few different ways to find the real <span class="pink" >JavacProcessingEnvironment</span>:
If none of these attempts succeed, Lombok disables its processor because it can't find the necessary environment to work.
This process is about finding and preparing a special component in Java called a "ClassLoader." The main goal is to load a specific AnnotationProcessor called "lombok.javac.apt.LombokProcessor."
Three classes play a role in this process:
There's a special case when using the Plexus Apache Compiler, especially in the Eclipse IDE. In this case, Lombok uses a technique called "reflection" to add its own code to the ClassLoader. This allows Lombok to create an instance of LombokProcessor and set it up.
The real work of modifying Javac's fields happens inside the <span class="teal"LombokProcessor.init()</span> method.
Before diving in, we see a disable switch
The ConfigurationKeys java source itself carries a scary notice with this
"Disables lombok transformers. It does not flag any lombok mentions (so, <span class="pink" >@Cleanup</span> silently does nothing), and does not disable patched operations in eclipse either. Don't use this unless you know what you're doing. (default: false).
I did come across this thread where the developer moved the lombok dependency from 500 sub-modules to a global parent module, ending up with a
And they resolved it by disabling lombok during tests.
I couldn't make much sense of this. Disabling lombok should have resulted in non-compilable code (since the <span class="pink" >@Getters</span>/<span class="pink" >@Setters</span> are not effective resulting in missing methods). But we move on.
This part of the code is about obtaining another instance of the <span class="white" >Javac ProcessingEnvironment</span>, but with a slight difference in purpose.
First, it checks if the Java version being used is 1.9 or above. It does this by trying to load a class named <span class="white" >"java.lang.Module"</span>. If that class doesn't exist, it means it's not Java 1.9 or above, and it stops here.
It then gets access to a class called <span class="white" >"sun.misc.Unsafe"</span>. This class holds a special singleton instance named <span class="white" >"theUnsafe"</span>. Note that there's a security check to ensure only trusted code can access this instance.
The code then does something seemingly strange: it gets the offset of a field named <span class="white" >"first"</span> in a class called <span class="white" >"Parent"</span> using the <span class="white" >"theUnsafe"</span> instance. This offset is then used to set a value on a <span class="white" >"Method"</span> instance. The purpose becomes clear when we look at the <span class="teal" >Method.invoke()</span> method. The code sets the <span class="white" >"override"</span> field of the <span class="white" >"Method"</span> instance to <span class="s-green" >"true"</span>. This field is used for access checks.
In the <span class="teal" >Method.invoke()</span> method, there's a check that fails if <span class="white" >"override"</span> is not true. This is part of access control. So, by setting <span class="white" >"override"</span> to true, the code ensures that it can access the necessary methods without access checks.
Finally, the code invokes a method for 10 modules in a loop. This part is relatively straightforward.
This paper deep dives into the usage of unsafe class.
<span class="white" >JavacFiler</span> is responsible for creating files where the Java compiler stores its results. The <span class="white" >JavacProcessingEnvironment</span> can provide the <span class="white" >JavacFiler</span>, but sometimes it's wrapped in another proxy class. To ensure Lombok gets access to it, there's a <span class="teal" >getJavacFiler()</span> method that uses similar tactics to what we saw earlier in <span class="teal" >getJavacProcessingEnvironment()</span>.
You can read more about Tree API, and pluggable annotation APIs, The Java Compiler API allows a Java program to select and invoke a Java Language Compiler programmatically. The interfaces abstract the way a compiler interacts with its environment.
The in-place modification of the java code during compilation happens in three parts as we see in the call flow earlier. One is the AST based handlers like HandleFieldDefaults and most notably the HandleVal . The second one is annotation based and most handlers are of this second type. <span class="pink" >@Getter</span> / <span class="pink" >@Setter</span> / <span class="pink" >@Builder</span> are all annotation based handler implementations.
Here is how to set the priority when there is dependency between 2 handlers.
In <span class="pink" >HandleVal</span>
Let's say you have an annotation like <span class="pink" >@Setter</span>. Lombok has a handler for it called <span class="white" >HandleSetter</span>. It adds a new method to your code's structure (AST).
If it's a constructor, it might also do some extra checks to remove an existing default constructor.
It takes care of updating the overall code structure.
Here is a code snippet from their code base.
We saw that earlier, Lombok replaced the standard <span class="white" >JavacFiler</span> with its own <span class="white" >"InterceptingFiler"</span>.
When it's time for the Java compiler to save the final bytecode to class files on disk (like when you compile your code), the <span class="white" >InterceptingFiler</span> steps in. It receives the bytecode, applies any necessary transformations (like removing calls to <span class="teal" >Lombok.preventNullAnalysis</span>), and then writes the modified bytecode to the disk as expected.
Once the modified class files are on your disk, they are ready for use by the Java Virtual Machine (JVM) when you run your application in the future. These modified class files can also be packaged into a JAR file if needed.