Spring AI is a project within the Spring Framework that brings the power of artificial intelligence (AI) to Java developers.
It’s a spring module that lets you to create Java-based AI applications without complexity.
💡 Spring AI is currently in a pre-1.0 release version, specifically at 0.8.0 snapshot. As a snapshot version, it is still under development, which means you might encounter unexpected behavior or code-breaking changes. 🌱🤖
Generative Ai terms to know:
Let’s Explore Generative AI that is used under the hood by Spring AI:
Generative AI, also known as GenAI, is a fascinating field within artificial intelligence (AI) that focuses on creating new content in response to input prompts. Let’s break it down:
What Is Generative AI?
Generative AI allows users to input various prompts to generate fresh content across different media types, including:
Code: Generating code snippets or entire programs.
Text: Generating stories, poems, or other written content.
Sounds: Creating music, sound effects, or audio snippets.
Videos: Generating video clips or animations.
3D Designs: Creating 3D models or structures.
Images: Creating visual art, illustrations, or designs.
Prompt: In the AI world, a prompt is a text message given to the AI model. It includes context and a question.
What happens underneath: Prompts are tokenized i.e. breaking the text string into simple units. These units can be words or even characters depending on the model’s design. This token is then processed by the model to understand and generate responses based on the input it receives.
💡 In the case of Spring AI - it makes an API call to openAI and presents the response back.
What Spring AI Offers
1. Text-Based Generative AI:
At its core, Spring AI offers a text-based generative AI system. Users input text, and it responds with relevant text output. This basic yet powerful functionality allows for a wide array of text-based interactions and solutions tailored to meet diverse needs.
2. Integration with Industry-Leading Generative AI Models:
Spring AI collaborates with several of the tech industry's most prominent generative AI models, including OpenAI, Azure Open AI, Bedrock (Amazon), Ollama, and Vertex AI (Google). By leveraging the unique capabilities of these platforms, Spring AI enhances its offerings, providing users with access to cutting-edge AI technology and a broad range of functionalities.
3. Output Parser:
The output parser is a key feature of Spring AI, significantly influencing the processing and presentation of AI-generated content. It does two things mainly:
It instructs the prompt on the desired response format and structure, ensuring outputs are not generated at random but are instead presented in a specific format, such as JSON, and based on the context provided.
After issuing a prompt and receiving a response, the output parser organizes this response into a structured form, like a Java bean, a list, or possibly a map of values. This aspect of Spring AI is crucial, as it allows for the customization of responses to specific user requirements and the seamless integration of AI-generated content into various applications and workflows.
4. Document Reader:
Spring AI also features document readers, which are among the standout offerings from the Spring AI development team. These document readers are particularly useful in the context of Retrieval Augmented Generation (RAG).
RAG provides your prompts with a specific context, ensuring that the AI doesn't rely on its entire extensive training knowledge. Instead, it can concentrate on particular documents you've supplied, containing the information you're interested in. This focus enhances the relevance and accuracy of the AI's responses.
Spring AI's document reader supports various formats, including simple text, JSON, and Tika, with notable support for PDFs. This versatility ensures that developers can seamlessly integrate documents in these formats into their workflows, enabling the AI to extract and utilize the contained information effectively.
5. Vector Store Integration:
Spring AI is equipped with integration capabilities for several leading vector stores, including Chroma, Pinecone, Redis, Weavite, Milvus, Azure, and PostgreSQL (PG Vector). Later in this blog, we will delve into coding with PG Vector.
A vector store enables you to organize your document data into smaller chunks, or sub-documents. Before consulting a Large Language Model (LLM) with a question, you can query the vector store to see what it knows about your query. The vector store then provides these sub-documents, which serve as concise prompts for the LLM, instead of submitting the entire document for processing.
This approach is particularly useful given the limitations on the number of tokens that can be included in a prompt. By supplying sub-documents obtained from the vector store as prompts to the LLM, we can efficiently navigate these restrictions.
Essentially, a vector store functions much like a search engine, streamlining the process of querying large datasets and enhancing the efficiency of interactions with LLMs.
6. Prompt Templates:
Crafting an effective prompt involves more than merely posing a question and expecting an accurate response. This approach may suffice for straightforward inquiries, but for more complex scenarios, it's essential to employ refined templates. These templates are structured in a way that allows you to insert specific details about the question or instructions you're providing to the AI. Spring AI includes support for these prompt templates, enabling users to tailor their prompts more precisely and enhance the likelihood of obtaining relevant and accurate answers.
We will delve into some of Spring AI's key features in detail by demonstrating their implementation through code in the later part of this article.
What can we request/response from Spring AI?
Conversational question-and-answer sessions, mimicking a human-like interaction by remembering past context and responding accordingly.
Questions based on specific documents, provided they are supplied.
Questions related to coding and programming data.
Chapter 1: Building a Basic Generative AI Application with Spring Boot and OpenAI Integration.
Embark on developing a basic Generative AI application with Spring Boot, integrating Spring AI and OpenAI to enhance your app with advanced AI functionalities. Follow the steps below to set up the environment, add dependencies, and create an AI-powered REST endpoint.
Step 1: Initialize Your Spring Boot Application
Use the Spring Initializr to create a new Spring Boot project. For this application, you'll need at least the <span class="pink">Spring Web</span> dependency. Select your preferred project metadata (Group, Artifact, Project Name) and other settings like the Java version.
Step 2: Configure Maven Dependencies
To integrate advanced Generative AI capabilities into a Spring Boot application, particularly for projects aiming to harness the power of AI models like those offered by OpenAI, it's necessary to include specific dependencies and repositories in your Maven <span class="pink">pom.xml</span> file. This setup ensures your application has access to the latest AI-focused functionalities provided by the Spring framework. Below, we detail the steps and configurations required to transform a standard Spring Boot application into a Generative AI-enabled project.
Adding the Spring Snapshots Repository
The first step involves adding the Spring Snapshots repository to your project. This repository contains development versions of Spring libraries, which are frequently updated with the latest features and bug fixes. Since AI integrations in Spring might still be evolving rapidly, using the snapshot versions ensures you have access to the most current functionalities. The repository is defined as follows:
This configuration allows Maven to fetch dependencies from the specified Spring Snapshots repository.
Setting Java Version and Spring AI Version
Next, we define properties for the Java version and the version of the Spring AI library. Using properties simplifies the management of versions throughout the project. In this example, we're using Java 17 and specifying the version <span class="blue">0.8.0-SNAPSHOT</span> for the Spring AI library:
This approach promotes better maintainability, especially when upgrading to newer versions of Java or the Spring AI library.
Adding the Spring AI Dependency
Finally, to enable AI functionalities within your Spring Boot application, you must add a dependency for <span class="pink">spring-ai-openai-spring-boot-starter</span>. This starter package likely includes auto-configuration classes, utilities, and other necessary components to seamlessly integrate AI models into your application, potentially providing an interface to OpenAI's APIs or other AI services:
By incorporating this dependency, developers can access pre-configured AI-related beans and services, streamlining the development of AI-powered features within the Spring Boot framework.
This code snippet establishes a REST endpoint <span class="pink">/topAthlete</span> in a Spring Boot application, which utilizes AI to generate information about top athletes in a given sport. When you access this endpoint and input a sport through the <span class="pink">subject</span> query parameter, it instructs the AI (via the <span class="pink">ChatClient</span>) to compile a detailed list of prominent athletes in that sport. This list includes their earnings, family background, and the teams they are part of, formatted in JSON. The <span class="pink">AthleteController</span> manages this process, leveraging the AI capabilities to fetch and present the requested data.
package com.unlogged.springai;
import org.springframework.ai.chat.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class AthleteController {
private final ChatClient chatClient;
public AthleteController(ChatClient chatClient) {
this.chatClient = chatClient;
}
@GetMapping("/topAthlete")
public String topAthlete(@RequestParam("subject") String subject) {
return chatClient.call("give me the list of "+subject+" in the world in json format with their income, family, sports team");
}
}
Step 4: Configure Your Environment Variables
Before running your Spring Boot application, it's important to securely set up the API key needed for the OpenAI integration.
4.1 Obtain an OpenAI API Key: To leverage OpenAI's services, you first need to secure an API key. Here’s how:
Once you have logged in, Navigate to the API section and select “API Keys.” Here, you can generate a new API key.
Once generated, copy your new API key. You'll use this key to authenticate your requests from your spring boot application to OpenAI's services.
To do this, you'll add an environment variable with your unique OpenAI API key. Use the following format, replacing YOUR_API_KEY with your actual OpenAI API key:
spring.ai.openai.api-key=YOUR_API_KEY
This configures the <span class="blue">spring.ai.openai.api-key</span> environment variable, essential for authenticating your application's requests to OpenAI services. Keeping the API key as an environment variable ensures it remains secure and is not hard-coded within your application's source code. You can set this variable in your IDE's run configurations or directly within your operating system's environment settings.
(Make sure to complete this step before launching your application to enable its AI features by successfully communicating with OpenAI's API.)
Step 5: Mock Calls to OpenAI
Making frequent API calls to OpenAI could be expensive and it takes a while to get their response too. You can mock the API calls with one click using Unlogged, and continue the rest of your workflow.
To set up the plugin in the IntelliJ IDEA, go to File → Settings in Windows or IntelliJ IDEA → Settings on macOS
We need to add a new dependency and a plugin item to our pom.xml file.
Run mvn clean to download the Unlogged dependencies and plugin:
Finally, reload the project using the Maven plugin. Click the Maven icon on the right side of the IntelliJ IDEA window, then select the refresh icon.
After launching your Spring Boot application and configuring the environment variable with the OpenAI API key, let’s test our rest end points and mock api calls with unlogged.io
💡 In real-time, production-level applications, calling an API can be costly and time-consuming. To avoid delays and reduce expenses, especially if an API is down or incurs charges, we can simulate the API responses using the unlogged.io plugin. This approach allows us to bypass actual API calls in runtime, ensuring a smoother and more cost-effective workflow. Here’s how:
Chapter 2: Exploring the Broad Capabilities of Spring AI: The Prompt Template
The PromptTemplate class in Spring AI is a key component for managing and manipulating prompt templates. It uses the StringTemplate engine to handle templates and allows dynamic data insertion into these templates.
The Prompt Template enables developers to shape the AI’s responses’ format and substance. This is useful when we want uniformity in the AI’s responses or steer the AI’s conduct in a particular direction.
Think of Prompt Templates as a fill-in-the-blank game. The AI fills in the blanks based on what’s happening in the conversation. Some parts only show up if certain conditions are met.
Now, let's replicate the process of identifying the world's top athletes across various sports using a consistent prompt template. This template will serve as our base for different API requests, with the only modification being the specific sport's name. By altering just this keyword, we can seamlessly retrieve a list of the leading athletes in any given sport.
We just have to make some changes in our Rest Controller file and environment variables file of the above-used project.
spring.ai.openai.api-key=YOUR_OPENAI_KEY
app.promptTemplate=give me the list of top {subject} players in the world in json format with their income, family, sports team
Here’s how it works:
The method <span class="pink">topAthlete</span> is declared, which is accessible to everyone (public), returns a text (String), and takes a <span class="pink">subject</span> as input from the user’s request.
A new <span class="pink">PromptTemplate </span>object is created using a predefined prompt template. This prompt template is stored in the <span class="pink">application.properties</span> file and is retrieved when the application starts.
The prompt template is <span class="blue">"give me the list of top {subject} players in the world in json format with their income, family, sports team".</span>
This prompt template is a text with a placeholder <span class="pink">{subject}</span> that will be replaced with actual values. The <span class="pink">PromptTemplate</span> object comes from the Spring AI module, which provides utilities for handling such templates.
The <span class="pink">render</span> method of the <span class="pink">PromptTemplate</span> object is called with a map that has <span class="pink">“subject”</span> as the key and the actual <span class="pink">subject</span> from the user’s request as the value. This method replaces the <span class="pink">{subject}</span> placeholder in the prompt template with the actual <span class="pink">subject</span>. The result is a new string where the placeholder has been replaced with the actual subject, and this string is stored in <span class="pink">renderedPrompt</span>.
Finally, a call is made to the chat service using the <span class="pink">chatClient</span> object and the <span class="pink">renderedPrompt</span> string. The response from the chat service is what the <span class="pink">topAthlete</span> method ultimately returns.
So, this method takes a subject from the user’s request, plugs it into a predefined prompt template stored in <span class="pink">application.properties</span> and handled by the Spring AI module, and uses that to make a call to a chat service. The response from the chat service is then returned to the user. This allows the application to dynamically generate requests for information about the top players in a given sport, complete with details about their income, family, and sports team. This design makes the application flexible and easy to configure.
Chapter 3: Spring AI with Vector Databases.
3.1 Vector Database
Vector databases are designed specifically for artificial intelligence (AI) projects. They operate differently from regular SQL or NoSQL databases by focusing on finding similar items, rather than searching for exact matches. For instance, if you're searching for articles related to "innovative technology," a vector database would find articles discussing cutting-edge tech, even if they don't use the exact term "innovative technology." It locates texts with a similar essence or theme, rather than relying on exact word matches from the search query.
Here's a simple breakdown of how they're used:
Loading Data: First, you upload your information into the vector database.
Query Processing: When you have a question you want an AI to answer, you don't send the question directly to the AI. Instead, you first look for related information or documents in the vector database. These are pieces of data that are similar to your question in some way.
Retrieval Augmented Generation (RAG): The similar documents found in step 2 are then combined with your original question and sent to the AI. This process helps the AI understand the context of your question better, enabling it to provide a more accurate and relevant answer.
Think of vector databases as giving the AI a "background briefing" with related information so it can better understand and answer your question.
Spring AI provides a simplified API for seamless interaction with vector databases via the VectorStore interface.
public interface VectorStore {
void add(List documents);
Optional delete(List idList);
List similaritySearch(String query);
List similaritySearch(SearchRequest request);
}
And the related <span class="pink">SearchRequest</span> builder:
public class SearchRequest {
public final String query;
private int topK = 4;
private double similarityThreshold = SIMILARITY_THRESHOLD_ALL;
private Filter.Expression filterExpression;
public static SearchRequest query(String query) { return new SearchRequest(query); }
private SearchRequest(String query) { this.query = query; }
public SearchRequest withTopK(int topK) {...}
public SearchRequest withSimilarityThreshold(double threshold) {...}
public SearchRequest withSimilarityThresholdAll() {...}
public SearchRequest withFilterExpression(Filter.Expression expression) {...}
public SearchRequest withFilterExpression(String textExpression) {...}
public String getQuery() {...}
public int getTopK() {...}
public double getSimilarityThreshold() {...}
public Filter.Expression getFilterExpression() {...}
}
Brief Explanation:
VectorStore Interface:
The VectorStore interface abstracts the interaction with vector databases, making it easier for developers to work with vector data.
Here are the essentials regarding the VectorStore interface:
Data Insertion:
To insert data into the vector database, you create a <span class="pink">Document</span> object. This class encapsulates content from various data sources (such as PDFs or Word documents) and includes the following components:
Text Content: Represented as a string.
Metadata: Key-value pairs that provide additional details. For example, metadata might include the filename, author, or creation date.
<span class="teal">Upon insertion, the text content is transformed into a numerical array (vector embedding) using an embedding model (such as Word2Vec, GLoVE, BERT, or OpenAI’s text-embedding-ada-002,). Embedding models are used to convert words, sentences, or paragraphs into these vector embeddings.</span>
Vector Embeddings:
Vector embeddings are essential for similarity searches. They represent the data in a format suitable for comparison.
The vector database does not generate these embeddings itself; it stores and facilitates similarity searches based on existing embeddings.
Similarity Searches:
The <span class="pink">similaritySearch</span> methods in the VectorStore interface allow you to retrieve documents similar to a given query string.
Fine-tune your search using the following parameters:
k (Top K): Specify the maximum number of similar documents to return (often referred to as K nearest neighbors or KNN).
threshold: Set a similarity threshold (ranging from 0 to 1). Only documents with similarity scores above this threshold are returned.
Filter.Expression: Use this DSL expression to filter results based on metadata key-value pairs (similar to a SQL WHERE clause).
Available Implementations
These are the available implementations of the VectorStore interface:
Azure Vector Search [AzureVectorStore]: The Azure vector store
Chroma [ChromaVectorStore]: The Chroma vector store
Milvus [MilvusVectorStore]: The Milvus vector store
Simple Vector Store [SimpleVectorStore]: A simple implementation of persistent vector storage, good for educational purposes
Weaviate [WeaviateVectorStore] The Weaviate vector store
Embedding Client:
The EmbeddingClient interface simplifies the integration of embedding models within Spring AI. Its primary function is to convert text into numerical vectors (embeddings).
Key features of the EmbeddingClient interface:
Portability: It ensures easy adaptability across different embedding models. Developers can switch between various techniques or models with minimal code changes.
Simplicity: The EmbeddingClient handles the complexity of dealing with raw text data and embedding algorithms. Methods like embed(String text) and embed(Document document) provide straightforward ways to obtain embeddings.
Example methods in the EmbeddingClient interface:
public interface EmbeddingClient {
List embed(String text);
List embed(Document document);
List> embed(List texts);
EmbeddingResponse embedForResponse(List texts);
default int dimensions() {
return embed("Test String").size();
}
}
The <span class="pink">embed</span> methods allow you to convert text into embeddings, accommodating single strings, structured <span class="pink">Document</span> objects, or batches of text. The returned values are lists of doubles representing the embeddings in numerical vector format.
The <span class="pink">embedForResponse</span> method provides a more comprehensive output, potentially including additional information about the embeddings.
The <span class="pink">dimensions</span> method quickly reveals the size of the embedding vectors, which is essential for understanding the embedding space and subsequent processing steps.
OpenAI Embeddings:
Spring AI supports OpenAI’s text embedding models.
These embeddings measure the relatedness of text strings. An embedding is essentially a vector (list) of floating-point numbers.
The distance between two vectors reflects their relatedness: smaller distances indicate high relatedness, while larger distances suggest low relatedness.
💡 To summarise, the VectorStore interface simplifies vector database interactions, the EmbeddingClient streamlines embedding integration, and Spring AI embraces OpenAI’s powerful text embeddings for various natural language processing tasks.
In this article, we'll be focusing on PgVector, but feel free to select any option that best suits your needs.
3.2 Spring AI with PgVector and OpenAPI in Action
PgVector: pgvector is an open-source PostgreSQL extension designed for vector similarity search. It allows you to store, query, and index vectors within your PostgreSQL database.
PgVector Features:
Facilitates both precise and approximate nearest-neighbor searches.
Calculates distances through L2 distance, inner product, and cosine distance metrics.
Works with any programming language that can connect to a PostgreSQL database.
Offers ACID compliance, point-in-time recovery, and additional PostgreSQL features.
pgvector is set to conduct an exact nearest neighbor search by default, ensuring flawless recall.
In this tutorial, we won't delve into the inner workings of pgVector. Instead, our focus will be on integrating it with Spring AI and leveraging its potential.
Spring AI + pgVector explained with an example
We'll be initializing a basic Spring Boot project through start.spring.io. This project will focus on storing documents as vector embeddings in a pgVector database, facilitated by Spring AI. Subsequently, we will execute a vector similarity search utilizing both Spring AI and pgVector to retrieve these documents based on their vector similarities.
To utilize pgVector and Spring AI in our project, certain dependencies must be included:
💡 Additionally, I've incorporated a potent and open-source dependency to utilize the Unlogged.io plugin in my project for API testing. This article has previously outlined the steps to incorporate the unlogged.io plugin into your project, offering a powerful solution for your testing needs.
💡 Currently, the dependencies for Spring AI and pgvector cannot be added automatically through start.spring.io. Instead, you'll need to manually insert these dependencies into your pom.xml. Additionally, for Spring AI, you must manually specify the corresponding repository in your pom.xml.
Once these dependencies are added, our pom.xml file will appear as follows:
I am utilizing the pgVector image obtained from Docker Hub. To facilitate this, I composed a yml file to retrieve the image, enabling the execution of the Spring Boot application locally.
Additionally, I've included a dependency within the Docker Compose configuration to manage and orchestrate the container for the pgVector image alongside running the SpringBoot application on my local machine.
<span class="pink">docker-compose.yml</span> file should look like this:
💡 I adjusted the database's default host port to 5434 because the default port, 5432, was already in use. Should you encounter any database-related issues, consider altering the port number as a potential solution.
Next, we'll develop a REST controller that includes endpoints for storing documents in the pgVector database and retrieving documents based on specific keywords. This controller will utilize the VectorStore for adding documents and performing similarity searches to find relevant documents based on the input query.
The restController file should look like this:
package com.unlogged;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
@RestController
public class VectorController {
private final VectorStore vectorStore;
// Autowiring VectorStore through the constructor
@Autowired
public VectorController(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}
// Endpoint to add documents to the vector store
@GetMapping("/addDocuments")
public void addDocuments() {
List documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))
);
vectorStore.add(documents);
}
// Modified endpoint to perform a similarity search and return a list of strings
@GetMapping("/search")
public List searchDocuments(@RequestParam("query") String query) {
// Using SearchRequest for a more detailed search configuration
List results = vectorStore.similaritySearch(SearchRequest.query(query).withTopK(5));
// Transform the document results into a List of Strings
return results.stream()
.map(doc -> "Id: " + doc.getId() + ", Content: " + doc.getContent())
.collect(Collectors.toList());
}
}
Explanation:
The REST controller provided manages interactions with a pgVector database for storing and retrieving vectorized representations of documents using two main endpoints:
<span class="pink">/addDocuments</span> Endpoint: When a GET request is made to this endpoint, it triggers the addition of a predefined set of documents into the pgVector database. Each document is represented by a <span class="pink">Document</span> object, which includes the document's content and optional metadata. These documents are converted into vector embeddings (if not already in that form) and stored in the database, ready for similarity searches. The documents in the example include both plain text and associated metadata, demonstrating the flexibility in what can be stored.
<span class="pink">/search</span> Endpoint: This GET endpoint accepts a query parameter named "query," which is expected to be a string. Upon receiving a request, it utilizes the <span class="pink">vectorStore.similaritySearch</span> method, passing a <span class="pink">SearchRequest</span> configured with the query string and specifying that the top 5 (TopK) most similar documents should be returned. The similarity search leverages the vector embeddings of the stored documents to find the ones most similar to the vector representation of the query string. The endpoint then formats the results into a list of strings, each summarizing a document's ID and content, showcasing how the vectorized search can identify and retrieve documents that are contextually related to the search query.
💡 Remember to convert the List of Documents into a List of Strings; otherwise, you'll end up receiving vector embeddings as a response.
The <span class="pink">VectorConfiguration</span> class configures a <span class="pink">VectorStore</span> bean using <span class="pink">PgVectorStore</span> to handle vector embeddings in a <span class="blue">PostgreSQL</span> database. It leverages Spring's <span class="pink">JdbcTemplate</span> for database interactions and an <span class="pink">EmbeddingClient</span> for embedding operations, enabling efficient storage and retrieval of vector data.
package com.unlogged;
import org.springframework.ai.embedding.EmbeddingClient;
import org.springframework.ai.vectorstore.PgVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.core.JdbcTemplate;
@Configuration
public class VectorConfiguration {
// Bean for VectorStore setup
@Bean
public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingClient embeddingClient) {
return new PgVectorStore(jdbcTemplate, embeddingClient);
}
}
The configuration outlines the necessity for an OpenAI services API key and establishes a connection to a PostgreSQL database hosted on localhost port 5434, titled vector_store.