Get ahead
VMware offers training and certification to turbo-charge your progress.
Learn moreThis blog post is co-authored by our great contributor Thomas Vitale.
Ollama now supports all GGUF models from Hugging Face, allowing access to over 45,000 community-created models through Spring AI's Ollama integration, runnable locally.
We'll explore using this new feature with Spring AI. The Spring AI Ollama integration can automatically pull unavailable models for both chat completion and embedding models. This is useful when switching models or deploying to new environments.
Install Ollama on your system: https://ollama.com/download.
Tip: Spring AI also supports running Ollama via Testcontainers or integrating with an external Ollama service via Kubernetes Service Bindings.
Follow the dependency management guide to add the Spring AI BOM and the Spring AI Ollama boot starter to your project's Maven pom.xml
file or Gradle build.gradle
files.
Maven:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
Gradle:
implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'
Add the following properties to your application.properties
file:
spring.ai.ollama.chat.options.model=hf.co/bartowski/gemma-2-2b-it-GGUF
spring.ai.ollama.init.pull-model-strategy=always
hf.co/{username}/{repository}
ollama pull hf.co/bartowski/gemma-2-2b-it-GGUF
.Note: Auto-pulling is available in Spring AI 1.0.0-SNAPSHOT and the upcoming M4 release. For M3, pre-download models (ollama pull hf.co/{username}/{repository}
).
You can disable the embedding auto-configuration if not required: spring.ai.ollama.embedding.enabled=false
.
Otherwise, Spring AI will pull the mxbai-embed-large
embedding model if not available locally.
Using the configured Hugging Face model with Spring AI is straightforward and not different from using any other Spring AI model provider. Here's a simple example:
@Bean
public CommandLineRunner run(ChatClient.Builder builder) {
var chatClient = builder.build();
return args -> {
var response = chatClient
.prompt("Tell me a joke")
.call()
.content();
logger.info("Answer: " + response);
};
}
The integration of Ollama's support for Hugging Face GGUF models with Spring AI opens up a world of possibilities for developers.
We encourage you to explore the vast collection of models on Hugging Face and experiment with different models in your Spring AI projects. Whether you're building advanced natural language understanding systems, creative writing tools, or complex analytical applications, Spring AI and Ollama provide the flexibility to easily leverage these powerful models.
Remember to stay updated with the latest developments in Spring AI and Ollama, as this field is rapidly evolving. Happy coding!