Spring AI with Docker Model Runner

Releases | Mark Pollack | April 10, 2025 | ...

This blog post is authored by Eddú Meléndez.

Docker recently released a Model Runner in Docker Desktop for Mac 4.40.0 on Apple silicon. The Docker Model Runner provides a local Inference API designed to be compatible with the OpenAI API, enabling easy integration with Spring AI as part of the Spring AI 1.0.0-M7 release. Models are distributed as standard OCI artifacts on Docker Hub under the ai namespace.

Prerequisites

  • Download Docker Desktop for Mac 4.40.0.

  • Choose one of the following options to enable the Model Runner:

    Option 1:

  • Enable Model Runner `docker desktop enable model-runner --tcp 12434`.

  • Set the base-url to `http://localhost:12434/engines\`

    Option 2:

  • Enable Model Runner `docker desktop enable model-runner`.

  • Use Testcontainers and set the base-url as follows:

@Container
private static final SocatContainer socat = new SocatContainer().withTarget(80, "model-runner.docker.internal");

@Bean
public OpenAiApi chatCompletionApi() {
	var baseUrl = "http://%s:%d/engines".formatted(socat.getHost(), socat.getMappedPort(80));
	return OpenAiApi.builder().baseUrl(baseUrl).apiKey("test").build();
}

Next, pull the model `docker model pull ai/gemma3` and confirm it is available locally `docker model list`

Dependencies

Go to start.spring.io, select Spring Web, OpenAI and Testcontainers and generate the project.

The following dependencies must be listed

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.ai</groupId>
	<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

<dependency>
	<groupId>org.springframework.ai</groupId>
	<artifactId>spring-ai-spring-boot-testcontainers</artifactId>
	<scope>test</scope>
</dependency>

Also, make sure the Spring AI BOM is present

<dependencyManagement>
	<dependencies>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-bom</artifactId>
			<version>${spring-ai.version}</version>
			<type>pom</type>
			<scope>import</scope>
		</dependency>
	</dependencies>
</dependencyManagement>

Configuring Spring AI

To use Docker Model Runner, we need to configure the OpenAI client to point to the right endpoint and use the model pulled earlier

For Option 1: Let’s configure the src/main/resources/application.properties

spring.ai.openai.api-key=ignored
spring.ai.openai.base-url=http://localhost:12434/engines
spring.ai.openai.chat.options.model=ai/gemma3

For Option 2 (Using Testcontainers): Let’s go to `TestcontainersConfiguration`, define the SocatContainer bean and register the properties with `DynamicPropertyRegistrar` bean.

@TestConfiguration(proxyBeanMethods = false)
class TestcontainersConfiguration {

    @Bean
    SocatContainer socat() {
        return new SocatContainer(DockerImageName.parse("alpine/socat:1.8.0.1"))
                .withTarget(80, "model-runner.docker.internal");
    }
    
    @Bean
    DynamicPropertyRegistrar properties(SocatContainer socat) {
        return (registrar) -> {
            registrar.add("spring.ai.openai.base-url", () -> "http://%s:%d/engines".formatted(socat.getHost(), socat.getMappedPort(80)));
            registrar.add("spring.ai.openai.api-key", () -> "test-api-key");
            registrar.add("spring.ai.openai.chat.options.model", () -> "ai/gemma3");
        };
    }
}

Chat example

Now, let’s create a simple controller

@RestController
public class ChatController {

    private final ChatClient chatClient;

    public ChatController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping("/chat")
    public String chat(@RequestParam String message) {
        return this.chatClient.prompt()
                .user(message)
                .call()
                .content();
    }

   @GetMapping("/chat-stream")
    public Flux<String> chatStream(@RequestParam String message) {
        return this.chatClient.prompt()
                .user(message)
                .stream()
                .content();
    }

}

Run the application with `./mvnw spring-boot:test-run`

Using httpie, let’s call to the `/chat` endpoint

http :8080/chat message=="tell me a joke"

We can also call to the `/chat-stream` endpoint

http :8080/chat-stream message=="tell me a haiku about docker containers"

Tool example

Docker Model Runner of course supports tool calling if used with a model that supports tool calling.

Create a `FunctionCallConfig` class and add a simple function

@Configuration(proxyBeanMethods = false)
class FunctionCallConfig {

    @Bean
    @Description("Get the stock price")
    public Function<MockStockService.StockRequest, MockStockService.StockResponse> stockFunction() {
        return new MockStockService();
    }

    static class MockStockService implements Function<MockStockService.StockRequest, MockStockService.StockResponse> {

        public record StockRequest(String symbol) {}
        public record StockResponse(double price) {}

        @Override
        public StockResponse apply(StockRequest request) {
            double price = request.symbol().contains("AAPL") ? 198 : 114;
            return new StockResponse(price);
        }
    }
    
}

Now, let’s register `stockFunction` function

@GetMapping("/stocks")
public String stocks(@RequestParam String message) {
    return this.chatClient.prompt()
            .user(message)
            .tools("stockFunction")
            .call()
            .content();
}

Run the application `./mvnw spring-boot:test-run` and call the `/stocks` endpoint

http :8080/stocks message=="What's AAPL and NVDA stock price?"

The response should be something like `AAPL stock price is 198.0 and NVDA stock price is 114.0.` based on the hardcoded values we set.

References

Conclusion

Docker Model Runner allows you to iterate faster, stay local and access an OpenAI-compatible API. It streamlines the development experience by enabling seamless integration with Spring AI’s OpenAI module, letting developers stay within their familiar inner-loop tooling. This empowers teams to build and test AI applications at their own pace—locally, securely, and efficiently. In the future, integration with Testcontainers will make it even easier to pull and run models on demand, further simplifying setup and testing workflows.Docker Model Runner allows you to iterate faster, stay local and access an OpenAI-compatible API. It streamlines the development experience by enabling seamless integration with Spring AI’s OpenAI module, letting developers stay within their familiar inner-loop tooling. This empowers teams to build and test AI applications at their own pace—locally, securely, and efficiently.

Get the Spring newsletter

Stay connected with the Spring newsletter

Subscribe

Get ahead

VMware offers training and certification to turbo-charge your progress.

Learn more

Get support

Tanzu Spring offers support and binaries for OpenJDK™, Spring, and Apache Tomcat® in one simple subscription.

Learn more

Upcoming events

Check out all the upcoming events in the Spring community.

View all