🧠 AI with Java & Spring Boot – Part 2: Streaming ChatGPT Responses
Hey again, devs! 👋
In Part 1 of this series, we built a text summarizer using Java, Spring Boot, and the OpenAI GPT API. If you haven’t checked that out yet, I recommend starting there.
Now, in Part 2, let’s level up and make our app more dynamic by streaming responses from the ChatGPT API — just like the real thing. 🔥
💡 What’s Streaming?
When using OpenAI’s API, instead of waiting for the full response, you can stream the output as it's generated. This is:
- ⚡ Faster (you see results immediately)
- 💬 More conversational
- 🧠 Great for chatbot-style apps
Let’s make that happen in Java!
⚙️ What We'll Build
We’ll extend our Spring Boot app to:
- Hit OpenAI’s API with stream=true
- Read data chunk-by-chunk using Server-Sent Events (SSE)
- Stream it to the client
🛠️ Step-by-Step Guide
1. Enable Streaming in OpenAI Request
Update your OpenAIService.java
:
public Flux<String> streamChatResponse(String userPrompt) {
WebClient webClient = WebClient.builder()
.baseUrl("https://api.openai.com/v1/chat/completions")
.defaultHeader(HttpHeaders.AUTHORIZATION, "Bearer " + apiKey)
.build();
Map<String, Object> message = Map.of(
"role", "user",
"content", userPrompt
);
Map<String, Object> requestBody = Map.of(
"model", model,
"messages", List.of(message),
"stream", true
);
return webClient.post()
.contentType(MediaType.APPLICATION_JSON)
.bodyValue(requestBody)
.retrieve()
.bodyToFlux(String.class) // Handle as Server-Sent Events
.flatMap(response -> {
// parse out the delta content
if (response.startsWith("data: ")) {
response = response.substring(6);
}
if (response.trim().equals("[DONE]")) return Flux.empty();
try {
ObjectMapper mapper = new ObjectMapper();
JsonNode json = mapper.readTree(response);
return Flux.just(json.at("/choices/0/delta/content").asText());
} catch (Exception e) {
return Flux.empty();
}
});
}
2. Create Controller Endpoint for SSE
@RestController
@RequestMapping("/api/ai")
public class AIStreamController {
private final OpenAIService openAIService;
public AIStreamController(OpenAIService openAIService) {
this.openAIService = openAIService;
}
@GetMapping(value = "/chat-stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> stream(@RequestParam String prompt) {
return openAIService.streamChatResponse(prompt);
}
}
3. Test with curl or JS (Client-Side)
curl:
curl http://localhost:8080/api/ai/chat-stream?prompt=Tell+me+a+joke
JavaScript (SSE):
const evtSource = new EventSource("/api/ai/chat-stream?prompt=Tell+me+a+joke");
evtSource.onmessage = function(event) {
console.log("🧠", event.data);
// Update DOM here
};
✅ Output
Once everything's up, you’ll get a streaming ChatGPT-style response, line-by-line. Much more responsive and realistic!
🔚 What’s Next?
In Part 3, we’ll explore:
- Using LangChain4J to build AI agents in Java
- Creating a memory-aware chat session
- Maybe even file Q&A support with documents
💬 Thoughts?
💡 Suggestions for next topics?
Drop them in the comments below!