Prev Next

Spring / Spring AI interview questions

What are the Spring AI Chat Model options for controlling response determinism?

Response determinism in LLMs is primarily controlled through two inference parameters: temperature and top-p (nucleus sampling). Both are set via ChatOptions in Spring AI and work together to shape how randomly the model selects the next token at each step of generation.

Temperature scales the probability distribution over the vocabulary before sampling. A temperature of 0.0 makes the model almost always choose the single highest-probability token (near-deterministic, repetitive). A temperature of 1.0 samples from the raw distribution. Values above 1.0 flatten the distribution further, increasing creativity and randomness. For factual tasks (Q&A, code generation, data extraction) use 0.0–0.3. For creative tasks (writing, brainstorming) use 0.7–1.0.

Top-p restricts sampling to the smallest set of tokens whose cumulative probability exceeds p. A top-p of 0.9 means the model only considers tokens that together account for 90% of the probability mass, discarding long-tail unlikely tokens. Most practitioners either tune temperature alone and leave top-p at 1.0, or tune top-p alone and leave temperature at 1.0 — adjusting both simultaneously is rarely necessary and harder to reason about.

// Deterministic (code analysis, data extraction)
ChatOptions factual = ChatOptionsBuilder.builder()
    .withTemperature(0.1f).withTopP(1.0f).build();

// Creative (story, marketing copy)
ChatOptions creative = ChatOptionsBuilder.builder()
    .withTemperature(0.85f).withTopP(0.95f).build();

Note that even temperature 0 is not fully deterministic across all providers due to floating-point parallelism in GPU computations — you may see occasional token variation on identical inputs.

Invest now in Acorns!!! 🚀 Join Acorns and get your $5 bonus!

Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!

Earn passively and while sleeping

Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.

Invest now!!! Get Free equity stock (US, UK only)!

Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.

The Robinhood app makes it easy to trade stocks, crypto and more.


Webull! Receive free stock by signing up using the link: Webull signup.

More Related questions...

What is Spring AI and what problem does it solve? What AI model providers does Spring AI support? What is the difference between ChatModel and ChatClient in Spring AI? How do you create and use a ChatClient in a Spring Boot application? What message types does Spring AI support in a Prompt? What is Retrieval-Augmented Generation (RAG) and how does Spring AI implement it? What is a VectorStore in Spring AI and which implementations are available? What is an EmbeddingModel in Spring AI and why must the same model be used for ingestion and retrieval? How does PromptTemplate work in Spring AI? What is structured output in Spring AI and how does it work internally? What are Advisors in Spring AI and what built-in advisors are available? How does conversation memory work in Spring AI? What is function calling (tool use) in Spring AI and how do you register a function? How do you stream responses from an LLM in Spring AI? What is the Document class in Spring AI and how is it used in RAG? What is TokenTextSplitter and why is document chunking necessary? What DocumentReaders does Spring AI provide for loading content into the RAG pipeline? What is the Spring AI ETL pipeline and how does it work? How does Spring AI integrate with Spring Boot auto-configuration? What are ChatOptions in Spring AI and how do you apply them per-request? What is the SearchRequest API in Spring AI's VectorStore? How does Spring AI support multimodal inputs such as images? What is image generation in Spring AI and how do you use ImageModel? How does Spring AI handle observability and what metrics does it expose? How do you test Spring AI components without calling real AI APIs? What is the Spring AI MCP (Model Context Protocol) integration? What is the role of MetadataEnricher and KeywordMetadataEnricher in Spring AI? What are the Spring AI Chat Model options for controlling response determinism? What is the Spring AI Agentic pattern and how does it differ from a single-turn chat call? What does the spring-ai-bom do and why should you use it? What is PgVector and how do you configure it as a VectorStore in Spring AI? How does Spring AI's retry and resilience mechanism work for LLM API calls? What is the Spring AI Evaluation framework and how do you use it? How do you use Spring AI with Spring WebFlux for a reactive AI endpoint? What are the Spring AI Spring Initializr options and how do you bootstrap a project? What is the Spring AI content moderation strategy and how do you implement it? How does Spring AI support multi-tenancy where different users need different LLM configurations? What is the Spring AI AudioModel and how does it support speech synthesis? How does Spring AI handle prompt injection attacks? What are the performance tuning strategies for a Spring AI RAG application at scale? How does Spring AI support the Ollama provider for local model development? What is semantic caching in Spring AI and how would you implement it? How does Spring AI integrate with Spring Security for securing AI endpoints? How does Spring AI's Document metadata filtering work with PgVector and what filter operators are available?
Show more question and Answers...

Hibernate

Comments & Discussions