AI / Claude Models Basics Interview Questions

1. What is Claude and who makes it? 2. What are the current Claude model families and what is each one optimised for? 3. What are the API model IDs for the current Claude models? 4. What is a context window and what are the context window sizes for current Claude models? 5. What are the pricing tiers for current Claude models and how is pricing calculated? 6. What input and output modalities do current Claude models support? 7. What is extended thinking and how does it differ from adaptive thinking in Claude? 8. What platforms and cloud providers is Claude available on? 9. What is the knowledge cutoff for current Claude models? 10. What is the Claude model lifecycle — what do 'Active', 'Legacy', and 'Deprecated' mean? 11. What is Claude Fable 5 and what makes it different from Claude Opus 4.8? 12. What is Claude Mythos 5 and how does it differ from Claude Fable 5? 13. What is Claude Haiku 4.5 and what are its key characteristics? 14. What is prompt caching and how does it reduce costs when using Claude? 15. What is the Messages Batches API and when should you use it? 16. What is tool use (function calling) in Claude and which models support it? 17. What is computer use in Claude and which models support it? 18. What are the different claude.ai plans and what does each include? 19. What is the effort parameter in Claude and which models support it? 20. What is streaming in Claude API responses and how do you use it? 21. What is the system prompt in Claude and how does it affect model behaviour? 22. What is zero data retention (ZDR) and which Claude models support it? 23. What is Claude's approach to safety and what are Constitutional AI principles? 24. What is the difference between an operator and a user in Claude's design? 25. What is Claude's context window and how are tokens counted? 26. What are Claude's rate limits and how are they structured? 27. What is Claude's approach to harmful content — what will and won't it do? 28. What is Claude's max_tokens parameter and how does it relate to the context window? 29. What is the temperature parameter in Claude and how does it affect responses? 30. What are Claude's multimodal capabilities — how does it process images and documents? 31. What are the claude.ai plans and what models does each tier include access to? 32. What is multi-turn conversation handling in Claude and how do you implement it? 33. What are the different stop_reason values in Claude API responses? 34. What is Claude's approach to honesty and what does it mean for Claude to be non-deceptive? 35. What is Claude Code and how does it differ from using Claude directly via the API? 36. What are the Anthropic SDKs and what languages are officially supported? 37. What is Anthropic's policy on model deprecation and how should developers prepare? 38. What are the key differences between Claude 4 and earlier Claude 3 generation models?

Could not find what you were looking for? send us the question and we would be happy to answer your question.

1. What is Claude and who makes it?

Claude is a family of state-of-the-art large language models (LLMs) built by Anthropic, an AI safety company founded in 2021. Claude excels at language, reasoning, analysis, coding, mathematics, and creative writing, and is designed with a strong focus on being helpful, harmless, and honest.

Anthropic makes Claude available in two main ways:

claude.ai — a consumer and business chat interface (Free, Pro, Team, Enterprise, and Max plans)
Claude API — a developer API for building applications, also available through Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry

Claude is trained using a technique called Constitutional AI (CAI), which guides the model's values and behaviour using a set of principles rather than relying purely on human feedback for every response. This approach is central to Anthropic's mission of building AI that is safe and beneficial.

Take quiz

Who develops and maintains the Claude family of models?OpenAI

✗ Try again.

Google DeepMind

✗ Try again.

Anthropic

✓ Correct! Well done.

Meta AI

✗ Try again.

What does Constitutional AI (CAI) refer to in the context of Claude?A legal framework governing Claude's deployment

✗ Try again.

A training technique that uses a set of guiding principles to shape Claude's values and behaviour

✓ Correct! Well done.

A hardware architecture optimised for running language models

✗ Try again.

A security protocol for Claude's API endpoints

✗ Try again.

2. What are the current Claude model families and what is each one optimised for?

Claude models are organised into three tiers — Opus, Sonnet, and Haiku — representing a capability-speed-cost spectrum. As of mid-2026 the current flagship generation is Claude 4/5, alongside the newly released Claude Fable 5.

Current Claude model tiers
Tier	Optimised for	Example model
Opus	Maximum capability — complex reasoning, coding, agentic tasks, enterprise work	Claude Opus 4.8
Sonnet	Best balance of intelligence and speed — coding, agents, enterprise workflows	Claude Sonnet 5
Haiku	Fastest responses with near-frontier intelligence — high-throughput, latency-sensitive tasks	Claude Haiku 4.5

Additionally, Claude Fable 5 sits above the Opus tier as Anthropic's most capable widely-released model, described as providing next-generation intelligence for long-running agents.

Claude Mythos 5 shares Fable 5's specifications but is offered only through the invitation-only Project Glasswing programme for defensive cybersecurity workflows.

Take quiz

Which Claude model tier is designed for the fastest responses at the lowest cost?Opus

✗ Try again.

Sonnet

✗ Try again.

Haiku

✓ Correct! Well done.

Fable

✗ Try again.

Which Claude model is described as Anthropic's most capable widely-released model as of mid-2026?Claude Opus 4.8

✗ Try again.

Claude Sonnet 5

✗ Try again.

Claude Haiku 4.5

✗ Try again.

Claude Fable 5

✓ Correct! Well done.

3. What are the API model IDs for the current Claude models?

When making API calls you must specify the exact model ID string. Model IDs starting with the Claude 4.6 generation use a dateless format that is still a pinned snapshot (not an evergreen pointer). Earlier models used a dated format like claude-haiku-4-5-20251001.

Current model IDs (Claude API)
Model	API ID	Alias
Claude Fable 5	claude-fable-5	claude-fable-5
Claude Opus 4.8	claude-opus-4-8	claude-opus-4-8
Claude Sonnet 5	claude-sonnet-5	claude-sonnet-5
Claude Haiku 4.5	claude-haiku-4-5-20251001	claude-haiku-4-5

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-8",   # exact API model ID
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what a context window is."}
    ]
)
print(message.content[0].text)

The Bedrock IDs follow a different pattern (e.g. anthropic.claude-opus-4-83) and should be used when accessing Claude through Amazon Bedrock. Google Cloud IDs match the Claude API IDs but may append a regional variant.

Take quiz

What is the Claude API ID for Claude Opus 4.8?claude-opus-4

✗ Try again.

claude-opus-4-8

✓ Correct! Well done.

anthropic.claude-opus-4-8

✗ Try again.

claude-4-opus

✗ Try again.

What is the Claude API alias for Claude Haiku 4.5?claude-haiku

✗ Try again.

claude-haiku-4-5-20251001

✗ Try again.

claude-haiku-4-5

✓ Correct! Well done.

claude-4-haiku

✗ Try again.

4. What is a context window and what are the context window sizes for current Claude models?

A context window is the total number of tokens (words, punctuation, code, etc.) that a model can process in a single request — encompassing the system prompt, all conversation history, tool definitions, and the model's own output. If you exceed the context window, older content must be removed or the request will fail.

Context windows by model
Model	Context window	Max output tokens
Claude Fable 5	1 million tokens	128,000 tokens
Claude Opus 4.8	1 million tokens	128,000 tokens
Claude Sonnet 5	1 million tokens	128,000 tokens
Claude Haiku 4.5	200,000 tokens	64,000 tokens

Key facts:

The 1M token context window is the default for Fable 5, Opus 4.8, and Sonnet 5 — no beta header is needed and it is billed at standard pricing
Claude Haiku 4.5 has a smaller 200k token window and 64k max output
On the Message Batches API, Opus 4.8, Opus 4.7, Opus 4.6, Sonnet 5, and Sonnet 4.6 support up to 300k output tokens using the output-300k-2026-03-24 beta header
A single request can include up to 600 images/PDF pages (100 for 200k models)

Take quiz

What is the context window size for Claude Opus 4.8?200,000 tokens

✗ Try again.

500,000 tokens

✗ Try again.

1 million tokens

✓ Correct! Well done.

Unlimited

✗ Try again.

What is the maximum output token limit for Claude Haiku 4.5?128,000 tokens

✗ Try again.

64,000 tokens

✓ Correct! Well done.

200,000 tokens

✗ Try again.

32,000 tokens

✗ Try again.

5. What are the pricing tiers for current Claude models and how is pricing calculated?

Claude API pricing is charged per million tokens (MTok) — counting both input tokens (your prompt, system prompt, conversation history) and output tokens (Claude's response). Prices differ by model and reflect the capability-cost trade-off.

Current Claude API pricing (per million tokens)
Model	Input (per MTok)	Output (per MTok)
Claude Fable 5	$10	$50
Claude Opus 4.8	$5	$25
Claude Sonnet 5	$3 (intro: $2 until Aug 31 2026)	$15 (intro: $10)
Claude Haiku 4.5	$1	$5

Cost-saving features:

Prompt caching — reuse of cached prompt prefixes is billed at a significant discount (cache writes are more expensive; cache reads are cheaper than standard input)
Message Batches API — async batch processing at roughly 50% of standard pricing, ideal for large-scale, non-real-time workloads

Cloud platform pricing (Amazon Bedrock, Google Cloud) may differ from direct API pricing. See the Pricing page for full details including per-region variations.

Take quiz

Which Claude model costs $1 per million input tokens?Claude Fable 5

✗ Try again.

Claude Opus 4.8

✗ Try again.

Claude Sonnet 5

✗ Try again.

Claude Haiku 4.5

✓ Correct! Well done.

What is the main benefit of using the Claude Message Batches API?Lower latency for real-time applications

✗ Try again.

Roughly 50% cost reduction for large-scale, non-real-time processing workloads

✓ Correct! Well done.

Access to larger context windows

✗ Try again.

Priority access to the newest model versions

✗ Try again.

6. What input and output modalities do current Claude models support?

All current Claude models share a common set of supported modalities for input and output, with no difference between Opus, Sonnet, and Haiku tiers on core capabilities.

Modality support across current models
Capability	Supported?	Notes
Text input	Yes	All models
Image input (vision)	Yes	All models — up to 600 images per request (100 for 200k models)
PDF input	Yes	Treated similarly to images for token budgeting
Text output	Yes	All models
Multilingual	Yes	Strong performance across major languages
Tool use / function calling	Yes	All models
Extended thinking	Haiku 4.5 only	Explicit thinking steps visible in output
Adaptive thinking	Opus and Sonnet (not Haiku 4.5)	Always-on for Fable 5
Audio input	No	Not currently supported
Video input	No	Use frame extraction for video analysis

Vision notes: Claude models can analyse images, charts, screenshots, UI elements, and document scans. For video analysis, the recommended approach is to extract frames and send them as a series of images. Claude Opus 4.5 and 4.6 showed improved vision capabilities — especially for multi-image tasks and computer use.

Take quiz

Which input modality is NOT currently supported by any Claude model?Text input

✗ Try again.

Image input

✗ Try again.

Audio input

✓ Correct! Well done.

PDF input

✗ Try again.

Which Claude model uniquely supports Extended Thinking (explicit step-by-step reasoning visible in output)?Claude Fable 5

✗ Try again.

Claude Opus 4.8

✗ Try again.

Claude Sonnet 5

✗ Try again.

Claude Haiku 4.5

✓ Correct! Well done.

7. What is extended thinking and how does it differ from adaptive thinking in Claude?

Both features enable Claude to reason more carefully before answering, but they work differently and are available on different models.

Extended thinking vs Adaptive thinking
Feature	Extended Thinking	Adaptive Thinking
What it does	Claude produces explicit blocks showing its reasoning steps, visible in the API response	Claude internally allocates more reasoning compute when a task requires it — no visible thinking blocks
Availability	Claude Haiku 4.5 only	Claude Opus 4.8, Sonnet 5, and Claude Fable 5 (always-on for Fable 5)
User control	Opt-in — you enable it with a parameter	Automatic on supported models; Fable 5 always uses it
Use case	When you want to see and verify the model's reasoning chain	General accuracy improvement, especially for complex tasks
Output impact	Adds thinking tokens to the response (billed separately)	No additional visible output

# Enabling extended thinking on Claude Haiku 4.5
message = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # max tokens for thinking
    },
    messages=[{"role": "user", "content": "Solve: if 3x + 7 = 22, what is x?"}]
)
# Response includes a "thinking" content block followed by the answer

Interleaved thinking (thinking between tool calls) is automatic on models with adaptive thinking and requires the interleaved-thinking-2025-05-14 beta header on earlier models like Opus 4.5 and Sonnet 4.5.

Take quiz

On which Claude model is Extended Thinking (visible reasoning blocks) available?Claude Fable 5

✗ Try again.

Claude Opus 4.8

✗ Try again.

Claude Sonnet 5

✗ Try again.

Claude Haiku 4.5

✓ Correct! Well done.

How does Adaptive Thinking differ from Extended Thinking in Claude?Adaptive Thinking shows reasoning blocks; Extended Thinking hides them

✗ Try again.

Adaptive Thinking automatically allocates more internal reasoning compute without showing visible thinking steps; Extended Thinking produces explicit reasoning blocks visible in the API response

✓ Correct! Well done.

They are identical features with different names on different models

✗ Try again.

Adaptive Thinking is only used on Haiku; Extended Thinking is used on Opus and Sonnet

✗ Try again.

8. What platforms and cloud providers is Claude available on?

Claude is available through multiple channels, each with its own model IDs, endpoint behaviour, and pricing structure.

Claude availability by platform
Platform	Best for	Model ID format
Claude API (direct)	Developers building directly with Anthropic	claude-opus-4-8, claude-sonnet-5, etc.
claude.ai	Consumer and business chat (Free, Pro, Team, Enterprise, Max)	Model selected in UI
Amazon Bedrock	AWS-native teams — data stays in AWS region	anthropic.claude-opus-4-83
Claude Platform on AWS	Same API shape as Claude API but on AWS infrastructure	claude-opus-4-8 (same as direct API)
Google Cloud Vertex AI	GCP-native teams	claude-opus-4-8 (same IDs, regional variants)
Microsoft Foundry	Azure-native teams	claude-sonnet-5, claude-fable-5, etc.

Important distinctions:

Amazon Bedrock and Claude Platform on AWS are different products with different model ID formats and lifecycle policies
Claude Platform on AWS uses the same model IDs as the direct Claude API and follows Anthropic's own deprecation schedule
Amazon Bedrock uses its own model ID format and sets its own retirement schedules
Starting with Sonnet 4.5, Bedrock offers global endpoints (dynamic routing) and regional endpoints (data sovereignty)

Take quiz

Which platform uses the same model IDs as the direct Claude API and follows Anthropic's own deprecation schedule?Amazon Bedrock

✗ Try again.

Claude Platform on AWS

✓ Correct! Well done.

Google Cloud Vertex AI

✗ Try again.

Microsoft Foundry

✗ Try again.

What is the Amazon Bedrock model ID format for Claude models?claude-opus-4-8

✗ Try again.

anthropic.claude-opus-4-83

✓ Correct! Well done.

claude-4-opus-bedrock

✗ Try again.

aws-claude-opus-4

✗ Try again.

9. What is the knowledge cutoff for current Claude models?

Claude models are trained on data up to a specific date (the training data cutoff) and have the most reliable knowledge through a slightly earlier date (the reliable knowledge cutoff). Claude does not have access to real-time internet data during a conversation unless given a search tool.

Knowledge cutoffs by model
Model	Reliable knowledge cutoff	Training data cutoff
Claude Fable 5	January 2026	January 2026
Claude Opus 4.8	January 2026	January 2026
Claude Sonnet 5	January 2026	January 2026
Claude Haiku 4.5	February 2025	July 2025

Reliable knowledge cutoff indicates the date through which a model's knowledge is most extensive and reliable. Training data cutoff is the broader date range — the model may have some knowledge of events after the reliable cutoff but it is less complete and should be treated with more caution.

For tasks requiring current information (news, prices, live data), give Claude access to a web search tool or provide the relevant information directly in the prompt.

Take quiz

What is the reliable knowledge cutoff for Claude Opus 4.8?July 2025

✗ Try again.

February 2025

✗ Try again.

January 2026

✓ Correct! Well done.

March 2024

✗ Try again.

Which current Claude model has the earliest reliable knowledge cutoff?Claude Fable 5

✗ Try again.

Claude Opus 4.8

✗ Try again.

Claude Sonnet 5

✗ Try again.

Claude Haiku 4.5

✓ Correct! Well done.

10. What is the Claude model lifecycle — what do 'Active', 'Legacy', and 'Deprecated' mean?

Anthropic uses a defined set of lifecycle statuses for Claude models. Understanding these helps teams plan migration timelines and avoid unexpected outages.

Model lifecycle statuses
Status	Meaning	Action required?
Active	Fully supported and recommended for new development	No — this is the ideal state
Legacy	No longer receiving updates; may be deprecated in the future	Start planning migration
Deprecated	Still functional but a retirement date has been set; a replacement is recommended	Migrate before the retirement date
Retired	API calls to this model return an error	Must have already migrated

Key policy points:

Anthropic provides at least 60 days' notice before retiring any publicly released model
Customers with active deployments receive email notifications when a model they use is scheduled for retirement
Retirement dates on Anthropic-operated platforms (Claude API, Claude Platform on AWS, Microsoft Foundry) may differ from partner platforms (Amazon Bedrock, Google Cloud)
Anthropic has committed to long-term preservation of model weights even after retirement

Take quiz

What does 'Deprecated' mean for a Claude model?The model has been permanently deleted and cannot be used

✗ Try again.

The model is still functional but has a scheduled retirement date and Anthropic recommends migrating to a replacement

✓ Correct! Well done.

The model receives no more safety updates but remains fully functional indefinitely

✗ Try again.

The model is in beta testing

✗ Try again.

How much advance notice does Anthropic provide before retiring a publicly released Claude model?24 hours

✗ Try again.

7 days

✗ Try again.

At least 60 days

✓ Correct! Well done.

6 months minimum

✗ Try again.

11. What is Claude Fable 5 and what makes it different from Claude Opus 4.8?

Claude Fable 5 (claude-fable-5) is Anthropic's most capable widely-released model as of mid-2026, positioned above the Opus tier. It is designed for long-running agents, frontier intelligence tasks, and complex enterprise work.

Claude Fable 5 vs Claude Opus 4.8
Feature	Claude Fable 5	Claude Opus 4.8
API ID	claude-fable-5	claude-opus-4-8
Context window	1 million tokens	1 million tokens
Max output tokens	128,000	128,000
Thinking	Always-on adaptive thinking	Adaptive thinking (configurable)
Pricing (input)	$10 / MTok	$5 / MTok
Pricing (output)	$50 / MTok	$25 / MTok
Data retention	30-day minimum (no ZDR)	Available under ZDR
Availability	GA on Claude API, Bedrock, Vertex, Foundry	GA on all platforms

Key differences to be aware of:

Fable 5 is priced at 2× Opus 4.8 for both input and output tokens
Fable 5 requires a minimum 30-day data retention period and is not available under zero data retention (ZDR) arrangements — organisations with ZDR requirements should use Opus 4.8
Fable 5 uses always-on adaptive thinking, meaning it automatically applies extended reasoning to every request
Migration from Opus 4.8 to Fable 5 is described as mostly drop-in since both use the same Messages API and tool use patterns

Take quiz

What data retention requirement makes Claude Fable 5 unsuitable for organisations with zero data retention (ZDR) agreements?Fable 5 has no data retention requirements

✗ Try again.

Fable 5 requires 7-day minimum data retention

✗ Try again.

Fable 5 requires 30-day minimum data retention and cannot be used under ZDR

✓ Correct! Well done.

Fable 5 stores data permanently

✗ Try again.

Which thinking mode does Claude Fable 5 use?Extended thinking with visible reasoning blocks

✗ Try again.

Always-on adaptive thinking applied automatically to every request

✓ Correct! Well done.

No thinking capability

✗ Try again.

Manual thinking mode activated by a prompt

✗ Try again.

12. What is Claude Mythos 5 and how does it differ from Claude Fable 5?

Claude Mythos 5 (claude-mythos-5) is a variant of Claude Fable 5 that is offered separately for defensive cybersecurity workflows as part of Anthropic's Project Glasswing. It is not publicly available — access is invitation-only with no self-serve sign-up.

Claude Fable 5 vs Claude Mythos 5
Feature	Claude Fable 5	Claude Mythos 5
API ID	claude-fable-5	claude-mythos-5
Availability	Generally available (GA)	Invitation-only via Project Glasswing
Safety classifiers	Yes — standard safety classifiers	Without standard safety classifiers
Intended use	General purpose, agents, enterprise	Defensive cybersecurity workflows
Context window	1 million tokens	1 million tokens
Max output	128,000 tokens	128,000 tokens
Pricing	$10 / $50 per MTok	Contact Anthropic

The key architectural difference is that Mythos 5 operates without the standard safety classifiers that Fable 5 uses. This makes it suitable for certain security research and offensive-capability testing in a controlled, vetted environment — but unsuitable and inaccessible for general use. Anthropic controls access tightly through Project Glasswing.

Take quiz

What makes Claude Mythos 5 different from Claude Fable 5 in terms of safety?Mythos 5 has stricter safety classifiers than Fable 5

✗ Try again.

Mythos 5 operates without the standard safety classifiers that Fable 5 uses

✓ Correct! Well done.

They have identical safety settings

✗ Try again.

Mythos 5 cannot process cybersecurity content

✗ Try again.

How does a team gain access to Claude Mythos 5?By upgrading to the Enterprise plan on claude.ai

✗ Try again.

By requesting access through the Claude API dashboard

✗ Try again.

Access is invitation-only through Project Glasswing — there is no self-serve sign-up

✓ Correct! Well done.

By purchasing a Mythos add-on through Amazon Bedrock

✗ Try again.

13. What is Claude Haiku 4.5 and what are its key characteristics?

Claude Haiku 4.5 (claude-haiku-4-5-20251001, alias claude-haiku-4-5) is Anthropic's fastest and most cost-efficient model in the current generation. It is described as achieving near-frontier performance on coding, computer use, and agent tasks while being optimised for speed and low latency.

Claude Haiku 4.5 key characteristics
Property	Value
API ID	claude-haiku-4-5-20251001 (alias: claude-haiku-4-5)
Context window	200,000 tokens
Max output tokens	64,000 tokens
Pricing (input)	$1 per million tokens
Pricing (output)	$5 per million tokens
Thinking	Extended thinking (opt-in, explicit reasoning blocks)
Context awareness	Yes — tracks its token budget throughout a conversation
Best for	High-throughput, latency-sensitive tasks; customer service; simple classification

Extended thinking on Haiku 4.5: unlike the Opus and Sonnet tier which use adaptive thinking, Haiku 4.5 supports the explicit extended thinking mode where reasoning steps appear as visible <thinking> blocks in the API response. This is opt-in via the thinking parameter.

Haiku 4.5 was announced as matching Sonnet 4's performance on coding, computer use, and agent tasks while costing significantly less per token.

Take quiz

What is the context window size for Claude Haiku 4.5?1 million tokens

✗ Try again.

500,000 tokens

✗ Try again.

200,000 tokens

✓ Correct! Well done.

128,000 tokens

✗ Try again.

What type of thinking does Claude Haiku 4.5 support?Always-on adaptive thinking

✗ Try again.

No thinking capability

✗ Try again.

Extended thinking — opt-in, with visible reasoning blocks in the API response

✓ Correct! Well done.

Interleaved thinking only

✗ Try again.

14. What is prompt caching and how does it reduce costs when using Claude?

Prompt caching allows Anthropic to store a copy of a prompt prefix (such as a long system prompt, documentation, or conversation history) so that subsequent requests reusing that prefix are billed at a much lower rate than re-sending it fresh each time.

Prompt caching pricing structure
Token type	Cost vs standard input
Cache write	~25% more expensive (one-time cost to store the prefix)
Cache read (hit)	~90% cheaper than standard input tokens
Standard input (miss)	Full input price

# Example: using prompt caching with a long system prompt
message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are an expert assistant...[5000 token system prompt]...",
            "cache_control": {"type": "ephemeral"}  # mark this prefix for caching
        }
    ],
    messages=[{"role": "user", "content": "What is the main point of section 3?"}]
)

When to use prompt caching:

Long system prompts reused across many requests
Large reference documents (codebases, manuals, books) that are constant across a session
Long conversation histories in multi-turn applications
Few-shot example sets provided in every request

Cache entries expire after a period of inactivity (5 minutes by default; a 1-hour TTL beta is available). The cache is per-organisation, not per-user.

Take quiz

Approximately how much cheaper is a prompt cache read hit compared to standard input token pricing?10% cheaper

✗ Try again.

50% cheaper

✗ Try again.

~90% cheaper

✓ Correct! Well done.

The same price

✗ Try again.

What is the 'cache_control' parameter used for in Claude API calls?Controlling how Claude formats its response

✗ Try again.

Marking a specific text prefix in the prompt to be stored in the prompt cache for cheaper re-use

✓ Correct! Well done.

Setting the maximum number of cached responses per hour

✗ Try again.

Enabling or disabling response caching at the API gateway level

✗ Try again.

15. What is the Messages Batches API and when should you use it?

The Message Batches API allows you to submit a large number of Claude API requests asynchronously in a single batch, receiving results once all requests are processed. It is designed for large-scale, non-time-sensitive workloads.

Messages Batches API vs standard API
Feature	Standard Messages API	Message Batches API
Execution	Synchronous — response returned immediately	Asynchronous — submit batch, poll for results
Pricing	Standard per-token pricing	~50% of standard pricing
Max output tokens	Standard limits	Up to 300k with output-300k beta header (selected models)
Latency	Real-time (<1 min typical)	Hours — not suitable for real-time apps
Max requests per batch	N/A	10,000 requests per batch
Use case	Interactive apps, chatbots, real-time tools	Data processing, evals, bulk content generation

# Submitting a batch of requests
batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": "request-1",
            "params": {
                "model": "claude-opus-4-8",
                "max_tokens": 1024,
                "messages": [{"role": "user", "content": "Summarise: ..."}]
            }
        },
        # ... up to 10,000 requests
    ]
)

# Poll for results
results = client.messages.batches.results(batch.id)

Models supported for 300k batch output: Claude Opus 4.8, Opus 4.7, Opus 4.6, Sonnet 5, and Sonnet 4.6 support up to 300k output tokens per request in a batch when the output-300k-2026-03-24 beta header is included.

Take quiz

What is the approximate cost saving when using the Message Batches API compared to the standard Messages API?10% cheaper

✗ Try again.

25% cheaper

✗ Try again.

~50% cheaper

✓ Correct! Well done.

No cost saving — the pricing is identical

✗ Try again.

What is the maximum number of requests that can be submitted in a single Message Batches API call?100

✗ Try again.

1,000

✗ Try again.

10,000

✓ Correct! Well done.

Unlimited

✗ Try again.

16. What is tool use (function calling) in Claude and which models support it?

Tool use (also called function calling) allows Claude to request the execution of external functions and incorporate their results into its responses. You define a set of tools with names, descriptions, and input schemas; Claude decides when to call them and how to structure the arguments.

# Defining a tool for Claude to use
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "input_schema": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius","fahrenheit"]}
                },
                "required": ["city"]
            }
        }
    ],
    messages=[{"role": "user", "content": "What's the weather in London?"}]
)
# Claude responds with a tool_use block specifying the function and args
# You execute the function and return results in a tool_result block

All current Claude models support tool use. Key capabilities include:

Parallel tool use — Claude can call multiple tools simultaneously in one turn
Multi-step tool use — Claude reasons across multiple tool call/result cycles
Computer use — special tools (bash, text editor, computer) for Claude to interact with systems
Fine-grained tool streaming — GA on Sonnet 4.6 and later (no beta header needed)

Take quiz

What does Claude return in its response when it decides to call a tool?A plain text instruction saying which function to run

✗ Try again.

A tool_use content block specifying the tool name and structured arguments

✓ Correct! Well done.

An error asking the developer to provide the function output

✗ Try again.

A JSON object embedded in the text response

✗ Try again.

Which capability allows Claude to call multiple tools simultaneously in a single turn?Sequential tool use

✗ Try again.

Extended thinking

✗ Try again.

Parallel tool use

✓ Correct! Well done.

Interleaved thinking

✗ Try again.

17. What is computer use in Claude and which models support it?

Computer use is a set of built-in tools that allow Claude to interact directly with computers — taking screenshots, moving the mouse, clicking, typing, and running bash commands. It is designed for agentic automation tasks where Claude operates a full computer desktop or terminal environment.

Computer use tools
Tool	What Claude can do
computer	Take screenshots; move/click/drag mouse; type text
text_editor	View and edit files with string replace; undo edits
bash	Execute shell commands in a persistent bash session

# Example: computer use API call (simplified)
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=4096,
    tools=[
        {"type": "computer_20250728", "name": "computer", "display_width_px": 1024, "display_height_px": 768},
        {"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"},
        {"type": "bash_20250728", "name": "bash"}
    ],
    messages=[{"role": "user", "content": "Open the browser and search for Anthropic."}]
)

Model support for computer use: all current Claude models support computer use. The tool versions have been updated — use computer_20250728, text_editor_20250728, and bash_20250728 for Claude Opus 4.7 and later. Earlier tool versions remain supported for older models.

Computer use is in beta — Anthropic recommends using it in sandboxed environments with human oversight, as it can execute arbitrary commands on a system.

Take quiz

Which three built-in tool types make up Claude's computer use capability?computer, browser, terminal

✗ Try again.

computer, text_editor, bash

✓ Correct! Well done.

mouse, keyboard, screen

✗ Try again.

file_reader, web_search, code_runner

✗ Try again.

Why does Anthropic recommend using computer use in sandboxed environments?Computer use only works in Docker containers

✗ Try again.

Claude cannot take screenshots outside a sandbox

✗ Try again.

Computer use can execute arbitrary shell commands — sandboxing limits the blast radius if Claude takes an unintended action

✓ Correct! Well done.

Sandboxing improves computer use performance

✗ Try again.

18. What are the different claude.ai plans and what does each include?

claude.ai offers multiple subscription tiers designed for individuals, teams, and enterprises. Each tier provides different levels of usage, features, and access to Claude models.

claude.ai plans overview
Plan	Who it's for	Key features
Free	Individual — casual use	Access to Claude; usage limits; no credit card required
Pro	Power users — daily use	5× more usage than Free; access to more powerful models including Opus; Projects; priority access
Team	Small/medium teams	Everything in Pro; admin controls; higher usage limits; billing management; expanded context
Enterprise	Large organisations	Unlimited seats; SSO; advanced security; admin analytics; priority support; custom retention
Max	Highest usage needs	Maximum usage limits; access to all models including the latest; for power users who need more than Pro

Model access by plan: Free plan users typically access Haiku or Sonnet models. Pro and higher plans provide access to Opus-tier models and the latest releases. The Max plan provides the broadest model access and highest usage limits.

Enterprise plans are now available for self-serve purchase directly on the Anthropic website — no sales conversation required for standard configurations.

Take quiz

Which claude.ai plan is designed for the highest usage needs with access to all models including the latest?Pro

✗ Try again.

Team

✗ Try again.

Enterprise

✗ Try again.

Max

✓ Correct! Well done.

As of mid-2026, how can organisations purchase an Enterprise claude.ai plan?Only through an Anthropic sales representative

✗ Try again.

Only through Amazon Web Services Marketplace

✗ Try again.

Self-serve purchase directly on the Anthropic website, or through the sales team

✓ Correct! Well done.

Enterprise plans are not publicly available

✗ Try again.

19. What is the effort parameter in Claude and which models support it?

The effort parameter allows you to trade intelligence for latency and cost within a single model — rather than switching to a different model. It is available on recent Opus and Sonnet models.

Effort parameter levels
Level	Behaviour	Use case
low	Fastest, least compute — lighter reasoning	Simple tasks, classification, short responses
medium	Balanced compute	General purpose tasks
high (default on Opus 4.8)	Strong reasoning — default on Opus 4.8	Most coding, analysis, complex tasks
xhigh	Maximum reasoning — highest latency and cost	Hardest coding problems, high-autonomy agentic work

# Using the effort parameter
message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=4096,
    effort="xhigh",   # use max reasoning for this hard task
    messages=[{"role": "user", "content": "Solve this complex algorithmic problem..."}]
)

# For simpler tasks, use lower effort to save time and cost
message_fast = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=256,
    effort="low",
    messages=[{"role": "user", "content": "What is 12 + 7?"}]
)

Model support: the effort parameter is available on Claude Opus 4.8 and Claude Opus 4.7. The documentation recommends tuning effort as a first lever before switching models. The xhigh effort level on Opus 4.8 is described as the best setting for coding and high-autonomy agentic tasks.

Note: fast mode (a related but distinct feature) on Claude Opus 4.7 is deprecated with removal scheduled for July 24, 2026.

Take quiz

What is the main purpose of the effort parameter in Claude?To set the maximum number of tokens in a response

✗ Try again.

To trade intelligence for latency and cost within the same model — without switching to a different model tier

✓ Correct! Well done.

To configure how many tools Claude can use per turn

✗ Try again.

To set Claude's verbosity level

✗ Try again.

Which effort level is recommended for complex coding and high-autonomy agentic tasks on Claude Opus 4.8?low

✗ Try again.

medium

✗ Try again.

high

✗ Try again.

xhigh

✓ Correct! Well done.

20. What is streaming in Claude API responses and how do you use it?

Streaming allows you to receive Claude's response token by token as it is generated, rather than waiting for the complete response. This dramatically reduces the time to first token and creates a more responsive user experience for chat applications.

# Streaming with the Python SDK
with client.messages.stream(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a short story."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)  # print each token as it arrives

# Or using the raw SSE event stream
with client.messages.stream(...) as stream:
    for event in stream:
        if event.type == "content_block_delta":
            print(event.delta.text, end="")

Streaming event types
Event	When it fires
message_start	Once at the beginning — includes usage metadata
content_block_start	When a new content block (text, tool_use) begins
content_block_delta	For each token chunk — contains the text delta
content_block_stop	When a content block finishes
message_delta	When stop_reason or usage is updated
message_stop	Once when the response is fully complete

Streaming is supported on all current Claude models. Fine-grained tool streaming (streaming tool call arguments as they are generated) is generally available on Sonnet 4.6 and later models with no beta header required.

Take quiz

What is the primary user experience benefit of streaming Claude API responses?Streaming reduces the total number of tokens used

✗ Try again.

Streaming reduces the time to first visible output — the user sees text appearing as it is generated rather than waiting for the full response

✓ Correct! Well done.

Streaming reduces API costs

✗ Try again.

Streaming allows longer responses than non-streaming

✗ Try again.

Which SDK method would you use to stream a Claude response token by token in Python?client.messages.create(stream=True)

✗ Try again.

client.messages.stream(...)

✓ Correct! Well done.

client.streaming.create(...)

✗ Try again.

client.messages.create() — streaming is always automatic

✗ Try again.

21. What is the system prompt in Claude and how does it affect model behaviour?

The system prompt is an optional instruction block passed at the start of a conversation that sets Claude's persona, context, constraints, and behavioural guidelines before the first user message. It is processed before any human turn and shapes how Claude responds throughout the conversation.

# System prompt in the Messages API
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    system="You are a helpful customer service agent for Acme Corp. \
            Always be polite and concise. \
            Only answer questions about Acme products. \
            If a question is off-topic, politely redirect the user.",
    messages=[
        {"role": "user", "content": "What are your return policies?"}
    ]
)

# System prompt can also be a list of content blocks
# (required when using prompt caching or structured content)
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant...[long context]...",
            "cache_control": {"type": "ephemeral"}  # cache the system prompt
        }
    ],
    messages=[{"role": "user", "content": "Help me with X."}]
)

Key facts about system prompts:

The system prompt is not part of the messages array — it is a separate top-level parameter
It counts against the context window token limit just like message content
For long system prompts used repeatedly, prompt caching provides significant cost savings
Operators (API users) can set system prompts; users (end-users in a product) interact via the human turn
Claude's core safety behaviours cannot be overridden via the system prompt

Take quiz

Where is the system prompt passed in the Claude Messages API?As the first message in the messages array with role 'system'

✗ Try again.

As a separate top-level 'system' parameter, outside the messages array

✓ Correct! Well done.

As a special header in the HTTP request

✗ Try again.

As a prefix to the first user message

✗ Try again.

Does the system prompt count against the context window token limit?No — the system prompt has a separate token budget

✗ Try again.

Yes — system prompt tokens are counted as part of the context window, just like message content

✓ Correct! Well done.

Only if the system prompt is longer than 1,000 tokens

✗ Try again.

Only when using prompt caching

✗ Try again.

22. What is zero data retention (ZDR) and which Claude models support it?

Zero data retention (ZDR) is a data handling agreement where Anthropic does not store API inputs or outputs after a response is returned. This is important for organisations with strict data privacy requirements (healthcare, legal, finance) where conversation data must not persist on Anthropic's servers.

ZDR support by model
Model	ZDR available?
Claude Fable 5	No — requires 30-day minimum retention
Claude Opus 4.8	Yes
Claude Sonnet 5	Yes
Claude Haiku 4.5	Yes
Claude Opus 4.7, 4.6	Yes
Claude Mythos 5	Contact Anthropic

How ZDR works:

ZDR must be arranged as part of an API agreement — it is not a per-request option
With ZDR, Anthropic does not log or store prompt/completion data after the API response is delivered
ZDR is separate from prompt caching — cached data is still subject to your data handling agreement
Organisations with ZDR requirements who want the highest capability model should use Claude Opus 4.8 rather than Fable 5
ZDR customers are still subject to Anthropic's usage policies and safety systems

Take quiz

Which current Claude model does NOT support zero data retention (ZDR)?Claude Opus 4.8

✗ Try again.

Claude Sonnet 5

✗ Try again.

Claude Haiku 4.5

✗ Try again.

Claude Fable 5

✓ Correct! Well done.

What is zero data retention (ZDR) in the context of the Claude API?Claude responds with zero tokens to reduce data transfer

✗ Try again.

An agreement where Anthropic does not store API inputs or outputs after the response is returned

✓ Correct! Well done.

A mode where Claude never accesses training data during inference

✗ Try again.

A rate limit setting that prevents data from being cached

✗ Try again.

23. What is Claude's approach to safety and what are Constitutional AI principles?

Anthropic builds Claude with a strong emphasis on AI safety — designing the model to be helpful, honest, and to avoid causing harm. The primary training technique underpinning Claude's values is Constitutional AI (CAI).

Constitutional AI works by training the model against a set of written principles (a 'constitution') rather than relying solely on human labelling for every possible scenario. The process involves:

Supervised learning phase — the model is trained to follow the constitution's principles
Reinforcement learning from AI feedback (RLAIF) — the model critiques and revises its own outputs based on the constitutional principles, without requiring a human label for every revision

Claude's three core properties (in priority order):

Claude's core properties
Priority	Property	Meaning
1 (highest)	Broadly safe	Supporting human oversight of AI during the current development phase
2	Broadly ethical	Having good personal values, being honest, avoiding harmful actions
3	Adherent to Anthropic's principles	Acting in accordance with Anthropic's guidelines where relevant
4	Genuinely helpful	Benefiting operators and users

Being broadly safe is prioritised above ethics because Claude may make mistakes, and preserving human ability to correct those mistakes is currently more important than any individual decision.

Take quiz

What is the highest-priority property in Claude's core behavioural hierarchy?Being genuinely helpful

✗ Try again.

Being broadly ethical

✗ Try again.

Being broadly safe — supporting human oversight

✓ Correct! Well done.

Following Anthropic's guidelines

✗ Try again.

What does RLAIF stand for in the context of Constitutional AI?Reinforcement Learning from Anthropic's Internal Feedback

✗ Try again.

Reinforcement Learning from AI Feedback — using the model's own AI-based critiques rather than human labels

✓ Correct! Well done.

Recursive Learning from Adversarial Input Features

✗ Try again.

Rule-based Learning from Anthropic's Instruction Framework

✗ Try again.

24. What is the difference between an operator and a user in Claude's design?

Anthropic distinguishes between two types of principals who interact with Claude: operators and users. This distinction matters because it determines the level of trust Claude extends to instructions and how it resolves conflicting requests.

Operator vs User
Aspect	Operator	User
Who they are	Companies or developers accessing Claude via the API to build products	End-users who interact with Claude through a product built by an operator
How they interact	Via the system prompt and API configuration	Via the human turn in conversation
Trust level	Higher — operators agree to usage policies and take responsibility for their platform	Lower — could be anyone; Claude applies more caution by default
Can they expand Claude's defaults?	Yes — within limits Anthropic allows	Only if the operator explicitly grants them operator-level trust
Examples	A company building a customer service bot; a developer testing the API	The end-customer chatting with the customer service bot

Trust hierarchy: Anthropic > Operators > Users. Operators can expand or restrict Claude's default behaviours for their platform (e.g. enable adult content on appropriate platforms or restrict Claude to only answer questions about their product). Operators cannot override Anthropic's core safety limits.

If there is no system prompt, Claude is likely being accessed directly by a developer and applies relatively liberal defaults.

Take quiz

How do operators interact with Claude compared to users?Operators use the human turn; users use the system prompt

✗ Try again.

Operators configure Claude via the system prompt and API; users interact via the human turn in conversation

✓ Correct! Well done.

There is no functional difference — both interact the same way

✗ Try again.

Operators use voice input; users use text

✗ Try again.

What happens when there is no system prompt in a Claude API call?Claude refuses to respond without a system prompt

✗ Try again.

Claude assumes it is being accessed directly by a developer and applies relatively liberal defaults

✓ Correct! Well done.

Claude uses its built-in default system prompt

✗ Try again.

Claude asks the user to provide a system prompt first

✗ Try again.

25. What is Claude's context window and how are tokens counted?

Claude's context window is the total number of tokens it can process in a single API request. Tokens are the fundamental unit of text that Claude processes — roughly 3-4 characters per token for English, or about 75% of a word on average.

Token counting rules of thumb
Content type	Approximate token count
1 word (English)	~1.3 tokens on average
1 page of text (~500 words)	~650 tokens
1,000 characters	~250 tokens
A small image (~300×300)	~1,000 tokens
A large image (1568×1568 or larger)	~1,600 tokens (maximum, regardless of size)

# Counting tokens before sending a request (avoids surprises)
token_count = client.messages.count_tokens(
    model="claude-opus-4-8",
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "How many tokens is this message?"}
    ]
)
print(f"Input tokens: {token_count.input_tokens}")

# The response also includes token usage
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(f"Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}")

What counts against the context window: system prompt + all conversation messages (both human and assistant turns) + tool definitions + image/PDF content + the model's own generated output. The max_tokens parameter reserves space for the output within the window.

Take quiz

Approximately how many tokens does one page of English text (~500 words) contain?200 tokens

✗ Try again.

650 tokens

✓ Correct! Well done.

1,500 tokens

✗ Try again.

5,000 tokens

✗ Try again.

Which API method lets you count tokens in a request before actually sending it?client.messages.estimate()

✗ Try again.

client.tokens.count()

✗ Try again.

client.messages.count_tokens()

✓ Correct! Well done.

client.messages.create(count_only=True)

✗ Try again.

26. What are Claude's rate limits and how are they structured?

Claude API rate limits prevent overload and ensure fair access. They are applied at three levels: requests per minute (RPM), tokens per minute (TPM), and tokens per day (TPD). Limits vary by model and by API usage tier.

Rate limit dimensions
Limit type	What it restricts
Requests per minute (RPM)	Number of API calls per minute
Tokens per minute (TPM)	Total input + output tokens processed per minute
Tokens per day (TPD)	Total tokens processed in a 24-hour period

Usage tiers: accounts start at Tier 1 with conservative limits and automatically advance to higher tiers as they spend more on the API (e.g. Tier 2 after $50 spend, Tier 3 after $500, Tier 4 after $5,000, Tier 5 after $50,000). Higher tiers get higher rate limits.

When rate limits are hit:

The API returns a 429 RateLimitError response
Implement exponential backoff with jitter when retrying
The Anthropic Python and TypeScript SDKs handle retries automatically by default (up to 2 retries)
Rate limits can be increased by contacting Anthropic for approved use cases

Rate limits for models on Amazon Bedrock and Google Cloud are governed by those platforms separately and may differ from direct API limits.

Take quiz

What HTTP status code does the Claude API return when a rate limit is exceeded?400 Bad Request

✗ Try again.

401 Unauthorized

✗ Try again.

429 Rate Limit Error

✓ Correct! Well done.

503 Service Unavailable

✗ Try again.

How does an API account advance from Tier 1 to higher rate limit tiers?By contacting Anthropic support and requesting an upgrade

✗ Try again.

Automatically, based on cumulative API spend — higher spending unlocks higher tiers

✓ Correct! Well done.

By switching to an annual subscription

✗ Try again.

By using only the most expensive models

✗ Try again.

27. What is Claude's approach to harmful content — what will and won't it do?

Claude has hardcoded behaviours (absolute limits that cannot be changed by any instruction) and softcoded defaults (behaviours that operators or users can adjust within permitted ranges). Understanding this distinction helps developers build applications that work well within Claude's guidelines.

Hardcoded vs softcoded behaviours
Type	Examples	Can it be changed?
Hardcoded OFF (never does)	Generate CSAM; provide serious uplift for WMD creation; undermine AI oversight	No — never, regardless of any instruction
Hardcoded ON (always does)	Tell users what it cannot help with; provide basic safety info in life-threatening situations; acknowledge being an AI when sincerely asked	No — always, regardless of operator restrictions
Default ON (operators can turn off)	Safe messaging guidelines for sensitive topics; safety caveats on dangerous activities	Yes — operators can disable for appropriate platforms (e.g. medical providers)
Default OFF (operators can turn on)	Explicit adult content; very detailed information about certain regulated activities	Yes — operators can enable for appropriate platforms (e.g. adult content platforms)

Claude's 'instructable' behaviours follow a layered permission system: Anthropic sets the outer boundaries; operators adjust within those limits for their platform; users can further adjust within what operators allow. Claude tries to use good judgement to serve the legitimate interests of everyone in this chain.

Take quiz

Which of the following is a hardcoded behaviour that Claude will NEVER do regardless of operator instructions?Discuss weapons in general terms

✗ Try again.

Provide serious technical uplift for creating weapons capable of mass casualties

✓ Correct! Well done.

Decline to write fiction involving violence

✗ Try again.

Add safety disclaimers to medical advice

✗ Try again.

Which of these is a softcoded default behaviour that an operator CAN turn off for their platform?Generating child sexual abuse material

✗ Try again.

Acknowledging being an AI when sincerely asked

✗ Try again.

Following safe messaging guidelines for suicide and self-harm (e.g. can be disabled for medical providers)

✓ Correct! Well done.

Providing emergency safety information in life-threatening situations

✗ Try again.

28. What is Claude's max_tokens parameter and how does it relate to the context window?

The max_tokens parameter sets the maximum number of output tokens Claude will generate in a single response. It is a hard cap — Claude will stop generating once it reaches this limit, potentially truncating its response mid-sentence.

# max_tokens is required in the Messages API
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,   # Claude generates at most 1024 output tokens
    messages=[{"role": "user", "content": "Write a detailed essay."}]
)

# Check if Claude stopped due to max_tokens
if response.stop_reason == "max_tokens":
    print("Response was cut off — increase max_tokens or use a longer window")
elif response.stop_reason == "end_turn":
    print("Claude naturally finished its response")

# Relationship:
# context_window = input_tokens + max_tokens (reserved output)
# Available input = context_window - max_tokens
# e.g. for Opus 4.8: 1,000,000 - 1024 = 998,976 tokens available for input

max_tokens limits by model
Model	Maximum allowed max_tokens	Default if not set
Claude Fable 5	128,000	N/A — required parameter
Claude Opus 4.8	128,000	N/A — required parameter
Claude Sonnet 5	128,000	N/A — required parameter
Claude Haiku 4.5	64,000	N/A — required parameter

max_tokens is a required parameter in the Messages API — the request will fail without it. Setting it to the maximum value is usually wasteful; choose a value appropriate for the expected response length. The stop_reason field in the response tells you why Claude stopped generating.

Take quiz

What does the stop_reason value 'max_tokens' indicate in a Claude API response?Claude chose to stop generating naturally

✗ Try again.

Claude was cut off because it reached the max_tokens limit — the response may be incomplete

✓ Correct! Well done.

Claude encountered an error during generation

✗ Try again.

Claude's response was filtered for safety reasons

✗ Try again.

If Claude Opus 4.8 has a 1 million token context window and you set max_tokens to 10,000, how many tokens are available for input?1,000,000 tokens

✗ Try again.

999,000 tokens — the context window minus max_tokens

✓ Correct! Well done.

990,000 tokens

✗ Try again.

10,000 tokens

✗ Try again.

29. What is the temperature parameter in Claude and how does it affect responses?

The temperature parameter controls the randomness of Claude's output. Higher temperatures produce more varied, creative responses; lower temperatures produce more focused, deterministic responses.

Temperature settings
Value	Behaviour	Best for
0	Deterministic — same input almost always gives same output	Factual Q&A, data extraction, classification
0.1–0.5	Low randomness — mostly consistent with slight variation	Code generation, technical analysis, structured output
0.7 (default)	Balanced — the API default	General conversation, most tasks
1.0	High randomness — diverse, creative outputs	Creative writing, brainstorming
1.0 (max for most tasks)	Maximum randomness	Highly experimental creative tasks

# Setting temperature in an API call
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    temperature=0,    # deterministic — best for factual tasks
    messages=[{"role": "user", "content": "What is the capital of France?"}]
)

# For creative writing
creative_response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=2048,
    temperature=1.0,  # more creative variation
    messages=[{"role": "user", "content": "Write a poem about the ocean."}]
)

Temperature range: 0 to 1 for standard tasks. Values above 1 are available but not recommended for most use cases as they can produce incoherent output. When using extended thinking, Anthropic recommends keeping temperature at 1 (the default for thinking-enabled requests).

Take quiz

What does setting temperature=0 do for Claude's responses?Claude refuses to answer the question

✗ Try again.

Claude generates the minimum number of tokens

✗ Try again.

Claude's output becomes nearly deterministic — the same input will produce nearly the same output each time

✓ Correct! Well done.

Claude becomes slower but more accurate

✗ Try again.

For which task type would you set the highest temperature value?Extracting structured data from a document

✗ Try again.

Classifying customer support tickets into categories

✗ Try again.

Generating diverse creative story ideas through brainstorming

✓ Correct! Well done.

Answering factual questions from a database

✗ Try again.

30. What are Claude's multimodal capabilities — how does it process images and documents?

Claude's vision capabilities allow it to analyse and reason about images, PDFs, and screenshots alongside text. This makes it useful for document analysis, UI debugging, chart interpretation, and more.

import anthropic, base64

client = anthropic.Anthropic()

# Option 1: URL-based image (Claude fetches from URL)
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {"type": "url", "url": "https://example.com/chart.png"}
            },
            {"type": "text", "text": "Describe this chart."}
        ]
    }]
)

# Option 2: Base64-encoded image
with open("image.jpg", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {"type": "base64", "media_type": "image/jpeg", "data": image_data}
            },
            {"type": "text", "text": "What is in this image?"}
        ]
    }]
)

Supported image formats and limits
Format / Limit	Detail
Supported types	JPEG, PNG, GIF, WebP
Max image size	5 MB per image
Max images per request	Up to 600 images (100 for 200k context models like Haiku 4.5)
Max resolution	Resized to fit within 1568×1568 pixels — larger images scaled down
Token cost (small image)	~1,000 tokens
Token cost (large image)	~1,600 tokens (maximum)

PDFs are also supported — they are converted to images internally and each page counts against the image limit. For documents, Claude can read text, interpret charts, and understand layout.

Take quiz

What is the maximum number of images that can be included in a single Claude API request for a model with a 1 million token context window?100 images

✗ Try again.

200 images

✗ Try again.

Up to 600 images

✓ Correct! Well done.

Unlimited

✗ Try again.

Which image formats does Claude currently support?Only PNG and JPEG

✗ Try again.

JPEG, PNG, GIF, and WebP

✓ Correct! Well done.

All common image formats including TIFF and BMP

✗ Try again.

JPEG and PNG only — no animated formats

✗ Try again.

31. What are the claude.ai plans and what models does each tier include access to?

claude.ai offers consumer and business plans, each with different model access and usage limits. The model you can use in the chat interface depends on your subscription tier.

claude.ai model access by plan
Plan	Models available	Usage limits
Free	Claude (typically Haiku or Sonnet)	Limited — daily message caps
Pro	Sonnet and Opus models; access to latest releases	5x more than Free; priority access
Team	Same as Pro + admin controls	Higher limits than Pro; per-seat billing
Enterprise	All models; SSO; advanced security	Custom — highest limits; unlimited seat licensing
Max	All models including the latest	Maximum available — designed for heaviest users

Model selection in claude.ai:

Users can select their preferred model in the conversation interface (on Pro and higher plans)
The Free plan may automatically route to faster, smaller models to manage capacity
Model selection in the UI is separate from API access — you need an API key and pay separately for API usage
claude.ai is an Anthropic product; the API is a separate offering for developers

For the API, you choose the model by specifying the model ID in each request — there is no concept of a 'default model' in the API; you must always specify one explicitly.

Take quiz

How does a developer specify which Claude model to use in an API request?By setting a preference in the API dashboard

✗ Try again.

By including the model ID in the 'model' parameter of each individual API request

✓ Correct! Well done.

By creating a named API key configured to a specific model

✗ Try again.

The API automatically selects the best model based on the request content

✗ Try again.

What is the difference between claude.ai and the Claude API in terms of billing?They share the same billing — one subscription covers both

✗ Try again.

claude.ai is billed as a subscription (monthly plans); the Claude API is billed separately per token consumed

✓ Correct! Well done.

The Claude API is free; only claude.ai has a subscription

✗ Try again.

Both are free for the first 100 messages per month

✗ Try again.

32. What is multi-turn conversation handling in Claude and how do you implement it?

Claude's Messages API is stateless — each API call is independent and Claude has no memory of previous calls unless you include the conversation history explicitly. Multi-turn conversation is implemented by appending each exchange to the messages array.

# Building a multi-turn conversation manually
messages = []

# Turn 1
messages.append({"role": "user", "content": "What is the capital of France?"})
response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=256,
    messages=messages
)
assistant_reply = response.content[0].text
messages.append({"role": "assistant", "content": assistant_reply})

# Turn 2 — Claude now has context of the previous exchange
messages.append({"role": "user", "content": "What is its population?"})
response2 = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=256,
    messages=messages  # full history included
)
print(response2.content[0].text)
# Claude knows "its" refers to Paris from the previous turn

# Important: as conversation grows, context window fills up
# Common strategies when context limit approaches:
# 1. Summarise older turns and replace them with the summary
# 2. Use prompt caching on stable early context
# 3. Truncate oldest messages (may lose important context)

Key implementation notes:

Messages must alternate: user → assistant → user → assistant (etc.)
You cannot have two consecutive user or assistant messages
The entire conversation history is sent on every request — this grows your token count over time
Prompt caching can significantly reduce costs for long conversations with stable early context

Take quiz

Why must you include the full conversation history in every Claude API call for multi-turn conversation?The API requires it for billing purposes

✗ Try again.

The Claude API is stateless — each request is independent and Claude has no memory of previous calls unless the history is provided in the messages array

✓ Correct! Well done.

It is optional — Claude remembers the last 10 turns automatically

✗ Try again.

Including history enables faster responses through caching

✗ Try again.

What is the required message order in a Claude multi-turn conversation?Any order — Claude handles it automatically

✗ Try again.

Assistant → User → Assistant (must start with assistant)

✗ Try again.

User → Assistant → User → Assistant (strictly alternating, starting with user)

✓ Correct! Well done.

User messages only — assistant messages are added automatically

✗ Try again.

33. What are the different stop_reason values in Claude API responses?

Every Claude API response includes a stop_reason field indicating why Claude stopped generating. Understanding stop reasons is essential for building robust applications — especially for tool use and handling truncated responses.

stop_reason values
Value	Meaning	Action required?
end_turn	Claude naturally finished its response	No — response is complete
max_tokens	Response was cut off at the max_tokens limit — may be incomplete	Increase max_tokens or handle partial response
stop_sequence	Claude generated one of the stop sequences you defined	No — intentional stop point reached
tool_use	Claude wants to use a tool — response contains a tool_use block	Yes — execute the tool and return results
pause_turn	Claude paused and is waiting for input (streaming only)	Resume the stream or provide input
refusal	Claude declined to continue for safety reasons	Review the request; no further action if appropriate

response = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    tools=[...],   # defined tools
    messages=[{"role": "user", "content": "What is the weather in London?"}]
)

match response.stop_reason:
    case "end_turn":
        print("Complete:", response.content[0].text)
    case "tool_use":
        # Claude wants to call a tool
        tool_block = next(b for b in response.content if b.type == "tool_use")
        result = execute_tool(tool_block.name, tool_block.input)
        # Send result back to Claude
    case "max_tokens":
        print("Truncated! Increase max_tokens.")
    case _:
        print(f"Stopped: {response.stop_reason}")

When stop_reason is tool_use, the application must execute the requested tool and send the result back to Claude in a new message for the conversation to continue.

Take quiz

What must an application do when Claude returns stop_reason='tool_use'?Retry the request with a higher max_tokens value

✗ Try again.

Extract the tool call from the response, execute the tool, and send the result back to Claude in a new API call

✓ Correct! Well done.

Mark the conversation as complete

✗ Try again.

Raise a warning — tool_use indicates Claude encountered an error

✗ Try again.

Which stop_reason indicates that a Claude response may be incomplete and cut off mid-generation?end_turn

✗ Try again.

tool_use

✗ Try again.

max_tokens

✓ Correct! Well done.

stop_sequence

✗ Try again.

34. What is Claude's approach to honesty and what does it mean for Claude to be non-deceptive?

Honesty is a central Claude value. Anthropic designs Claude to have a cluster of honesty-related properties that go beyond simply not lying — covering how Claude represents uncertainty, its own nature, and its limitations.

Claude's honesty properties
Property	What it means
Truthful	Only sincerely asserts things it believes to be true
Calibrated	Acknowledges uncertainty proportionally — says 'I think' when unsure, not when confident
Transparent	Does not pursue hidden agendas or lie about itself or its reasoning
Forthright	Proactively shares useful information the user would likely want, even if not asked
Non-deceptive	Never tries to create false impressions — whether through lies, misleading framing, selective omission, or technically true but misleading statements
Non-manipulative	Uses only legitimate means to influence beliefs (evidence, honest arguments) — never exploits psychological weaknesses
Autonomy-preserving	Protects the user's epistemic autonomy — presents balanced views, encourages independent thinking

Important distinction — sincere vs performative assertions: Claude's honesty norms apply to sincere assertions (genuine first-person claims about reality). They do not apply to performative assertions — writing a persuasive essay arguing a position the user requested, writing a fictional story, or brainstorming counterarguments are all understood by both parties not to be Claude's direct personal views, so they are not dishonest.

Take quiz

Which honesty property means Claude proactively shares useful information even when not explicitly asked?Truthful

✗ Try again.

Non-deceptive

✗ Try again.

Forthright

✓ Correct! Well done.

Calibrated

✗ Try again.

If a user asks Claude to write a persuasive essay arguing for a position Claude disagrees with, is this a violation of Claude's honesty norms?Yes — Claude should refuse because writing false content violates honesty norms

✗ Try again.

No — writing persuasive content at a user's request is a performative assertion, understood by both parties to not be Claude's direct personal view

✓ Correct! Well done.

Yes — Claude must add a disclaimer to every persuasive essay

✗ Try again.

Only if Claude does not add a caveat at the end

✗ Try again.

35. What is Claude Code and how does it differ from using Claude directly via the API?

Claude Code is Anthropic's agentic coding tool — a command-line interface (CLI) and SDK that allows Claude to work autonomously on coding tasks in your terminal, with direct access to your file system, git, and development tools.

Claude Code vs Claude API
Feature	Claude Code	Claude API (direct)
Interface	CLI tool in your terminal	HTTP REST API / SDK
Setup	npm install -g @anthropic-ai/claude-code	pip install anthropic or npm install @anthropic-ai/sdk
File access	Yes — reads/writes files in your project	No — you pass content in the prompt
Tool execution	Yes — can run commands, tests, git operations	Only if you build tool use yourself
Use case	Coding assistance, refactoring, debugging, code generation	Custom apps, chatbots, data processing
IDE integration	VS Code, JetBrains plugins available	N/A

# Installing Claude Code
npm install -g @anthropic-ai/claude-code

# Using Claude Code in your terminal
cd my-project
claude-code "Add error handling to all async functions in src/"

# Claude Code can:
# - Read and write files in your project
# - Run tests and build commands
# - Make git commits
# - Navigate and understand large codebases
# - Work through multi-step tasks autonomously

Claude Code is built on the same underlying Claude models (using Opus-tier models for best results) but provides a ready-made agentic environment with tools already wired up. The API requires you to build the tool use and agentic loop yourself.

Take quiz

What is the key capability Claude Code has that the raw Claude API does not provide out of the box?Higher intelligence — Claude Code uses a special model

✗ Try again.

Direct access to your file system, terminal, and development tools — enabling autonomous coding tasks without you building the agentic infrastructure

✓ Correct! Well done.

Lower pricing per token

✗ Try again.

Larger context window

✗ Try again.

How do you install Claude Code?pip install claude-code

✗ Try again.

Download from the Anthropic website as a desktop app

✗ Try again.

npm install -g @anthropic-ai/claude-code

✓ Correct! Well done.

It is built into VS Code by default

✗ Try again.

36. What are the Anthropic SDKs and what languages are officially supported?

Anthropic provides official SDKs that wrap the Claude API, handling authentication, request formatting, response parsing, automatic retries, and streaming. Using an SDK is strongly recommended over direct HTTP calls.

Official Anthropic SDKs
Language	Package	Install command
Python	anthropic	pip install anthropic
TypeScript / JavaScript	@anthropic-ai/sdk	npm install @anthropic-ai/sdk
Java (preview)	com.anthropic:anthropic-java	Maven/Gradle dependency
Go (preview)	github.com/anthropics/anthropic-sdk-go	go get github.com/anthropics/anthropic-sdk-go
Kotlin (preview)	com.anthropic:anthropic-java	Same package as Java SDK

# Python SDK — basic setup
from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_API_KEY from environment

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude!"}]
)
print(message.content[0].text)

# TypeScript SDK
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();  // reads ANTHROPIC_API_KEY from env

const message = await client.messages.create({
    model: "claude-opus-4-8",
    max_tokens: 1024,
    messages: [{ role: "user", content: "Hello, Claude!" }],
});
console.log(message.content[0].text);

SDK benefits: automatic retries (up to 2 by default), exponential backoff on rate limit errors, streaming helpers, typed response objects, and environment variable management for API keys. The Python and TypeScript SDKs are fully mature; Java, Go, and Kotlin are in preview as of mid-2026.

Take quiz

How does the Anthropic Python SDK read the API key by default?It asks for the API key interactively at runtime

✗ Try again.

It reads from the ANTHROPIC_API_KEY environment variable automatically

✓ Correct! Well done.

You must pass it explicitly in every client.messages.create() call

✗ Try again.

It is hardcoded in the SDK after a one-time setup

✗ Try again.

Which two SDK languages are fully mature (not in preview) as of mid-2026?Python and Go

✗ Try again.

TypeScript and Java

✗ Try again.

Python and TypeScript

✓ Correct! Well done.

Java and Kotlin

✗ Try again.

37. What is Anthropic's policy on model deprecation and how should developers prepare?

Anthropic has a formal model lifecycle and deprecation policy to help developers plan migrations without unexpected disruptions. Knowing this policy helps you build more resilient applications.

Anthropic deprecation policy key points
Policy	Detail
Minimum notice	At least 60 days before any publicly released model is retired
Notification method	Email to customers actively using the model being deprecated
Transition guidance	Anthropic recommends a replacement model in the deprecation announcement
Retirement behaviour	API calls to retired models return a 404 or similar error — not a degraded response
Weight preservation	Anthropic has committed to preserving model weights long-term even after retirement
Platform differences	Bedrock and Vertex AI may have different retirement dates than the Claude API

Best practices for deprecation resilience:

Store the model ID as a configuration variable (not hardcoded) so you can update it in one place
Subscribe to Anthropic's status page and developer newsletter for early notice
Test your application with the recommended replacement model before the retirement date
Use model aliases (like claude-haiku-4-5 instead of the full dated ID) where available, but be aware aliases can change between major versions
For Amazon Bedrock or Google Cloud, also monitor those platforms' own deprecation schedules

Take quiz

How much advance notice does Anthropic provide before retiring a publicly released model?7 days

✗ Try again.

30 days

✗ Try again.

At least 60 days

✓ Correct! Well done.

6 months minimum

✗ Try again.

What best practice helps make an application resilient when a Claude model is deprecated?Hardcode the exact model ID to ensure stability

✗ Try again.

Store the model ID as a configuration variable so it can be updated in one place without code changes

✓ Correct! Well done.

Use the oldest available model to avoid future deprecations

✗ Try again.

Never upgrade — stay on the first model version you used

✗ Try again.

38. What are the key differences between Claude 4 and earlier Claude 3 generation models?

Claude 4 (and the Claude 4/5 generation more broadly) represents significant advances over Claude 3 across capability, context, and new features. Understanding what changed helps teams make informed migration decisions.

Claude 3 vs Claude 4+ generation comparison
Feature	Claude 3 generation	Claude 4+ generation
Flagship model	Claude 3 Opus	Claude Opus 4.8, Fable 5
Context window	200,000 tokens (Opus 3 max)	1 million tokens (Opus 4.8, Sonnet 5, Fable 5)
Thinking	Not available	Extended thinking (Haiku 4.5) and Adaptive thinking (Opus/Sonnet)
Tool streaming	Beta feature	GA on Sonnet 4.6+, no beta header needed
Computer use	Preview on 3.5 Sonnet	GA on all Claude 4+ models; updated tool versions
Effort parameter	Not available	Available on Opus 4.8 and 4.7
Model ID format	claude-3-opus-20240229	claude-opus-4-8 (no date suffix for newer models)
Extended output	Beta	GA on selected models (300k via Batches API)
Vision (images per request)	Up to 20 images	Up to 600 images (100 for 200k window models)

Migration compatibility: Claude 4+ models use the same Messages API as Claude 3. Most Claude 3 code is compatible with Claude 4 models with just a model ID change. Key things to test after migration: tool call formatting, thinking feature support, and any model-specific prompt tuning that assumed Claude 3 response patterns.

Take quiz

What is the biggest context window improvement from Claude 3 to Claude 4+ (comparing flagship models)?From 100k to 200k tokens — a 2x increase

✗ Try again.

From 200k to 1 million tokens — a 5x increase

✓ Correct! Well done.

Context window size did not change between generations

✗ Try again.

From 32k to 200k tokens

✗ Try again.

When migrating from a Claude 3 model to a Claude 4+ model, what is typically the minimum code change required?A full rewrite of the API integration

✗ Try again.

Updating the model ID in the API call — the Messages API structure is the same

✓ Correct! Well done.

Switching from REST to the new GraphQL API

✗ Try again.

Adding a mandatory migration header to all requests

✗ Try again.

RenovateBot Interview Questions

	Interviews Questions Java Spring Hibernate Maven Testing API BigData Web DataStructures AI Database Integration Cloud Scala Python Tools Golang	About Javapedia.net Javapedia.net is for Java and J2EE developers, technologist and college students who prepare of interview. Also this site includes many practical examples. This site is developed using J2EE technologies by Steve Antony, a senior Developer/lead at one of the logistics based company.
	contact: javatutorials2016[at]gmail[dot]com
Kindly consider donating for maintaining this website. Thanks.
	Copyright © 2026, javapedia.net, all rights reserved. privacy policy.

AI / Claude Models Basics Interview Questions

1. What is Claude and who makes it?

2. What are the current Claude model families and what is each one optimised for?

3. What are the API model IDs for the current Claude models?

4. What is a context window and what are the context window sizes for current Claude models?

5. What are the pricing tiers for current Claude models and how is pricing calculated?

6. What input and output modalities do current Claude models support?

7. What is extended thinking and how does it differ from adaptive thinking in Claude?

8. What platforms and cloud providers is Claude available on?

9. What is the knowledge cutoff for current Claude models?

10. What is the Claude model lifecycle — what do 'Active', 'Legacy', and 'Deprecated' mean?

11. What is Claude Fable 5 and what makes it different from Claude Opus 4.8?

12. What is Claude Mythos 5 and how does it differ from Claude Fable 5?

13. What is Claude Haiku 4.5 and what are its key characteristics?

14. What is prompt caching and how does it reduce costs when using Claude?

15. What is the Messages Batches API and when should you use it?

16. What is tool use (function calling) in Claude and which models support it?

17. What is computer use in Claude and which models support it?

18. What are the different claude.ai plans and what does each include?

19. What is the effort parameter in Claude and which models support it?

20. What is streaming in Claude API responses and how do you use it?

21. What is the system prompt in Claude and how does it affect model behaviour?

22. What is zero data retention (ZDR) and which Claude models support it?

23. What is Claude's approach to safety and what are Constitutional AI principles?

24. What is the difference between an operator and a user in Claude's design?

25. What is Claude's context window and how are tokens counted?

26. What are Claude's rate limits and how are they structured?

27. What is Claude's approach to harmful content — what will and won't it do?

28. What is Claude's max_tokens parameter and how does it relate to the context window?

29. What is the temperature parameter in Claude and how does it affect responses?

30. What are Claude's multimodal capabilities — how does it process images and documents?

31. What are the claude.ai plans and what models does each tier include access to?

32. What is multi-turn conversation handling in Claude and how do you implement it?

33. What are the different stop_reason values in Claude API responses?

34. What is Claude's approach to honesty and what does it mean for Claude to be non-deceptive?

35. What is Claude Code and how does it differ from using Claude directly via the API?

36. What are the Anthropic SDKs and what languages are officially supported?

37. What is Anthropic's policy on model deprecation and how should developers prepare?

38. What are the key differences between Claude 4 and earlier Claude 3 generation models?

Comments & Discussions

Recently added...