API / Microservices Design Patterns Interview Questions

Could not find what you were looking for? send us the question and we would be happy to answer your question.

1. What is the Decompose by Business Capability pattern and how do you identify business capabilities?

The Decompose by Business Capability pattern assigns one microservice per business capability — a stable, high-level function the organisation performs to deliver value. A business capability answers "what does this part of the business do?" not "how does the software do it?", so capabilities make durable service boundaries that outlast technology changes.

For an online retailer, a first-pass capability map might look like: Order Management (placement, tracking, cancellations), Inventory Management (stock levels, reservations, replenishment), Customer Management (profiles, preferences, loyalty), Payment Processing (authorisation, settlement, refunds), and Fulfilment (pick, pack, dispatch). Each becomes a microservice candidate.

How to identify capabilities in practice:

Capability mapping workshops — work top-down with business stakeholders. Ask "what does this team do?" rather than "what systems do they use?". Group answers into named, stable functions.
Organisational alignment — each recognisable business unit typically owns one or more capabilities. Conway's Law predicts that service structure will mirror team communication patterns.
Stability test — a valid capability has existed for years and will persist even if the technology stack is replaced entirely. "Accept and fulfil orders" has been a retail capability since long before e-commerce.
Independence check — if a capability can be changed and deployed without touching a sibling capability, it is a good service boundary. Frequent cross-team coordination to deploy one service is a signal that the boundary is wrong.

Sub-capabilities (e.g., Refunds within Order Management) may become separate services only when they evolve at a different pace, have distinct scaling needs, or are owned by a dedicated team. The deliverable is a capability map — a living diagram that drives service topology and team ownership throughout the programme.

What does a business capability describe in this decomposition pattern?A specific database table owned by one service

✗ Try again.

A REST endpoint or API contract exposed by a service

✗ Try again.

What the business does to create value, independent of implementation technology

✓ Well done! Correct.

An internal technical component such as a cache or queue

✗ Try again.

Which characteristic makes business capabilities suitable as durable microservice boundaries?They always map one-to-one to a single database schema

✗ Try again.

They change far less frequently than technology choices or organisational reporting lines

✓ Well done! Correct.

They are defined solely by the API contract

✗ Try again.

They require a single programming language per service

✗ Try again.

2. What is the Decompose by Subdomain pattern and how does it relate to DDD Bounded Contexts?

The Decompose by Subdomain pattern uses Eric Evans' Domain-Driven Design taxonomy to carve out service boundaries. A subdomain is a coherent slice of the problem domain. Instead of decomposing by technical layer or org chart, you model the real-world domain first, then map each subdomain to one or a small cluster of services.

DDD classifies every subdomain into one of three types, which directly influence investment and build-vs-buy decisions:

Core subdomain — the competitive differentiator. This is where the business wins or loses. Custom-built with the best engineers. Example: personalised recommendation engine at a streaming service.
Supporting subdomain — necessary but not differentiating. Still custom-built but simpler. Example: a notification service that sends emails and push alerts.
Generic subdomain — commodity functionality available off-the-shelf. Buy or use open source; do not reinvent. Example: user authentication and identity management (Keycloak, Auth0).

A Bounded Context defines the explicit boundary within which a specific domain model applies. Two subdomains may share a term — "Customer" in Sales has a credit limit and purchase history; "Customer" in Support has open tickets and SLA status — but forcing one "Customer" object to serve both contexts creates a bloated model. A Bounded Context keeps these clean and separate.

The relationship between subdomains and Bounded Contexts is often 1:1, but a large core subdomain may be split into multiple Bounded Contexts for team autonomy. Each Bounded Context is a strong microservice candidate with its own data store and ubiquitous language. The Context Map documents how contexts integrate: via Shared Kernel, Customer/Supplier, Conformist, Anti-Corruption Layer, or Open Host Service relationships — each implying different levels of coupling.

Which DDD subdomain type represents the business's primary competitive differentiator?Generic subdomain — commodity functionality bought off the shelf

✗ Try again.

Supporting subdomain — necessary but not differentiating

✗ Try again.

Infrastructure subdomain — shared platform services

✗ Try again.

Core subdomain — the function that sets the business apart from competitors

✓ Well done! Correct.

What DDD concept defines the explicit boundary within which a particular domain model applies?Aggregate Root

✗ Try again.

Domain Event

✗ Try again.

Bounded Context

✓ Well done! Correct.

Value Object

✗ Try again.

3. What is the Strangler Fig pattern and when should you use it to migrate a monolith?

The Strangler Fig pattern — coined by Martin Fowler after the strangler fig tree that gradually wraps and replaces its host — is an incremental migration strategy for moving functionality out of a monolith into microservices. Instead of a risky "big bang" rewrite, you build new services alongside the running monolith, route traffic to them one capability at a time, and eventually decommission the hollowed-out monolith.

The three-step migration cycle for each capability:

Insert a facade — place a reverse proxy, API gateway, or routing layer in front of the monolith. All traffic flows through this facade, which initially passes everything unchanged to the monolith.
Extract and build — implement the selected capability as a new microservice with its own data store. Migrate the relevant data. Run dark launches or a Parallel Run to validate correctness.
Redirect — update the facade to route requests for that capability to the new service. The monolith no longer handles it. Repeat for the next capability.

Over many iterations, the monolith shrinks (it is "strangled") until it handles no capabilities and can be switched off.

When to use it: Use the Strangler Fig whenever the monolith is too large, complex, or poorly understood to rewrite safely in one step; when the business cannot afford a freeze on new feature delivery during migration; or when the team needs to build confidence in microservice patterns before committing fully. It is the default recommended strategy for production monolith migrations.

When not to use it: If the monolith is small and the codebase is well understood, a targeted rewrite may be faster. Also avoid if the monolith's architecture makes clean extraction practically impossible without massive refactoring first — in that case, Branch by Abstraction (Q5) must precede the extraction.

What role does the facade play in the Strangler Fig migration?It acts as a routing layer sitting in front of both old and new systems, redirecting traffic as capabilities are extracted

✓ Well done! Correct.

It stores a read replica of the monolith database for the new services

✗ Try again.

It enforces rate limits on the legacy monolith to reduce load

✗ Try again.

It encrypts traffic between the monolith and the new microservices

✗ Try again.

Which scenario best justifies using the Strangler Fig pattern over a full rewrite?The monolith is small, well-tested, and fully understood by the team

✗ Try again.

The system is large, complex, or poorly understood, and must keep delivering features during migration

✓ Well done! Correct.

All services already share a database and the team wants to keep it that way

✗ Try again.

The organisation has no reverse proxy infrastructure available

✗ Try again.

4. What is the Anti-Corruption Layer (ACL) pattern in microservices?

The Anti-Corruption Layer (ACL) is a translation boundary placed at the edge of a service to prevent an external model — typically from a legacy system or a foreign bounded context — from contaminating the service's own domain model. Without it, the consuming service must adopt the vocabulary, data shapes, and assumptions of the external system, gradually corrupting its clean internal design.

The ACL consists of three collaborating components:

Facade — presents a clean interface to the internal domain, hiding the existence of the external system entirely.
Adapter — calls the external system's API, reads its events from a message broker, or queries its data store on behalf of the facade.
Translator/Mapper — converts between the external model and the internal domain model in both directions. For example, a legacy ERP might call a product a "SKU Item" with a flat unitPrice integer field; the translator converts this into the internal Product aggregate with a proper Money value object.

Key use cases:

Integrating with a legacy monolith during a Strangler Fig migration — the new service has its own clean model while the ACL handles translation to/from the old system.
Consuming a third-party SaaS API without exposing its schema to your core domain model.
Bridging two bounded contexts that use overlapping but divergent concepts of the same entity.

The ACL ensures that internal domain objects evolve independently of whatever is outside the boundary. When the external system changes its model, only the ACL needs updating — not the core domain logic. This is especially valuable during long-running migrations where the legacy system remains in use for months or years.

What does the Anti-Corruption Layer primarily protect the consuming service from?Network latency and timeouts from the external system

✗ Try again.

SQL injection attacks originating in the external API

✗ Try again.

Its domain model being corrupted by adopting the external system's terminology and data shapes

✓ Well done! Correct.

Rate limiting imposed by the external system's API gateway

✗ Try again.

When the external system changes its data model, which part of the consuming service needs to be updated?The entire internal domain model to match the new external schema

✗ Try again.

The API gateway routing configuration

✗ Try again.

Only the ACL — the core domain logic remains unchanged

✓ Well done! Correct.

The message broker topic configuration

✗ Try again.

5. What is the Branch by Abstraction pattern for incremental migration?

Branch by Abstraction is an incremental migration technique that replaces an existing component without disrupting the codebase or requiring a long-lived code branch. The key mechanic is introducing an abstraction (an interface or abstract class) over the existing component so that all callers depend on the abstraction rather than the concrete implementation. The replacement is then developed behind that same abstraction and switched in when ready.

The four-step process:

Create the abstraction — introduce an interface or abstract class that captures the component's contract. Update all callers to program against the abstraction, not the concrete class. The system still works exactly as before; only the coupling direction has changed.
Implement the new version — build the replacement (e.g., a microservice client stub that calls the extracted service) behind the same abstraction. Both the old and new implementations exist simultaneously in the main branch.
Route clients progressively — use a configuration flag, feature toggle, or simple factory to direct some or all callers to the new implementation while the old one remains available as a fallback.
Remove the old implementation — once the new implementation is validated in production and all traffic is routed to it, delete the old code. The abstraction itself may also be removed if it no longer serves a purpose.

The critical property of this pattern is that the main codebase remains in a releasable state throughout the migration. There is no feature branch that diverges from main for weeks; the entire process happens in small, shippable increments on the trunk. This makes it a natural complement to the Strangler Fig pattern: Branch by Abstraction prepares the seam at which the Strangler Fig can extract a capability.

What is the very first step in the Branch by Abstraction pattern?Deploy the new replacement implementation to production

✗ Try again.

Delete the old implementation to force the switch

✗ Try again.

Introduce an abstraction (interface) over the existing component so all callers depend on it

✓ Well done! Correct.

Create a long-lived feature branch in version control

✗ Try again.

Why does Branch by Abstraction keep the codebase in a releasable state throughout the migration?It uses blue-green deployment to hide the old implementation from users

✗ Try again.

It stores the old implementation in a separate repository

✗ Try again.

It rewrites the entire component before any callers are updated

✗ Try again.

Both old and new implementations coexist behind the abstraction in the main branch, so every commit is deployable

✓ Well done! Correct.

6. What is the Parallel Run pattern and how does it reduce migration risk?

The Parallel Run pattern runs an old and a new implementation simultaneously against the same live production input, comparing their outputs to verify correctness before committing to the new system. The legacy system's response is always returned to the caller — it remains the source of truth. The new system's response is captured asynchronously, compared in the background, and any discrepancies are surfaced to developers.

The flow for each production request:

An intercepting component (a routing layer, the calling service, or a library) fans the request out to both the legacy system and the new service.
The legacy system's response is returned to the caller immediately — no user impact if the new service is slow or fails.
The new service's response is captured asynchronously and compared with the legacy response field-by-field.
Mismatches are logged with enough context for developers to reproduce and diagnose the discrepancy.
When the mismatch rate reaches zero over a sustained period, the new service takes over as authoritative and the legacy path is removed.

GitHub's open-source Scientist library (Ruby) popularised this technique under the name "controlled experiments". The pattern is particularly valuable for stateful, business-critical calculations — pricing engines, financial reconciliations, eligibility rules — where unit tests cannot fully cover the diversity of real production inputs.

The key safety guarantee: the new service can produce wrong answers, time out, or crash during the parallel phase, and no user is ever affected. This makes it possible to run experiments on 100% of production traffic while accepting zero user-facing risk from the new implementation.

During a Parallel Run, whose response is returned to the end user?The new service, since it is being validated

✗ Try again.

Whichever service responds fastest

✗ Try again.

The legacy system, which remains the authoritative source of truth until the new service is proven

✓ Well done! Correct.

An average or merged result from both services

✗ Try again.

What is the signal that a Parallel Run is ready to cut over to the new service?The new service passes all its unit and integration tests

✗ Try again.

The new service has been deployed for more than 30 consecutive days

✗ Try again.

The mismatch rate between old and new responses drops to zero over a sustained observation window on real production traffic

✓ Well done! Correct.

The legacy system is scheduled for a maintenance window

✗ Try again.

7. What is the Bulkhead decomposition pattern and how does it isolate failure domains?

The Bulkhead pattern — named after the watertight compartments in a ship's hull that prevent a single breach from flooding the entire vessel — partitions a system into isolated failure domains so that a critical failure in one domain cannot cascade to others. In the context of service decomposition, it means deliberately grouping services, their infrastructure, and their resource pools into segments that share no mutable state or critical resources with adjacent segments.

A concrete decomposition example: an e-commerce platform partitions into a Browse & Search bulkhead (product catalog, search index, recommendations) and a Checkout & Payments bulkhead (cart, order placement, payment gateway). Even if the Elasticsearch cluster powering search becomes overloaded or crashes entirely, the checkout flow is completely unaffected — it uses a separate set of services, database clusters, thread pools, and message broker topics.

Isolation strategies applied at each level:

Process isolation — separate containers or OS processes mean a crash or OOM in one service does not affect another.
Thread/connection pool isolation — each downstream dependency gets its own bounded pool, preventing a slow dependency from exhausting shared resources (this is the resource-level Bulkhead, covered in Q28).
Infrastructure isolation — separate database clusters, separate message broker partitions, and separate network segments per bulkhead limit the blast radius of an infrastructure failure.
Deployment isolation — placing bulkheads in separate Kubernetes namespaces, availability zones, or cloud regions ensures that a zone-level outage affects only one bulkhead.

The trade-off is cost: infrastructure isolation requires duplicated resources. Bulkheads are most justified on revenue-critical paths where the cost of cascading failure — lost transactions, SLA breaches, reputational damage — outweighs the overhead of duplication.

What physical engineering concept directly inspired the Bulkhead pattern?Electrical circuit breakers that interrupt current on overload

✗ Try again.

Watertight compartments in a ship hull that stop a single breach from flooding all sections

✓ Well done! Correct.

Firebreaks cut into forests to limit the spread of wildfires

✗ Try again.

Air gaps used to electrically isolate high-voltage equipment

✗ Try again.

What guarantee does Bulkhead decomposition provide when one service partition experiences a critical failure?The failed partition automatically restarts and resumes processing

✗ Try again.

All traffic is rerouted to a single healthy partition for continuity

✗ Try again.

Failures in one partition cannot propagate to other partitions because they share no critical infrastructure resources

✓ Well done! Correct.

The failed partition triggers a global rollback of all in-flight transactions

✗ Try again.

8. What is the Database per Service pattern and what problem does it solve?

The Database per Service pattern mandates that each microservice owns its own persistent data store exclusively. No other service may directly read or write to that store — access is only possible through the owning service's published API. The store may be a separate schema in the same RDBMS engine, a fully separate server instance, or an entirely different database technology chosen to match the service's data model and access patterns.

The core problem it solves is structural data coupling. When services share a database, a schema change in one table can silently break every other service that reads it. Two teams must coordinate every deployment that touches shared tables, making independent deployment — a foundational goal of microservices — impossible in practice.

Benefits enabled by this pattern:

Independent deployment — schema migrations are scoped to a single service. No cross-team release coordination required.
Polyglot persistence — each service chooses the database best suited to its workload: relational for orders, document store for product catalog, time-series for IoT metrics, graph for social connections.
Fault isolation — a database outage in one service does not directly cascade to other services that have separate stores.
Independent scaling — a high-read service can add read replicas or a caching layer without affecting other services' data infrastructure.

The trade-off is that cross-service queries cannot use SQL JOINs. Queries that used to be a single SQL statement across multiple tables must now be composed at the application level using the API Composition pattern (Q14) or a dedicated CQRS read model (Q12). Cross-service writes must use the Saga pattern (Q10) rather than a single ACID transaction.

What coupling problem does the Database per Service pattern directly eliminate?Services using different database technologies creating inconsistent schemas

✗ Try again.

A schema change by one service silently breaking other services that share the same database tables

✓ Well done! Correct.

Services being deployed to different cloud regions

✗ Try again.

Services scaling to different capacities

✗ Try again.

What rule does the Database per Service pattern enforce regarding data access?All services must use the same database engine for consistency

✗ Try again.

A service may read from other services' databases but not write to them

✗ Try again.

No service may directly access another service's data store; the only access path is through the owning service's API

✓ Well done! Correct.

Each service must have exactly one database table

✗ Try again.

9. What is the Shared Database anti-pattern and why should it be avoided in microservices?

The Shared Database anti-pattern occurs when two or more microservices bypass each other's APIs to directly read from and write to the same database schema. It is the most common mistake teams make when splitting a monolith, because it initially appears to be the easiest path — split the code but keep a single database.

Why it fundamentally undermines the microservices model:

Schema coupling — any team wanting to rename a column, add a NOT NULL constraint, or change a table structure must coordinate with every team whose service touches that table. A simple schema change becomes a multi-team, multi-sprint event.
Loss of independent deployability — if Service A changes the orders table, Service B and Service C must be updated and redeployed simultaneously. The services cannot be independently deployed.
Hidden dependencies — the coupling is invisible at the API level. No OpenAPI spec, no contract test, captures it. It surfaces unexpectedly as a runtime breakage during incidents.
Technology lock-in — all services are forced to use the same database engine, preventing polyglot persistence and specialised data modelling.
Operational coupling — a runaway query or bulk migration in one service can saturate the shared database's connection pool and I/O capacity, degrading every other service that shares the same instance.

The correct alternative is the Database per Service pattern (Q8), with cross-service reads handled via API Composition (Q14) or CQRS (Q12), and cross-service writes handled via Sagas (Q10). The short-term pain of separating data stores pays back quickly in deployment independence and incident isolation.

Which type of coupling does the Shared Database anti-pattern most directly introduce?Certificate coupling via shared TLS credentials

✗ Try again.

Container runtime coupling via shared process memory

✗ Try again.

Schema-level coupling — services depend on shared table structures and must change together

✓ Well done! Correct.

Network-level coupling via synchronous HTTP call chains

✗ Try again.

What operational risk is created when multiple services share a single database instance?Services cannot use feature flags or canary deployments

✗ Try again.

A poorly performing query or bulk migration in one service can exhaust shared database resources, degrading all services on that instance

✓ Well done! Correct.

Services are forced to deploy in alphabetical order

✗ Try again.

The API gateway cannot cache responses correctly

✗ Try again.

10. What is the Saga pattern and how does it manage distributed transactions across microservices?

The Saga pattern manages a long-running business transaction that spans multiple services without using a distributed two-phase commit (2PC). A Saga is a sequence of local transactions: each step performs a local commit and then publishes an event or sends a command to trigger the next step. If a step fails, the Saga executes compensating transactions — semantic undos — for each previously completed step.

Example — Place Order Saga:

FORWARD STEPS
1. Order Service    → INSERT order (status=PENDING)
                   → emit OrderCreated

2. Inventory Svc   → UPDATE stock (reserve qty)
                   → emit StockReserved  OR  StockReservationFailed

3. Payment Svc     → charge customer card
                   → emit PaymentProcessed  OR  PaymentFailed

4. Order Service   → UPDATE order (status=CONFIRMED)

COMPENSATION (if PaymentFailed at step 3)
← Inventory Svc  → release reservation  (compensate step 2)
← Order Svc      → UPDATE order (status=CANCELLED)  (compensate step 1)

Key properties:

ACD, not full ACID — Sagas provide Atomicity (all steps complete or are compensated), Consistency (at application level), and Durability, but not Isolation. Intermediate states are visible to concurrent operations, requiring careful handling of anomalies such as dirty reads and lost updates.
Eventual consistency — the system reaches a globally consistent state eventually, not immediately after each step.
Two coordination styles — Choreography (event-driven, no central coordinator) and Orchestration (central saga orchestrator directs participants). See Q11 for the comparison.

Sagas are the standard replacement for distributed transactions in microservice architectures because they work across heterogeneous data stores and do not require all participating services to hold locks simultaneously.

What mechanism does a Saga use instead of two-phase commit (2PC) to span multiple services?A global transaction coordinator holding distributed locks

✗ Try again.

A consensus protocol such as Raft across all participating services

✗ Try again.

A sequence of local transactions where each step publishes an event or command to trigger the next

✓ Well done! Correct.

A shared XA transaction log written by all services

✗ Try again.

Which ACID property do Sagas NOT provide, and what risk does this create?Atomicity — compensating transactions can silently fail without rolling back previous steps

✗ Try again.

Durability — Saga state is stored only in memory and lost on restart

✗ Try again.

Isolation — intermediate Saga states are visible to concurrent operations, enabling dirty reads and lost updates

✓ Well done! Correct.

Consistency — Sagas can never reach a globally consistent final state

✗ Try again.

11. What is the difference between Choreography-based and Orchestration-based Sagas?

Both styles implement the Saga pattern (Q10) but differ fundamentally in how the steps are coordinated. In Choreography, there is no central authority: each service listens for domain events published by the preceding step and reacts autonomously, emitting its own event to trigger the next participant. In Orchestration, a dedicated Saga orchestrator sends explicit commands to each participant and receives success/failure responses, driving the workflow from a single place.

Aspect	Choreography	Orchestration
Coordination	Implicit via domain events on a message broker	Explicit commands from a central orchestrator
Coupling	Services are coupled to event topics, not to each other	Orchestrator is coupled to each participant service
Visibility	Flow is distributed across services; hard to visualise end-to-end	Entire saga flow is explicit in the orchestrator's state machine
Debugging	Requires correlating events across multiple logs and services	Orchestrator state shows exact step and failure point
Scalability	Good — no central bottleneck	Orchestrator can become a bottleneck at very high throughput
Best for	Simple 2–3 step sagas; teams that already use event-driven patterns	Complex multi-step sagas with many compensations and error paths

Choreography example — OrderService emits OrderCreated; InventoryService listens and emits StockReserved; PaymentService listens and emits PaymentProcessed; OrderService listens and marks the order confirmed. No single component knows the full flow.

Orchestration example (using AWS Step Functions or Temporal):

OrderSagaOrchestrator:
  1. send ReserveStockCommand → InventoryService
  2. on StockReserved: send ChargePaymentCommand → PaymentService
  3. on PaymentProcessed: send ConfirmOrderCommand → OrderService
  4. on PaymentFailed: send ReleaseStockCommand → InventoryService (compensate)

In a Choreography-based Saga, how does each service know what action to take next?It polls the orchestrator REST API for its next instruction

✗ Try again.

It reads a shared coordination table in a central database

✗ Try again.

It listens for domain events published by the preceding service and reacts autonomously

✓ Well done! Correct.

It receives a direct RPC call from the service that completed the previous step

✗ Try again.

What operational advantage does an Orchestration-based Saga have over a Choreography-based Saga?Orchestration eliminates the need for compensating transactions entirely

✗ Try again.

Orchestration requires fewer services to participate in the saga

✗ Try again.

The full saga flow is explicit in one place, making it easier to visualise, debug, and reason about end-to-end

✓ Well done! Correct.

Orchestration always has higher throughput because events are batched

✗ Try again.

12. What is CQRS (Command Query Responsibility Segregation) and when should you use it?

CQRS separates a service's data model into two distinct paths: a Command side that handles writes (state changes) and a Query side that handles reads. Each side can use a different data store, different data model, and even a different technology stack, optimised independently for its purpose.

On the Command side, commands express intent (PlaceOrder, UpdateShippingAddress) and mutate the authoritative write model. On the Query side, pre-built, denormalised read models serve specific views efficiently — for example, an order summary view that joins order, customer, and product data is materialised as a single flat document that can be served without any JOINs.

// Command side (write)
class PlaceOrderCommand { orderId, customerId, items[] }
class OrderCommandHandler {
    handle(PlaceOrderCommand cmd) {
        Order order = new Order(cmd.orderId, cmd.customerId, cmd.items);
        orderRepository.save(order);          // write to normalised DB
        eventBus.publish(new OrderPlaced(order)); // update read models
    }
}

// Query side (read)
class OrderSummaryQuery { orderId }
class OrderQueryHandler {
    handle(OrderSummaryQuery q) {
        return orderSummaryReadModel.findById(q.orderId); // pre-built view
    }
}

When to use CQRS:

Read and write traffic profiles differ significantly (e.g., 100:1 read-to-write ratio) and require separate scaling strategies.
Complex domain logic on the write side conflicts with simple, fast reads that need denormalised projections.
You are using Event Sourcing (Q13), which pairs naturally with CQRS because events update separate read projections.
The service needs to serve multiple different view shapes to different consumers (mobile, web, analytics) without a one-size-fits-all query model.

CQRS adds complexity: two models to maintain, eventual consistency between them (the read side lags the write side), and more infrastructure. It is over-engineering for simple CRUD services where a single model suffices.

In CQRS, what is the primary responsibility of a Command?To return a data view to the caller

✗ Try again.

To authenticate the requesting user

✗ Try again.

To express intent and mutate the write-side state of the system

✓ Well done! Correct.

To trigger log aggregation and audit events

✗ Try again.

Why might the read model in CQRS use a different database than the write model?The write model cannot handle concurrent reads safely

✗ Try again.

Regulatory requirements always mandate physical separation of reads and writes

✗ Try again.

The read model can be denormalised and technology-optimised for specific query patterns, independent of the normalised write model

✓ Well done! Correct.

CQRS requires a minimum of two database engines per service

✗ Try again.

13. What is Event Sourcing and how does it complement CQRS?

Event Sourcing stores the state of a domain entity not as its current snapshot in a row, but as an append-only log of every domain event that has ever happened to it. The current state is derived on demand by replaying all events for that entity from the beginning (or from the most recent snapshot). Nothing is ever deleted or updated in place — the event log is immutable.

// Event store: append-only
events for Order#42:
  [1] OrderPlaced     { customerId: C1, items: [...] }    t=09:00
  [2] AddressUpdated  { newAddress: "123 Main St" }       t=09:05
  [3] PaymentReceived { amount: 59.99, txnId: T77 }       t=09:07
  [4] OrderShipped    { carrier: "UPS", trackingId: U99 } t=09:30

// Replay to derive current state
Order order = new Order();
events.forEach(e -> order.apply(e));
// order.status == SHIPPED

Key properties:

Full audit trail — every state transition is recorded with its timestamp and actor. No separate audit log needed.
Temporal queries — you can reconstruct the state of any entity at any point in time by replaying up to a given event position.
Event replay — if a bug introduced wrong state, replay events through the fixed logic to regenerate correct state.
Snapshots — for entities with thousands of events, periodic snapshots cache the state at a point in time, allowing replay to start from the snapshot rather than event zero.

Complementing CQRS: Event Sourcing and CQRS are a natural pairing. Every event written to the write-side event store is also published to subscribers that update one or more denormalised read projections (the CQRS query side). Each projection can be an independent view optimised for a specific consumer: a mobile summary, a warehouse pick list, an analytics fact table. When a projection's logic changes, it can be rebuilt by replaying the complete event history.

In Event Sourcing, how is the current state of an entity determined?By reading its most recent row in a normalised relational table

✗ Try again.

By calling the entity's REST GET endpoint

✗ Try again.

By replaying all events for that entity sequentially from the append-only event store

✓ Well done! Correct.

By querying a denormalised CQRS read projection

✗ Try again.

What is the purpose of a Snapshot in an Event-Sourced system?To permanently delete old events that are no longer needed

✗ Try again.

To cache the entity's state at a point in time so replay can start from there, avoiding full history replay for entities with many events

✓ Well done! Correct.

To create a backup copy of the event store in a separate database

✗ Try again.

To trigger compensating transactions when a Saga fails

✗ Try again.

14. What is the API Composition pattern for querying data across services?

The API Composition pattern implements a query that requires data from multiple microservices by having an API composer — typically the API gateway, a BFF, or a dedicated aggregation service — call each relevant service in parallel, then join and transform the results in memory before returning a single response to the caller. It replaces the cross-service SQL JOIN that becomes impossible when each service owns its own database.

For example, a "Get Order Details" screen needs data from three services: Order Service (order status, items, timestamps), Customer Service (name, shipping address), and Product Service (product names, images). The composer calls all three — ideally in parallel — and merges the results into one response document.

The approach works well when:

The amount of data being joined is manageable in memory (hundreds to low thousands of records).
Queries do not require complex aggregations such as GROUP BY, SUM, or window functions across large datasets.
Response-time SLAs are met even after adding the latency of parallel service calls.

Limitations of API Composition:

No transactional consistency — data is fetched from multiple services at different instants, so results may reflect slightly different states.
Scalability — joining large datasets (e.g., all orders in the last year with full customer profiles) in memory is expensive and may exhaust the composer's heap.
Complexity — partial failures (one service is down) must be handled gracefully; the composer must decide whether to return partial results or fail the request.

For queries that are too complex or too large for in-memory joining, the CQRS pattern (Q12) with a pre-built, denormalised read model is the preferred alternative.

What limitation of API Composition distinguishes it from a SQL JOIN on a shared database?API Composition requires GraphQL; SQL joins work with REST

✗ Try again.

API Composition only works for write operations, not reads

✗ Try again.

API Composition lacks transactional consistency — data fetched from multiple services may reflect different point-in-time states

✓ Well done! Correct.

API Composition cannot run service calls in parallel

✗ Try again.

Where is the join and transformation logic executed in the API Composition pattern?Inside each individual downstream service before it returns its response

✗ Try again.

In a shared relational database using a cross-service view

✗ Try again.

In the API composer — an API gateway, BFF, or dedicated aggregation service — which merges results in memory

✓ Well done! Correct.

In the message broker that coordinates the fan-out

✗ Try again.

15. What is the Outbox Pattern and how does it solve the dual-write problem?

The dual-write problem arises when a service must atomically write to its own database and publish a message to a message broker in a single operation. If it writes to the DB but crashes before publishing, other services never learn about the change. If it publishes first but the DB write fails, it emits an event for something that never happened. Standard distributed transactions (2PC) across a database and a broker are too heavy and often unsupported.

The Outbox Pattern solves this by writing both the domain change and the message to-be-published in a single local database transaction, then using a relay process to forward outbox records to the broker asynchronously.

-- Same local DB transaction:
BEGIN;
  INSERT INTO orders (id, status, ...) VALUES (42, 'PENDING', ...);
  INSERT INTO outbox (id, aggregate_type, aggregate_id, event_type, payload)
       VALUES (gen_uuid(), 'Order', 42, 'OrderPlaced', '{...json...}');
COMMIT;

-- Relay process (Message Relay / Transactional Outbox Relay):
LOOP:
  rows = SELECT * FROM outbox WHERE published = false ORDER BY created_at LIMIT 100;
  FOR EACH row:
    broker.publish(topic=row.event_type, body=row.payload);
    UPDATE outbox SET published = true WHERE id = row.id;

Two relay strategies exist:

Polling publisher — a background thread or scheduled job polls the outbox table and publishes unpublished records. Simple but adds slight latency and DB load.
Transaction log tailing — tools like Debezium use the database's CDC (Change Data Capture) log (e.g., PostgreSQL WAL, MySQL binlog) to detect outbox inserts and publish them. Near-real-time with minimal DB overhead.

The relay must publish at-least-once (idempotency key = outbox row ID), so consumers must implement the Idempotent Consumer pattern (Q22) to handle rare duplicates gracefully.

What is the "dual-write problem" that the Outbox Pattern solves?Reading from two separate databases simultaneously causing stale data

✗ Try again.

The risk of writing the same event to two different message topics

✗ Try again.

Writing to a database and a message broker in two separate operations — if one fails, the two become permanently inconsistent

✓ Well done! Correct.

Duplicate writes caused by two service replicas processing the same request

✗ Try again.

What ensures atomicity between the domain record write and the outbox record in the Outbox Pattern?A two-phase commit spanning the database and the message broker

✗ Try again.

A Saga with a compensating transaction if the broker publish fails

✗ Try again.

A distributed lock held by the relay process

✗ Try again.

Both records are written inside the same local database transaction — local ACID guarantees atomicity

✓ Well done! Correct.

16. What is the Saga rollback / compensating transaction pattern?

In a Saga (Q10), when a step fails, previously completed steps cannot be undone with a database ROLLBACK because each step has already committed its local transaction and those locks are released. Instead, the Saga executes compensating transactions — purpose-built operations that reverse the business effect of each completed step in reverse order.

A compensating transaction is a semantic undo, not a technical rollback. The key distinction:

A technical rollback is performed by the database engine before a transaction commits — it undoes uncommitted SQL statements.
A compensating transaction is a new, forward-moving operation that creates the business-level opposite of an already-committed action.

Example compensations:

Forward step: reserve 5 units of stock → Compensation: release 5 units of stock reservation
Forward step: charge customer 9.99 → Compensation: refund customer 9.99
Forward step: create order in PENDING status → Compensation: update order status to CANCELLED

Important edge cases:

Pivotal transactions — not all Saga steps can be compensated. A step that sends a physical shipment or charges a non-refundable fee is called a pivot transaction; if it succeeds, the Saga must complete rather than roll back.
Retriable transactions — some steps after the pivot are guaranteed to succeed eventually (e.g., updating an order status). These steps are retried until success rather than being compensated.
Idempotency — compensating transactions may be retried if the Saga coordination infrastructure fails, so each compensation must be idempotent.

What is a compensating transaction in the context of a Saga?A database ROLLBACK statement that undoes uncommitted SQL changes

✗ Try again.

A retry of the failed operation with a different set of parameters

✗ Try again.

A new, forward-moving operation that reverses the business effect of an already-committed Saga step

✓ Well done! Correct.

A timeout handler that aborts the saga after a deadline

✗ Try again.

What is a "pivot transaction" in a Saga, and how is it handled differently from compensatable steps?A transaction that runs twice for idempotency verification

✗ Try again.

A transaction that can be retried indefinitely without compensation

✗ Try again.

An irreversible step (e.g., physical shipment dispatched) after which the Saga must complete forward rather than rolling back

✓ Well done! Correct.

The first transaction in every Saga that establishes the correlation ID

✗ Try again.

17. What is the API Gateway pattern and what responsibilities should it have versus a BFF?

The API Gateway is a single entry point that sits between external clients and the internal microservice topology. Rather than exposing each service's API directly to the internet, all traffic flows through the gateway. It handles cross-cutting concerns so that individual services do not have to implement them repeatedly.

Core gateway responsibilities:

Request routing — maps an inbound URL/path to the appropriate downstream service endpoint.
Authentication and authorisation — validates JWT tokens or API keys, optionally enriches the request with user identity claims before forwarding.
SSL/TLS termination — terminates HTTPS at the gateway; downstream traffic on the internal network may use HTTP or mTLS separately.
Rate limiting and throttling — enforces per-client request budgets to prevent abuse.
Request/response transformation — rewrites headers, translates between REST and gRPC, aggregates partial responses.

# Example: Kong or Nginx API Gateway routing rule
/api/orders/*   → http://order-service:8080
/api/products/* → http://product-service:8081
/api/customers/*→ http://customer-service:8082

A general-purpose gateway serves all client types (mobile, web, third-party) through a single API surface. The Backend for Frontend (BFF) pattern (Q18) splits this into client-type-specific gateways when different clients have materially different needs — mobile needs smaller payloads and fewer fields; web needs richer aggregated responses. Moving client-specific transformation logic into a BFF prevents the general gateway from accumulating ever-growing, client-specific business logic that makes it fragile and slow to change.

The rule of thumb: use a general gateway for cross-cutting platform concerns (auth, SSL, rate limiting). Use a BFF for client-tailored data shaping and aggregation. The two can coexist — the BFF sits behind the general gateway.

Which of the following is NOT a recommended responsibility for a general-purpose API Gateway?SSL/TLS termination for all inbound HTTPS traffic

✗ Try again.

Storing the canonical domain model and business logic for all downstream services

✓ Well done! Correct.

Request routing to the correct downstream microservice

✗ Try again.

Validating JWT access tokens and enforcing rate limits

✗ Try again.

Why is it recommended to separate client-specific transformation logic into a BFF rather than keeping it in the general API Gateway?The general gateway cannot handle HTTPS traffic from mobile clients

✗ Try again.

A BFF is always faster because it is deployed closer to the client geographically

✗ Try again.

Keeping all client-specific logic in one gateway makes it bloated, hard to test, and slow to evolve for any single client type

✓ Well done! Correct.

The general gateway does not support JSON responses

✗ Try again.

18. What is the Backend for Frontend (BFF) pattern and when does it replace a general API Gateway?

The Backend for Frontend (BFF) pattern creates a dedicated API backend for each distinct client type — one BFF for the mobile app, one for the web SPA, one for third-party integrations. Each BFF is owned by the team building that frontend and is free to shape, aggregate, and optimise responses exactly as its client needs, without compromising the API shape that other clients rely on.

The driving insight is that different clients have genuinely different needs. A mobile app on a 4G connection needs lightweight payloads with only the fields it displays. A web dashboard needs richer, pre-aggregated data across multiple services. A third-party partner API needs a stable, versioned contract independent of UI feature work.

Client request: GET /mobile/orders/42
Mobile BFF:
  parallel fetch:
    order  = orderService.get(42)          // id, status, total only
    status = shippingService.track(42)     // latest event only
  return { id, status, total, latestTracking }   // 4 fields, ~200 bytes

Client request: GET /web/orders/42
Web BFF:
  parallel fetch:
    order    = orderService.get(42)        // full order model
    customer = customerService.get(order.customerId)
    items    = productService.getBulk(order.itemIds)
  return { order, customer, itemDetails } // rich object, ~4 KB

When to use BFF over a general gateway:

Multiple client types exist with divergent payload, filtering, or aggregation requirements.
Frontend teams are blocked by a shared gateway team whenever they need API changes.
Mobile clients suffer from over-fetching because the API was designed for a richer web client.

BFFs and a general API gateway often coexist: the general gateway sits at the edge and handles cross-cutting concerns (auth, SSL, DDoS); each BFF sits behind it and handles client-specific orchestration. A BFF is not a replacement for the gateway — it is a specialisation layer on top.

What is the primary motivation for creating a separate BFF for mobile clients?Mobile clients require OAuth2 tokens while web clients use session cookies

✗ Try again.

Mobile BFFs must always be deployed on edge CDN nodes

✗ Try again.

Mobile clients have different payload, bandwidth, and interaction requirements than web clients and should not be forced to consume a one-size-fits-all API

✓ Well done! Correct.

Mobile apps cannot consume REST APIs, only GraphQL

✗ Try again.

Who typically owns and maintains a BFF in a microservices organisation?The shared platform or API gateway team

✗ Try again.

The database administrators, since BFFs aggregate data

✗ Try again.

The frontend/client team for that specific client type, giving them autonomy over their API shape

✓ Well done! Correct.

The security team, because BFFs handle authentication

✗ Try again.

19. What is the Service Mesh pattern and how do data-plane proxies such as Envoy implement it?

A Service Mesh is an infrastructure layer that handles all service-to-service communication concerns — traffic management, mutual TLS, retries, circuit breaking, observability — without requiring application code to implement any of it. It consists of two planes:

Data plane — a sidecar proxy (Envoy, Linkerd-proxy) injected into every pod. All inbound and outbound network traffic for the application container flows through the sidecar. The sidecar applies policies, collects telemetry, and enforces mTLS transparently.
Control plane — manages and configures the sidecar fleet (Istio Pilot, Linkerd control plane). It distributes routing rules, certificates, and traffic policies to each proxy. The control plane is never in the hot path of production traffic.

# Istio VirtualService — traffic splitting for canary release
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts: [reviews]
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10

What Envoy (data plane) handles per-request:

mTLS — terminates inbound TLS and initiates outbound TLS with the peer's certificate, providing service identity without code changes.
Retries and timeouts — configurable retry budgets and per-route timeouts enforced at the proxy level.
Circuit breaking — ejects unhealthy upstream hosts from the load-balancing pool.
Distributed tracing — propagates W3C traceparent headers and emits Zipkin-compatible spans.
Traffic splitting — routes a percentage of traffic to canary versions of a service without touching application code.

The Service Mesh is appropriate when an organisation operates many services written in multiple languages and wants consistent, policy-driven networking without embedding SDK-level resilience logic in every service.

Which component in a Service Mesh is responsible for intercepting and proxying actual production network traffic?The control plane API server, which routes requests centrally

✗ Try again.

The service registry, which holds current service locations

✗ Try again.

The data-plane sidecar proxy (e.g., Envoy) injected alongside each service instance

✓ Well done! Correct.

The ingress load balancer at the cluster edge

✗ Try again.

What does the control plane in a Service Mesh do?It routes all production traffic through a central high-availability node

✗ Try again.

It runs health checks directly on each application container

✗ Try again.

It replaces the API gateway entirely

✗ Try again.

It configures and pushes routing rules, certificates, and policies to the sidecar proxies without being in the production data path

✓ Well done! Correct.

20. What is the Message Broker pattern and how does it enable asynchronous microservice communication?

The Message Broker pattern introduces a durable intermediary — the broker (Apache Kafka, RabbitMQ, AWS SQS/SNS) — between a producer service and one or more consumer services. The producer publishes a message to the broker and returns immediately without waiting for consumers to process it. Consumers pull messages when ready. The broker stores messages until they are acknowledged, providing durability and decoupling.

// Producer (Java/Kafka)
ProducerRecord<String, String> record =
    new ProducerRecord<>("order-events", orderId, orderEventJson);
kafkaProducer.send(record);    // non-blocking; returns immediately
// No dependency on whether any consumer is alive

// Consumer (Java/Kafka)
ConsumerRecords<String, String> records =
    kafkaConsumer.poll(Duration.ofMillis(500));
for (ConsumerRecord r : records) {
    orderEventHandler.handle(r.value());
    kafkaConsumer.commitSync();   // acknowledge offset
}

Types of messaging models:

Publish/Subscribe (topic) — one message is delivered to all subscribed consumer groups. Used for domain events (e.g., Kafka topics). Multiple independent consumers receive the same event.
Point-to-point (queue) — each message is delivered to exactly one consumer. Used for task distribution (e.g., SQS, RabbitMQ queues). Competing consumers load-balance across queue messages.

The pattern solves temporal coupling: with direct HTTP calls, the caller must wait for the receiver to be available. With a broker, the producer can publish even if all consumers are down — messages accumulate and are processed when consumers recover. It also provides rate smoothing: if a producer bursts messages faster than consumers can process, the broker absorbs the burst and consumers drain at their own pace.

What type of coupling does the Message Broker pattern primarily eliminate compared to synchronous HTTP calls?Physical coupling — services no longer need to run on the same host

✗ Try again.

Schema coupling — services can use completely different data formats

✗ Try again.

Temporal coupling — the producer no longer needs the consumer to be available at the same moment

✓ Well done! Correct.

Authentication coupling — no tokens are needed between services

✗ Try again.

In message broker terminology, what distinguishes a topic from a queue?A topic is persistent while a queue is stored only in memory

✗ Try again.

A topic supports multiple independent subscriber groups all receiving the same message; a queue delivers each message to exactly one consumer

✓ Well done! Correct.

A topic requires strict message ordering; a queue does not

✗ Try again.

A queue is for commands while a topic is exclusively for read queries

✗ Try again.

21. What is the Request-Reply (Correlation ID) pattern for async messaging?

The Request-Reply pattern enables synchronous-like request/response semantics over an asynchronous message channel. The requestor sends a message to a request channel, attaches a unique Correlation ID and a reply-to address (a dedicated reply channel or a temporary queue), and waits for a response. The responder processes the request, copies the Correlation ID into its reply, and publishes the response to the reply channel. The requestor matches incoming responses to pending requests using the Correlation ID.

// Requestor
String correlationId = UUID.randomUUID().toString();
Message request = MessageBuilder
    .withBody(payload)
    .setHeader("correlationId", correlationId)
    .setHeader("replyTo", "order-reply-queue")
    .build();
requestChannel.send(request);
CompletableFuture<Response> pending = pendingRequests.put(correlationId);
// ... asynchronously wait for response on "order-reply-queue"

// Responder
Message request = requestChannel.receive();
String corrId = request.getHeaders().get("correlationId");
Response resp = processRequest(request.getBody());
Message reply = MessageBuilder.withBody(resp)
    .setHeader("correlationId", corrId).build();
replyChannel.send(reply);

// Requestor matches incoming reply
String corrId = reply.getHeaders().get("correlationId");
pendingRequests.complete(corrId, reply.getBody());

The Correlation ID is essential when multiple in-flight requests use the same reply channel: without it, the requestor cannot determine which response corresponds to which request. If two concurrent requests share the same Correlation ID, each requestor will receive the wrong reply or the ID collision will cause a missed response. IDs must therefore be globally unique (UUID v4 is standard) and the pending-request registry must be thread-safe.

Why is a Correlation ID necessary in the async Request-Reply pattern?To encrypt the request payload before sending it through the broker

✗ Try again.

To match an asynchronous reply to its originating request when multiple in-flight requests share the same reply channel

✓ Well done! Correct.

To set the message expiry time-to-live on the broker

✗ Try again.

To identify the requesting service for billing and rate-limiting purposes

✗ Try again.

What happens if two concurrent requests are sent with the same Correlation ID?The broker automatically retries both requests with new IDs

✗ Try again.

The system processes both requests sequentially to avoid ID collision

✗ Try again.

The responder combines the two replies into a single merged response

✗ Try again.

Each requestor may receive the wrong reply or miss its response because the reply cannot be correctly matched to a unique in-flight request

✓ Well done! Correct.

22. What is the Idempotent Consumer pattern and why is it essential in event-driven systems?

The Idempotent Consumer pattern ensures that processing the same message more than once produces the same outcome as processing it exactly once. It is essential because virtually all message brokers (Kafka, RabbitMQ, SQS) guarantee at-least-once delivery — a message may be redelivered after a consumer crashes before acknowledging, after a network partition, or during broker rebalancing. Without idempotency, redelivery causes duplicate side effects: double charges, duplicate shipments, over-reserved inventory.

// Idempotent consumer using deduplication table
public void handleOrderPlaced(OrderPlacedEvent event) {
    String msgId = event.getMessageId();   // unique per message

    if (processedMessages.exists(msgId)) {
        log.info("Duplicate message {}, skipping", msgId);
        return;   // idempotency guard: already processed
    }

    // process inside a transaction that also inserts the msgId
    transactionTemplate.execute(status -> {
        orderRepository.createFrom(event);
        processedMessages.insert(msgId, Instant.now());
        return null;
    });
}

Implementation strategies:

Deduplication table — persist message IDs (or idempotency keys) in a table. Before processing, check if the ID exists. Insert the ID and process in the same transaction so a crash between processing and acknowledging still results in a consistent state on retry.
Natural idempotency — design operations that are inherently idempotent. UPDATE orders SET status='CONFIRMED' WHERE id=42 is idempotent; running it twice has no extra effect. But INSERT INTO charges (amount, orderId) VALUES (59.99, 42) is not — it creates duplicate rows.
Conditional update — use an optimistic locking version or a state machine check (WHERE status='PENDING') to ensure the operation only applies in the correct state, making reprocessing a no-op if the state has already advanced.

The deduplication store must be co-located or transactionally integrated with the main data store, otherwise the check-then-insert itself has a race condition.

What delivery guarantee makes the Idempotent Consumer pattern necessary in message-driven systems?Exactly-once delivery — messages are always delivered precisely once, requiring deduplication

✗ Try again.

At-most-once delivery — messages may be lost, so consumers must replay from an offset

✗ Try again.

At-least-once delivery — a message may be redelivered more than once, causing duplicate side effects without idempotency

✓ Well done! Correct.

Ordered delivery — messages arrive in strict sequence, requiring position tracking

✗ Try again.

How does a deduplication-table approach implement consumer idempotency?By comparing full message body checksums across all consumer replicas on every delivery

✗ Try again.

By querying the broker's internal offset tracking log for duplicate message IDs

✗ Try again.

By storing previously processed message IDs and skipping any message whose ID already exists in the table

✓ Well done! Correct.

By requiring the broker to filter duplicates before delivering to the consumer

✗ Try again.

23. What is the Event-Driven Architecture pattern and how does it differ from synchronous request/response?

Event-Driven Architecture (EDA) structures communication around events — immutable records of something that has happened. A producer emits an event to a broker and moves on without knowing or caring who consumes it. Consumers subscribe to events and react asynchronously and independently. No party waits for another.

In synchronous request/response, the caller blocks until the called service returns a result. This creates three forms of coupling:

Temporal coupling — caller and callee must both be alive at the same instant.
Behavioural coupling — the caller depends on the callee's response structure and error codes.
Performance coupling — the caller's response time is bounded below by the callee's processing time.

EDA eliminates all three. The producer has no knowledge of its consumers; a new consumer can subscribe to existing events without any producer code change. The system can scale consumer instances independently, and a slow consumer does not delay the producer.

Trade-offs of EDA:

Eventual consistency — consumers process events asynchronously; the system is not instantly consistent after an event is published.
Harder debugging — end-to-end request flows are reconstructed by correlating events across distributed logs rather than reading a single call stack.
Event schema evolution — changing an event's schema is a breaking change for all consumers; requires careful versioning (additive changes only, or explicit versioned event types).
No immediate response — EDA is unsuitable for interactions that require a synchronous return value (e.g., a login that must return a JWT).

EDA and request/response often coexist in the same system: synchronous for user-facing read operations that need immediate results; asynchronous events for state change propagation across service boundaries.

How does Event-Driven Architecture eliminate temporal coupling compared to synchronous REST calls?EDA requires all services to share the same deployment pipeline

✗ Try again.

In EDA, the producer emits an event and returns immediately; producer and consumer do not need to be available at the same time

✓ Well done! Correct.

EDA always uses GraphQL subscriptions to push responses back to the client synchronously

✗ Try again.

EDA moves the request/response to a separate microservice that handles blocking

✗ Try again.

What consistency model does Event-Driven Architecture typically provide?Strong consistency — all consumers see the same state simultaneously after every event

✗ Try again.

Linearisability — events are processed in strict global total order

✗ Try again.

Eventual consistency — consumers process events asynchronously and may be temporarily behind the producer

✓ Well done! Correct.

Serialisable consistency identical to RDBMS SERIALIZABLE isolation

✗ Try again.

24. What is Gateway Aggregation versus Gateway Routing versus Gateway Offloading?

These three responsibilities are often all assigned to an API Gateway, but they serve distinct purposes and are worth understanding separately.

Responsibility	What it does	Example
Gateway Routing	Forwards an inbound request to a single downstream service based on URL path, host, or header	`GET /orders/` → Order Service; `GET /products/` → Product Service
Gateway Aggregation	Fans a single inbound request out to multiple downstream services, waits for all responses, and merges them into one reply	A dashboard request calls Order Service, Customer Service, and Loyalty Service in parallel and returns a single combined payload
Gateway Offloading	Handles cross-cutting concerns on behalf of all services, so each service does not need to implement them individually	SSL/TLS termination, JWT validation, rate limiting, request logging, response compression, CORS headers

In practice all three often live in the same gateway process, but separating them conceptually helps when deciding how to split responsibilities between a general API gateway (routing + offloading) and a BFF (aggregation + client-specific transformation).

Gateway Offloading deserves special emphasis: it prevents copy-paste of security and infrastructure code across dozens of services. A service that relies on the gateway for SSL termination, rate limiting, and JWT validation contains zero infrastructure boilerplate — only domain logic. If the JWT validation algorithm changes, one gateway configuration update covers every service instantly.

Which gateway responsibility involves calling multiple downstream services and combining their responses into one reply?Gateway Offloading — delegating cross-cutting concerns to the gateway

✗ Try again.

Gateway Routing — forwarding a request to a single matched backend

✗ Try again.

Gateway Aggregation — fanning out to multiple services and merging results

✓ Well done! Correct.

Gateway Load Balancing — distributing traffic across service replicas

✗ Try again.

Which of the following is a classic example of Gateway Offloading?Calling three downstream services in parallel and merging their JSON bodies

✗ Try again.

Routing GET /orders/* to the Order Service based on path prefix

✗ Try again.

SSL/TLS termination and JWT validation performed once at the gateway so individual services are free of that concern

✓ Well done! Correct.

Returning a cached response when the downstream service is unavailable

✗ Try again.

25. How does the Circuit Breaker pattern work and what are its three states?

The Circuit Breaker pattern prevents cascading failures by detecting when a downstream service is unavailable and fast-failing subsequent calls instead of letting them queue up and exhaust threads. It is named after the electrical circuit breaker that trips when current exceeds a safe threshold.

The pattern has three states:

Closed (normal) — calls pass through to the downstream service. The breaker counts failures. When the failure rate (or failure count) exceeds a configured threshold within a time window, the breaker trips to Open.
Open — all calls are immediately rejected with an error (or the Fallback is invoked) without contacting the downstream service. A timer starts. This gives the failing service time to recover without being bombarded with traffic.
Half-Open — after the timer expires, the breaker allows a limited number of probe requests through. If they succeed, the breaker transitions back to Closed. If they fail, it returns to Open.

// Resilience4j Circuit Breaker (Java)
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
    .failureRateThreshold(50)          // open if >50% fail
    .waitDurationInOpenState(Duration.ofSeconds(30))
    .permittedNumberOfCallsInHalfOpenState(5)
    .slidingWindowSize(10)
    .build();

CircuitBreaker cb = CircuitBreakerRegistry.of(config)
                        .circuitBreaker("inventoryService");

Supplier<Stock> decorated = CircuitBreaker
    .decorateSupplier(cb, () -> inventoryClient.getStock(sku));

Try<Stock> result = Try.ofSupplier(decorated)
    .recover(CallNotPermittedException.class, e -> defaultStock());

The circuit breaker is most effective when combined with a Fallback (Q31) — when the circuit is Open, the fallback returns a cached or degraded response so the caller can still serve the request in a degraded mode rather than propagating a hard error to the user.

What is the purpose of the Half-Open state in a Circuit Breaker?To forward 50% of traffic to the failing service as a load-shedding measure

✗ Try again.

To permanently block all traffic until an operator manually resets the breaker

✗ Try again.

To allow a small number of probe requests through to determine whether the downstream service has recovered

✓ Well done! Correct.

To log error details without blocking any in-flight requests

✗ Try again.

In which state does the Circuit Breaker immediately reject calls without contacting the downstream service?Closed state — the normal operating state

✗ Try again.

Half-Open state — where probe requests are allowed

✗ Try again.

Open state — all calls are fast-failed or redirected to the fallback

✓ Well done! Correct.

Pending state — while the failure threshold is being evaluated

✗ Try again.

26. What is the Retry pattern with exponential backoff and jitter, and when should you NOT retry?

The Retry pattern automatically re-attempts a failed operation a limited number of times before declaring it a final failure. On its own, retrying immediately (fixed delay or no delay) can overwhelm a struggling downstream service. Exponential backoff solves this by increasing the delay between retries exponentially:

delay(attempt) = base * 2^attempt
// attempt 1: 1s, attempt 2: 2s, attempt 3: 4s, attempt 4: 8s ...

Adding jitter (random noise) to the backoff prevents the thundering herd problem — when many clients retry simultaneously after a shared outage, they all backoff to the same windows and hammer the recovering service in synchronized bursts. With jitter, each client's retry time is randomised within the backoff window:

// Full jitter (recommended by AWS)
delay(attempt) = random(0, base * 2^attempt)

// Decorrelated jitter
delay(n) = min(cap, random(base, delay(n-1) * 3))

When NOT to retry:

Non-idempotent operations — a POST /charges that creates a charge must not be retried without idempotency keys; retrying will create duplicate charges.
4xx client errors — HTTP 400 (Bad Request), 401 (Unauthorised), 403 (Forbidden), 404 (Not Found). These indicate problems with the request itself; retrying will produce the same failure.
429 Too Many Requests — only retry after respecting the Retry-After header; retrying aggressively makes the rate-limit situation worse.
Circuit is Open — if the circuit breaker has already tripped, adding retries amplifies the load on an already-failing service.
Deadlines exceeded — if the caller's overall timeout budget is already exhausted, retrying only prolongs the client's wait without any chance of success within budget.

What problem does adding random jitter to exponential backoff specifically solve?The cold-start latency of new service instances launching after a deployment

✗ Try again.

The thundering herd problem — without jitter, all retrying clients hit the recovering service at the same synchronised moment

✓ Well done! Correct.

The split-brain problem in distributed databases after a network partition

✗ Try again.

The dual-write problem in event-sourced systems

✗ Try again.

Which HTTP response code should generally NOT trigger an automatic retry?503 Service Unavailable — the server is temporarily overloaded and may recover

✗ Try again.

500 Internal Server Error — a transient error that might resolve on retry

✗ Try again.

429 Too Many Requests — only retry respecting the Retry-After header

✗ Try again.

400 Bad Request — it indicates the request itself is malformed; retrying the same request will fail again

✓ Well done! Correct.

27. What is the Timeout pattern and how does it prevent cascading failures?

The Timeout pattern sets an upper bound on how long a caller will wait for a response from a downstream service. Without timeouts, a slow or unresponsive service causes the calling service's request-handling threads to block indefinitely. When enough threads are blocked, the caller's thread pool is exhausted, and it can no longer serve any incoming requests — the failure cascades upstream.

There are two distinct timeout types to configure on every HTTP/gRPC client:

Connection timeout — the maximum time allowed to establish the TCP connection (and TLS handshake) to the server. If the server is unreachable or overloaded, the OS may queue the SYN packet indefinitely. A connection timeout of 1–3 seconds is typical for internal services.
Read (socket/response) timeout — the maximum time to wait for the server to send its response after the connection is established. This covers the time the server spends processing the request. Set this to slightly above the service's P99 latency under normal load.

For asynchronous operations, a deadline (a fixed wall-clock time that the entire operation must complete by) is preferable to a per-hop timeout, because per-hop timeouts can accumulate across a call chain without any single hop exceeding its budget yet the total chain still exceeding the end-user SLA.

Timeout values must be tuned carefully. A timeout that is too short causes unnecessary failures during legitimate traffic spikes; too long defeats the purpose by allowing thread exhaustion before the timeout fires. Use the service's P99 latency measurements as the baseline and add a safety margin (e.g., P99 + 50%).

The Timeout pattern works best in combination with the Circuit Breaker (Q25): once timeouts accumulate and the failure rate crosses the circuit breaker threshold, the circuit opens and stops further timeouts from occurring, protecting the thread pool proactively.

What resource is most at risk if inter-service HTTP calls have no read timeout configured?CPU cycles on the calling service — spin-waiting for a response

✗ Try again.

Database connection pool — all connections are used for pending calls

✗ Try again.

Request-handling threads — they block indefinitely, eventually exhausting the thread pool

✓ Well done! Correct.

Network bandwidth — pending connections accumulate and saturate the NIC

✗ Try again.

What is the difference between a connection timeout and a read (response) timeout?Connection timeout only applies during the TLS handshake; read timeout covers the full HTTP exchange

✗ Try again.

Connection timeout limits the time to establish the TCP connection to the server; read timeout limits the time to receive the response after connecting

✓ Well done! Correct.

They are synonyms for the same configuration parameter in most HTTP clients

✗ Try again.

Connection timeout applies to UDP protocols; read timeout applies to TCP

✗ Try again.

28. What is the Bulkhead pattern for resource isolation (thread pools, connection pools)?

The Bulkhead pattern at the resource level isolates the thread pools and connection pools used to call different downstream dependencies, so that a slow or failed dependency cannot monopolise the shared pool and block calls to unrelated services.

Without Bulkhead: all outbound calls from Service A (to Inventory, Payment, and Notification) share one thread pool. If Inventory becomes slow and holds threads for 30 seconds each, it quickly fills the entire pool. Calls to Payment and Notification then queue up even though those services are healthy.

With Bulkhead: each downstream dependency gets its own bounded pool. Inventory gets 10 threads; Payment gets 10 threads; Notification gets 5 threads. A stalled Inventory pool only blocks Inventory calls.

Two isolation strategies:

Thread pool isolation — each dependency's calls execute on a dedicated thread pool. The calling thread is released immediately; the dedicated pool thread handles the blocking call. Supports async timeouts because the pool thread can be interrupted. Higher overhead (context switching between pools).
Semaphore isolation — each dependency is limited to N concurrent calls using a semaphore. No separate thread pool; the calling thread itself makes the blocking call, limited by the semaphore count. Lower overhead but no support for independent timeout interruption — a hung call holds the semaphore and the calling thread.

// Resilience4j Bulkhead (semaphore)
BulkheadConfig config = BulkheadConfig.custom()
    .maxConcurrentCalls(10)
    .maxWaitDuration(Duration.ofMillis(100))
    .build();
Bulkhead bh = BulkheadRegistry.of(config).bulkhead("inventoryService");

Supplier<Stock> decorated = Bulkhead.decorateSupplier(bh,
    () -> inventoryClient.getStock(sku));

What failure scenario does the Bulkhead pattern at the resource level prevent?A downstream service receiving more requests than it can process

✗ Try again.

Race conditions in shared mutable state within a service

✗ Try again.

A slow downstream dependency filling the shared thread/connection pool and blocking all calls to unrelated dependencies

✓ Well done! Correct.

Load balancer routing traffic to an unhealthy service instance

✗ Try again.

What is the key operational difference between thread pool isolation and semaphore isolation in the Bulkhead pattern?Thread pool isolation is only for external HTTP calls; semaphore isolation is only for database connections

✗ Try again.

Semaphore isolation supports built-in circuit breaking; thread pool isolation does not

✗ Try again.

Thread pool isolation executes calls on a dedicated pool (supports async timeouts); semaphore isolation limits concurrency in the calling thread without a separate pool

✓ Well done! Correct.

Thread pool isolation is a cloud-native pattern; semaphore isolation is only for on-premise deployments

✗ Try again.

29. What is the Health Check API pattern and what should a /health endpoint return?

The Health Check API pattern exposes an HTTP endpoint (typically /health, /actuator/health, or /healthz) that returns the current operational status of a service instance. Load balancers, orchestrators (Kubernetes), and service registries poll this endpoint to determine whether traffic should be routed to an instance and whether it should be restarted.

Two semantically distinct probe types are important (especially in Kubernetes):

Liveness probe — answers "is this process still alive and not deadlocked?" If it fails, Kubernetes restarts the container. Should only check internal process health — not external dependencies.
Readiness probe — answers "is this instance ready to serve traffic?" If it fails, the instance is temporarily removed from the load balancer pool (but not restarted). Should check whether all required dependencies (database, downstream services) are reachable.

// Spring Boot Actuator /actuator/health response
{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "details": { "database": "PostgreSQL", "result": 1 }
    },
    "redis": { "status": "UP" },
    "diskSpace": {
      "status": "UP",
      "details": { "free": 10737418240, "threshold": 10485760 }
    }
  }
}
// Return HTTP 200 when UP; HTTP 503 when DOWN or DEGRADED

Best practices for health endpoints:

Return a structured JSON body, not just an HTTP status code, so operators can diagnose which dependency is unhealthy.
Never put liveness and readiness logic in the same endpoint if they have different semantics.
Keep health checks fast (under 1 second); slow health checks look like outages to the load balancer.
Include a startup probe for slow-starting services to prevent Kubernetes from restarting them prematurely.

What is the difference between a Kubernetes liveness probe and a readiness probe?Liveness checks database connectivity; readiness checks heap memory usage

✗ Try again.

Liveness is for external API consumers; readiness is for internal cluster traffic

✗ Try again.

Liveness checks whether the process should be restarted; readiness checks whether the instance should receive traffic

✓ Well done! Correct.

Liveness probes only apply to containers; readiness probes apply to entire pods

✗ Try again.

What HTTP status code should a /health endpoint return when a critical dependency (e.g., database) is unavailable?HTTP 200 — so the load balancer continues sending traffic and the service handles degradation internally

✗ Try again.

HTTP 404 — indicating the health endpoint is temporarily disabled

✗ Try again.

HTTP 503 Service Unavailable — with a JSON body identifying which dependency is unhealthy

✓ Well done! Correct.

HTTP 500 — with a full stack trace to aid debugging

✗ Try again.

30. What is the Rate Limiting pattern and what algorithms are commonly used?

The Rate Limiting pattern caps the number of requests a client (identified by IP, API key, or user ID) can make within a time window. When the limit is exceeded, the server rejects excess requests with an HTTP 429 (Too Many Requests) and optionally includes a Retry-After header. It protects services from accidental or malicious overload, enforces fair-use quotas, and prevents a single client from exhausting shared resources.

Four commonly used algorithms:

Fixed Window Counter — count requests in fixed time windows (e.g., 0–60 s, 60–120 s). Simple and cheap. Weakness: a burst can occur at the boundary — up to 2× the limit in a single window transition.
Sliding Window Log — store the exact timestamp of each request. Count requests in the rolling window ending at "now". Precise but memory-intensive (O(N) per client).
Sliding Window Counter — approximate the sliding window by blending the current and previous fixed-window counts using elapsed time fraction. Good accuracy at low memory cost.
Token Bucket — a bucket fills with tokens at a fixed rate (e.g., 10 tokens/second, bucket size 100). Each request consumes one token. If the bucket is empty, reject. Allows controlled bursting up to the bucket size.
Leaky Bucket — requests fill a queue (the "bucket"). The bucket drains at a fixed constant rate. Smooths bursty input to a steady output. Excess requests that overflow the bucket are rejected.

# Redis-based Token Bucket (pseudocode)
tokens = redis.get("rate:" + clientId) or bucketCapacity
if tokens < 1:
    return HTTP 429
redis.decrby("rate:" + clientId, 1)
redis.expire("rate:" + clientId, windowSeconds)
# proceed with request

Rate limits are commonly enforced at the API Gateway using Redis (for distributed state across multiple gateway replicas) with the Token Bucket or Sliding Window Counter algorithm.

Which rate limiting algorithm is most susceptible to allowing a burst of up to 2x the limit at the boundary between two fixed windows?Token Bucket — always allows bursts up to the full bucket capacity

✗ Try again.

Leaky Bucket — has no fixed window so boundary bursts are impossible

✗ Try again.

Sliding Window Log — stores exact timestamps so it is perfectly accurate

✗ Try again.

Fixed Window Counter — up to 2x the per-window limit can pass in a short span straddling two adjacent windows

✓ Well done! Correct.

How does Token Bucket differ from Leaky Bucket in handling request bursts?Leaky Bucket allows larger bursts; Token Bucket enforces a strict constant rate

✗ Try again.

Token Bucket is only for WebSocket traffic; Leaky Bucket is for REST APIs

✗ Try again.

Token Bucket allows bursts up to the bucket capacity (accumulated tokens); Leaky Bucket smooths output to a constant drain rate regardless of burst size

✓ Well done! Correct.

They are algorithmically identical — the names are historical synonyms

✗ Try again.

31. What is the Fallback pattern and how does it relate to the Circuit Breaker?

The Fallback pattern provides an alternative response path when a downstream call fails — whether due to a timeout, an exception, or a Circuit Breaker (Q25) in the Open state. Instead of propagating a hard error to the caller (and potentially all the way to the user), the fallback returns a degraded but functional result that lets the system continue operating at reduced capability.

Common fallback strategies:

Cached response — return the last successfully retrieved value from a local or distributed cache. Works well for product catalogs, user preferences, and feature flags where slightly stale data is acceptable.
Default/stub value — return a sensible default. An unavailable recommendation engine falls back to a static "top-10 bestsellers" list.
Degraded feature — disable the feature entirely and return a response that omits the failed component. A loyalty-points display is hidden rather than blocking the checkout page.
Alternate service — route to a secondary service (e.g., a read replica, a lower-SLA provider, or a local fallback implementation).

Relationship to Circuit Breaker: The Circuit Breaker detects failure and short-circuits calls. The Fallback defines what to do when the circuit is open (or any call fails). They are complementary: the circuit breaker decides that the call should not be attempted; the fallback provides the alternative response. Resilience4j, Hystrix, and similar libraries allow both to be configured on the same decorated method.

A fallback should never do expensive work — it must return quickly. If the fallback itself can fail, it needs its own timeout and should itself degrade gracefully. A fallback that calls yet another slow service is an anti-pattern.

What does the Fallback pattern provide when a Circuit Breaker is in the Open state?A retry of the same call with a longer timeout

✗ Try again.

An alternative response — cached, default, or degraded — that allows the caller to continue functioning

✓ Well done! Correct.

An immediate HTTP 503 error propagated to the end user

✗ Try again.

An automatic reset of the circuit breaker to Closed state

✗ Try again.

Which fallback strategy is most appropriate for a product recommendation service that is temporarily unreachable?Return an empty array and hide the recommendation section entirely — degraded feature

✗ Try again.

Retry the recommendation service 10 times before giving up

✗ Try again.

Return a static list of bestsellers or recently cached recommendations rather than blocking the page

✓ Well done! Correct.

Propagate the error to the user with an HTTP 503 message

✗ Try again.

32. What is the Throttling pattern and how does it differ from Rate Limiting?

Both Throttling and Rate Limiting control the flow of requests to protect a service from overload, but they differ in what they do to excess traffic.

Aspect	Rate Limiting	Throttling
What happens to excess requests	Rejected immediately — HTTP 429 returned	Slowed down, queued, or delayed — response takes longer
Client experience	Hard error; client must back off and retry	Slower response; client waits longer but eventually receives a response
Use case	Enforcing hard quotas per client (API monetisation, abuse prevention)	Graceful degradation under peak load; ensuring critical traffic is served first
Implementation	Counter/token check before processing	Priority queue, token bucket drain with delay, or thread pool queue with bounded size

In practice, throttling often applies to internal flows — for example, a batch processing service that reads from a database throttles its own read rate to avoid saturating the DB connection pool. Rate limiting is more commonly applied at the external boundary (API gateway) to control client behaviour.

A service can apply both simultaneously: rate limit external clients to prevent abuse (hard cap), while internally throttling its own outbound calls to downstream services (graceful slowdown) to stay within those services' capacity. Throttling at the outbound call level also prevents the Retry storm anti-pattern, where many retries overload a recovering downstream service.

What HTTP status code is returned to a client when a rate limit is exceeded?503 Service Unavailable

✗ Try again.

403 Forbidden

✗ Try again.

400 Bad Request

✗ Try again.

429 Too Many Requests

✓ Well done! Correct.

How does throttling differ from rate limiting in its handling of excess requests?Throttling applies per-user; rate limiting applies globally to all users

✗ Try again.

Throttling is a client-side pattern; rate limiting is always server-side

✗ Try again.

They are identical — the terms are interchangeable in all contexts

✗ Try again.

Throttling slows or delays excess requests (e.g., by queuing them); rate limiting rejects them immediately with an error response

✓ Well done! Correct.

33. What is the Log Aggregation pattern and how does a centralised logging pipeline work?

The Log Aggregation pattern collects log output from every service instance and ships it to a centralised store where it can be searched, correlated, and analysed in one place. Without aggregation, diagnosing an incident across 50 service instances means SSHing into individual machines — impractical at scale.

A typical pipeline (EFK/ELK stack):

Emit structured logs — each service writes JSON-formatted log events to stdout (preferred in containers) or a log file. Structured logs include fields like timestamp, level, service, traceId, message.
Log shipper — Fluentd or Filebeat runs as a DaemonSet (one per node in Kubernetes) and tails container log files, applying parsing, filtering, and enrichment rules before forwarding.
Aggregator/processor — Logstash or Fluentd aggregator buffers, transforms (grok patterns, field extraction), and routes events.
Storage and search — Elasticsearch (or OpenSearch) indexes log events for full-text and structured queries.
Visualisation — Kibana (or OpenSearch Dashboards) provides dashboards, search, and alerting.

# Structured JSON log line emitted by a service
{
  "timestamp": "2026-04-22T09:01:23.456Z",
  "level":     "ERROR",
  "service":   "order-service",
  "traceId":   "4bf92f3577b34da6a3ce929d0e0e4736",
  "spanId":    "00f067aa0ba902b7",
  "message":   "Payment gateway timeout for order 42",
  "orderId":   42,
  "durationMs": 5001
}

The Correlation ID / Trace ID field is essential: it ties together all log lines from a single end-to-end request across every service that handled it, even when each service writes its logs to a different local file. A single Kibana query on traceId=4bf92f3... shows the entire request journey in chronological order.

Why is structured JSON logging preferred over plain-text log lines in a microservices log aggregation pipeline?JSON logs are automatically compressed before transmission, saving bandwidth

✗ Try again.

Log shippers cannot process plain-text log lines at all

✗ Try again.

JSON logs carry consistent, parseable fields that can be indexed and filtered without fragile regex patterns

✓ Well done! Correct.

Plain-text logs are not supported by Elasticsearch

✗ Try again.

What role does a Correlation ID (Trace ID) play in a centralised logging system?It compresses the log payload before it is shipped to Elasticsearch

✗ Try again.

It identifies the log-shipper agent that collected the event

✗ Try again.

It ties together all log entries from a single end-to-end request across multiple services, enabling full-journey reconstruction in one query

✓ Well done! Correct.

It provides millisecond-precision timestamps for latency calculations

✗ Try again.

34. What is the Application Metrics pattern and what is the difference between push and pull metric collection?

The Application Metrics pattern instruments each service to emit numeric measurements — counters, gauges, histograms, and summaries — that describe its runtime behaviour. These metrics feed dashboards, alerting rules, and capacity-planning models that plain logs cannot efficiently support (logs are for discrete events; metrics are for continuous numerical trends).

Common metric types:

Counter — monotonically increasing (e.g., total HTTP requests served, total errors). Never decremented except on process restart.
Gauge — a value that goes up and down (e.g., current active connections, JVM heap used, queue depth).
Histogram — distributes observations into configurable buckets (e.g., request latency distribution, enabling P50/P95/P99 calculations).

Pull model (Prometheus): The Prometheus server periodically scrapes a /metrics HTTP endpoint on each service instance. The service maintains in-memory metric state; Prometheus pulls it on its own schedule.

// Micrometer / Prometheus metric registration (Java)
Counter httpRequests = Counter.builder("http_requests_total")
    .tag("method", "GET").tag("status", "200")
    .register(Metrics.globalRegistry);
httpRequests.increment();

// Prometheus scrapes GET /actuator/prometheus every 15s

Push model (StatsD, Prometheus Pushgateway): The service actively sends metric updates to a collection agent or gateway. Used when services are short-lived (batch jobs, serverless functions) that do not run long enough to be scraped.

Aspect	Pull (Prometheus)	Push (StatsD / Pushgateway)
Discovery	Prometheus discovers targets via service discovery	Service knows the collector address
Short-lived jobs	Poor fit — job may finish before being scraped	Good fit — pushes before exit
Load on service	Scrape adds a momentary HTTP request	Service bears cost of every metric push

In the pull-based metrics model used by Prometheus, what does the Prometheus server do to collect metrics?Receives metric events pushed by each service to a central collector endpoint

✗ Try again.

Subscribes to a Kafka topic to which services publish metric events

✗ Try again.

Queries a shared metrics database that all services write to

✗ Try again.

Periodically scrapes a /metrics HTTP endpoint exposed on each service instance

✓ Well done! Correct.

What is the key difference between a Counter and a Gauge metric type?A Counter resets to zero on every Prometheus scrape; a Gauge is cumulative

✗ Try again.

Gauges are only for latency percentiles; Counters are for all other measurements

✗ Try again.

A Counter only increases (e.g., total requests served); a Gauge can increase or decrease (e.g., current active connections)

✓ Well done! Correct.

They are identical — the distinction is only a naming convention in Prometheus

✗ Try again.

35. What is the Audit Logging pattern and what events should always be captured?

Audit Logging records a tamper-evident, chronological trail of who performed what action on which resource and when. It is distinct from application or debug logging: application logs record technical events (exceptions, slow queries, service calls) for operational troubleshooting; audit logs record business-level events for compliance, forensics, and accountability — and must be retained even when the original data is deleted.

Events that should always be captured:

Authentication events — successful logins, failed login attempts, logouts, and token refresh operations. Essential for detecting credential-stuffing attacks and session anomalies.
Authorisation decisions — both grants and denials. A denied access attempt to a sensitive endpoint may indicate a privilege-escalation attempt.
Privileged record reads — when a user or service reads personally identifiable information (PII), financial records, or health data. Required by GDPR, HIPAA, and PCI-DSS.
Create, Update, Delete on critical entities — changes to user accounts, payment methods, configuration, permissions, and order state.
Administrative actions — role assignments, system configuration changes, secret rotation, and feature flag toggles.

Key properties of a well-designed audit log:

Immutability — audit records must not be deletable or modifiable after writing. Use append-only stores (AWS CloudTrail, Kafka with infinite retention, an WORM-locked S3 bucket).
Attribution — every entry must record the identity of the actor (user ID, service principal, IP address) and the target resource.
Tamper detection — hash chaining or cryptographic signing of records allows detection of modifications.
Separation from application logs — audit logs should flow through a separate pipeline with stricter retention policies and access controls.

What distinguishes audit logging from general application or debug logging?Audit logs use a faster storage backend for real-time queries

✗ Try again.

Application logs are encrypted; audit logs are always plain text

✗ Try again.

Audit logs record who performed what business action on which resource, for compliance and forensics; application logs record technical events for debugging

✓ Well done! Correct.

Audit logs use a different log level — typically TRACE — compared to application logs

✗ Try again.

Which of the following events should ALWAYS be captured in an audit log?A service instance starting up and binding to its configured port

✗ Try again.

A health-check endpoint being polled by a load balancer every 5 seconds

✗ Try again.

A configuration value being loaded from environment variables at boot time

✗ Try again.

A user successfully authenticating and a privileged record (PII or financial data) being read or modified

✓ Well done! Correct.

36. What is the Distributed Tracing pattern and how do trace context headers propagate across services?

Distributed Tracing reconstructs the end-to-end path of a single request as it flows across multiple microservices, providing a flamegraph of timings that reveals where latency accumulates and where failures occur. Without it, correlating logs from 10 services for a single slow request requires manual, error-prone cross-referencing.

Key concepts:

Trace — the complete journey of a request, identified by a globally unique traceId. All spans belonging to the same originating request share this ID.
Span — a named, timed unit of work within a single service (e.g., "HTTP GET /orders/42", "DB SELECT orders", "Kafka publish OrderPlaced"). A span records start time, duration, service name, tags (metadata), and the parentSpanId linking it to its caller.
Context propagation — the tracing library injects the trace context into outbound call headers; the receiving service extracts it and creates a child span with the correct traceId and parentSpanId.

# W3C Trace Context header (standard across OpenTelemetry, Jaeger, Zipkin)
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
#            version-traceId(128-bit hex)       -spanId(64-bit)-flags

# Outgoing HTTP request from Service A to Service B includes this header.
# Service B extracts traceId + parentSpanId, creates its own child span.
# All spans are sent asynchronously to Jaeger / Zipkin / AWS X-Ray.

Two propagation standards are in common use:

W3C Trace Context (traceparent + tracestate headers) — the IETF standard, supported natively by OpenTelemetry.
Zipkin B3 headers (X-B3-TraceId, X-B3-SpanId, X-B3-ParentSpanId, X-B3-Sampled) — older but still widely used by Istio, Zipkin, and some Jaeger deployments.

For async messaging, the trace context is injected into message headers (Kafka record headers, AMQP headers) so traces span across broker boundaries.

What is a Span in distributed tracing?The complete end-to-end path of a request from client through all services

✗ Try again.

A log message tagged with a timestamp and service name

✗ Try again.

A metric data point recording request latency

✗ Try again.

A named, timed unit of work within a single service, linked to its parent call by parentSpanId

✓ Well done! Correct.

How is trace context propagated from one microservice to the next in a synchronous HTTP call?Via a shared in-memory map maintained by the service mesh control plane

✗ Try again.

Through the message body payload as a JSON field

✗ Try again.

Via standardised HTTP headers (e.g., W3C traceparent or Zipkin B3) injected by the tracing library on outbound calls and extracted on inbound calls

✓ Well done! Correct.

Via a side channel in the Kubernetes API server

✗ Try again.

37. What is the Access Token pattern (JWT/OAuth2) for service-to-client authentication?

The Access Token pattern uses short-lived cryptographically signed tokens — most commonly JWTs issued via OAuth 2.0 / OpenID Connect — to authenticate client requests to microservices. The client authenticates once with an Authorization Server (Keycloak, Okta, Cognito) and receives a JWT. Subsequent requests carry this token; any service that holds the corresponding public key can validate it locally without calling the Auth Server on every request.

JWT structure — three Base64URL-encoded segments separated by dots (header.payload.signature):

// Header (algorithm and token type)
{ "alg": "RS256", "typ": "JWT" }

// Payload (claims)
{
  "sub": "user-42",
  "iss": "https://auth.example.com",
  "aud": "order-service",
  "exp": 1714003200,   // expiry Unix timestamp
  "scope": "orders:read orders:write"
}

// Signature: RS256(base64(header) + "." + base64(payload), privateKey)

// HTTP request
Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...

Validation at the receiving service (API Gateway or service itself):

Decode the header to get the signing algorithm and key ID (kid).
Fetch (or cache) the public key from the Auth Server's JWKS endpoint.
Verify the signature using the public key.
Check exp (not expired), iss (trusted issuer), and aud (this service is the intended audience).
Extract sub and scope claims to enforce authorisation.

Short token lifetimes (5–15 minutes) limit the blast radius if a token is stolen. Refresh tokens (longer-lived, stored securely by the client) are exchanged for new access tokens when the old one expires, without requiring re-authentication.

Why can a JWT access token be validated locally at the receiving service without calling the Authorization Server for every request?The Authorization Server pushes token validity state to all services in real time via a webhook

✗ Try again.

The API gateway caches all issued tokens in a shared Redis store accessible by all services

✗ Try again.

The token is cryptographically signed; any party with the issuer's public key can verify its integrity and claims offline

✓ Well done! Correct.

JWT tokens contain a shared HMAC secret known by all services in the same cluster

✗ Try again.

What are the three Base64URL-encoded parts of a JWT, in correct order?Signature, Header, Payload

✗ Try again.

Payload, Header, Signature

✗ Try again.

Header, Signature, Payload

✗ Try again.

Header, Payload, Signature

✓ Well done! Correct.

38. What is the Mutual TLS (mTLS) pattern for service-to-service authentication?

Mutual TLS (mTLS) extends standard one-way TLS by requiring both sides of a connection to present and verify X.509 certificates. In a microservices context it provides two things simultaneously: an encrypted channel (confidentiality and integrity) and verified service identity (authentication) — without any application-level token or API key. The services prove who they are via their certificates, issued by a trusted internal Certificate Authority.

Standard TLS vs mTLS:

Standard (one-way) TLS — only the server presents a certificate. The client verifies the server's identity but the server does not verify the client. Used for browser-to-server HTTPS.
mTLS — both client and server present certificates. Each side verifies the other's certificate against a shared CA. This proves the client is a legitimate service instance, not just any caller that can reach the network.

# Istio PeerAuthentication — enforce STRICT mTLS in a namespace
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: production
spec:
  mtls:
    mode: STRICT     # reject any plaintext or one-way TLS traffic
---
# Istio automatically rotates certificates via its Citadel CA.
# Envoy sidecars handle the TLS handshake transparently.
# Application code sees plain HTTP internally — zero code changes.

In a service mesh (Istio, Linkerd), mTLS is fully transparent to application code: the sidecar proxy handles the TLS handshake using certificates provisioned by the control-plane CA. Certificates are short-lived (e.g., 24 hours) and rotated automatically, eliminating the risk of a compromised long-lived credential.

mTLS is the recommended pattern for east-west (service-to-service) authentication within a cluster. It replaces shared API keys and static secrets with cryptographic identity tied to a specific service workload.

How does mTLS differ from standard (one-way) TLS in terms of certificate presentation?mTLS encrypts the message body but not the headers; standard TLS encrypts both

✗ Try again.

mTLS uses symmetric keys for encryption; standard TLS uses asymmetric keys

✗ Try again.

In mTLS both client and server present and verify X.509 certificates; in standard TLS only the server presents a certificate

✓ Well done! Correct.

mTLS only operates within Kubernetes pods; standard TLS is for external HTTPS traffic

✗ Try again.

What Istio resource is used to enforce strict mTLS between all services in a namespace?VirtualService with a TLS route match

✗ Try again.

DestinationRule with tls mode CLIENT

✗ Try again.

RequestAuthentication policy with a JWT rule

✗ Try again.

PeerAuthentication resource with mtls.mode set to STRICT

✓ Well done! Correct.

39. What is the Secrets Management pattern and how do tools like Vault or AWS Secrets Manager implement it?

The Secrets Management pattern centralises the storage, access control, rotation, and auditing of sensitive credentials — database passwords, API keys, TLS certificates, encryption keys — in a dedicated secrets store rather than hardcoding them in environment variables, config files, or source code. The goal is to ensure that a compromised container image, log file, or configuration repository cannot expose production credentials.

HashiCorp Vault provides several key capabilities:

Dynamic secrets — instead of storing a long-lived database password, Vault generates a unique, short-TTL (e.g., 1-hour) database credential on demand for each service instance. When the lease expires, Vault revokes it automatically. If credentials leak, they expire quickly — limiting the blast radius.
Encryption as a Service — services can ask Vault to encrypt/decrypt data without ever holding the encryption key themselves.
Leasing and renewal — every secret is issued with a lease. Services renew leases before expiry; Vault revokes them if renewal stops (e.g., after a service crash).
Audit log — every secret access is logged with the requesting entity, timestamp, and secret path.

AWS Secrets Manager provides:

Automatic rotation of RDS database credentials on a configurable schedule (Lambda-powered).
IAM-based access control — only services with the correct IAM role can retrieve a secret.
Cross-account and cross-region replication for disaster recovery.

In Kubernetes, secrets are typically injected at pod creation via a Vault Agent sidecar or the Vault Secrets Operator (CSI provider), making the secret available as an in-memory file or environment variable at runtime — never baked into the container image or stored in etcd in plaintext.

What is a dynamic secret in HashiCorp Vault and why is it more secure than a static, pre-stored credential?A dynamic secret is stored in an encrypted environment variable that rotates every 90 days

✗ Try again.

A dynamic secret is generated on demand with a short TTL and automatically revoked when the lease expires, limiting the blast radius if credentials are leaked

✓ Well done! Correct.

A dynamic secret is encrypted using a service-specific key derived from the service name

✗ Try again.

A dynamic secret is a secret that can only be read once before being deleted from Vault

✗ Try again.

What is the primary security benefit of using short-lived secrets with automatic rotation?It eliminates the need for authentication between services entirely

✗ Try again.

It automatically satisfies all PCI-DSS compliance requirements

✗ Try again.

Short-lived credentials expire quickly, so a leaked secret provides only a narrow window for an attacker to exploit it

✓ Well done! Correct.

It reduces the number of API calls required to retrieve secrets

✗ Try again.

40. What is the Sidecar pattern and what responsibilities does a sidecar container take on?

The Sidecar pattern deploys a helper container alongside the main application container in the same pod (Kubernetes) or VM instance. The sidecar shares the same network namespace, localhost address space, and optionally a shared volume with the main container. It handles cross-cutting concerns so the main application stays free of infrastructure boilerplate.

# Kubernetes Pod with a Fluentd log-shipper sidecar
apiVersion: v1
kind: Pod
metadata:
  name: order-service
spec:
  containers:
  - name: order-service          # main application
    image: myregistry/order-service:2.1
    volumeMounts:
    - name: logs
      mountPath: /var/log/app

  - name: log-shipper            # sidecar
    image: fluent/fluentd:v1.16
    volumeMounts:
    - name: logs
      mountPath: /var/log/app    # reads same log directory
    env:
    - name: FLUENTD_CONF
      value: fluent.conf

  volumes:
  - name: logs
    emptyDir: {}

Common sidecar responsibilities:

Log shipping — tail application log files and forward to Elasticsearch or a log aggregation pipeline (as in the example above).
Metrics collection — scrape or poll the application's metrics and expose them in Prometheus format, or push to StatsD.
Service proxy — Envoy/Linkerd-proxy sidecars intercept all inbound and outbound traffic, handling mTLS, retries, circuit breaking, and tracing without code changes in the main app. (This is the Service Mesh data plane.)
Configuration reload — watch a ConfigMap or Vault path and write updated configuration to a shared volume that the main app reads without restarting.
Secret rotation — fetch short-lived secrets from Vault and refresh them in a shared in-memory file before they expire.

The key architectural property: the main application is unaware of its sidecar. It reads log files or environment variables as normal; it makes outbound HTTP calls normally. The sidecar intercepts or supplements transparently. This allows infrastructure capabilities to be upgraded or replaced independently of the application.

What is the defining characteristic of the Sidecar pattern in a Kubernetes environment?A secondary replica of the main container deployed for high availability

✗ Try again.

A container that runs only during pod initialisation and then exits

✗ Try again.

A helper container in the same pod that shares the network namespace and handles a cross-cutting concern transparently

✓ Well done! Correct.

A separate microservice deployed in a different namespace to support the main service

✗ Try again.

Which of the following is NOT a typical responsibility assigned to a sidecar container?Collecting and forwarding application log files to a centralised pipeline

✗ Try again.

Proxying inbound and outbound network traffic for mTLS and circuit breaking

✗ Try again.

Implementing the core business logic of the main application service

✓ Well done! Correct.

Fetching and refreshing short-lived secrets from a Vault instance

✗ Try again.

41. What is the Ambassador pattern and how does it proxy outbound traffic for a service?

The Ambassador pattern is a specialisation of the Sidecar pattern focused on outbound (egress) connections. The ambassador container acts as a local proxy for all traffic the main container sends to external services. Instead of the main application connecting directly to downstream services (with all the attendant concerns of retry logic, circuit breaking, timeouts, and connection pooling), it connects to localhost:<port> on the ambassador, which handles all of that transparently.

# Pod with an Envoy Ambassador for outbound calls
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: order-service
    image: myregistry/order-service:2.1
    env:
    - name: INVENTORY_URL
      value: http://localhost:9901/inventory  # ambassador port

  - name: envoy-ambassador
    image: envoyproxy/envoy:v1.29
    args: ["-c", "/etc/envoy/envoy.yaml"]
    volumeMounts:
    - name: envoy-config
      mountPath: /etc/envoy
# envoy.yaml configures upstream cluster, retries, circuit breaker, TLS

Ambassador responsibilities:

Retry and circuit breaking — the ambassador retries transient failures with exponential backoff; opens the circuit when the upstream is unhealthy.
Connection pooling — maintains a warm pool of HTTP/2 or gRPC connections to upstream services, avoiding per-request TCP handshake overhead.
Protocol translation — a legacy service that only speaks HTTP/1.1 can be transparently proxied to an upstream that expects HTTP/2 or gRPC.
mTLS to upstream — the ambassador terminates plaintext from the main app and re-establishes mTLS to the upstream, so the main app does not need TLS libraries.
Telemetry — emits distributed trace spans and latency metrics for every outbound call.

The main application's code is simplified to a plain HTTP call to localhost. All network resilience logic lives in the ambassador configuration and can be changed without redeploying the application.

What traffic direction does the Ambassador pattern primarily address?Inbound (ingress) traffic from external clients reaching the service

✗ Try again.

Internal traffic between containers sharing the same pod

✗ Try again.

Outbound (egress) traffic from the service to its downstream dependencies

✓ Well done! Correct.

Database replication traffic between primary and replica nodes

✗ Try again.

What is the practical difference between the Sidecar pattern and the Ambassador pattern?Sidecar handles inbound traffic; Ambassador handles outbound traffic — but both are otherwise identical

✗ Try again.

Ambassador is a specialisation of the Sidecar pattern specifically for managing outbound connections and resilience concerns for egress traffic

✓ Well done! Correct.

They are different names for the same pattern with no semantic distinction

✗ Try again.

Sidecar requires Kubernetes; Ambassador is a standalone process deployable on any platform

✗ Try again.

42. What is the Adapter pattern in the context of microservice containers?

The Adapter pattern (container / structural variant) places a sidecar container alongside the main container to normalise the main container's output into a standard interface that the surrounding infrastructure expects — without modifying the main application. It is essentially a structural translator that makes a non-conforming service look conforming to monitoring, logging, or management infrastructure.

Concrete examples:

Legacy log format to structured JSON — a legacy Java service writes logs in a custom text format. An Adapter sidecar reads the log file, parses the custom format, and re-emits structured JSON to stdout so the standard Fluentd pipeline can process it identically to every other service.
Non-standard metrics to Prometheus format — a third-party binary exposes metrics on a proprietary UDP endpoint. An Adapter sidecar reads those metrics and exposes them at /metrics in Prometheus exposition format, making the service scrape-able by Prometheus without any changes to the binary.
Legacy health endpoint normalisation — a vendor application returns health status in a non-standard format. The Adapter translates it to the standard { "status": "UP" } JSON that Kubernetes probes expect.

Adapter vs Ambassador: Both are sidecar specialisations. The Ambassador manages outbound connectivity (egress traffic, retries, circuit breaking). The Adapter manages output format normalisation (translating the service's emitted data — logs, metrics, health — into a standard interface). An Ambassador speaks on behalf of the app to the outside world; an Adapter speaks on behalf of the app to the infrastructure tooling.

What problem does the Adapter container pattern solve in a heterogeneous microservices environment?It proxies outbound HTTP calls and adds retry logic for the main container

✗ Try again.

It manages inbound traffic routing and load balancing

✗ Try again.

It translates the main container's output format to match the standard interface expected by monitoring or logging infrastructure

✓ Well done! Correct.

It injects secrets from Vault into the main container at startup

✗ Try again.

How does the Adapter container pattern differ from the Ambassador pattern?They handle the same concern and differ only in the programming language used

✗ Try again.

Adapter is only for legacy services; Ambassador is exclusively for greenfield cloud-native services

✗ Try again.

Adapter handles inbound traffic normalisation; Ambassador handles outbound traffic

✗ Try again.

Adapter normalises the service's output format for infrastructure tooling; Ambassador manages outbound connectivity and resilience for downstream calls

✓ Well done! Correct.

43. What is the Canary Deployment pattern and how does it differ from Blue-Green deployment?

The Canary Deployment pattern releases a new version of a service to a small percentage of production traffic first, monitors it closely for errors, latency regressions, and business metric anomalies, then gradually increases its traffic share until it serves 100% — at which point the old version is decommissioned. The name comes from the "canary in a coal mine" practice of using a small probe to detect danger before full exposure.

Blue-Green Deployment maintains two complete, identical production environments — Blue (current live) and Green (new version). Traffic is switched from Blue to Green all at once (or very rapidly). If Green has a problem, rollback is instant: switch traffic back to Blue.

Aspect	Canary	Blue-Green
Traffic shift	Gradual (1% → 10% → 50% → 100%)	All-at-once switch
User exposure to new version	Small initial subset of users	All users at switch time
Infrastructure cost	Only canary instances needed alongside full production	Two full production environments run simultaneously
Rollback speed	Seconds (redirect traffic back)	Seconds (flip the switch)
Best for	Validating new features on real traffic before full exposure	Pre-validated releases where instant cutover and instant rollback are required

# Istio VirtualService — 5% canary traffic split
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata: { name: order-service }
spec:
  hosts: [order-service]
  http:
  - route:
    - destination: { host: order-service, subset: v1 }
      weight: 95
    - destination: { host: order-service, subset: v2-canary }
      weight: 5

Canary deployments require automated monitoring with clear rollback triggers: if canary error rate or P99 latency exceeds a threshold within the observation window, traffic is automatically redirected back to the stable version.

In a Canary deployment, what is the correct action if the canary version shows elevated error rates during the observation window?Immediately deploy the canary to 100% to fix the issue faster

✗ Try again.

Increase the canary traffic percentage to gather more data

✗ Try again.

Redirect all traffic back to the stable version, investigate, and fix before retrying the canary

✓ Well done! Correct.

Restart the canary pods and wait for the error rate to self-correct

✗ Try again.

Which deployment strategy requires two complete, full-capacity production environments to be running simultaneously?Rolling deployment — instances are updated sequentially one at a time

✗ Try again.

Recreate deployment — the old version is stopped before deploying the new one

✗ Try again.

Canary deployment — only a small percentage is deployed alongside full production

✗ Try again.

Blue-Green deployment — both environments run at full capacity until the traffic switch

✓ Well done! Correct.

44. What is the Service Registry and Discovery pattern — client-side versus server-side discovery?

The Service Registry is a database of network locations (host + port) for all running service instances, kept up-to-date by registration (at startup) and deregistration (at shutdown or failure). Service Discovery is the mechanism by which a service caller looks up the current location of a dependency at runtime, replacing hardcoded hostnames with dynamic lookups.

Two styles of service discovery:

Client-side discovery — the calling service queries the registry directly, receives a list of healthy instances, and performs its own load balancing (round-robin, random, least-connections).

// Spring Cloud / Netflix Eureka client-side discovery
@LoadBalanced  // Ribbon intercepts and resolves via Eureka
RestTemplate restTemplate = new RestTemplate();

// Call by logical service name — Ribbon resolves to a real IP:port
String result = restTemplate.getForObject(
    "http://inventory-service/api/stock/SKU-99", String.class);

Server-side discovery — the caller sends the request to a load balancer (AWS ALB, HAProxy, Nginx, Kubernetes Service). The load balancer queries the registry and forwards to a healthy instance. The caller needs no registry SDK.

Aspect	Client-side	Server-side
Who queries the registry	The calling service (via SDK)	The load balancer
Client SDK dependency	Required (Eureka client, Ribbon)	Not required — any HTTP client works
Language support	Needs SDK for each language	Works for any language / protocol
Examples	Netflix Eureka + Ribbon, Consul + Fabio	Kubernetes Service + kube-proxy, AWS ALB + ECS

Kubernetes uses server-side discovery natively: a Service resource provides a stable DNS name and VIP; kube-proxy (or iptables/IPVS) routes traffic to healthy pods via Endpoints, which are kept current by the Endpoints controller watching pod readiness probes.

In client-side service discovery, who is responsible for selecting which service instance to call?The API gateway, which maintains the registry on behalf of all clients

✗ Try again.

The service registry itself, which returns a single pre-selected instance

✗ Try again.

The calling service (client), which queries the registry and performs its own load balancing

✓ Well done! Correct.

A separate load balancer process that intercepts all outbound calls

✗ Try again.

What is a key operational advantage of server-side discovery over client-side discovery?Server-side discovery is always lower-latency because the load balancer is co-located with services

✗ Try again.

Server-side discovery eliminates the need for health checks on service instances

✗ Try again.

Clients do not need a registry SDK — any HTTP client in any language can call the load balancer directly

✓ Well done! Correct.

Client-side discovery cannot work with containerised services

✗ Try again.

45. What is the Self Registration versus Third-Party Registration pattern for service discovery?

These two patterns describe how service instances get their network location recorded in (and removed from) the Service Registry — not how callers use it.

Self Registration — the service instance itself registers with the registry on startup and deregisters on orderly shutdown. It is also responsible for sending heartbeats so the registry can detect failed instances and remove stale entries.

Example: A Spring Boot service with the Netflix Eureka client calls eurekaClient.register() at startup and eurekaClient.deregister() in a shutdown hook.
Drawback: every service must import and configure the registry client library. The service is now coupled to the specific registry technology. If the registry changes, all services need updating.

Third-Party Registration — an external Registrar component monitors service instances (via the platform's event stream) and registers/deregisters them on their behalf. The service itself has zero registry awareness.

Example (Kubernetes): The Endpoints controller watches pod events. When a pod passes its readiness probe, the controller adds its IP to the Endpoints resource for the corresponding Service. When the pod fails its probe or is deleted, the controller removes it. The pod never calls the registry directly.
Example (Consul + Docker): A Registrator daemon on each host listens for Docker start/stop events and updates Consul accordingly.

Aspect	Self Registration	Third-Party Registration
Coupling	Service coupled to registry client library	Service has no registry dependency
Complexity	Simpler — no extra component	Requires a Registrar / controller process
Language support	Needs SDK per language	Language-agnostic
Used by	Netflix Eureka, Consul client mode	Kubernetes, Consul + Registrator

What is the main drawback of the Self Registration pattern for service discovery?It requires a separate Registrar process running on every host

✗ Try again.

It cannot detect failed service instances that crash without a shutdown hook

✗ Try again.

Registration logic must be implemented in every service, coupling each service to the registry client library and technology

✓ Well done! Correct.

It only works with REST-based services and cannot handle gRPC

✗ Try again.

In a Kubernetes cluster, which component performs Third-Party Registration for service instances?The kubelet process running on each node, which registers pods directly

✗ Try again.

The pod itself via an init container that calls the Kubernetes API

✗ Try again.

The Ingress controller, which maps pods to external hostnames

✗ Try again.

The Endpoints controller, which watches pod readiness events and updates Service Endpoint records automatically

✓ Well done! Correct.

BigData

	Interviews Questions Java Spring Hibernate Maven Testing API BigData Web DataStructures AI Database MuleESB Cloud Scala Tools	About Javapedia.net Javapedia.net is for Java and J2EE developers, technologist and college students who prepare of interview. Also this site includes many practical examples. This site is developed using J2EE technologies by Steve Antony, a senior Developer/lead at one of the logistics based company.
	contact: javatutorials2016[at]gmail[dot]com
Kindly consider donating for maintaining this website. Thanks.
	Copyright © 2026, javapedia.net, all rights reserved. privacy policy.

API / Microservices Design Patterns Interview Questions

Comments & Discussions

Recently added...