Golang / GoLang System Architecture and Testing Interview Questions

1. Compare REST/JSON with gRPC/Protocol Buffers. When would you choose gRPC for a Go microservice? 2. How do you implement a gRPC server in Go, including error handling and interceptors? 3. How do you build a production-ready gRPC client in Go with connection reuse and resilience? 4. What microservice design patterns are most important to understand for Go interviews? 5. How do you implement distributed tracing and observability in a Go microservice system? 6. How do you implement event-driven communication between Go microservices using message queues? 7. How do you manage database connections and sharding in a high-scale Go service? 8. What caching strategies do you use in Go microservices and how do you prevent cache stampede? 9. What are table-driven tests in Go and why are they the standard testing pattern? 10. How do you write Go benchmarks and what does -benchmem tell you? 11. How do you find and fix memory allocation hotspots in a Go service using profiling? 12. How do you structure integration tests in Go that require real databases or external services? 13. Explain the difference between mocks, stubs, and fakes in Go testing. When do you use each? 14. How does Go's built-in fuzzing work and when should you use property-based testing? 15. How do you test concurrent Go code correctly — including data races and timing issues? 16. How do you decide where to draw service boundaries when decomposing a Go monolith into microservices? 17. How do you version gRPC APIs in Go without breaking existing clients? 18. How do you write unit and integration tests for gRPC services in Go? 19. How do you load test a Go microservice and interpret the results? 20. How does service discovery and client-side load balancing work in a Go microservice system? 21. How do you design a consistent error model across multiple Go microservices? 22. How do you implement the Saga pattern for distributed transactions in Go? 23. What testing.T methods do experienced Go engineers use to write cleaner tests? 24. How do you benchmark concurrent code with testing.B and what insights does it provide? 25. How do you manage dependency injection at scale in a large Go service — wire, dig, or manual? 26. How do you achieve zero-downtime deployments for a Go microservice in Kubernetes? 27. How do generics in Go 1.18+ enable better system design and what are the trade-offs? 28. How do you use test coverage meaningfully in Go — beyond just a percentage? 29. What are the best practices for designing Protocol Buffer schemas in Go microservices? 30. How do you implement safe retries in Go microservices? 31. What are golden file tests in Go and when should you use them? 32. How do you ensure data consistency across Go microservices without distributed transactions? 33. What is the API Gateway pattern and how does it complement Go microservices? 34. What memory leak patterns in Go are not goroutine leaks and how do you detect them? 35. How do CQRS and event sourcing apply to Go microservice architecture? 36. What is chaos engineering and how do Go teams apply it to test microservice resilience? 37. What is contract testing and how does it apply to Go microservices? 38. What makes a Go microservice horizontally scalable and what patterns break scaling? 39. How do you implement configuration hot-reloading in a Go service without restart? 40. How do you architect Go services for maximum testability at the package level? 41. How do you implement feature flags and canary deployments in a Go microservice? 42. How do you design a multi-tenant Go microservice? 43. What is mutation testing and how does it evaluate test suite quality beyond coverage? 44. How do you manage the full lifecycle of a Go microservice from startup to shutdown? 45. How do you test Go code that processes streaming data or works with channels? 46. Summarise the key principles for designing scalable Go microservices that senior engineers demonstrate.

Could not find what you were looking for? send us the question and we would be happy to answer your question.

1. Compare REST/JSON with gRPC/Protocol Buffers. When would you choose gRPC for a Go microservice?

REST and gRPC solve the same problem — remote procedure calls — but make very different trade-offs. Understanding these trade-offs is central to microservice architecture decisions.

REST/JSON vs gRPC/Protobuf
Aspect	REST / JSON	gRPC / Protobuf
Protocol	HTTP/1.1 or HTTP/2	HTTP/2 (mandatory)
Serialisation	JSON: text, ~30 bytes/field, human-readable	Protobuf: binary, ~3-5 bytes/field, 5-10× smaller
Schema	Optional (OpenAPI), not enforced at compile time	Required (.proto file), compile-time checked
Code generation	Optional	Required (protoc + language plugins)
Browser support	Native	Needs grpc-web proxy layer
Streaming	Workarounds: SSE, WebSockets	Built-in: unary, server, client, bidirectional
Latency	Higher (text parsing overhead)	Lower (binary + HTTP/2 multiplexing)
Best for	Public APIs, browser clients, simple integrations	Internal service-to-service, high-throughput, typed contracts

// user.proto
syntax = "proto3";
package user;

service UserService {
    rpc GetUser(GetUserRequest) returns (User);           // unary
    rpc StreamUsers(Empty) returns (stream User);         // server streaming
    rpc BatchCreate(stream CreateUserRequest)             // client streaming
        returns (BatchCreateResponse);
    rpc Chat(stream Message) returns (stream Message);    // bidirectional
}

message GetUserRequest { int64 id = 1; }
message User {
    int64  id    = 1;
    string name  = 2;
    string email = 3;
}

// Generate Go code:
// protoc --go_out=. --go-grpc_out=. user.proto

Why Go excels at gRPC: Go's goroutine model maps naturally to gRPC's concurrent streaming model. Each RPC handler runs in its own goroutine; the Go runtime multiplexes thousands of concurrent streams efficiently. Protobuf's generated Go code is idiomatic and integrates with Go's type system. Go's standard net/http and context packages align with gRPC's design.

Decision guide: choose gRPC for internal service-to-service calls where typed contracts, performance, and streaming matter. Use REST/JSON for public APIs consumed by browsers or third parties where human readability and broad tooling matter.

Take quiz

What serialisation format does gRPC use and what is its primary advantage over JSON?XML — it is more structured than JSON

✗ Try again.

Protocol Buffers (binary) — 5-10× smaller payloads and faster serialisation/deserialisation

✓ Correct! Well done.

MessagePack — it is a compressed form of JSON

✗ Try again.

CBOR — it offers better unicode support

✗ Try again.

Which gRPC streaming mode allows both the client and server to send multiple messages on a single connection?Server streaming

✗ Try again.

Client streaming

✗ Try again.

Bidirectional streaming

✓ Correct! Well done.

Unary RPC

✗ Try again.

2. How do you implement a gRPC server in Go, including error handling and interceptors?

Implementing a gRPC server follows a code-generation-first workflow: define the proto, generate Go stubs, implement the interface, and start the server. Interceptors (gRPC's equivalent of HTTP middleware) add cross-cutting concerns like logging and auth.

// Step 1: implement the generated server interface
type userServiceServer struct {
    pb.UnimplementedUserServiceServer // embed for forward compatibility
    repo UserRepository
    log  *slog.Logger
}

func (s *userServiceServer) GetUser(
    ctx context.Context,
    req *pb.GetUserRequest,
) (*pb.User, error) {
    if req.Id <= 0 {
        return nil, status.Errorf(codes.InvalidArgument,
            "id must be positive, got %d", req.Id)
    }
    user, err := s.repo.FindByID(ctx, int(req.Id))
    if err != nil {
        if errors.Is(err, ErrNotFound) {
            return nil, status.Errorf(codes.NotFound,
                "user %d not found", req.Id)
        }
        s.log.Error("repo error", "err", err)
        return nil, status.Errorf(codes.Internal, "internal error")
    }
    return &pb.User{Id: int64(user.ID), Name: user.Name, Email: user.Email}, nil
}

// Interceptor (middleware equivalent)
func loggingInterceptor(log *slog.Logger) grpc.UnaryServerInterceptor {
    return func(
        ctx context.Context,
        req any,
        info *grpc.UnaryServerInfo,
        handler grpc.UnaryHandler,
    ) (any, error) {
        start := time.Now()
        resp, err := handler(ctx, req)
        log.Info("RPC",
            "method", info.FullMethod,
            "duration", time.Since(start),
            "error", err,
        )
        return resp, err
    }
}

// Step 2: start the server
func main() {
    lis, _ := net.Listen("tcp", ":9090")
    srv := grpc.NewServer(
        grpc.ChainUnaryInterceptor(
            loggingInterceptor(log),
            recoveryInterceptor(),
            authInterceptor(secret),
        ),
    )
    pb.RegisterUserServiceServer(srv, &userServiceServer{repo: repo})
    reflection.Register(srv) // enables grpcurl inspection
    srv.Serve(lis)
}

gRPC status codes: always return structured status errors using status.Errorf(codes.X, ...). Clients receive the code and message and can handle them programmatically. The mapping to HTTP status codes is standardised (NotFound→404, InvalidArgument→400, Internal→500).

Take quiz

What does embedding 'pb.UnimplementedUserServiceServer' in your gRPC server struct provide?It provides production-ready default implementations of all RPC methods

✗ Try again.

It provides stub implementations returning codes.Unimplemented — adding new RPCs to the proto later doesn't break existing server code

✓ Correct! Well done.

It automatically generates the .proto file from Go structs

✗ Try again.

It enables reflection so grpcurl can discover the service

✗ Try again.

Which gRPC status code should you return for a missing resource?codes.NotFound

✓ Correct! Well done.

codes.InvalidArgument

✗ Try again.

codes.Unavailable

✗ Try again.

codes.PermissionDenied

✗ Try again.

3. How do you build a production-ready gRPC client in Go with connection reuse and resilience?

A gRPC client wraps a ClientConn which manages a pool of HTTP/2 connections. Unlike HTTP/1.1, a single gRPC connection multiplexes many concurrent RPCs — connection reuse is critical.

// Production gRPC client setup
func newUserClient(addr string) (pb.UserServiceClient, func(), error) {
    conn, err := grpc.NewClient(addr,
        // TLS in production
        grpc.WithTransportCredentials(credentials.NewTLS(&tls.Config{})),

        // Keep-alive: detect dead connections
        grpc.WithKeepaliveParams(keepalive.ClientParameters{
            Time:                10 * time.Second,
            Timeout:             5 * time.Second,
            PermitWithoutStream: true,
        }),

        // Retry policy (built-in retry)
        grpc.WithDefaultServiceConfig(`{
            "methodConfig": [{
                "name": [{"service": "user.UserService"}],
                "retryPolicy": {
                    "maxAttempts": 3,
                    "initialBackoff": "0.1s",
                    "maxBackoff": "1s",
                    "backoffMultiplier": 2,
                    "retryableStatusCodes": ["UNAVAILABLE"]
                }
            }]
        }`),
    )
    if err != nil { return nil, nil, err }

    cleanup := func() { conn.Close() }
    return pb.NewUserServiceClient(conn), cleanup, nil
}

// Using the client â always pass context
func fetchUser(ctx context.Context, client pb.UserServiceClient, id int64) (*pb.User, error) {
    ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
    defer cancel()

    return client.GetUser(ctx, &pb.GetUserRequest{Id: id})
}

// Client interceptors (middleware for outgoing calls)
conn, _ := grpc.NewClient(addr,
    grpc.WithTransportCredentials(insecure.NewCredentials()),
    grpc.WithChainUnaryInterceptor(
        metadataInterceptor, // attach trace ID to outgoing metadata
        retryInterceptor,
    ),
)

Connection sharing: share one ClientConn per target service across the entire application. HTTP/2 multiplexing means thousands of concurrent RPCs share a single TCP connection — creating a new conn per RPC defeats the purpose and wastes resources.

Take quiz

Why should a gRPC ClientConn be shared across the application rather than creating one per RPC call?Creating connections is not thread-safe in Go

✗ Try again.

HTTP/2 multiplexes many concurrent RPCs on one TCP connection — creating per-RPC connections wastes resources and loses the multiplexing benefit

✓ Correct! Well done.

gRPC clients have a fixed limit of one connection per address

✗ Try again.

The gRPC library caches connections automatically regardless

✗ Try again.

What does the gRPC keepalive parameter 'PermitWithoutStream: true' enable?Allows RPCs without authentication

✗ Try again.

Sends keepalive pings even when there are no active RPCs — detects dead connections during idle periods

✓ Correct! Well done.

Permits streaming RPCs on connections without TLS

✗ Try again.

Allows the server to close connections without notifying the client

✗ Try again.

4. What microservice design patterns are most important to understand for Go interviews?

Senior Go interviews probe whether you can design systems that are resilient, observable, and maintainable at scale. These patterns recur across every production Go microservice.

Core Microservice Patterns
Pattern	Problem Solved	Go Implementation
Circuit Breaker	Prevent cascade failures when a downstream is slow/dead	sony/gobreaker or custom state machine
Bulkhead	Isolate failures — one slow dependency shouldn't affect others	Separate goroutine pools / semaphores per dependency
Retry + Backoff	Transient failures in distributed systems	grpc RetryPolicy or manual exponential backoff
Saga / Outbox	Distributed transactions without 2PC	Event sourcing + idempotent consumers
Sidecar	Cross-cutting concerns without modifying service	Envoy/Linkerd proxies, Dapr
Health Check Aggregator	K8s readiness = all dependencies healthy	Custom /readyz checking DB, cache, downstream
Strangler Fig	Gradual migration from monolith	Route by feature flag; run old+new in parallel

// Circuit breaker with sony/gobreaker
import "github.com/sony/gobreaker"

type UserClient struct {
    grpc pb.UserServiceClient
    cb   *gobreaker.CircuitBreaker
}

func NewUserClient(grpc pb.UserServiceClient) *UserClient {
    cb := gobreaker.NewCircuitBreaker(gobreaker.Settings{
        Name:        "user-service",
        MaxRequests: 3,                 // requests in half-open state
        Interval:    10 * time.Second,  // reset window for counting
        Timeout:     30 * time.Second,  // how long to stay open
        ReadyToTrip: func(c gobreaker.Counts) bool {
            return c.ConsecutiveFailures >= 5
        },
    })
    return &UserClient{grpc: grpc, cb: cb}
}

func (c *UserClient) GetUser(ctx context.Context, id int64) (*pb.User, error) {
    result, err := c.cb.Execute(func() (any, error) {
        return c.grpc.GetUser(ctx, &pb.GetUserRequest{Id: id})
    })
    if err != nil {
        if errors.Is(err, gobreaker.ErrOpenState) {
            return nil, fmt.Errorf("user service unavailable (circuit open): %w", err)
        }
        return nil, err
    }
    return result.(*pb.User), nil
}

Take quiz

What is the purpose of a circuit breaker in a microservice architecture?To encrypt traffic between services

✗ Try again.

To detect repeated failures and stop sending requests to a failing service — allowing it to recover without being overwhelmed

✓ Correct! Well done.

To limit the rate of requests to prevent DDoS attacks

✗ Try again.

To route requests to the closest available service instance

✗ Try again.

What problem does the Bulkhead pattern solve?It prevents unauthorised access between services

✗ Try again.

It isolates resource pools so a slow or failing dependency cannot exhaust resources needed by other dependencies

✓ Correct! Well done.

It ensures exactly-once delivery of messages

✗ Try again.

It provides a single entry point for external traffic

✗ Try again.

5. How do you implement distributed tracing and observability in a Go microservice system?

Observability in distributed systems requires three pillars: metrics (what is happening?), logs (what happened?), and traces (why is a specific request slow?). OpenTelemetry is the standard SDK for Go.

// OpenTelemetry setup
import (
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
    "go.opentelemetry.io/otel/sdk/trace"
)

func initTracer(ctx context.Context) (func(), error) {
    exporter, err := otlptracegrpc.New(ctx,
        otlptracegrpc.WithEndpoint("otel-collector:4317"),
        otlptracegrpc.WithInsecure(),
    )
    if err != nil { return nil, err }

    tp := trace.NewTracerProvider(
        trace.WithBatcher(exporter),
        trace.WithResource(resource.NewWithAttributes(
            semconv.SchemaURL,
            semconv.ServiceNameKey.String("user-service"),
            semconv.ServiceVersionKey.String("1.0.0"),
        )),
        trace.WithSampler(trace.TraceIDRatioBased(0.1)), // sample 10%
    )
    otel.SetTracerProvider(tp)
    return func() { tp.Shutdown(context.Background()) }, nil
}

// Instrument a function
var tracer = otel.Tracer("user-service")

func (s *userServiceServer) GetUser(
    ctx context.Context, req *pb.GetUserRequest,
) (*pb.User, error) {
    ctx, span := tracer.Start(ctx, "UserService.GetUser")
    defer span.End()

    span.SetAttributes(
        attribute.Int64("user.id", req.Id),
    )

    user, err := s.repo.FindByID(ctx, int(req.Id))
    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
        return nil, status.Errorf(codes.Internal, "internal error")
    }
    return toProto(user), nil
}

// Propagate trace context in gRPC metadata
// Use otelgrpc interceptors to do this automatically:
grpc.NewServer(
    grpc.StatsHandler(otelgrpc.NewServerHandler()),
)

Correlation IDs: every request entering the system gets a trace ID propagated through gRPC metadata, HTTP headers, and message queue headers. This enables you to see the full call tree of a single request across 10 services in a tool like Jaeger or Tempo.

Take quiz

What are the three pillars of observability in a distributed system?Metrics, traces, and alerts

✗ Try again.

Metrics (what is happening), logs (what happened), and traces (how did a specific request flow through the system)

✓ Correct! Well done.

CPU usage, memory usage, and network throughput

✗ Try again.

Latency, throughput, and error rate

✗ Try again.

What does span.RecordError(err) do in OpenTelemetry?It stops the current span and creates a new error span

✗ Try again.

It records the error on the current span so it appears in trace visualisation tools and can trigger alerting

✓ Correct! Well done.

It automatically retries the failed operation

✗ Try again.

It converts the Go error to an OTel status code

✗ Try again.

6. How do you implement event-driven communication between Go microservices using message queues?

Synchronous RPC (REST/gRPC) creates tight coupling — if service B is down, service A fails. Message queues (Kafka, NATS, RabbitMQ) decouple producers from consumers: A publishes an event and continues; B processes it when ready. This improves resilience and enables fan-out.

// NATS JetStream producer
import "github.com/nats-io/nats.go"

type EventPublisher struct {
    js nats.JetStreamContext
}

type UserCreatedEvent struct {
    UserID    int       `json:"user_id"`
    Email     string    `json:"email"`
    CreatedAt time.Time `json:"created_at"`
}

func (p *EventPublisher) PublishUserCreated(ctx context.Context, e UserCreatedEvent) error {
    data, err := json.Marshal(e)
    if err != nil {
        return fmt.Errorf("marshal event: %w", err)
    }
    msg := &nats.Msg{
        Subject: "users.created",
        Data:    data,
        Header:  make(nats.Header),
    }
    // Propagate trace context in headers
    otel.GetTextMapPropagator().Inject(ctx, propagation.HeaderCarrier(msg.Header))

    _, err = p.js.PublishMsg(msg)
    return err
}

// Consumer with idempotency
type EmailConsumer struct {
    email  EmailSender
    dedup  *DeduplicationCache // prevent double-sending
}

func (c *EmailConsumer) HandleUserCreated(msg *nats.Msg) {
    var event UserCreatedEvent
    if err := json.Unmarshal(msg.Data, &event); err != nil {
        msg.Nak() // negative ack: requeue
        return
    }

    // Idempotency check â process each event exactly once
    key := fmt.Sprintf("email:welcome:%d", event.UserID)
    if c.dedup.Has(key) {
        msg.Ack() // already processed â ack without re-sending
        return
    }

    if err := c.email.SendWelcome(event.Email); err != nil {
        msg.Nak()
        return
    }
    c.dedup.Set(key)
    msg.Ack()
}

Exactly-once delivery: message queues provide at-least-once delivery (messages may be re-delivered on failure). Consumers must be idempotent — processing the same message twice produces the same result. Use a deduplication cache (Redis SET NX with TTL) keyed on the event ID.

Take quiz

What is the key advantage of event-driven (message queue) communication over synchronous gRPC for microservices?Message queues are always faster than gRPC

✗ Try again.

It decouples producer and consumer — the producer continues if the consumer is down, and multiple consumers can independently process the same event

✓ Correct! Well done.

Message queues provide stronger type safety than gRPC

✗ Try again.

Events are automatically encrypted by the message broker

✗ Try again.

Why must event consumers be idempotent in an at-least-once message queue?The broker guarantees message ordering only for idempotent consumers

✗ Try again.

Messages may be redelivered on consumer failure — processing the same event twice must produce the same result to prevent double-sending emails, double-charging, etc.

✓ Correct! Well done.

Idempotency is required by the AMQP protocol

✗ Try again.

Non-idempotent consumers cause message queue deadlock

✗ Try again.

7. How do you manage database connections and sharding in a high-scale Go service?

At scale, a single database becomes a bottleneck. Go services address this through connection pool tuning, read replicas, and horizontal sharding. The database/sql pool must be sized carefully — too few connections cause queuing, too many overwhelm the DB.

// Connection pool configuration
func openDB(dsn string) (*sql.DB, error) {
    db, err := sql.Open("postgres", dsn)
    if err != nil { return nil, err }

    // Tune the pool
    db.SetMaxOpenConns(25)                  // max concurrent connections
    db.SetMaxIdleConns(25)                  // keep idle connections warm
    db.SetConnMaxLifetime(5 * time.Minute)  // close and reopen periodically
    db.SetConnMaxIdleTime(1 * time.Minute)  // close long-idle connections

    return db, db.PingContext(context.Background())
}

// Read replica routing
type DBPool struct {
    primary  *sql.DB
    replicas []*sql.DB
    rr       uint64 // round-robin counter
}

func (p *DBPool) ReadDB() *sql.DB {
    if len(p.replicas) == 0 { return p.primary }
    idx := atomic.AddUint64(&p.rr, 1) % uint64(len(p.replicas))
    return p.replicas[idx]
}

// Hash-based sharding (user ID â shard)
type ShardedDB struct {
    shards []*sql.DB
}

func (s *ShardedDB) shardFor(userID int64) *sql.DB {
    // Consistent hashing: hash(userID) mod N shards
    h := fnv32(userID) % uint32(len(s.shards))
    return s.shards[h]
}

func (s *ShardedDB) GetUser(ctx context.Context, id int64) (*User, error) {
    db := s.shardFor(id)
    row := db.QueryRowContext(ctx,
        "SELECT id, name FROM users WHERE id = $1", id)
    var u User
    return &u, row.Scan(&u.ID, &u.Name)
}

Pool sizing rule of thumb: set MaxOpenConns to the number of CPU cores on the DB server (for CPU-bound queries) or the connection limit minus connections used by other services. Postgres default connection limit is 100 — a service with 4 replicas should use at most 20 connections each.

Take quiz

What problem does SetMaxIdleConns solve in database/sql connection pooling?It limits memory used by idle connections

✗ Try again.

It keeps a pool of ready connections so new queries don't pay TCP+TLS handshake latency — idle connections are reused immediately

✓ Correct! Well done.

It prevents connections from being used by more than one goroutine

✗ Try again.

It automatically retries failed queries on idle connections

✗ Try again.

In hash-based database sharding, what property must the shard assignment function have to avoid data migration when adding shards?It must use a cryptographic hash for security

✗ Try again.

It should use consistent hashing — consistent hashing minimises the number of keys that must be remapped when shards are added or removed

✓ Correct! Well done.

The function must be deterministic but nothing else matters

✗ Try again.

It must be injective — each user ID maps to a unique shard

✗ Try again.

8. What caching strategies do you use in Go microservices and how do you prevent cache stampede?

Caching reduces database load and improves latency. Common strategies in Go: in-memory (sync.Map, ristretto), distributed (Redis), and multi-level (L1 in-memory + L2 Redis). Cache stampede (thundering herd) is a classic distributed systems problem where many requests simultaneously miss a cold cache.

// Singleflight: collapse concurrent identical requests into one
import "golang.org/x/sync/singleflight"

type UserCache struct {
    mu     sync.RWMutex
    local  map[int64]*cachedUser
    redis  *redis.Client
    repo   UserRepository
    sf     singleflight.Group
}

func (c *UserCache) Get(ctx context.Context, id int64) (*User, error) {
    // L1: in-memory cache (no network)
    c.mu.RLock()
    if cu, ok := c.local[id]; ok && time.Now().Before(cu.expires) {
        c.mu.RUnlock()
        return cu.user, nil
    }
    c.mu.RUnlock()

    // Singleflight: if 100 goroutines miss at the same time,
    // only ONE goes to Redis/DB â the other 99 wait for the result
    key := fmt.Sprintf("user:%d", id)
    result, err, _ := c.sf.Do(key, func() (any, error) {
        // L2: Redis cache
        data, err := c.redis.Get(ctx, key).Bytes()
        if err == nil {
            var u User
            json.Unmarshal(data, &u)
            c.storeLocal(id, &u)
            return &u, nil
        }

        // L3: database
        user, err := c.repo.FindByID(ctx, int(id))
        if err != nil { return nil, err }

        // Populate caches (jitter TTL to avoid simultaneous expiry)
        jitter := time.Duration(rand.Intn(30)) * time.Second
        ttl := 5*time.Minute + jitter
        data, _ = json.Marshal(user)
        c.redis.Set(ctx, key, data, ttl)
        c.storeLocal(id, user)
        return user, nil
    })
    if err != nil { return nil, err }
    return result.(*User), nil
}

Cache stampede prevention: TTL jitter prevents all keys from expiring simultaneously (avoiding a mass DB hit). Singleflight collapses concurrent requests for the same key. Probabilistic early rehydration (XFetch algorithm) proactively refreshes cache before expiry based on computation time vs remaining TTL.

Take quiz

How does singleflight.Group prevent cache stampede?It replicates the cache value across multiple Redis nodes

✗ Try again.

When many goroutines request the same missing key simultaneously, singleflight ensures only one actually fetches from the source — the rest wait and share that result

✓ Correct! Well done.

It pre-warms the cache before TTL expiry

✗ Try again.

It spreads requests across multiple cache replicas

✗ Try again.

Why should cache TTLs include a random jitter value?Jitter improves cache hit rate

✗ Try again.

Without jitter, all entries populated at the same time expire at the same time — causing a thundering herd to hit the database simultaneously

✓ Correct! Well done.

Redis requires jitter to prevent connection pooling issues

✗ Try again.

Jitter prevents cache key collisions between services

✗ Try again.

9. What are table-driven tests in Go and why are they the standard testing pattern?

Table-driven tests define all test cases as a slice of structs, then iterate over them with a single test loop. This is Go's idiomatic testing pattern — adopted throughout the standard library. It eliminates duplication, makes adding new cases trivial, and produces clear failure output identifying exactly which case failed.

// Function under test
func validateEmail(email string) error {
    if email == "" {
        return errors.New("email is required")
    }
    if !strings.Contains(email, "@") {
        return fmt.Errorf("email %q has no @ symbol", email)
    }
    return nil
}

// Table-driven test
func TestValidateEmail(t *testing.T) {
    tests := []struct {
        name    string
        email   string
        wantErr bool
        errMsg  string // optional: check error message substring
    }{
        {
            name:    "valid email",
            email:   "alice@example.com",
            wantErr: false,
        },
        {
            name:    "empty email",
            email:   "",
            wantErr: true,
            errMsg:  "required",
        },
        {
            name:    "missing @ symbol",
            email:   "notanemail",
            wantErr: true,
            errMsg:  "no @ symbol",
        },
        {
            name:    "@ symbol only",
            email:   "@",
            wantErr: false, // technically valid by our rule
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            err := validateEmail(tt.email)

            if (err != nil) != tt.wantErr {
                t.Errorf("validateEmail(%q): got err = %v, wantErr = %v",
                    tt.email, err, tt.wantErr)
                return
            }
            if tt.wantErr && tt.errMsg != "" {
                if !strings.Contains(err.Error(), tt.errMsg) {
                    t.Errorf("error %q does not contain %q",
                        err.Error(), tt.errMsg)
                }
            }
        })
    }
}

// Run only a specific subtest:
// go test -run TestValidateEmail/empty_email ./...

t.Run benefits: each subtest gets its own scope in test output. Failed subtests show the test name, making debugging immediate. You can run a specific subtest with -run TestFoo/case_name. Parallel subtests are supported with t.Parallel() inside the subtest function.

Take quiz

What is the main benefit of using t.Run() for each table-driven test case?t.Run() executes test cases in parallel automatically

✗ Try again.

Each subtest gets an isolated scope — failures show the case name, and individual cases can be run with the -run flag

✓ Correct! Well done.

t.Run() provides better memory isolation between test cases

✗ Try again.

It prevents test cases from sharing the testing.T instance

✗ Try again.

How do you run a specific table-driven subtest named 'empty email' in TestValidateEmail?go test -case TestValidateEmail.empty_email

✗ Try again.

go test -run TestValidateEmail/empty_email ./...

✓ Correct! Well done.

go test -subtest 'empty email'

✗ Try again.

go test -filter TestValidateEmail:empty_email

✗ Try again.

10. How do you write Go benchmarks and what does -benchmem tell you?

Go's testing package has built-in benchmark support. Benchmarks identify performance regressions and allocation hotspots before they reach production. The -benchmem flag reveals hidden allocations that cause GC pressure.

// Benchmark function: func BenchmarkXxx(b *testing.B)
func BenchmarkJSONMarshal(b *testing.B) {
    user := User{ID: 1, Name: "Alice", Email: "alice@example.com", Age: 30}

    b.ResetTimer() // start timing AFTER setup (exclude allocation of user)
    for i := 0; i < b.N; i++ { // b.N is calibrated by the framework
        _, err := json.Marshal(user)
        if err != nil { b.Fatal(err) }
    }
}

// Memory allocation benchmark
func BenchmarkStringConcat(b *testing.B) {
    words := []string{"hello", "world", "foo", "bar", "baz"}

    b.ReportAllocs() // same as -benchmem for this specific benchmark
    b.ResetTimer()

    b.Run("plus operator", func(b *testing.B) {
        for i := 0; i < b.N; i++ {
            s := ""
            for _, w := range words { s += w } // alloc per iteration
            _ = s
        }
    })

    b.Run("strings.Builder", func(b *testing.B) {
        for i := 0; i < b.N; i++ {
            var sb strings.Builder
            sb.Grow(50) // pre-allocate â zero allocations inside loop
            for _, w := range words { sb.WriteString(w) }
            _ = sb.String()
        }
    })
}

// Run commands:
// go test -bench=. -benchmem ./...
// go test -bench=BenchmarkStringConcat -benchtime=5s -count=3

// Sample output:
// BenchmarkStringConcat/plus_operator-8   2345678   512 ns/op   256 B/op  5 allocs/op
// BenchmarkStringConcat/strings.Builder-8 9876543   123 ns/op    64 B/op  1 allocs/op
// Columns: name, iterations, ns per op, bytes per op, allocs per op

What -benchmem reports: 'B/op' is average bytes allocated per operation (heap). 'allocs/op' is the number of separate heap allocations per operation. Each allocation has overhead (~100ns) and increases GC pressure. Zero allocations in a hot path is the ideal target.

Comparing benchmarks: use benchstat (golang.org/x/perf/cmd/benchstat) to compare before/after with statistical significance — it reports percent change and p-values.

Take quiz

What does the 'allocs/op' column in benchmark output represent?The total number of memory allocations across all iterations

✗ Try again.

The average number of heap allocations per benchmark iteration — each allocation adds GC pressure and ~100ns overhead

✓ Correct! Well done.

The number of stack allocations per operation

✗ Try again.

The memory released back to the OS per operation

✗ Try again.

What does b.ResetTimer() do in a benchmark and why is it important?It resets b.N to 1 to restart calibration

✗ Try again.

It zeros the elapsed time counter so setup code before it (like creating test data) is excluded from the benchmark measurement

✓ Correct! Well done.

It flushes the CPU instruction cache for fair comparison

✗ Try again.

It stops and restarts the GC to get a clean measurement

✗ Try again.

11. How do you find and fix memory allocation hotspots in a Go service using profiling?

Memory allocation hotspots cause GC pressure, latency spikes, and higher CPU usage. The workflow: benchmark to detect allocations, profile to find the source, fix (pre-allocate, use sync.Pool, reduce interface boxing), benchmark again to verify improvement.

// Step 1: identify hotspots with -benchmem
// go test -bench=BenchmarkProcessRequests -benchmem -memprofile=mem.out
// go tool pprof mem.out
// > top10 -cum
// > list processRequest

// Step 2: fix common allocation patterns

// PATTERN 1: pre-allocate slices to known capacity
// Allocates N times as the slice grows:
func collectIDs(users []User) []int {
    var ids []int
    for _, u := range users { ids = append(ids, u.ID) }
    return ids
}
// Zero allocations:
func collectIDsFast(users []User) []int {
    ids := make([]int, 0, len(users)) // pre-allocate exact capacity
    for _, u := range users { ids = append(ids, u.ID) }
    return ids
}

// PATTERN 2: sync.Pool for frequently allocated/freed objects
var bufPool = sync.Pool{
    New: func() any { return &bytes.Buffer{} },
}

func encodeResponse(v any) ([]byte, error) {
    buf := bufPool.Get().(*bytes.Buffer)
    buf.Reset()
    defer bufPool.Put(buf)
    if err := json.NewEncoder(buf).Encode(v); err != nil {
        return nil, err
    }
    return buf.Bytes(), nil
}

// PATTERN 3: avoid interface boxing of small values
// This allocates (int escapes to heap when stored as interface):
func logValue(v interface{}) { fmt.Println(v) }
logValue(42) // 42 allocated on heap

// Use type-specific overloads or generics instead
func logInt(v int) { fmt.Println(v) } // no allocation

// PATTERN 4: strings.Builder instead of string concatenation
// Step 3: verify with benchmark comparison
// go test -bench=BenchmarkCollectIDs -benchmem -count=5 > after.txt
// benchstat before.txt after.txt

Take quiz

What is the most effective way to reduce allocations when building a slice of known final length?Use a linked list instead of a slice

✗ Try again.

Pre-allocate with make([]T, 0, knownLength) — the slice grows into existing capacity without any new allocations

✓ Correct! Well done.

Use sync.Pool to reuse the slice

✗ Try again.

Declare the slice as a global variable

✗ Try again.

When using sync.Pool, why must you call buf.Reset() after Get()?Get() returns a nil object that must be initialised

✗ Try again.

The pooled object may contain data from its previous use — Reset() brings it to a clean state before reuse

✓ Correct! Well done.

Reset() registers the buffer with the GC for proper tracking

✗ Try again.

sync.Pool requires Reset() to be called or it will panic

✗ Try again.

12. How do you structure integration tests in Go that require real databases or external services?

Integration tests verify that your code works with real infrastructure. Go's testing tools make this clean: build tags separate unit from integration tests, TestMain handles setup/teardown, and testcontainers-go spins up real dependencies in Docker.

// integration_test.go
//go:build integration

package repository_test

import (
    "context"
    "testing"
    "github.com/testcontainers/testcontainers-go"
    "github.com/testcontainers/testcontainers-go/modules/postgres"
)

var testDB *sql.DB

// TestMain: shared setup/teardown for the whole package
func TestMain(m *testing.M) {
    ctx := context.Background()

    // Start a real Postgres container
    pgContainer, err := postgres.RunContainer(ctx,
        testcontainers.WithImage("postgres:15"),
        postgres.WithDatabase("testdb"),
        postgres.WithUsername("test"),
        postgres.WithPassword("test"),
    )
    if err != nil { log.Fatalf("container start: %v", err) }
    defer pgContainer.Terminate(ctx)

    dsn, _ := pgContainer.ConnectionString(ctx, "sslmode=disable")
    testDB, err = sql.Open("postgres", dsn)
    if err != nil { log.Fatalf("open db: %v", err) }

    // Run migrations
    if err := runMigrations(testDB); err != nil {
        log.Fatalf("migrate: %v", err)
    }

    os.Exit(m.Run()) // run all tests in the package
}

// Individual integration test using shared testDB
func TestUserRepository_Save(t *testing.T) {
    repo := postgres.NewUserRepository(testDB)
    ctx := context.Background()

    t.Cleanup(func() {
        testDB.ExecContext(ctx, "DELETE FROM users WHERE email = $1",
            "test@example.com")
    })

    user := &User{Name: "Test", Email: "test@example.com"}
    if err := repo.Save(ctx, user); err != nil {
        t.Fatalf("Save: %v", err)
    }
    if user.ID == 0 { t.Error("expected ID to be set after save") }

    got, err := repo.FindByID(ctx, user.ID)
    if err != nil { t.Fatalf("FindByID: %v", err) }
    if got.Email != user.Email {
        t.Errorf("got email %q, want %q", got.Email, user.Email)
    }
}

// Run integration tests:
// go test -tags integration ./...

Take quiz

What is the purpose of TestMain(m *testing.M) in Go integration tests?It replaces all t.Run() calls with a single test runner

✗ Try again.

It provides a hook to run setup code (start containers, open DB) before any tests in the package run, and teardown code after all tests complete

✓ Correct! Well done.

It allows integration tests to skip unit tests automatically

✗ Try again.

It configures the test timeout for the entire package

✗ Try again.

What does 't.Cleanup(fn)' do in a Go test?It runs fn before each subtest created with t.Run

✗ Try again.

It registers fn to run when the current test ends — for cleanup code that should execute regardless of pass or fail

✓ Correct! Well done.

It deletes all test files created during the test

✗ Try again.

It cancels the test's context when cleanup begins

✗ Try again.

13. Explain the difference between mocks, stubs, and fakes in Go testing. When do you use each?

These three terms are often used interchangeably, but they describe different test double patterns with different purposes. Go's implicit interfaces make all three easy to implement without a framework.

Test Double Types
Type	Purpose	Returns	Verifies calls?
Stub	Returns pre-programmed responses to specific calls	Fixed values	No
Fake	Working implementation with simplified logic (e.g., in-memory DB)	Realistic values	No
Mock	Records calls and verifies expected interactions	Pre-programmed OR real	Yes — asserts call count, order, args

// Interface to test against
type UserRepository interface {
    FindByID(ctx context.Context, id int) (*User, error)
    Save(ctx context.Context, user *User) error
}

// STUB: returns fixed values, no logic
type stubUserRepo struct {
    user *User
    err  error
}
func (s *stubUserRepo) FindByID(_ context.Context, _ int) (*User, error) {
    return s.user, s.err
}
func (s *stubUserRepo) Save(_ context.Context, _ *User) error { return nil }

// FAKE: in-memory map â works like a real repo but without a DB
type fakeUserRepo struct {
    mu    sync.Mutex
    users map[int]*User
    nextID int
}
func (f *fakeUserRepo) FindByID(_ context.Context, id int) (*User, error) {
    f.mu.Lock(); defer f.mu.Unlock()
    u, ok := f.users[id]
    if !ok { return nil, ErrNotFound }
    return u, nil
}
func (f *fakeUserRepo) Save(_ context.Context, u *User) error {
    f.mu.Lock(); defer f.mu.Unlock()
    f.nextID++; u.ID = f.nextID
    f.users[u.ID] = u
    return nil
}

// MOCK: records calls for assertion
type mockUserRepo struct {
    FindByIDCalls []int
    SaveCalls     []*User
    stub          stubUserRepo
}
func (m *mockUserRepo) FindByID(ctx context.Context, id int) (*User, error) {
    m.FindByIDCalls = append(m.FindByIDCalls, id)
    return m.stub.FindByID(ctx, id)
}
func (m *mockUserRepo) Save(ctx context.Context, u *User) error {
    m.SaveCalls = append(m.SaveCalls, u)
    return m.stub.Save(ctx, u)
}

// Test using mock
func TestService_Register(t *testing.T) {
    mock := &mockUserRepo{stub: stubUserRepo{}}
    svc := NewUserService(mock)
    svc.Register(context.Background(), "alice@example.com")
    if len(mock.SaveCalls) != 1 {
        t.Errorf("expected 1 Save call, got %d", len(mock.SaveCalls))
    }
}

Take quiz

When would you prefer a fake over a mock in Go tests?Fakes are always faster than mocks

✗ Try again.

When tests need to verify realistic multi-step interactions (e.g., save then find) where a stub's fixed response would break the test — a fake's in-memory implementation handles state

✓ Correct! Well done.

When you need to verify the exact number of times a method was called

✗ Try again.

Fakes are required when testing gRPC services

✗ Try again.

What is the key difference between a stub and a mock in testing terminology?Stubs are for functions; mocks are for interfaces

✗ Try again.

A stub just returns pre-programmed values; a mock also records calls and asserts expected interactions occurred

✓ Correct! Well done.

Mocks require a mocking framework; stubs do not

✗ Try again.

Stubs work with real databases; mocks use in-memory storage

✗ Try again.

14. How does Go's built-in fuzzing work and when should you use property-based testing?

Go 1.18 added native fuzz testing via go test -fuzz. Fuzzing automatically generates inputs that exercise edge cases your hand-written tests miss — particularly effective for parsers, serialisers, and cryptographic code.

// Fuzz test â finds inputs that cause a panic or incorrect result
func FuzzParseURL(f *testing.F) {
    // Seed corpus: known interesting inputs
    f.Add("https://example.com/path?q=1")
    f.Add("http://user:pass@host:8080/")
    f.Add("")

    f.Fuzz(func(t *testing.T, s string) {
        // Property: parsing should never panic
        u, err := url.Parse(s)
        if err != nil { return } // error is ok, panic is not

        // Property: round-trip should be stable
        // parsing the string representation should give the same URL
        u2, err := url.Parse(u.String())
        if err != nil {
            t.Errorf("round-trip parse failed: %v", err)
        }
        if u.String() != u2.String() {
            t.Errorf("round-trip changed URL: %q â %q",
                u.String(), u2.String())
        }
    })
}

// Another fuzz example: JSON encode/decode round-trip
func FuzzJSONRoundTrip(f *testing.F) {
    f.Add(`{"name":"Alice","age":30}`)

    f.Fuzz(func(t *testing.T, data []byte) {
        var v map[string]any
        if err := json.Unmarshal(data, &v); err != nil { return }

        encoded, err := json.Marshal(v)
        if err != nil {
            t.Errorf("marshal failed after successful unmarshal: %v", err)
        }
        var v2 map[string]any
        if err := json.Unmarshal(encoded, &v2); err != nil {
            t.Errorf("second unmarshal failed: %v", err)
        }
    })
}

// Run fuzzing:
// go test -fuzz=FuzzParseURL                    # fuzz until stopped
// go test -fuzz=FuzzParseURL -fuzztime=60s      # fuzz for 60 seconds
// go test                                        # replays corpus only (CI)

When to use fuzzing: parsers (JSON, YAML, protobuf), network protocol handlers, cryptographic code, regular expression engines, any function that accepts arbitrary byte/string input. Fuzzing found critical bugs in Go's own standard library.

Take quiz

What happens when 'go test -fuzz' discovers a failing input?The test exits immediately without saving the input

✗ Try again.

The failing input is saved to the testdata/corpus/ directory and replayed on every subsequent 'go test' run without -fuzz

✓ Correct! Well done.

A GitHub issue is automatically created with the failing input

✗ Try again.

The fuzz engine backtracks and tries smaller variations

✗ Try again.

What is a 'property' in property-based testing and fuzzing?A struct field that must not be nil

✗ Try again.

A logical invariant that must hold true for all valid inputs — like 'parsing never panics' or 'encode(decode(x)) == x'

✓ Correct! Well done.

A performance requirement measured in nanoseconds

✗ Try again.

A required test case from the seed corpus

✗ Try again.

15. How do you test concurrent Go code correctly — including data races and timing issues?

Concurrent code is notoriously difficult to test because bugs may only appear under specific goroutine interleavings. Go provides three essential tools: the race detector (-race), goroutine leak detection, and deterministic design.

import (
    "testing"
    "sync"
    "go.uber.org/goleak"
)

// Always run concurrent tests with -race
// go test -race ./...

// Test goroutine leak detection
func TestNoGoroutineLeak(t *testing.T) {
    defer goleak.VerifyNone(t) // fails if goroutines remain after test

    ctx, cancel := context.WithCancel(context.Background())
    worker := NewBackgroundWorker(ctx)
    worker.Start()

    // do some work...

    cancel() // signal worker to stop
    worker.Wait()
    // goleak checks that the worker goroutine actually exited
}

// Test shared state with concurrent access
func TestCounter_ConcurrentIncrement(t *testing.T) {
    c := NewAtomicCounter()
    const goroutines = 100
    const increments = 1000

    var wg sync.WaitGroup
    wg.Add(goroutines)
    for i := 0; i < goroutines; i++ {
        go func() {
            defer wg.Done()
            for j := 0; j < increments; j++ {
                c.Increment()
            }
        }()
    }
    wg.Wait()

    expected := goroutines * increments
    if got := c.Value(); got != expected {
        t.Errorf("got %d, want %d", got, expected)
    }
}

// Test channel-based pipelines with timeout
func TestPipeline_ProcessesAllItems(t *testing.T) {
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    in := generateJobs([]string{"a", "b", "c"})
    out := processJobs(ctx, in, 3)

    results := collectResults(ctx, out)
    if len(results) != 3 {
        t.Errorf("expected 3 results, got %d", len(results))
    }
}

Deterministic test design: avoid time.Sleep in tests to wait for goroutines — use WaitGroup, channels, or context. Sleep-based synchronisation makes tests flaky on slow CI machines. Design concurrent components so their completion is signalled through channels or sync primitives.

Take quiz

What does running 'go test -race' detect?Goroutines that run for more than 1 second

✗ Try again.

Data races — concurrent accesses to shared memory without synchronisation, where at least one access is a write

✓ Correct! Well done.

Memory leaks from goroutines that are not stopped

✗ Try again.

Deadlocks where all goroutines are blocked

✗ Try again.

Why should you avoid time.Sleep() in tests to wait for goroutines to complete?time.Sleep causes the race detector to produce false positives

✗ Try again.

Sleep-based synchronisation is flaky — it passes on fast machines but fails on slow CI; use sync.WaitGroup, channels, or context for deterministic synchronisation

✓ Correct! Well done.

The Go test framework times out any test that uses Sleep for more than 1 second

✗ Try again.

time.Sleep cannot be used inside goroutines

✗ Try again.

16. How do you decide where to draw service boundaries when decomposing a Go monolith into microservices?

Service decomposition is one of the hardest architectural decisions. Decomposing too aggressively creates a 'distributed monolith' — all the complexity of microservices with none of the benefits. Decomposing too conservatively keeps the monolith's disadvantages.

Decomposition Principles
Principle	Description	Go Implication
Domain-Driven Design	Split by bounded context — User, Order, Inventory are separate domains	Each service owns its schema; no cross-schema JOINs
Single Responsibility	Each service does one thing well; change one thing without touching others	One binary per domain; small team ownership
Data Ownership	Service owns its data; others access via API — no shared DB tables	Separate databases or schemas; eventual consistency via events
Deployability	Can each service be deployed independently?	Separate CI/CD pipelines; semantic versioning of gRPC APIs
Failure Isolation	Does one service's failure propagate?	Circuit breakers; timeout on every cross-service call
Strangler Fig	Migrate gradually — route traffic by feature flag	Proxy in Go; run old+new in parallel

// Anti-pattern: too-fine decomposition (nanoservices)
// UserNameService, UserEmailService, UserAgeService
// â Every user operation requires 3 network calls
// â Synchronous coupling worse than a monolith

// Better: domain-aligned decomposition
// UserService:   manage user profiles and authentication
// OrderService:  manage order lifecycle and payments
// NotifyService: send emails/SMS/push â consumes events from others

// Strangler Fig pattern in Go
func routeToNewService(cfg *Config) http.Handler {
    legacyHandler := newLegacyHandler()
    newHandler    := newModernHandler()

    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Gradually shift traffic using feature flags
        if cfg.Features.IsEnabled("new-user-service") &&
            strings.HasPrefix(r.URL.Path, "/users/") {
            newHandler.ServeHTTP(w, r)
            return
        }
        legacyHandler.ServeHTTP(w, r)
    })
}

Take quiz

What is the 'distributed monolith' anti-pattern in microservices?A monolith deployed across multiple data centres

✗ Try again.

Services that are physically separate but tightly coupled — requiring coordinated deployment, sharing a database, or having synchronous dependencies that fail together

✓ Correct! Well done.

A microservices system without a service mesh

✗ Try again.

A Go program that imports too many packages

✗ Try again.

According to the Data Ownership principle, how should one service access another service's data?By querying the other service's database directly with read-only credentials

✗ Try again.

Via the owning service's public API — each service owns its data exclusively; direct DB access bypasses the owner's business rules and creates coupling

✓ Correct! Well done.

By using a shared read replica database

✗ Try again.

Via database views that expose only safe columns

✗ Try again.

17. How do you version gRPC APIs in Go without breaking existing clients?

Breaking changes in gRPC are harder to recover from than REST — generated client code must be recompiled. The Protobuf wire format and Go's embedded Unimplemented* pattern provide the tools to evolve APIs safely.

// Protobuf field number rules (never change):
// - Field numbers 1-15: used for frequently-sent fields (1-byte encoded)
// - Never reuse a field number â remove fields by reserving them
// - Never rename fields if using JSON encoding

// Safe changes (backward compatible):
// 1. Add new optional fields with new field numbers
// 2. Add new RPC methods to the service
// 3. Add values to enums (with care)

// Breaking changes (require new major version):
// 1. Remove or rename fields
// 2. Change field types
// 3. Remove RPC methods

// In user.proto â safe evolution:
message User {
    int64  id    = 1;
    string name  = 2;
    string email = 3;
    // Added in v1.1 â old clients ignore this field
    string avatar_url = 4;
    // Removed field: reserved to prevent accidental reuse
    reserved 5;
    reserved "phone_number";
}

// Major version bump when breaking changes required
// package user.v2;
// â separate proto package, separate Go package
// â clients opt-in by importing v2

// Server supporting both v1 and v2 simultaneously
func main() {
    grpcServer := grpc.NewServer()
    v1pb.RegisterUserServiceServer(grpcServer, &v1Handler{})
    v2pb.RegisterUserServiceServer(grpcServer, &v2Handler{})
    // gRPC uses fully qualified service name to route:
    // user.v1.UserService vs user.v2.UserService
}

Field reservation: when you remove a field, add its number and name to reserved. This prevents future schema changes from accidentally reusing the old field number and silently corrupting old clients that still send the deprecated field.

Take quiz

What must you do when removing a field from a Protobuf message to prevent future corruption?Delete the field definition but keep the name in a comment

✗ Try again.

Add the field number and name to the 'reserved' statement — preventing accidental reuse that would corrupt old clients still sending that field

✓ Correct! Well done.

Rename the field with a 'deprecated_' prefix

✗ Try again.

Move the field to a separate DeprecatedFields message

✗ Try again.

Which Protobuf change is backward compatible and will not break existing clients?Changing a field's type from string to int64

✗ Try again.

Adding a new optional field with a new, previously unused field number

✓ Correct! Well done.

Removing an RPC method from the service

✗ Try again.

Reusing a previously deleted field number with a different type

✗ Try again.

18. How do you write unit and integration tests for gRPC services in Go?

gRPC services are tested at multiple levels: unit tests using the generated client/server with an in-process buffer connection, and integration tests using a real server. The bufconn package provides a lightweight in-memory network for fast unit tests.

import (
    "google.golang.org/grpc"
    "google.golang.org/grpc/test/bufconn"
)

const bufSize = 1024 * 1024

// Setup: start server in-process with bufconn
func setupTestServer(t *testing.T, repo UserRepository) pb.UserServiceClient {
    t.Helper()

    lis := bufconn.Listen(bufSize)
    grpcServer := grpc.NewServer()
    pb.RegisterUserServiceServer(grpcServer,
        &userServiceServer{repo: repo})

    go func() {
        if err := grpcServer.Serve(lis); err != nil {
            t.Logf("server error: %v", err)
        }
    }()
    t.Cleanup(func() { grpcServer.GracefulStop() })

    // Dial using the in-memory buffer
    conn, err := grpc.NewClient(
        "passthrough://bufnet",
        grpc.WithContextDialer(func(ctx context.Context, _ string) (net.Conn, error) {
            return lis.DialContext(ctx)
        }),
        grpc.WithTransportCredentials(insecure.NewCredentials()),
    )
    if err != nil { t.Fatalf("dial: %v", err) }
    t.Cleanup(func() { conn.Close() })

    return pb.NewUserServiceClient(conn)
}

// Table-driven gRPC test
func TestGetUser(t *testing.T) {
    fakeRepo := &fakeUserRepo{
        users: map[int]*User{1: {ID: 1, Name: "Alice"}},
    }
    client := setupTestServer(t, fakeRepo)

    tests := []struct {
        name     string
        id       int64
        wantName string
        wantCode codes.Code
    }{
        {"existing user", 1, "Alice", codes.OK},
        {"missing user", 99, "", codes.NotFound},
        {"invalid id",   0, "", codes.InvalidArgument},
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            resp, err := client.GetUser(context.Background(),
                &pb.GetUserRequest{Id: tt.id})

            if code := status.Code(err); code != tt.wantCode {
                t.Errorf("got code %v, want %v", code, tt.wantCode)
            }
            if tt.wantName != "" && resp.Name != tt.wantName {
                t.Errorf("got name %q, want %q", resp.Name, tt.wantName)
            }
        })
    }
}

Take quiz

What does bufconn provide for gRPC testing?A mock gRPC server that automatically returns expected responses

✗ Try again.

An in-memory network listener — gRPC servers and clients connect through memory buffers with no OS network stack, making tests fast and avoiding port conflicts

✓ Correct! Well done.

A tool that records and replays real gRPC traffic

✗ Try again.

A gRPC proxy that adds artificial latency for realistic testing

✗ Try again.

How do you extract the gRPC status code from an error returned by a gRPC client call?err.(*status.Status).Code()

✗ Try again.

status.Code(err)

✓ Correct! Well done.

grpc.CodeOf(err)

✗ Try again.

err.(codes.Code)

✗ Try again.

19. How do you load test a Go microservice and interpret the results?

Load testing validates that a service meets performance requirements under expected and peak traffic. Go services are typically tested with k6, vegeta, or the Go-native go-wrk. The key metrics: throughput (RPS), latency percentiles (p50, p95, p99), and error rate.

// Vegeta: Go-native load testing library
import vegeta "github.com/tsenart/vegeta/v12/lib"

func LoadTestGetUser(t *testing.T) {
    if testing.Short() { t.Skip("skipping load test in short mode") }

    rate    := vegeta.Rate{Freq: 100, Per: time.Second} // 100 RPS
    duration := 30 * time.Second
    targeter := vegeta.NewStaticTargeter(vegeta.Target{
        Method: "GET",
        URL:    "http://localhost:8080/users/1",
    })

    attacker := vegeta.NewAttacker()
    var metrics vegeta.Metrics
    for res := range attacker.Attack(targeter, rate, duration, "load test") {
        metrics.Add(res)
    }
    metrics.Close()

    t.Logf("Requests:  %d", metrics.Requests)
    t.Logf("Success:   %.2f%%", metrics.Success*100)
    t.Logf("Throughput: %.2f rps", metrics.Throughput)
    t.Logf("Latency p50: %v", metrics.Latencies.P50)
    t.Logf("Latency p95: %v", metrics.Latencies.P95)
    t.Logf("Latency p99: %v", metrics.Latencies.P99)

    // Assertions
    if metrics.Success < 0.999 {
        t.Errorf("success rate %.2f%% below 99.9%%", metrics.Success*100)
    }
    if metrics.Latencies.P99 > 50*time.Millisecond {
        t.Errorf("p99 latency %v exceeds 50ms SLO", metrics.Latencies.P99)
    }
}

// Benchmark as a proxy for load test (lower overhead)
func BenchmarkHandlerThroughput(b *testing.B) {
    srv := httptest.NewServer(buildRouter())
    defer srv.Close()
    client := srv.Client()

    b.SetParallelism(10) // 10 goroutines Ã GOMAXPROCS concurrent
    b.ResetTimer()
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            resp, _ := client.Get(srv.URL + "/users/1")
            io.Discard.Write(resp.Body)
            resp.Body.Close()
        }
    })
    b.ReportMetric(float64(b.N)/b.Elapsed().Seconds(), "rps")
}

Take quiz

Why is the p99 latency more important than the average latency for a production SLO?p99 is easier to calculate than average

✗ Try again.

Average latency hides tail latency — 1% of users experience p99 or worse; in a system making 1000 RPS, 10 users/second see the worst case which must be bounded by your SLO

✓ Correct! Well done.

p99 is the only metric Prometheus can track

✗ Try again.

Average latency is biased by fast requests; p99 is not

✗ Try again.

What does b.RunParallel() do in a Go benchmark?Runs the benchmark function in parallel with other benchmarks

✗ Try again.

Runs the benchmark body concurrently across multiple goroutines — useful for measuring throughput and detecting lock contention

✓ Correct! Well done.

Automatically doubles b.N for each parallel goroutine

✗ Try again.

Runs the benchmark on multiple CPU cores simultaneously

✗ Try again.

20. How does service discovery and client-side load balancing work in a Go microservice system?

When service B needs to call service A, it must discover A's current addresses (since pods restart and scale). Go gRPC has built-in pluggable load balancing and name resolution for integrating with Consul, etcd, or Kubernetes DNS.

// Option 1: Kubernetes DNS + round-robin (simplest)
// k8s headless service: userservice.default.svc.cluster.local
// â resolves to ALL pod IPs, not just one VIP
conn, err := grpc.NewClient(
    "dns:///userservice.default.svc.cluster.local:9090",
    grpc.WithDefaultServiceConfig(`{"loadBalancingPolicy":"round_robin"}`),
    grpc.WithTransportCredentials(insecure.NewCredentials()),
)

// Option 2: consul service discovery
import resolverv2 "github.com/mbobrovskyi/grpc-consul-resolver"

resolverv2.RegisterDefault(
    resolverv2.NewResolver(
        "consul://localhost:8500/user-service",
        resolverv2.WithHealthCheck(true),
    ),
)

// Option 3: manual custom resolver for testing / dev
type staticResolver struct{}

func (staticResolver) Build(target resolver.Target,
    cc resolver.ClientConn, opts resolver.BuildOptions) (resolver.Resolver, error) {
    addrs := []resolver.Address{
        {Addr: "localhost:9090"},
        {Addr: "localhost:9091"},
    }
    cc.UpdateState(resolver.State{Addresses: addrs})
    return &staticRes{cc: cc, addrs: addrs}, nil
}

// Load balancing policies in gRPC:
// round_robin:  cycle through all addresses
// pick_first:   always use first healthy address (default)
// grpclb:       server-side load balancing (deprecated)
// rls:          routing lookup service

Service mesh alternative: tools like Istio, Linkerd, and Cilium implement load balancing, circuit breaking, retries, and mTLS in a sidecar proxy — removing these concerns from application code entirely. The Go service makes plain gRPC calls; the mesh handles distribution transparently.

Take quiz

What is the advantage of a Kubernetes headless service for gRPC load balancing?Headless services provide TLS termination

✗ Try again.

A headless service returns all pod IPs in DNS, enabling the gRPC client to load-balance across individual pods rather than through a single VIP that only round-robins TCP connections

✓ Correct! Well done.

Headless services are faster than regular services

✗ Try again.

They automatically implement circuit breaking

✗ Try again.

What problem does using a regular Kubernetes ClusterIP service create for gRPC load balancing?ClusterIP services don't support HTTP/2

✗ Try again.

A ClusterIP provides one stable IP — all gRPC multiplexes on one TCP connection through kube-proxy, bypassing per-pod load balancing and sending all traffic to one pod

✓ Correct! Well done.

ClusterIP services break gRPC keepalives

✗ Try again.

Regular services don't support DNS resolution for gRPC

✗ Try again.

21. How do you design a consistent error model across multiple Go microservices?

In a system with 10+ services, inconsistent error formats force every client to implement different error parsing. A shared error contract — carried in gRPC status details or HTTP Problem Details — enables uniform client-side handling.

// Shared proto for rich error details (google.rpc.Status)
// Add error_details.proto to your project

import (
    spb "google.golang.org/genproto/googleapis/rpc/status"
    "google.golang.org/grpc/status"
    errdetails "google.golang.org/genproto/googleapis/rpc/errdetails"
)

// Return rich error from gRPC handler
func (s *userServiceServer) CreateUser(
    ctx context.Context, req *pb.CreateUserRequest,
) (*pb.User, error) {
    if req.Email == "" {
        // Rich validation error with field-level detail
        st := status.New(codes.InvalidArgument, "validation failed")
        detail := &errdetails.BadRequest{
            FieldViolations: []*errdetails.BadRequest_FieldViolation{
                {Field: "email", Description: "email is required"},
            },
        }
        st, _ = st.WithDetails(detail)
        return nil, st.Err()
    }

    user, err := s.repo.Save(ctx, &User{Email: req.Email})
    if err != nil {
        if errors.Is(err, ErrDuplicate) {
            st := status.New(codes.AlreadyExists, "email already registered")
            info := &errdetails.ErrorInfo{
                Reason: "EMAIL_ALREADY_EXISTS",
                Domain: "user.service",
            }
            st, _ = st.WithDetails(info)
            return nil, st.Err()
        }
        return nil, status.Errorf(codes.Internal, "internal error")
    }
    return toProto(user), nil
}

// Client: extract rich error details
_, err := client.CreateUser(ctx, req)
if err != nil {
    st := status.Convert(err)
    for _, detail := range st.Details() {
        switch d := detail.(type) {
        case *errdetails.BadRequest:
            for _, v := range d.FieldViolations {
                log.Printf("field %s: %s", v.Field, v.Description)
            }
        }
    }
}

Take quiz

What does status.WithDetails() enable in gRPC error responses?It adds HTTP headers to gRPC responses

✗ Try again.

It attaches structured proto messages to a gRPC error — clients can extract field-level validation errors, error codes, and metadata beyond a simple string message

✓ Correct! Well done.

It enables error retransmission if the client disconnects

✗ Try again.

It encrypts the error message in transit

✗ Try again.

Why should you never expose internal error details (stack traces, DB queries) in gRPC error messages?gRPC error messages have a 256-byte limit

✗ Try again.

Internal details are sent to clients verbatim — they reveal implementation details, aid attackers, and may contain sensitive data; return generic codes.Internal with a request ID instead

✓ Correct! Well done.

Stack traces cannot be serialised to protobuf

✗ Try again.

It violates the Protobuf wire format specification

✗ Try again.

22. How do you implement the Saga pattern for distributed transactions in Go?

Distributed transactions that span multiple services cannot use traditional 2-phase commit without creating tight coupling and availability issues. The Saga pattern decomposes a transaction into a sequence of local transactions, each publishing an event. Failures trigger compensating transactions.

// Choreography-based saga: services react to events

// OrderService publishes OrderCreated
type OrderCreatedEvent struct {
    OrderID   string  `json:"order_id"`
    UserID    string  `json:"user_id"`
    Amount    float64 `json:"amount"`
    ProductID string  `json:"product_id"`
}

// PaymentService consumes OrderCreated, publishes PaymentProcessed or PaymentFailed
func (s *PaymentService) HandleOrderCreated(ctx context.Context,
    event OrderCreatedEvent) error {

    charged, err := s.stripe.Charge(ctx, event.UserID, event.Amount)
    if err != nil {
        // Publish compensating event for OrderService to cancel the order
        return s.publisher.Publish(ctx, PaymentFailedEvent{
            OrderID: event.OrderID,
            Reason:  err.Error(),
        })
    }
    return s.publisher.Publish(ctx, PaymentProcessedEvent{
        OrderID:   event.OrderID,
        ChargeID:  charged.ID,
    })
}

// InventoryService consumes PaymentProcessed, publishes StockReserved or StockUnavailable
// OrderService listens to StockReserved â fulfillment
// OrderService listens to StockUnavailable â refund (another compensating event)

// Outbox pattern: ensure event is published atomically with DB write
func (s *OrderService) CreateOrder(ctx context.Context, req OrderRequest) error {
    tx, err := s.db.BeginTx(ctx, nil)
    if err != nil { return err }
    defer tx.Rollback()

    // Write order to DB
    orderID := uuid.New().String()
    tx.ExecContext(ctx, "INSERT INTO orders ...", orderID, req.UserID)

    // Write event to outbox table in SAME transaction
    eventData, _ := json.Marshal(OrderCreatedEvent{OrderID: orderID})
    tx.ExecContext(ctx,
        "INSERT INTO outbox (event_type, payload) VALUES ($1, $2)",
        "order.created", eventData)

    return tx.Commit()
    // Separate process reads outbox and publishes to message queue
}

Outbox pattern solves the dual-write problem: writing to the DB and publishing to a message queue are two separate operations — either can fail. Writing both to the same DB transaction (with the event in an outbox table) makes them atomic; a relay process then publishes to the queue and deletes from outbox.

Take quiz

What is a compensating transaction in the Saga pattern?A transaction that runs faster to compensate for a slow previous transaction

✗ Try again.

A local transaction that undoes the effect of a previous step in the saga when a later step fails — maintaining eventual consistency without 2PC

✓ Correct! Well done.

A backup transaction that retries failed operations automatically

✗ Try again.

A read-only transaction that verifies the saga completed correctly

✗ Try again.

What problem does the Outbox pattern solve in event-driven architecture?It prevents duplicate events from being published

✗ Try again.

It solves the dual-write problem — without it, the DB write and message publish can fail independently; the outbox makes both atomic by writing both in one DB transaction

✓ Correct! Well done.

It provides ordered delivery of events

✗ Try again.

It encrypts events before publishing to the queue

✗ Try again.

23. What testing.T methods do experienced Go engineers use to write cleaner tests?

Beyond t.Error and t.Fatal, Go's testing package offers several methods that eliminate boilerplate and make test intent clearer. Knowing these marks a candidate as familiar with Go testing idioms.

// t.Helper() â marks current function as a test helper
// Error messages show the caller's line, not the helper's line
func requireNoError(t *testing.T, err error, msg string) {
    t.Helper() // CRITICAL: without this, failure shows helper's line
    if err != nil {
        t.Fatalf("%s: %v", msg, err)
    }
}

// t.Cleanup() â deferred cleanup, runs even if test panics
func TestDBOperation(t *testing.T) {
    db := openTestDB(t)
    t.Cleanup(func() { db.Close() }) // cleaner than defer for table tests

    user := createTestUser(t, db)
    t.Cleanup(func() {
        db.Exec("DELETE FROM users WHERE id = $1", user.ID)
    })
    // test code...
}

// t.Setenv() â sets env var for the test, auto-restores on cleanup
func TestLoadConfig(t *testing.T) {
    t.Setenv("DATABASE_URL", "postgres://test:test@localhost/testdb")
    t.Setenv("JWT_SECRET", "test-secret-at-least-32-chars-long")
    cfg, err := loadConfig()
    requireNoError(t, err, "loadConfig")
    if cfg.Database.URL == "" { t.Error("DATABASE_URL not loaded") }
}

// t.TempDir() â creates a temp directory that's auto-removed
func TestWriteFile(t *testing.T) {
    dir := t.TempDir() // automatically cleaned up after test
    path := filepath.Join(dir, "output.json")
    err := writeJSON(path, map[string]string{"ok": "true"})
    requireNoError(t, err, "writeJSON")
}

// t.Parallel() â run subtests concurrently for faster test suites
func TestConcurrentOperations(t *testing.T) {
    for _, tc := range testCases {
        tc := tc // capture pre-Go 1.22
        t.Run(tc.name, func(t *testing.T) {
            t.Parallel() // this subtest runs concurrently with others
            // ... test body
        })
    }
}

Take quiz

Why is t.Helper() important when writing test helper functions?It makes the helper function run faster

✗ Try again.

Without t.Helper(), test failure messages point to the helper's line instead of the test's line — making it hard to find which test case failed

✓ Correct! Well done.

It marks the function as safe for concurrent test execution

✗ Try again.

It prevents the helper from calling t.Fatal()

✗ Try again.

What does t.Setenv() do that manually calling os.Setenv does not?t.Setenv validates that the value is a valid environment variable

✗ Try again.

t.Setenv automatically restores the original value after the test — no manual cleanup needed, and tests don't leak modified environment to other tests

✓ Correct! Well done.

t.Setenv only works with the test process, not child processes

✗ Try again.

They are identical — t.Setenv is just syntactic sugar

✗ Try again.

24. How do you benchmark concurrent code with testing.B and what insights does it provide?

Serial benchmarks (for i := 0; i < b.N; i++) measure single-goroutine throughput. Parallel benchmarks reveal lock contention, cache coherence issues, and true concurrent throughput — critical for shared data structures and handlers.

// Serial benchmark: single goroutine throughput
func BenchmarkMapGet(b *testing.B) {
    m := map[string]int{"key": 1}
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        _ = m["key"]
    }
}

// Parallel benchmark: concurrent throughput + contention
func BenchmarkSyncMapGet(b *testing.B) {
    var m sync.Map
    m.Store("key", 1)

    b.ResetTimer()
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() { // pb.Next() is goroutine-safe, replaces i < b.N
            m.Load("key")
        }
    })
}

// Compare mutex-protected map vs sync.Map under contention
type MutexMap struct {
    mu sync.RWMutex
    m  map[string]int
}

func BenchmarkMutexMapVsSyncMap(b *testing.B) {
    b.Run("mutex-map", func(b *testing.B) {
        mm := &MutexMap{m: map[string]int{"k": 1}}
        b.RunParallel(func(pb *testing.PB) {
            for pb.Next() {
                mm.mu.RLock()
                _ = mm.m["k"]
                mm.mu.RUnlock()
            }
        })
    })

    b.Run("sync-map", func(b *testing.B) {
        var sm sync.Map
        sm.Store("k", 1)
        b.RunParallel(func(pb *testing.PB) {
            for pb.Next() { sm.Load("k") }
        })
    })
}

// Run with multiple parallelism levels:
// go test -bench=BenchmarkMutexMapVsSyncMap -cpu=1,4,8,16
// -cpu controls GOMAXPROCS; shows how performance scales with CPUs

// Custom metric reporting
b.SetParallelism(10) // GOMAXPROCS * 10 goroutines
b.ReportMetric(float64(b.N)/b.Elapsed().Seconds(), "rps")
b.ReportMetric(float64(contention)/float64(b.N), "contentions/op")

Take quiz

What does the '-cpu=1,4,8,16' flag do when running Go benchmarks?It limits the benchmark to run on specific CPU cores

✗ Try again.

It runs the benchmark four times with GOMAXPROCS set to 1, 4, 8, and 16 respectively — showing how performance scales with CPU count and revealing scaling limits

✓ Correct! Well done.

It sets the number of goroutines in b.RunParallel to each value

✗ Try again.

It measures CPU frequency at each power level

✗ Try again.

In b.RunParallel, what does pb.Next() do that 'i < b.N' does not?pb.Next() is faster because it avoids integer comparison

✗ Try again.

pb.Next() is goroutine-safe — it distributes the b.N iterations across all parallel goroutines; 'i < b.N' would give each goroutine b.N iterations instead of sharing them

✓ Correct! Well done.

pb.Next() automatically detects when the benchmark is complete

✗ Try again.

pb.Next() enables memory profiling inside the parallel loop

✗ Try again.

25. How do you manage dependency injection at scale in a large Go service — wire, dig, or manual?

As a Go service grows beyond a few dependencies, main() becomes a complex wiring function. Three approaches: manual wiring (always readable), Google Wire (code generation), or Uber Dig (reflection-based runtime injection).

// APPROACH 1: Manual wiring in main() â clear, no magic, preferred for < 20 deps
func main() {
    cfg, err := config.Load()
    if err != nil { log.Fatal(err) }

    db, err := database.Open(cfg.Database)
    if err != nil { log.Fatal(err) }

    userRepo  := postgres.NewUserRepository(db)
    emailSvc  := smtp.NewEmailService(cfg.SMTP)
    cacheSvc  := redis.NewCache(cfg.Redis)
    userSvc   := service.NewUserService(userRepo, emailSvc, cacheSvc)
    grpcSrv   := grpc.NewUserServer(userSvc)
    httpSrv   := http.NewServer(cfg.HTTP, userSvc)

    // Start servers...
}

// APPROACH 2: Wire (google/wire) â compile-time code generation
// wire.go (build tag: //go:build wireinject)
//go:build wireinject
func InitializeApp(cfgPath string) (*App, func(), error) {
    wire.Build(
        config.Provider,      // func LoadConfig() (*Config, error)
        database.Provider,    // func Open(*Config) (*sql.DB, error)
        postgres.NewUserRepository,
        smtp.NewEmailService,
        service.NewUserService,
        NewApp,
    )
    return nil, nil, nil // wire generates the real implementation
}
// Run: wire gen ./...  â generates wire_gen.go

// APPROACH 3: Dig (uber-go/dig) â runtime reflection-based DI
container := dig.New()
container.Provide(config.Load)
container.Provide(database.Open)
container.Provide(postgres.NewUserRepository)
container.Provide(service.NewUserService)
container.Invoke(func(svc *service.UserService) {
    // svc is fully constructed with all deps injected
    startServer(svc)
})

Trade-offs: Manual wiring is the most debuggable (stack traces show actual constructor calls). Wire generates readable code at compile time (fails fast, zero runtime overhead). Dig is most concise but uses reflection (runtime errors, harder to trace). The Go community generally prefers manual or Wire over Dig for production services.

Take quiz

What is the main advantage of Wire (google/wire) over runtime DI frameworks like Dig?Wire requires less boilerplate code

✗ Try again.

Wire generates Go code at compile time — dependency mismatches are compile errors, not runtime panics; the generated code is readable and has zero reflection overhead

✓ Correct! Well done.

Wire works with any interface automatically

✗ Try again.

Wire supports circular dependencies that Dig cannot

✗ Try again.

When does manual dependency injection in main() become problematic?It always causes issues — frameworks are always preferred

✗ Try again.

When the service grows to 20+ dependencies with complex lifecycle management — main() becomes hard to read and errors like missing dependencies are caught only at runtime

✓ Correct! Well done.

When the service uses gRPC instead of REST

✗ Try again.

Manual DI does not support interface-based injection

✗ Try again.

26. How do you achieve zero-downtime deployments for a Go microservice in Kubernetes?

Zero-downtime deployment means in-flight requests complete before old pods terminate, and new pods are ready before traffic is routed to them. This requires coordination between the Go service and Kubernetes lifecycle hooks.

// Key components:

// 1. Graceful shutdown in the Go service
func main() {
    srv := &http.Server{Addr: ":8080", Handler: router}

    go func() { srv.ListenAndServe() }()

    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGTERM) // k8s sends SIGTERM
    <-quit

    // Give in-flight requests time to complete
    ctx, cancel := context.WithTimeout(context.Background(), 25*time.Second)
    defer cancel()
    srv.Shutdown(ctx)   // stop accepting; drain active connections
    db.Close()          // close DB connections cleanly
    log.Println("shutdown complete")
}

// 2. Kubernetes deployment configuration
// deployment.yaml (relevant sections):
// spec.template.spec.containers:
//   lifecycle:
//     preStop:
//       exec:
//         command: ["sleep", "5"]
//   # SIGTERM is sent AFTER preStop
//   terminationGracePeriodSeconds: 30  # must be > service shutdown timeout

// 3. Rolling update strategy
// strategy:
//   type: RollingUpdate
//   rollingUpdate:
//     maxSurge: 1        # start 1 new pod before killing old
//     maxUnavailable: 0  # never kill old until new is Ready

// 4. Readiness probe â k8s only sends traffic when this returns 200
// readinessProbe:
//   httpGet:
//     path: /readyz
//     port: 8080
//   initialDelaySeconds: 5   # wait for startup
//   periodSeconds: 5
//   failureThreshold: 3

preStop hook timing: when Kubernetes terminates a pod, it sends SIGTERM and simultaneously removes the pod from the Service endpoints. There is a propagation delay (~5s) before kube-proxy and ingress controllers stop routing. The preStop sleep 5 delays SIGTERM so the graceful shutdown starts after traffic has stopped arriving.

Take quiz

Why is a preStop sleep hook needed in Kubernetes for zero-downtime deployments?To give the database time to close connections

✗ Try again.

Kubernetes removes the pod from Service endpoints and sends SIGTERM simultaneously — there is a propagation delay before traffic stops; the sleep ensures the shutdown starts after traffic has drained

✓ Correct! Well done.

The Go runtime needs time to stop the GC before shutting down

✗ Try again.

preStop hooks prevent pods from being killed before their TTL

✗ Try again.

What does 'maxUnavailable: 0' in a Kubernetes RollingUpdate strategy ensure?No pods are ever unavailable during the deployment

✗ Try again.

Old pods are never terminated until new pods pass their readiness probe — ensuring at least the original replica count is always serving traffic

✓ Correct! Well done.

All replicas are updated simultaneously

✗ Try again.

The deployment rolls back if any pod becomes unavailable

✗ Try again.

27. How do generics in Go 1.18+ enable better system design and what are the trade-offs?

Go generics allow writing type-safe, reusable data structures and algorithms without code duplication or losing type information through interfaces. The key use cases: generic data structures, result/option types, and typed collections.

// Generic Result type â eliminates panic-or-nil patterns
type Result[T any] struct {
    value T
    err   error
}

func Ok[T any](v T) Result[T]    { return Result[T]{value: v} }
func Err[T any](e error) Result[T] { return Result[T]{err: e} }

func (r Result[T]) Unwrap() (T, error) { return r.value, r.err }
func (r Result[T]) IsOk() bool         { return r.err == nil }

// Generic ordered set
type Set[T comparable] struct {
    items map[T]struct{}
}

func NewSet[T comparable]() *Set[T] {
    return &Set[T]{items: make(map[T]struct{})}
}
func (s *Set[T]) Add(v T) { s.items[v] = struct{}{} }
func (s *Set[T]) Has(v T) bool { _, ok := s.items[v]; return ok }
func (s *Set[T]) Len() int { return len(s.items) }

// Generic Map/Filter/Reduce â functional pipeline without reflect
func Map[T, U any](slice []T, fn func(T) U) []U {
    result := make([]U, len(slice))
    for i, v := range slice { result[i] = fn(v) }
    return result
}

func Filter[T any](slice []T, pred func(T) bool) []T {
    var result []T
    for _, v := range slice {
        if pred(v) { result = append(result, v) }
    }
    return result
}

// Usage
users := []User{{ID: 1, Active: true}, {ID: 2, Active: false}}
activeIDs := Map(
    Filter(users, func(u User) bool { return u.Active }),
    func(u User) int { return u.ID },
)
// [1]  â type-safe, zero reflect, zero allocation overhead

Trade-offs: generics increase code readability but slightly increase compile time. The Go implementation uses GCShape monomorphisation — types with the same memory layout share one instantiation, reducing binary size compared to full C++ monomorphisation. Avoid generics for simple cases — if a plain interface satisfies the requirement, prefer it.

Take quiz

What is the primary benefit of the generic Map/Filter pattern over using reflect or interface{}?Generic functions are always 10x faster than interface-based ones

✗ Try again.

The return type is known at compile time — callers get type-safe results without type assertions, and the compiler catches type mismatches

✓ Correct! Well done.

Generics eliminate all memory allocations

✗ Try again.

Generic functions are inlined by the compiler in all cases

✗ Try again.

What Go type constraint must T satisfy to be used as a map key in a generic data structure?any — all types can be map keys

✗ Try again.

comparable — types that support == and != operators

✓ Correct! Well done.

ordered — types that support < and >

✗ Try again.

hashable — types that implement a Hash() method

✗ Try again.

28. How do you use test coverage meaningfully in Go — beyond just a percentage?

Test coverage in Go (go test -cover) reports which source lines were executed during tests. But 80% coverage can still miss critical paths. Experienced engineers use coverage to find untested branches, not to chase a number.

// Run coverage
// go test -cover ./...
// go test -coverprofile=cover.out ./...
// go tool cover -html=cover.out   # visual HTML report
// go tool cover -func=cover.out   # per-function percentages

// Useful: find UNCOVERED branches visually
// Red lines in cover -html are untested paths

// Coverage pragmas: mark code as intentionally not testable
func panicOnInvariantViolation(cond bool, msg string) {
    if !cond {
        // This code path requires a programming error to hit
        // It's fine to leave untested
        panic(msg) //nolint:gocritic
    }
}

// Testing edge cases found in coverage report
func parsePort(s string) (int, error) {
    n, err := strconv.Atoi(s)
    if err != nil {
        return 0, fmt.Errorf("port %q is not a number: %w", s, err) // branch 1
    }
    if n < 1 || n > 65535 {
        return 0, fmt.Errorf("port %d out of range [1, 65535]", n)   // branch 2
    }
    return n, nil // branch 3
}

// Table-driven test to hit all branches
func TestParsePort(t *testing.T) {
    tests := []struct {
        input   string
        want    int
        wantErr bool
    }{
        {"8080", 8080, false},    // happy path (branch 3)
        {"abc",  0,    true},     // not a number (branch 1)
        {"0",    0,    true},     // too low (branch 2)
        {"65536", 0,   true},     // too high (branch 2)
        {"1",    1,    false},    // boundary minimum
        {"65535", 65535, false},  // boundary maximum
    }
    // ... test loop
}

// CI enforcement:
// go test -coverprofile=cover.out ./...
// go tool cover -func=cover.out | tail -1 | awk '{print $3}' | grep -v '^[0-6][0-9]'

Take quiz

What does 'go tool cover -html=cover.out' show that the percentage alone does not?It shows which goroutines ran during tests

✗ Try again.

It shows exactly which lines and branches were NOT executed — making it easy to identify missing test cases for specific code paths

✓ Correct! Well done.

It shows the performance overhead of each tested function

✗ Try again.

It identifies test files that have no assertions

✗ Try again.

Why is achieving 100% test coverage not necessarily a good goal?It takes too long to achieve

✗ Try again.

Coverage measures lines executed, not scenarios tested — 100% coverage can be achieved with tests that make no assertions; meaningful coverage focuses on testing all decision branches and edge cases

✓ Correct! Well done.

The Go compiler cannot handle test coverage over 95%

✗ Try again.

100% coverage always indicates over-testing

✗ Try again.

29. What are the best practices for designing Protocol Buffer schemas in Go microservices?

Protobuf schema design has long-term consequences — once published, breaking changes require coordinated version bumps across all consumers. Good schema design minimises future pain.

// Best practices:

// 1. Use well-known types for common data
import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/wrappers.proto"; // for nullable primitives

message Order {
    string order_id = 1;  // UUIDs as strings, not int64
    google.protobuf.Timestamp created_at = 2; // not int64 unix
    google.protobuf.Duration  processing_time = 3;
    google.protobuf.StringValue discount_code = 4; // nullable string
    repeated OrderItem items = 5;
    OrderStatus status = 6;
}

// 2. Use enums with an UNSPECIFIED zero value
enum OrderStatus {
    ORDER_STATUS_UNSPECIFIED = 0; // default/unknown â must be 0
    ORDER_STATUS_PENDING     = 1;
    ORDER_STATUS_PAID        = 2;
    ORDER_STATUS_SHIPPED     = 3;
    ORDER_STATUS_CANCELLED   = 4;
}

// 3. OneOf for discriminated unions
message PaymentMethod {
    oneof method {
        CreditCard credit_card = 1;
        BankTransfer bank_transfer = 2;
        CryptoCurrency crypto = 3;
    }
}

// 4. Avoid nested types â prefer separate top-level messages
// Bad: message User { message Address { ... } address = 5; }
// Good: message Address { ... }  message User { Address address = 5; }

// 5. Name convention: snake_case fields, PascalCase messages
// 6. Namespace your protos: package company.service.v1;
// 7. One service per .proto file; one message per concern

Take quiz

Why should Protobuf enums always have a zero-value entry named *_UNSPECIFIED?Proto3 requires a default field for all enum types

✗ Try again.

Proto3 initialises missing enum fields to 0 — if 0 means a real value like PENDING, old clients sending an unset enum field would be misinterpreted as PENDING; UNSPECIFIED explicitly communicates 'not set'

✓ Correct! Well done.

The Go protobuf compiler treats 0 as an error code

✗ Try again.

UNSPECIFIED is required for forward compatibility with proto4

✗ Try again.

What is the purpose of 'google.protobuf.StringValue' instead of a plain string field?StringValue supports unicode characters that string cannot

✗ Try again.

A plain string field cannot distinguish between an empty string and a field that was not set; StringValue is a wrapper that can be nil, representing 'not provided'

✓ Correct! Well done.

StringValue fields are encrypted in transit

✗ Try again.

StringValue is required for fields that appear in oneof

✗ Try again.

30. How do you implement safe retries in Go microservices?

Retries improve resilience but must be done correctly. Retrying non-idempotent operations (POST create) without idempotency keys causes duplicate records. Retrying without backoff causes thundering herd. Retrying infinite times causes cascading failure.

// Safe retry with exponential backoff and jitter
type RetryConfig struct {
    MaxAttempts int
    InitialWait time.Duration
    MaxWait     time.Duration
    Multiplier  float64
}

func Retry(ctx context.Context, cfg RetryConfig, fn func() error) error {
    wait := cfg.InitialWait
    for attempt := 1; ; attempt++ {
        err := fn()
        if err == nil { return nil }

        // Don't retry permanent errors
        if !isRetryable(err) { return err }

        if attempt >= cfg.MaxAttempts {
            return fmt.Errorf("after %d attempts: %w", attempt, err)
        }

        // Exponential backoff with jitter
        jitter := time.Duration(rand.Int63n(int64(wait / 2)))
        sleep := wait + jitter
        if sleep > cfg.MaxWait { sleep = cfg.MaxWait }

        select {
        case <-time.After(sleep):
        case <-ctx.Done():
            return fmt.Errorf("retry cancelled: %w", ctx.Err())
        }
        wait = time.Duration(float64(wait) * cfg.Multiplier)
    }
}

func isRetryable(err error) bool {
    // Only retry transient errors
    code := status.Code(err)
    return code == codes.Unavailable ||
        code == codes.DeadlineExceeded ||
        code == codes.ResourceExhausted
}

// Idempotency key for non-idempotent operations
func (s *userServiceServer) CreateUser(
    ctx context.Context, req *pb.CreateUserRequest,
) (*pb.User, error) {
    // Extract idempotency key from metadata
    md, _ := metadata.FromIncomingContext(ctx)
    idempKey := md.Get("idempotency-key")

    // Check if we've seen this key before
    if len(idempKey) > 0 {
        if cached, err := s.cache.Get(ctx, "idemp:"+idempKey[0]); err == nil {
            var existing pb.User
            proto.Unmarshal(cached, &existing)
            return &existing, nil // return cached response
        }
    }
    // ... create user, cache result with idempotency key
}

Take quiz

Why should retries use exponential backoff with jitter instead of fixed intervals?Fixed intervals are faster

✗ Try again.

Fixed intervals cause all retrying clients to retry simultaneously — jitter spreads retries over time, preventing a thundering herd that would overwhelm an already-struggling service

✓ Correct! Well done.

Exponential backoff is required by the HTTP specification

✗ Try again.

Jitter prevents the retry from running on the same goroutine

✗ Try again.

What is an idempotency key and why is it necessary for retries on mutating operations?A hash of the request body used to detect duplicate messages

✗ Try again.

A unique client-generated identifier attached to mutating requests — the server uses it to deduplicate retried requests, ensuring creating a user twice with the same key returns the first result rather than creating a duplicate

✓ Correct! Well done.

A cryptographic signature proving the request is authentic

✗ Try again.

A server-generated token that must be returned in subsequent requests

✗ Try again.

31. What are golden file tests in Go and when should you use them?

Golden file tests compare output against a saved reference file. They are ideal for testing complex output (JSON API responses, generated SQL, HTML) where manually writing the expected value in code is tedious and error-prone.

import "github.com/sebdah/goldie/v2"

// Golden file test â output compared against testdata/*.golden
func TestUserToJSON(t *testing.T) {
    g := goldie.New(t,
        goldie.WithFixtureDir("testdata"),
        goldie.WithUpdateFlag("update"), // flag: -update
    )

    user := User{
        ID:        1,
        Name:      "Alice",
        CreatedAt: time.Date(2024, 1, 15, 0, 0, 0, 0, time.UTC),
    }
    data, err := json.MarshalIndent(user, "", "  ")
    if err != nil { t.Fatal(err) }

    // Compares with testdata/TestUserToJSON.golden
    g.Assert(t, "TestUserToJSON", data)
}

// Manual golden file implementation
func assertGolden(t *testing.T, name string, got []byte) {
    t.Helper()
    path := filepath.Join("testdata", name+".golden")

    if *update {
        // go test -run TestUserToJSON -args -update
        os.MkdirAll("testdata", 0755)
        os.WriteFile(path, got, 0644)
        return
    }

    want, err := os.ReadFile(path)
    if err != nil {
        t.Fatalf("golden file missing: %s (run with -args -update to create)", path)
    }
    if !bytes.Equal(got, want) {
        diff := diffStrings(string(want), string(got))
        t.Errorf("golden file mismatch:\n%s", diff)
    }
}

var update = flag.Bool("update", false, "update golden files")

// Commit testdata/*.golden files to version control
// Update when intentional output changes:
// go test -run TestUserToJSON -args -update

Take quiz

When are golden file tests most useful?When testing functions that return errors

✗ Try again.

When testing complex structured output (JSON, SQL, generated code) where hard-coding the expected value in Go code would be tedious and hard to read

✓ Correct! Well done.

When testing concurrent code for race conditions

✗ Try again.

When unit testing individual calculation functions

✗ Try again.

How do you update golden files when the output intentionally changes?Delete the .golden files and rerun the tests

✗ Try again.

Run tests with an -update flag (e.g., go test -args -update) which overwrites .golden files with the current output

✓ Correct! Well done.

Golden files must be manually edited in a text editor

✗ Try again.

The test framework automatically updates them when assertions change

✗ Try again.

32. How do you ensure data consistency across Go microservices without distributed transactions?

Distributed transactions (2PC) are generally avoided in microservices — they couple services and create availability issues. The alternative is eventual consistency through careful design: idempotent consumers, compensating transactions, and the Outbox pattern.

// Pattern: last-write-wins with optimistic locking (version field)
type User struct {
    ID      int
    Name    string
    Email   string
    Version int // incremented on each update
}

func (r *UserRepo) UpdateOptimistic(
    ctx context.Context, user *User,
) error {
    result, err := r.db.ExecContext(ctx,
        `UPDATE users
         SET name=$1, email=$2, version=version+1
         WHERE id=$3 AND version=$4`,
        user.Name, user.Email, user.ID, user.Version,
    )
    if err != nil { return err }
    n, _ := result.RowsAffected()
    if n == 0 {
        return ErrConflict // another update happened concurrently
    }
    user.Version++ // reflect new version
    return nil
}

// Pattern: idempotent event handler with deduplication
type EventHandler struct {
    db    *sql.DB
    dedup *DedupeCache
}

func (h *EventHandler) Handle(ctx context.Context, event Event) error {
    // Use event ID as idempotency key
    if already, _ := h.dedup.Check(ctx, event.ID); already {
        return nil // already processed â safe to ack
    }

    tx, err := h.db.BeginTx(ctx, nil)
    if err != nil { return err }
    defer tx.Rollback()

    // Process the event
    if err := h.applyEvent(ctx, tx, event); err != nil {
        return fmt.Errorf("apply event: %w", err)
    }

    // Mark event as processed within the same transaction
    tx.ExecContext(ctx,
        "INSERT INTO processed_events (id) VALUES ($1)", event.ID)

    if err := tx.Commit(); err != nil {
        return fmt.Errorf("commit: %w", err)
    }
    h.dedup.Set(ctx, event.ID) // populate cache
    return nil
}

Take quiz

What is optimistic locking and when does it fail?It assumes writes never conflict and skips all locking — it fails when data is corrupted

✗ Try again.

It assumes conflicts are rare — reads proceed without locks, but an UPDATE checks a version number; if the version changed since the read, the update returns 0 rows (conflict detected, retry required)

✓ Correct! Well done.

It locks the database row for reading but not writing

✗ Try again.

It automatically resolves conflicts using the last-write-wins rule

✗ Try again.

How does storing processed event IDs in the same transaction as the event handler work ensure idempotency?It prevents duplicate events from being published

✗ Try again.

The event processing and the 'mark as processed' write are atomic — either both succeed or both fail; if the process crashes after processing but before marking, the event is reprocessed but the duplicate write is detected

✓ Correct! Well done.

Events are deduplicated at the message broker before delivery

✗ Try again.

The version field prevents the same event from being stored twice

✗ Try again.

33. What is the API Gateway pattern and how does it complement Go microservices?

An API Gateway is a single entry point for all client requests. It handles cross-cutting concerns — auth, rate limiting, routing, SSL termination, response aggregation — so individual services don't need to implement them.

// Lightweight Go API gateway (simplified)
type Gateway struct {
    routes map[string]RouteConfig
    auth   *JWTValidator
    limit  *RateLimiter
    log    *slog.Logger
}

type RouteConfig struct {
    Target      string   // upstream URL
    RequireAuth bool
    RateLimit   int      // requests per second
    Methods     []string // allowed HTTP methods
    StripPrefix string
}

func (g *Gateway) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    route, ok := g.matchRoute(r.URL.Path)
    if !ok {
        http.Error(w, "not found", 404); return
    }

    // Auth (before rate limiting to save rate limit quota for auth users)
    if route.RequireAuth {
        if _, err := g.auth.Validate(r); err != nil {
            http.Error(w, "unauthorized", 401); return
        }
    }

    // Rate limiting
    if !g.limit.Allow(clientIP(r)) {
        http.Error(w, "rate limited", 429); return
    }

    // Proxy to upstream
    proxy := httputil.NewSingleHostReverseProxy(mustParseURL(route.Target))
    if route.StripPrefix != "" {
        r.URL.Path = strings.TrimPrefix(r.URL.Path, route.StripPrefix)
    }
    proxy.ServeHTTP(w, r)
}

// In practice: use Kong, Envoy, or AWS API Gateway
// Go excels at writing custom gateway logic as plugins:
// - Kong plugins in Go (go-pdk)
// - Envoy WASM filters in Go (tetratelabs/proxy-wasm-go-sdk)
// - Custom gateway in Go using httputil.ReverseProxy

BFF (Backend for Frontend) pattern: a specialised API gateway for each type of client — mobile BFF, web BFF. Each BFF aggregates calls to multiple services and returns exactly the data its client needs (graph-like queries without GraphQL's complexity). Often written in Go for performance.

Take quiz

What is the primary benefit of an API Gateway in a microservices architecture?It increases throughput by parallelising requests

✗ Try again.

It provides a single entry point that centralises cross-cutting concerns — auth, rate limiting, SSL, routing — so individual services focus on business logic

✓ Correct! Well done.

It automatically scales services based on traffic

✗ Try again.

It converts REST requests to gRPC automatically

✗ Try again.

What is the Backend for Frontend (BFF) pattern?A fallback service that handles requests when the primary backend is down

✗ Try again.

A specialised API gateway tailored for a specific type of client — a mobile BFF returns lightweight responses; a web BFF returns richer data — each optimised for its client's needs

✓ Correct! Well done.

A frontend framework written in Go

✗ Try again.

A pattern for caching backend responses in the browser

✗ Try again.

34. What memory leak patterns in Go are not goroutine leaks and how do you detect them?

Go has several memory leak patterns beyond leaked goroutines: long-lived caches without eviction, global maps that grow without bounds, slice backing arrays held by small sub-slices, and finalizers that delay GC.

// LEAK 1: growing global map without eviction
var requestMetrics = map[string]int64{} // grows forever with new paths
// Fix: use a bounded structure (LRU cache, TTL cache)
var metricsCache = cache.NewLRU[string, int64](1000) // bounded

// LEAK 2: slice sub-slice retaining large backing array
func getHeader(data []byte) []byte {
    return data[:8]  // BAD: keeps ALL of 'data' alive
}
func getHeaderSafe(data []byte) []byte {
    header := make([]byte, 8)
    copy(header, data[:8]) // GOOD: independent allocation
    return header
}

// LEAK 3: time.Ticker without Stop()
// Already discussed but worth repeating â channel and goroutine alive forever

// LEAK 4: cgo-allocated memory (outside GC's purview)
// Must manually call C.free() for C.CString(), C.malloc(), etc.

// DETECTION:
// A. runtime.ReadMemStats â watch HeapAlloc over time
func monitorHeap(ctx context.Context) {
    ticker := time.NewTicker(30 * time.Second)
    defer ticker.Stop()
    for {
        select {
        case <-ctx.Done(): return
        case <-ticker.C:
            var stats runtime.MemStats
            runtime.ReadMemStats(&stats)
            slog.Info("heap",
                "alloc_mb",    stats.HeapAlloc / (1 << 20),
                "sys_mb",      stats.HeapSys / (1 << 20),
                "num_gc",      stats.NumGC,
                "goroutines",  runtime.NumGoroutine(),
            )
        }
    }
}

// B. pprof heap profile comparison
// curl http://localhost:6060/debug/pprof/heap > heap1.out
// # wait a while
// curl http://localhost:6060/debug/pprof/heap > heap2.out
// go tool pprof -base heap1.out heap2.out
// > top10  # shows objects that grew between snapshots

Take quiz

How do you detect that a sub-slice is holding an unexpectedly large backing array in memory?The race detector reports it

✗ Try again.

A heap profile taken at two points in time shows the backing array in the allocation tree — the sub-slice appears small but the parent allocation is large

✓ Correct! Well done.

The GC automatically reports retained backing arrays

✗ Try again.

runtime.ReadMemStats shows the backing array separately from the sub-slice

✗ Try again.

What is the correct way to prevent a small sub-slice from holding a large backing array alive?Use append() instead of the slice operator

✗ Try again.

Use copy() to copy the needed elements into a new, independently allocated slice — breaking the reference to the original backing array

✓ Correct! Well done.

Call runtime.GC() after taking the sub-slice

✗ Try again.

Use sync.Pool for the backing array

✗ Try again.

35. How do CQRS and event sourcing apply to Go microservice architecture?

CQRS (Command Query Responsibility Segregation) separates reads and writes into separate models. Event Sourcing stores every state change as an event — the current state is derived by replaying events. Both patterns appear in high-scale Go systems.

// CQRS: separate command and query handlers
// Command: mutates state, returns error
type CreateOrderCommand struct {
    UserID    string
    ProductID string
    Quantity  int
}

type OrderCommandHandler struct{ eventStore EventStore }

func (h *OrderCommandHandler) Handle(
    ctx context.Context, cmd CreateOrderCommand,
) error {
    event := OrderCreatedEvent{
        OrderID:   uuid.New().String(),
        UserID:    cmd.UserID,
        ProductID: cmd.ProductID,
        Quantity:  cmd.Quantity,
        CreatedAt: time.Now(),
    }
    return h.eventStore.Append(ctx, event.OrderID, event)
}

// Query: reads from a projection (read model)
type OrderQueryHandler struct{ readDB *sql.DB }

func (h *OrderQueryHandler) GetOrder(
    ctx context.Context, orderID string,
) (*OrderView, error) {
    // Read from denormalised view optimised for queries
    row := h.readDB.QueryRowContext(ctx,
        "SELECT id, status, total, user_name FROM order_views WHERE id=$1",
        orderID)
    var view OrderView
    return &view, row.Scan(&view.ID, &view.Status, &view.Total, &view.UserName)
}

// Event Sourcing: replay events to reconstruct state
type Order struct {
    ID      string
    Status  string
    Items   []OrderItem
    version int
}

func ReplayOrder(ctx context.Context, store EventStore, id string) (*Order, error) {
    events, err := store.Load(ctx, id)
    if err != nil { return nil, err }

    order := &Order{ID: id}
    for _, event := range events {
        order.apply(event)   // deterministically mutate state
        order.version++
    }
    return order, nil
}

Take quiz

What is the main benefit of event sourcing over traditional state storage?Event sourcing is always faster than SQL databases

✗ Try again.

The full history of every state change is preserved — you can audit what happened, replay events to rebuild state, project events into different read models, and debug by replaying to any point in time

✓ Correct! Well done.

Event stores have lower storage requirements than relational databases

✗ Try again.

Events cannot be corrupted or lost unlike database rows

✗ Try again.

In CQRS, why do the read and write models use different data stores or schemas?Regulatory compliance requires separate read/write databases

✗ Try again.

Write models are optimised for consistency and transactional correctness; read models are optimised for query patterns (denormalised, pre-joined) — optimising one forces trade-offs in the other

✓ Correct! Well done.

CQRS requires different database technologies for reads and writes

✗ Try again.

Separate stores prevent read traffic from blocking write transactions

✗ Try again.

36. What is chaos engineering and how do Go teams apply it to test microservice resilience?

Chaos engineering deliberately injects failures into a running system to discover weaknesses before they cause production outages. Go services are validated against: network failures, slow dependencies, pod restarts, and resource exhaustion.

// Chaos testing in Go: inject failures in tests

// Fault injection via interface
type FaultInjector struct {
    next     UserRepository
    failRate float64 // 0.0 to 1.0
    latency  time.Duration
}

func (f *FaultInjector) FindByID(ctx context.Context, id int) (*User, error) {
    // Inject artificial latency
    if f.latency > 0 {
        select {
        case <-time.After(f.latency):
        case <-ctx.Done(): return nil, ctx.Err()
        }
    }
    // Inject random failures
    if rand.Float64() < f.failRate {
        return nil, errors.New("injected fault: database unavailable")
    }
    return f.next.FindByID(ctx, id)
}

// Test service behaviour under 50% failure rate
func TestServiceUnderFaults(t *testing.T) {
    repo := &fakeUserRepo{users: testUsers}
    faulty := &FaultInjector{
        next:     repo,
        failRate: 0.5,
        latency:  100 * time.Millisecond,
    }
    svc := NewUserService(faulty)

    // Test that service handles partial failures gracefully
    successCount := 0
    for i := 0; i < 100; i++ {
        user, err := svc.GetUserWithFallback(context.Background(), 1)
        if err == nil && user != nil { successCount++ }
    }
    // With 50% fault rate and fallback, expect at least 90% success
    if float64(successCount) < 90 {
        t.Errorf("success rate %d%% too low with fallback", successCount)
    }
}

// Tools for chaos in production:
// - Chaos Monkey (Netflix) â terminates random pods
// - Litmus (CNCF) â k8s-native chaos experiments
// - Gremlin â cloud chaos-as-a-service
// - k6 + fault injection scenarios

Take quiz

What is the primary goal of chaos engineering?To deliberately crash production systems to train engineers

✗ Try again.

To discover system weaknesses by injecting controlled failures in a running system — before those weaknesses cause unexpected production outages

✓ Correct! Well done.

To test that monitoring alerts fire correctly

✗ Try again.

To measure the maximum throughput of a service under load

✗ Try again.

Why is testing with a FaultInjector in unit tests valuable before doing production chaos experiments?Unit tests with fault injection are faster than chaos experiments

✗ Try again.

It verifies that the service's resilience mechanisms (circuit breakers, fallbacks, retries) work correctly before exposing real users to failures

✓ Correct! Well done.

FaultInjector tests count as chaos engineering for compliance purposes

✗ Try again.

Unit tests can inject faults that production chaos tools cannot

✗ Try again.

37. What is contract testing and how does it apply to Go microservices?

Contract testing verifies that services honour their agreed API contracts without requiring a full integration test environment. In a microservices system, consumer-driven contract testing (Pact) lets service consumers define the API shape they expect, and providers verify they match.

// Pact consumer test (user-service-client)
// Defines what the consumer expects from the user service
func TestUserServiceContract_GetUser(t *testing.T) {
    // Define the expected interaction
    pact := dsl.Pact{
        Consumer: "order-service",
        Provider: "user-service",
    }
    defer pact.Teardown()

    pact.
        AddInteraction().
        Given("user 1 exists").
        UponReceiving("a request for user 1").
        WithRequest(dsl.Request{
            Method:  "GET",
            Path:    "/users/1",
            Headers: dsl.MapMatcher{"Accept": "application/json"},
        }).
        WillRespondWith(dsl.Response{
            Status: 200,
            Body: dsl.Match(User{}), // matches structure, not values
        })

    if err := pact.Verify(func() error {
        client := NewUserClient(pact.Server.URL)
        user, err := client.GetUser(context.Background(), 1)
        if err != nil { return err }
        if user.ID != 1 { return fmt.Errorf("expected ID 1, got %d", user.ID) }
        return nil
    }); err != nil {
        t.Fatal(err)
    }
}

// Protobuf contracts are already self-documenting
// For gRPC: test that the proto file matches what clients use
// Use protoc-gen-validate for field-level validation in the schema
message CreateUserRequest {
    string email = 1 [(validate.rules).string.email = true];
    string name  = 2 [(validate.rules).string = {min_len: 1, max_len: 100}];
}

gRPC contract testing: Protobuf provides the contract itself. The pattern is different — ensure the generated Go code from the consumer's proto copy matches the provider's. Tools like buf breaking detect breaking changes in .proto files, acting as automated contract enforcement.

Take quiz

What is the key difference between contract testing and integration testing?Contract testing is faster because it doesn't use a real network

✗ Try again.

Contract testing verifies only the API contract (structure and behaviour) between two services, not the full system — it can run in isolation without deploying all services to a shared environment

✓ Correct! Well done.

Integration testing is more thorough and makes contract testing unnecessary

✗ Try again.

Contract tests run in production; integration tests run in staging

✗ Try again.

How does Protobuf help with contract testing in gRPC services?Protobuf automatically runs contract tests on every compilation

✗ Try again.

The .proto file IS the contract — tools like 'buf breaking' detect breaking changes, and the generated code enforces the contract; consumers and providers use the same .proto definitions

✓ Correct! Well done.

Protobuf generates Pact-compatible consumer tests automatically

✗ Try again.

Protobuf version numbers serve as contract version identifiers

✗ Try again.

38. What makes a Go microservice horizontally scalable and what patterns break scaling?

A horizontally scalable service can handle more load by adding replicas — each replica is identical and stateless. Go services are well-suited to horizontal scaling due to small memory footprint, but certain patterns break scalability.

Scalable vs Non-Scalable Patterns
Pattern	Scalable?	Problem / Fix
Global in-memory cache (map/slice)	NO	Each replica has its own cache — inconsistent reads. Fix: use Redis or Memcached
In-memory session store	NO	Sessions are lost on pod restart. Fix: Redis-backed sessions
Background goroutine per replica	Caution	N replicas run N workers — duplicated work. Fix: use a job queue with exactly-one semantics
Stateless request handler	YES	Each request fully handled without shared state
Externally stored state (DB, cache)	YES	State survives restarts; any replica can handle any request
Distributed locks (Redis SETNX)	YES	Enables exactly-one execution across replicas for cron jobs
Idempotent API	YES	Clients safely retry on any replica

// ANTI-PATTERN: in-memory state that breaks horizontal scaling
var sessionStore = map[string]Session{} // lost on pod restart!

// CORRECT: external session storage
type SessionStore struct{ redis *redis.Client }

func (s *SessionStore) Get(ctx context.Context, id string) (*Session, error) {
    data, err := s.redis.Get(ctx, "sess:"+id).Bytes()
    if err != nil { return nil, err }
    var sess Session
    return &sess, json.Unmarshal(data, &sess)
}

// Distributed cron: only one replica should run a job
func runIfLeader(ctx context.Context, rdb *redis.Client, jobFn func()) {
    key := "job:lock:daily-report"
    acquired, err := rdb.SetNX(ctx, key, "1", 5*time.Minute).Result()
    if err != nil || !acquired {
        return // another replica has the lock
    }
    defer rdb.Del(ctx, key) // release after job
    jobFn()
}

Take quiz

Why does storing session data in a Go service's in-memory map break horizontal scaling?In-memory maps are not thread-safe

✗ Try again.

Each replica has its own independent map — a request routed to a different replica won't find the session, forcing sticky sessions or causing authentication failures

✓ Correct! Well done.

Maps have a maximum size that prevents storing many sessions

✗ Try again.

Go's garbage collector clears map entries under memory pressure

✗ Try again.

What pattern enables exactly-one execution of a cron job across multiple replicas?Assigning the cron job to the replica with the lowest CPU usage

✗ Try again.

Distributed locking (e.g., Redis SETNX with TTL) — only the replica that acquires the lock executes the job; others detect the lock is held and skip

✓ Correct! Well done.

Running cron jobs only on the primary replica using pod labels

✗ Try again.

Using a Kubernetes CronJob resource which guarantees single execution

✗ Try again.

39. How do you implement configuration hot-reloading in a Go service without restart?

Some configuration changes — feature flags, rate limits, log levels — should not require a pod restart. Hot-reloading reads new config from a file or config service and atomically swaps the configuration pointer.

// Atomic configuration pointer â reads and writes are goroutine-safe
type DynamicConfig struct {
    current atomic.Pointer[Config]
}

func NewDynamicConfig(initial *Config) *DynamicConfig {
    dc := &DynamicConfig{}
    dc.current.Store(initial)
    return dc
}

// Get returns the current config â zero allocation, lock-free
func (dc *DynamicConfig) Get() *Config {
    return dc.current.Load()
}

// Reload atomically swaps to a new config
func (dc *DynamicConfig) Reload(path string) error {
    data, err := os.ReadFile(path)
    if err != nil { return err }
    var cfg Config
    if err := yaml.Unmarshal(data, &cfg); err != nil { return err }
    if err := cfg.Validate(); err != nil { return err }
    dc.current.Store(&cfg) // atomic swap
    slog.Info("config reloaded", "path", path)
    return nil
}

// SIGHUP-triggered reload
func watchConfigSignal(dc *DynamicConfig, path string) {
    sighup := make(chan os.Signal, 1)
    signal.Notify(sighup, syscall.SIGHUP)
    for range sighup {
        if err := dc.Reload(path); err != nil {
            slog.Error("config reload failed", "err", err)
            // Keep running with old config
        }
    }
}

// File watcher approach
func watchConfigFile(dc *DynamicConfig, path string, ctx context.Context) {
    watcher, _ := fsnotify.NewWatcher()
    watcher.Add(path)
    defer watcher.Close()
    for {
        select {
        case <-ctx.Done(): return
        case event := <-watcher.Events:
            if event.Has(fsnotify.Write) {
                time.Sleep(100 * time.Millisecond) // wait for write to complete
                dc.Reload(path)
            }
        }
    }
}

Take quiz

Why is atomic.Pointer used for hot-reloading configuration instead of a sync.RWMutex?atomic.Pointer works with any type; sync.RWMutex only works with primitives

✗ Try again.

atomic.Pointer provides lock-free reads — any goroutine reading the config gets a consistent snapshot without blocking writers or paying mutex overhead on the read path

✓ Correct! Well done.

sync.RWMutex would prevent configuration changes from taking effect

✗ Try again.

atomic.Pointer automatically validates the new configuration

✗ Try again.

Why should new configuration be validated before calling dc.current.Store()?Store() panics on invalid configuration

✗ Try again.

An invalid config swapped atomically would immediately affect all requests — validation before the swap ensures only valid configuration ever becomes active

✓ Correct! Well done.

Validation is required by the atomic.Pointer API contract

✗ Try again.

The old configuration cannot be restored once Store() is called

✗ Try again.

40. How do you architect Go services for maximum testability at the package level?

Testable architecture is not an afterthought — it follows from correct dependency direction. The key principle: business logic in inner packages depends on abstractions (interfaces), not on concrete infrastructure. This enables unit testing without a database or network.

// Layered architecture with interfaces at boundaries

// domain/user.go â inner layer, no infrastructure imports
package domain

type UserRepository interface {
    FindByID(ctx context.Context, id int) (*User, error)
    Save(ctx context.Context, u *User) error
}

type EmailSender interface {
    Send(ctx context.Context, to, subject, body string) error
}

type UserService struct {
    repo  UserRepository
    email EmailSender
}

// All business logic here â easily unit testable
func (s *UserService) Register(ctx context.Context, req RegisterRequest) (*User, error) {
    // validation, business rules â no DB/network calls directly
    if err := validateEmail(req.Email); err != nil {
        return nil, fmt.Errorf("invalid email: %w", err)
    }
    u := &User{Email: req.Email, Name: req.Name}
    if err := s.repo.Save(ctx, u); err != nil {
        return nil, fmt.Errorf("saving user: %w", err)
    }
    s.email.Send(ctx, u.Email, "Welcome!", "Thanks for joining.")
    return u, nil
}

// domain/user_test.go â fast unit test, no DB or network
func TestUserService_Register(t *testing.T) {
    tests := []struct {
        name    string
        email   string
        repoErr error
        wantErr bool
    }{
        {"valid", "alice@example.com", nil, false},
        {"invalid email", "notanemail", nil, true},
        {"db error", "bob@example.com", errors.New("db down"), true},
    }
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            svc := &UserService{
                repo:  &stubRepo{saveErr: tt.repoErr},
                email: &noopEmailer{},
            }
            _, err := svc.Register(context.Background(),
                RegisterRequest{Email: tt.email})
            if (err != nil) != tt.wantErr {
                t.Errorf("got err=%v, wantErr=%v", err, tt.wantErr)
            }
        })
    }
}

Take quiz

What is the dependency rule in clean/layered architecture?Dependencies always flow from inner packages to outer packages

✗ Try again.

Inner layers (domain/business logic) must not depend on outer layers (infrastructure) — dependencies flow inward; outer layers depend on inner layer interfaces

✓ Correct! Well done.

All packages depend on a central shared types package

✗ Try again.

Infrastructure packages define interfaces that business logic implements

✗ Try again.

Why does placing interfaces in the domain package (consumer) rather than the infrastructure package (producer) improve testability?Domain packages compile faster than infrastructure packages

✗ Try again.

The domain package can be tested without importing any infrastructure packages — tests in the domain package use stubs that implement the domain-defined interfaces with no external dependencies

✓ Correct! Well done.

It prevents circular imports between packages

✗ Try again.

The Go compiler requires interfaces to be in the consumer package

✗ Try again.

41. How do you implement feature flags and canary deployments in a Go microservice?

Feature flags decouple deployment from release — code is deployed to all servers but activated for a subset of users or traffic. Canary deployments route a small percentage of traffic to a new version, monitoring for errors before full rollout.

// Feature flag implementation
type FeatureFlags struct {
    mu    sync.RWMutex
    flags map[string]FlagConfig
}

type FlagConfig struct {
    Enabled    bool
    Percentage int    // 0-100: % of users who see the feature
    Allowlist  []string // specific user IDs
}

func (f *FeatureFlags) IsEnabled(flagName, userID string) bool {
    f.mu.RLock()
    cfg, ok := f.flags[flagName]
    f.mu.RUnlock()
    if !ok || !cfg.Enabled { return false }

    // Always-on for allowlisted users (internal testing)
    for _, id := range cfg.Allowlist {
        if id == userID { return true }
    }

    // Percentage rollout: deterministic based on user ID
    // Same user always gets same experience
    h := fnv32(userID) % 100
    return int(h) < cfg.Percentage
}

// In handler
func userHandler(w http.ResponseWriter, r *http.Request) {
    claims, _ := claimsFromCtx(r.Context())
    if flags.IsEnabled("new-profile-ui", claims.UserID) {
        renderNewProfile(w, r)
        return
    }
    renderLegacyProfile(w, r)
}

// Kubernetes canary via Argo Rollouts
// spec.strategy.canary:
//   steps:
//   - setWeight: 10    # 10% traffic to new version
//   - pause: {duration: 5m}
//   - analysis:        # check error rate
//       templates: [{templateName: error-rate-analysis}]
//   - setWeight: 50    # 50% if analysis passed
//   - pause: {duration: 10m}
//   - setWeight: 100   # full rollout

Take quiz

Why is percentage rollout based on a hash of the user ID instead of a random number?Hashing is faster than random number generation

✗ Try again.

A hash of the user ID is deterministic — the same user always sees the same experience (consistent UX), whereas a random number changes on every request

✓ Correct! Well done.

Random numbers are not available in Go without importing math/rand

✗ Try again.

Hashing prevents A/B test results from being skewed by repeat users

✗ Try again.

What is the difference between a feature flag and a canary deployment?Feature flags are for frontend; canary is for backend services

✗ Try again.

Feature flags control functionality within a single service version; canary deployments route traffic to different versions of the service — they are complementary strategies often used together

✓ Correct! Well done.

Canary deployments are always faster than feature flags

✗ Try again.

Feature flags require a separate service; canary deployments are built into Kubernetes

✗ Try again.

42. How do you design a multi-tenant Go microservice?

Multi-tenancy means one service instance serves multiple customers (tenants) with data isolation. Three common isolation models: shared database, schema per tenant, database per tenant. The choice depends on isolation requirements, scaling needs, and cost.

// Tenant context extracted from JWT or header
type TenantID string
type tenantKey struct{}

func withTenant(ctx context.Context, id TenantID) context.Context {
    return context.WithValue(ctx, tenantKey{}, id)
}

func tenantFromCtx(ctx context.Context) (TenantID, bool) {
    id, ok := ctx.Value(tenantKey{}).(TenantID)
    return id, ok
}

// Middleware: extract and validate tenant from JWT
func tenantMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        claims, err := validateJWT(r.Header.Get("Authorization"))
        if err != nil { http.Error(w, "unauthorized", 401); return }

        ctx := withTenant(r.Context(), TenantID(claims.TenantID))
        next.ServeHTTP(w, r.WithContext(ctx))
    })
}

// Shared database with tenant_id column (row-level isolation)
type TenantAwareRepo struct{ db *sql.DB }

func (r *TenantAwareRepo) FindUsers(ctx context.Context) ([]User, error) {
    tenantID, ok := tenantFromCtx(ctx)
    if !ok { return nil, errors.New("no tenant in context") }

    rows, err := r.db.QueryContext(ctx,
        // tenant_id filter on every query â NEVER omit this
        "SELECT id, name FROM users WHERE tenant_id = $1",
        tenantID)
    if err != nil { return nil, err }
    defer rows.Close()
    // ... scan rows
}

// Postgres Row Level Security (RLS) â DB enforces tenant isolation
// ALTER TABLE users ENABLE ROW LEVEL SECURITY;
// CREATE POLICY tenant_isolation ON users
//     USING (tenant_id = current_setting('app.tenant_id')::uuid);
// SET app.tenant_id = '123e4567-...' -- per connection/transaction

Take quiz

What is the main risk of the shared-database multi-tenant model if not implemented carefully?Queries become slower with multiple tenants

✗ Try again.

A missing tenant_id filter in a query exposes one tenant's data to another — data isolation must be enforced at every query, making it error-prone without row-level security

✓ Correct! Well done.

The database cannot handle more than 1000 tenants

✗ Try again.

Shared databases require all tenants to have the same schema

✗ Try again.

What does PostgreSQL Row Level Security (RLS) provide in a multi-tenant system?It encrypts rows belonging to each tenant

✗ Try again.

It enforces tenant isolation at the database level — even if application code forgets a tenant_id filter, the DB automatically restricts results to the current tenant's rows

✓ Correct! Well done.

It distributes tenant data across multiple database servers

✗ Try again.

It prevents tenants from having more than a configured number of rows

✗ Try again.

43. What is mutation testing and how does it evaluate test suite quality beyond coverage?

Mutation testing evaluates whether your tests actually catch bugs. It automatically introduces small changes (mutations) to the source code — like changing > to >= — and checks if at least one test fails. Tests that don't catch any mutation are weak.

// The function under test
func isEligible(age int, hasLicense bool) bool {
    return age >= 18 && hasLicense
}

// Weak test â 100% coverage but misses mutations
func TestIsEligible_Weak(t *testing.T) {
    if !isEligible(20, true) {
        t.Error("expected eligible")
    }
    if isEligible(15, true) {
        t.Error("expected ineligible")
    }
}
// Mutation: change age >= 18 to age > 18
// isEligible(18, true) should return true, but weak test doesn't check 18
// â mutation SURVIVES â test is weak at the boundary

// Strong test â catches boundary mutations
func TestIsEligible_Strong(t *testing.T) {
    tests := []struct {
        age     int
        license bool
        want    bool
    }{
        {20, true,  true},   // clearly eligible
        {17, true,  false},  // underage
        {18, true,  true},   // boundary: exactly eligible
        {18, false, false},  // boundary: no license
        {19, false, false},  // old enough but no license
        {0,  false, false},  // both missing
    }
    for _, tt := range tests {
        t.Run(fmt.Sprintf("age=%d,license=%v", tt.age, tt.license), func(t *testing.T) {
            if got := isEligible(tt.age, tt.license); got != tt.want {
                t.Errorf("isEligible(%d, %v) = %v, want %v",
                    tt.age, tt.license, got, tt.want)
            }
        })
    }
}

// Go mutation testing tools:
// - go-mutesting (zimmski/go-mutesting)
// - gremlins (singularity-code/gremlins)

Take quiz

What does it mean when a mutation 'survives' in mutation testing?The mutated code compiles successfully

✗ Try again.

No test in the suite detected the mutation (caught a failure) — the mutant code behaves differently but all tests still pass, indicating a gap in test coverage

✓ Correct! Well done.

The mutation improved the code's performance

✗ Try again.

The mutation was identical to the original code

✗ Try again.

Why should boundary values always be included in table-driven tests?Go test coverage tools require boundary cases

✗ Try again.

Boundary values are where off-by-one errors, > vs >= mistakes, and edge case bugs typically hide — testing only typical values misses the mutations most likely to be real bugs

✓ Correct! Well done.

Boundary values run faster in benchmarks

✗ Try again.

The -race detector only activates for boundary value inputs

✗ Try again.

44. How do you manage the full lifecycle of a Go microservice from startup to shutdown?

A production Go service follows a structured lifecycle: configuration validation, dependency initialisation, readiness signalling, traffic serving, graceful shutdown on signal, and cleanup. Each phase must handle failures correctly.

func main() {
    // Phase 1: load and validate config â fail fast
    cfg, err := config.Load()
    if err != nil { log.Fatalf("invalid config: %v", err) }

    // Phase 2: initialise dependencies
    logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
        Level: cfg.LogLevel,
    }))
    slog.SetDefault(logger)

    db, err := database.Open(cfg.Database)
    if err != nil { log.Fatalf("database: %v", err) }
    defer db.Close()

    // Phase 3: build services
    userRepo := postgres.NewUserRepository(db)
    userSvc  := service.NewUserService(userRepo)
    router   := api.NewRouter(userSvc)

    // Phase 4: start server
    srv := &http.Server{
        Addr:         fmt.Sprintf(":%d", cfg.Port),
        Handler:      router,
        ReadTimeout:  5 * time.Second,
        WriteTimeout: 10 * time.Second,
    }

    errCh := make(chan error, 1)
    go func() {
        slog.Info("server starting", "port", cfg.Port)
        if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
            errCh <- err
        }
    }()

    // Phase 5: wait for shutdown or error
    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)

    select {
    case err := <-errCh:
        slog.Error("server error", "err", err)
    case sig := <-sigCh:
        slog.Info("shutdown signal", "signal", sig)
    }

    // Phase 6: graceful shutdown
    shutdownCtx, cancel := context.WithTimeout(context.Background(), 25*time.Second)
    defer cancel()
    if err := srv.Shutdown(shutdownCtx); err != nil {
        slog.Error("shutdown error", "err", err)
    }
    slog.Info("service stopped cleanly")
}

Take quiz

Why must configuration loading and validation happen before any dependencies are initialised?The Go compiler requires it

✗ Try again.

Config failures should cause immediate startup abort with a clear error — initialising DB connections or starting HTTP servers before validating config wastes time and produces confusing errors

✓ Correct! Well done.

Dependencies require the config to be parsed first

✗ Try again.

Configuration loading is faster before dependencies are open

✗ Try again.

What should the service do if the HTTP server returns an error that is not http.ErrServerClosed?Log it and continue running

✗ Try again.

Treat it as a fatal error — log and exit, or signal for graceful shutdown; the server cannot serve requests if ListenAndServe returned unexpectedly

✓ Correct! Well done.

Restart the server automatically with backoff

✗ Try again.

Ignore it — http.Server always returns errors

✗ Try again.

45. How do you test Go code that processes streaming data or works with channels?

Testing channel-based pipelines requires careful synchronisation. Common patterns: bounded channels with timeout assertions, channel-based test doubles that feed input and capture output, and the fan-out test harness.

// Pipeline under test
func processEvents(ctx context.Context, in <-chan Event) <-chan ProcessedEvent {
    out := make(chan ProcessedEvent)
    go func() {
        defer close(out)
        for event := range in {
            result := transform(event)
            select {
            case out <- result:
            case <-ctx.Done(): return
            }
        }
    }()
    return out
}

// Test helper: send N items and collect results with timeout
func drainChannel[T any](t *testing.T, ch <-chan T, timeout time.Duration) []T {
    t.Helper()
    var results []T
    timer := time.NewTimer(timeout)
    defer timer.Stop()
    for {
        select {
        case item, ok := <-ch:
            if !ok { return results } // channel closed
            results = append(results, item)
        case <-timer.C:
            t.Fatalf("timeout: channel did not close after %v", timeout)
            return results
        }
    }
}

// Test the pipeline
func TestProcessEvents(t *testing.T) {
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    input := make(chan Event, 3)
    input <- Event{ID: 1, Type: "click"}
    input <- Event{ID: 2, Type: "view"}
    input <- Event{ID: 3, Type: "click"}
    close(input) // signal end of stream

    output := processEvents(ctx, input)
    results := drainChannel(t, output, 3*time.Second)

    if len(results) != 3 {
        t.Errorf("expected 3 results, got %d", len(results))
    }
    // Assert specific transformations
    for _, r := range results {
        if r.ProcessedAt.IsZero() {
            t.Errorf("ProcessedAt not set for event %d", r.EventID)
        }
    }
}

// Test cancellation: pipeline exits cleanly when ctx is cancelled
func TestProcessEvents_Cancellation(t *testing.T) {
    ctx, cancel := context.WithCancel(context.Background())
    input := make(chan Event) // never closes
    output := processEvents(ctx, input)

    cancel() // cancel immediately

    // Output should close promptly after cancellation
    select {
    case _, ok := <-output:
        if ok { t.Error("expected channel to be closed after cancellation") }
    case <-time.After(time.Second):
        t.Error("channel did not close after cancellation")
    }
}

Take quiz

Why is it important to test that a pipeline stage closes its output channel after the input channel closes?Unclosed channels cause memory leaks in Go

✗ Try again.

Downstream stages use 'for range ch' to consume — they only exit when the channel closes; if a stage doesn't close its output, downstream goroutines leak forever

✓ Correct! Well done.

The Go runtime panics if a channel is garbage collected while still open

✗ Try again.

Channel close is required by the io.Closer interface

✗ Try again.

Why should pipeline tests always use a context timeout rather than blocking indefinitely?Context timeouts are required by the testing.T interface

✗ Try again.

A bug that prevents the pipeline from completing would cause the test to hang forever in CI — a timeout converts an infinite hang into a clear failure message

✓ Correct! Well done.

Pipeline goroutines cannot detect context cancellation without a timeout

✗ Try again.

The Go race detector requires timeouts in channel tests

✗ Try again.

46. Summarise the key principles for designing scalable Go microservices that senior engineers demonstrate.

This summary condenses the architectural and testing knowledge expected at senior/staff Go engineer level into a reference for interviews.

Architecture Principles Cheat Sheet
Area	Key Principle
Service boundaries	Split by bounded context (DDD); each service owns its data
Communication	gRPC for internal (typed, efficient); REST for external (browser, partners)
Error model	Use gRPC status codes + errdetails; never expose internals to clients
Data consistency	Eventual consistency via events + idempotent consumers + Outbox pattern
Resilience	Circuit breaker, retry with backoff, timeout on every remote call
Observability	Metrics (Prometheus) + traces (OpenTelemetry) + structured logs (slog)
Scalability	Stateless services; external state (Redis, DB); distributed locks for singletons
Deployment	Graceful shutdown; readiness probe; rolling update; preStop sleep
API evolution	Never break field numbers; use reserved; major version for breaking changes

Testing Principles Cheat Sheet
Level	Tool/Pattern	Key Insight
Unit	Table-driven + t.Run	Test all branches including boundaries
Unit	Interface mocks (stubs/fakes)	No frameworks — plain structs satisfying interfaces
Benchmark	testing.B + -benchmem	Measure allocs/op — zero is the goal for hot paths
Integration	TestMain + testcontainers	Real DB in Docker; t.Cleanup for teardown
gRPC	bufconn + table-driven	In-memory server; assert status codes
Concurrency	-race flag always	Never use time.Sleep to sync; use WaitGroup/channels
Contract	Protobuf + buf breaking	Prevent breaking changes; client-side expectations
Load	vegeta / b.RunParallel	Assert p99 SLO; -cpu flag for scaling behaviour

// Interview answer template for 'design a scalable Go service':

// 1. API layer: REST (public) or gRPC (internal)
// 2. Auth: JWT middleware, validate alg, store claims in context
// 3. Business logic: pure domain package, no infrastructure imports
// 4. Data: database/sql pool, context on every query, optimistic locking
// 5. Caching: L1 (in-memory, singleflight) + L2 (Redis, TTL jitter)
// 6. Events: message queue for async, Outbox pattern for atomicity
// 7. Resilience: circuit breaker, retry with backoff, timeout on all I/O
// 8. Observability: Prometheus metrics, OTEL traces, slog JSON logs
// 9. Deployment: graceful shutdown, /healthz + /readyz, rolling update
// 10. Testing: table-driven, -race, -benchmem, goleak, testcontainers

Take quiz

At a senior Go engineer level, what is the correct answer to 'how do you handle distributed transactions across services'?Use 2-phase commit (2PC) coordinated by a transaction manager

✗ Try again.

Avoid distributed transactions entirely — use the Saga pattern with compensating events and the Outbox pattern for atomic event publishing, accepting eventual consistency

✓ Correct! Well done.

Use a distributed database that supports global transactions across services

✗ Try again.

Lock all involved service databases in sequence using Redis distributed locks

✗ Try again.

What is the most important flag to always include when running Go tests in CI for concurrent code?-cover — coverage is the primary quality metric

✗ Try again.

-race — detects data races that may only appear in concurrent execution

✓ Correct! Well done.

-bench — benchmarks catch performance regressions

✗ Try again.

-count=10 — multiple runs catch flaky tests

✗ Try again.

	Interviews Questions Java Spring Hibernate Maven Testing API BigData Web DataStructures AI Database Integration Cloud Scala Python Tools Golang	About Javapedia.net Javapedia.net is for Java and J2EE developers, technologist and college students who prepare of interview. Also this site includes many practical examples. This site is developed using J2EE technologies by Steve Antony, a senior Developer/lead at one of the logistics based company.
	contact: javatutorials2016[at]gmail[dot]com
Kindly consider donating for maintaining this website. Thanks.
	Copyright © 2026, javapedia.net, all rights reserved. privacy policy.

Golang / GoLang System Architecture and Testing Interview Questions

1. Compare REST/JSON with gRPC/Protocol Buffers. When would you choose gRPC for a Go microservice?

2. How do you implement a gRPC server in Go, including error handling and interceptors?

3. How do you build a production-ready gRPC client in Go with connection reuse and resilience?

4. What microservice design patterns are most important to understand for Go interviews?

5. How do you implement distributed tracing and observability in a Go microservice system?

6. How do you implement event-driven communication between Go microservices using message queues?

7. How do you manage database connections and sharding in a high-scale Go service?

8. What caching strategies do you use in Go microservices and how do you prevent cache stampede?

9. What are table-driven tests in Go and why are they the standard testing pattern?

10. How do you write Go benchmarks and what does -benchmem tell you?

11. How do you find and fix memory allocation hotspots in a Go service using profiling?

12. How do you structure integration tests in Go that require real databases or external services?

13. Explain the difference between mocks, stubs, and fakes in Go testing. When do you use each?

14. How does Go's built-in fuzzing work and when should you use property-based testing?

15. How do you test concurrent Go code correctly — including data races and timing issues?

16. How do you decide where to draw service boundaries when decomposing a Go monolith into microservices?

17. How do you version gRPC APIs in Go without breaking existing clients?

18. How do you write unit and integration tests for gRPC services in Go?

19. How do you load test a Go microservice and interpret the results?

20. How does service discovery and client-side load balancing work in a Go microservice system?

21. How do you design a consistent error model across multiple Go microservices?

22. How do you implement the Saga pattern for distributed transactions in Go?

23. What testing.T methods do experienced Go engineers use to write cleaner tests?

24. How do you benchmark concurrent code with testing.B and what insights does it provide?

25. How do you manage dependency injection at scale in a large Go service — wire, dig, or manual?

26. How do you achieve zero-downtime deployments for a Go microservice in Kubernetes?

27. How do generics in Go 1.18+ enable better system design and what are the trade-offs?

28. How do you use test coverage meaningfully in Go — beyond just a percentage?

29. What are the best practices for designing Protocol Buffer schemas in Go microservices?

30. How do you implement safe retries in Go microservices?

31. What are golden file tests in Go and when should you use them?

32. How do you ensure data consistency across Go microservices without distributed transactions?

33. What is the API Gateway pattern and how does it complement Go microservices?

34. What memory leak patterns in Go are not goroutine leaks and how do you detect them?

35. How do CQRS and event sourcing apply to Go microservice architecture?

36. What is chaos engineering and how do Go teams apply it to test microservice resilience?

37. What is contract testing and how does it apply to Go microservices?

38. What makes a Go microservice horizontally scalable and what patterns break scaling?

39. How do you implement configuration hot-reloading in a Go service without restart?

40. How do you architect Go services for maximum testability at the package level?

41. How do you implement feature flags and canary deployments in a Go microservice?

42. How do you design a multi-tenant Go microservice?

43. What is mutation testing and how does it evaluate test suite quality beyond coverage?

44. How do you manage the full lifecycle of a Go microservice from startup to shutdown?

45. How do you test Go code that processes streaming data or works with channels?

46. Summarise the key principles for designing scalable Go microservices that senior engineers demonstrate.

Comments & Discussions

Recently added...