Python / PyTorch Fundamentals Interview Questions
What is Batch Normalization in PyTorch and how does it differ from Layer Normalization?
Normalization layers stabilise training by re-centring and re-scaling activations. PyTorch provides several variants; Batch Normalization (BatchNorm) and Layer Normalization (LayerNorm) are the two most widely used, but they normalise over different dimensions and suit different architectures.
| Feature | BatchNorm (nn.BatchNorm1d/2d) | LayerNorm (nn.LayerNorm) |
|---|---|---|
| Normalises over | Batch dimension (per-feature statistics) | Feature dimension (per-sample statistics) |
| Statistics at train | Computed from current mini-batch | Computed from current sample's features |
| Statistics at eval | Uses running mean/var accumulated during training | Always computed fresh from current input |
| Batch size dependency | Noisy with very small batches (< 8) | Independent of batch size — works with batch=1 |
| Best for | CNNs (image models) | Transformers, RNNs, NLP models |
| Parameters | gamma (scale), beta (shift) per feature | Same, but normalised per sample |
import torch
import torch.nn as nn
# ── BatchNorm — for feedforward / CNN models
class BNModel(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(20, 64)
self.bn1 = nn.BatchNorm1d(64) # 64 features
self.fc2 = nn.Linear(64, 10)
def forward(self, x):
x = torch.relu(self.bn1(self.fc1(x)))
return self.fc2(x)
# BatchNorm behaves differently in train vs eval mode!
# train: normalise using batch mean/var, update running stats
# eval: use accumulated running_mean / running_var
model = BNModel()
model.train() # must be in train mode during training!
# ── LayerNorm — for transformers and sequence models
class LNModel(nn.Module):
def __init__(self, d_model=64):
super().__init__()
self.fc1 = nn.Linear(20, d_model)
self.ln1 = nn.LayerNorm(d_model) # normalise over last dim
self.fc2 = nn.Linear(d_model, 10)
def forward(self, x):
x = torch.relu(self.ln1(self.fc1(x)))
return self.fc2(x)
# LayerNorm produces the SAME result at train and eval
ln_model = LNModel()
ln_model.train()
x = torch.randn(8, 20)
out_train = ln_model(x)
ln_model.eval()
out_eval = ln_model(x)
print(torch.allclose(out_train, out_eval)) # True — LayerNorm is mode-independent!Common bug: forgetting to call model.train() before training and model.eval() before validation when using BatchNorm — at eval, it uses accumulated running statistics, and if these were never updated (because the model was always in eval mode), predictions will be incorrect.
Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!
Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.
Invest now!!! Get Free equity stock (US, UK only)!
Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.
The Robinhood app makes it easy to trade stocks, crypto and more.
Webull! Receive free stock by signing up using the link: Webull signup.
More Related questions...
