Python / Python Deep Learning and Neural Networks Interview Questions
What are the most effective regularization strategies for deep learning and how do they differ from classical ML regularization?
Deep neural networks have millions of parameters and can trivially memorise training data. Classical regularisation (L1/L2 on weights) still applies, but modern deep learning has developed additional techniques that often work better or are used in combination.
| Technique | How it works | Best applied to |
|---|---|---|
| L2 (weight decay) | Penalises large weights: adds λ‖w‖² to loss | All DL models; use AdamW for correct implementation |
| Dropout | Randomly zero neurons during training | Fully-connected layers; less common in conv/transformer |
| Data augmentation | Artificially increase diversity of training set | Vision (flips, crop, colour jitter, mixup, cutmix) |
| Early stopping | Stop training when val loss stops improving | Any model; simple and effective baseline |
| Label smoothing | Soften one-hot labels to (1-ε, ε/(k-1),...) | Classification; improves calibration |
| Stochastic depth | Randomly drop entire residual blocks during training | Very deep networks (ResNets, ViTs) |
import torch
import torch.nn as nn
import torchvision.transforms as T
# Data augmentation for images
train_transform = T.Compose([
T.RandomHorizontalFlip(p=0.5),
T.RandomCrop(32, padding=4),
T.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4),
T.ToTensor(),
T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
# Label smoothing: penalises overconfident predictions
criterion = nn.CrossEntropyLoss(label_smoothing=0.1)
# A 10-class example: true label 3 becomes
# [0.01, 0.01, 0.01, 0.91, 0.01, ...] instead of [0,0,0,1,0,...]
# Mixup augmentation (manual implementation)
def mixup_batch(x, y, alpha=0.4):
lam = torch.distributions.Beta(alpha, alpha).sample().item()
idx = torch.randperm(x.size(0))
x_mix = lam * x + (1 - lam) * x[idx]
y_a, y_b = y, y[idx]
return x_mix, y_a, y_b, lam
# Early stopping — track best val loss, restore best weights
best_val_loss = float('inf')
patience_count = 0
for epoch in range(max_epochs):
val_loss = validate(model, val_loader)
if val_loss < best_val_loss:
best_val_loss = val_loss
torch.save(model.state_dict(), 'best_model.pt')
patience_count = 0
else:
patience_count += 1
if patience_count >= patience:
break
Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!
Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.
Invest now!!! Get Free equity stock (US, UK only)!
Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.
The Robinhood app makes it easy to trade stocks, crypto and more.
Webull! Receive free stock by signing up using the link: Webull signup.
More Related questions...
