Python / PyTorch Fundamentals Interview Questions
How do you implement and use a custom loss function in PyTorch?
When built-in loss functions do not fit your task, you can write a custom loss as either a plain function or an nn.Module subclass. As long as the loss is computed from PyTorch tensor operations with requires_grad=True parameters, autograd handles differentiation automatically.
import torch
import torch.nn as nn
import torch.nn.functional as F
# ── Option 1: Plain function (simple, no learnable parameters)
def smooth_l1_custom(pred: torch.Tensor, target: torch.Tensor, beta: float = 1.0) -> torch.Tensor:
"""Huber loss — L1 outside beta, L2 inside beta."""
diff = torch.abs(pred - target)
loss = torch.where(
diff < beta,
0.5 * diff ** 2 / beta, # quadratic region
diff - 0.5 * beta, # linear region
)
return loss.mean()
# ── Option 2: nn.Module subclass (recommended when loss has hyper-parameters
# or learnable parameters you want saved in state_dict)
class FocalLoss(nn.Module):
"""Focal loss for class-imbalanced multi-class problems."""
def __init__(self, gamma: float = 2.0, weight: torch.Tensor | None = None):
super().__init__()
self.gamma = gamma
self.weight = weight # class weights (optional)
def forward(self, logits: torch.Tensor, targets: torch.Tensor) -> torch.Tensor:
# logits: (N, C) targets: (N,) int64
ce_loss = F.cross_entropy(logits, targets, weight=self.weight, reduction="none")
pt = torch.exp(-ce_loss) # probability of correct class
focal = (1 - pt) ** self.gamma * ce_loss
return focal.mean()
# Usage — identical to built-in loss functions
model = nn.Linear(10, 5)
focal_fn = FocalLoss(gamma=2.0)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
X = torch.randn(16, 10)
target = torch.randint(0, 5, (16,))
optimizer.zero_grad()
logits = model(X)
loss = focal_fn(logits, target) # custom loss used exactly like nn.CrossEntropyLoss
loss.backward() # autograd differentiates through our custom ops
optimizer.step()
print(f"Focal loss: {loss.item():.4f}")
# ── Combining multiple losses
rec_loss = F.mse_reconstruction_loss(output, target_img) # reconstruction
kl_loss = -0.5 * (1 + log_var - mu**2 - log_var.exp()).mean() # KL divergence
total_loss = rec_loss + 0.001 * kl_loss # weighted combinationKey insight: any PyTorch computation graph built from differentiable operations is automatically differentiable via autograd — you do not need to manually derive or implement gradients for custom losses. If you use standard PyTorch operations (torch.*, F.*), autograd takes care of the rest.
Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!
Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.
Invest now!!! Get Free equity stock (US, UK only)!
Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.
The Robinhood app makes it easy to trade stocks, crypto and more.
Webull! Receive free stock by signing up using the link: Webull signup.
More Related questions...
