Python / Python Mathematical Intuition and Scikit Learn Interview Questions
What is the 'naive' independence assumption in Naive Bayes, and why does it still work well in practice despite being unrealistic?
Naive Bayes applies Bayes' theorem to classify: P(y|x₁,...,xₙ) ∝ P(y)·P(x₁,...,xₙ|y). Computing the joint likelihood P(x₁,...,xₙ|y) exactly would require modelling all interactions between features — infeasible with limited data. The 'naive' simplification assumes all features are conditionally independent given the class: P(x₁,...,xₙ|y) = ∏ P(xᵢ|y), reducing the problem to estimating n simple univariate distributions instead of one complex n-dimensional joint distribution.
This independence assumption is almost always technically false (features usually correlate), yet Naive Bayes frequently performs well because classification only requires getting the relative ranking of class probabilities correct, not their exact values. Even with a biased probability estimate, if the bias affects all classes similarly, the argmax decision (which class has highest probability) often remains correct — a well-known result is that NB's classification accuracy can be good even when its probability estimates are poorly calibrated.
from sklearn.naive_bayes import GaussianNB, MultinomialNB
import numpy as np
# GaussianNB: assumes each feature is normally distributed within each class
gnb = GaussianNB()
gnb.fit(X_train, y_train)
# Manual demonstration of the independence factorization
def naive_bayes_predict(x, class_priors, feature_likelihoods):
scores = {}
for c in class_priors:
# log space to avoid numerical underflow from many small probabilities
log_prob = np.log(class_priors[c])
for i, xi in enumerate(x):
log_prob += np.log(feature_likelihoods[c][i](xi) + 1e-10)
scores[c] = log_prob
return max(scores, key=scores.get)
# MultinomialNB: common for text classification (word counts)
mnb = MultinomialNB()
mnb.fit(X_train_counts, y_train)
Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!
Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.
Invest now!!! Get Free equity stock (US, UK only)!
Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.
The Robinhood app makes it easy to trade stocks, crypto and more.
Webull! Receive free stock by signing up using the link: Webull signup.
More Related questions...
