Naive Bayes: How It Works and When to Use It
๐งฎ Naive Bayes: How It Works and When to Use It
๐ง What is Naive Bayes?
Naive Bayes is a probabilistic machine learning algorithm based on Bayes' Theorem, with the “naive” assumption that all features are independent of each other given the class label.
It's simple, fast, and surprisingly powerful—especially for text classification tasks like spam detection or sentiment analysis.
๐งพ Bayes’ Theorem Refresher
๐
(
๐ถ
∣
๐
)
=
๐
(
๐
∣
๐ถ
)
⋅
๐
(
๐ถ
)
๐
(
๐
)
P(C∣X)=
P(X)
P(X∣C)⋅P(C)
Where:
๐
(
๐ถ
∣
๐
)
P(C∣X) = Posterior: Probability of class C given features X
๐
(
๐
∣
๐ถ
)
P(X∣C) = Likelihood: Probability of features X given class C
๐
(
๐ถ
)
P(C) = Prior: Probability of class C
๐
(
๐
)
P(X) = Evidence: Probability of features X (can be ignored during classification)
Naive Bayes chooses the class that maximizes the posterior probability.
๐ How Naive Bayes Works
Train Phase:
Calculate prior probabilities for each class
๐
(
๐ถ
)
P(C).
For each feature, calculate likelihood probabilities
๐
(
๐ฅ
๐
∣
๐ถ
)
P(x
i
∣C) for every class.
Prediction Phase:
For a new example
๐
=
(
๐ฅ
1
,
๐ฅ
2
,
.
.
.
,
๐ฅ
๐
)
X=(x
1
,x
2
,...,x
n
), compute:
๐
(
๐ถ
∣
๐
)
∝
๐
(
๐ถ
)
⋅
∏
๐
=
1
๐
๐
(
๐ฅ
๐
∣
๐ถ
)
P(C∣X)∝P(C)⋅
i=1
∏
n
P(x
i
∣C)
Choose the class with the highest posterior probability.
The independence assumption lets us simplify the likelihood into a product of individual feature probabilities.
๐งช Types of Naive Bayes
Type Use Case Assumptions
Gaussian Naive Bayes Continuous data Features follow a normal distribution
Multinomial Naive Bayes Text classification, document classification Features are word counts or frequencies
Bernoulli Naive Bayes Binary/Boolean features Features are 0 or 1 (e.g., word present or not)
๐ป Example: Naive Bayes in Python (Scikit-learn)
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Train model
model = GaussianNB()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))
✅ When to Use Naive Bayes
✔️ Ideal For:
Text classification
Spam detection
Sentiment analysis
Document categorization
When features are mostly independent
Fast baseline models
Large datasets where training time matters
Multiclass problems
๐ Real-World Applications:
Email spam filters (Gmail, Outlook)
News categorization
Real-time document classification
Sentiment analysis on social media
Medical diagnosis (with categorical inputs)
⚠️ When Not to Use Naive Bayes
❌ Avoid If:
Features are highly correlated (the “naive” assumption fails)
You need probabilistic outputs with calibrated confidence (it’s not very well-calibrated)
Your features are not independent and performance is critical (e.g., image data)
๐ฐ Pros and Cons
✅ Pros ❌ Cons
Simple and fast Assumes feature independence
Works well with high-dimensional data Poor at handling correlated features
Great for text and categorical data Less accurate than more complex models in some cases
Requires less training data Limited in capturing complex relationships
๐ Summary Table
Feature Naive Bayes
Type Supervised
Task Classification
Core Idea Apply Bayes' Theorem with independence assumption
Input Labeled data
Output Predicted class (and probability)
Best for Text, categorical data
Variants Gaussian, Multinomial, Bernoulli
๐งฉ Pro Tip
Naive Bayes is often used as a baseline model. It’s fast to train and can perform surprisingly well, especially for text classification tasks.
Learn Data Science Course in Hyderabad
Read More
Understanding K-Means Clustering for Unsupervised Learning
Decision Trees: Intuition, Implementation, and Applications
Logistic Regression: A Practical Guide for Classification
Linear Regression: Explained and Implemented from Scratch
Visit Our Quality Thought Training Institute in Hyderabad
Comments
Post a Comment