Building Autoencoders for Dimensionality Reduction
Building Autoencoders for Dimensionality Reduction
๐ What is an Autoencoder?
An autoencoder is a type of neural network that learns to compress and then reconstruct data. It’s made up of two parts:
Encoder – compresses the input into a smaller representation (called a latent vector or bottleneck).
Decoder – tries to reconstruct the original input from that compressed representation.
So, it learns how to reduce data to fewer dimensions while keeping the important information.
๐ง Why Use Autoencoders for Dimensionality Reduction?
Traditional methods like PCA (Principal Component Analysis) reduce dimensionality by linear transformations. But autoencoders can learn non-linear patterns, making them more powerful for complex data like:
Images
Text embeddings
Sensor signals
✅ Capture non-linear relationships
✅ Preserve more structure in the data
✅ Work well for high-dimensional data (like 784-pixel MNIST images)
๐️ Architecture of an Autoencoder
Input → Encoder → Bottleneck (Compressed representation) → Decoder → Output (Reconstructed input)
Example with MNIST:
Input: 28x28 grayscale image (784 values)
Encoder: Compresses to 64 dimensions
Decoder: Reconstructs back to 784 dimensions
⚙️ Components of an Autoencoder
Part Function
Encoder Learns to compress data
Bottleneck Lowest-dimensional representation
Decoder Learns to reconstruct original input
Loss Measures how close output is to input
๐ก How Does It Learn?
During training, the network minimizes reconstruction loss:
Typically uses Mean Squared Error (MSE):
Loss = (Original - Reconstructed)^2
The model learns to keep essential information in the bottleneck and discard noise.
๐ ️ Building a Simple Autoencoder in Python (using Keras)
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.datasets import mnist
import numpy as np
# Load and preprocess data
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))
# Encoder
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
# Decoder
decoded = Dense(128, activation='relu')(encoded)
decoded = Dense(784, activation='sigmoid')(decoded)
# Autoencoder model
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='mse')
# Train
autoencoder.fit(x_train, x_train,
epochs=20,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))
๐ Visualizing Dimensionality Reduction
Once trained, you can extract the encoder part and use the 64-dimensional bottleneck representation for:
Visualization (e.g., t-SNE or PCA on the bottleneck)
Clustering
Anomaly detection
Feeding into another machine learning model
# Create encoder model
encoder = Model(input_img, encoded)
compressed_data = encoder.predict(x_test)
๐งช Applications of Autoencoders
Use Case Purpose
Dimensionality Reduction Compress high-dimensional data
Denoising Remove noise from images or signals
Anomaly Detection Identify unusual data patterns
Image Compression Reduce image size while preserving quality
Feature Learning Learn useful representations for downstream tasks
๐ Summary
Concept Description
Autoencoder Neural network that compresses and reconstructs data
Encoder Reduces data to a lower dimension
Decoder Reconstructs original data
Bottleneck Compressed representation
Loss Function (MSE) Measures reconstruction error
๐ฎ Tips for Better Autoencoders
Use ReLU in hidden layers and sigmoid in the output (for normalized input).
Regularize with dropout or L1/L2 loss to prevent overfitting.
Use denoising autoencoders for more robust feature learning.
Try variational autoencoders (VAEs) for generative modeling.
Learn AI ML Course in Hyderabad
Read More
The Importance of Activation Functions in Deep Learning
How Deep Learning is Transforming Natural Language Processing (NLP)
A Beginner’s Guide to Convolutional Neural Networks (CNNs)
Comments
Post a Comment