Building Autoencoders for Dimensionality Reduction

September 25, 2025

Building Autoencoders for Dimensionality Reduction

📌 What is an Autoencoder?

An autoencoder is a type of neural network that learns to compress and then reconstruct data. It’s made up of two parts:

Encoder – compresses the input into a smaller representation (called a latent vector or bottleneck).

Decoder – tries to reconstruct the original input from that compressed representation.

So, it learns how to reduce data to fewer dimensions while keeping the important information.

🧠 Why Use Autoencoders for Dimensionality Reduction?

Traditional methods like PCA (Principal Component Analysis) reduce dimensionality by linear transformations. But autoencoders can learn non-linear patterns, making them more powerful for complex data like:

Images

Text embeddings

Sensor signals

✅ Capture non-linear relationships

✅ Preserve more structure in the data

✅ Work well for high-dimensional data (like 784-pixel MNIST images)

🏗️ Architecture of an Autoencoder

Input → Encoder → Bottleneck (Compressed representation) → Decoder → Output (Reconstructed input)

Example with MNIST:

Input: 28x28 grayscale image (784 values)

Encoder: Compresses to 64 dimensions

Decoder: Reconstructs back to 784 dimensions

⚙️ Components of an Autoencoder

Part Function

Encoder Learns to compress data

Bottleneck Lowest-dimensional representation

Decoder Learns to reconstruct original input

Loss Measures how close output is to input

💡 How Does It Learn?

During training, the network minimizes reconstruction loss:

Typically uses Mean Squared Error (MSE):

Loss = (Original - Reconstructed)^2

The model learns to keep essential information in the bottleneck and discard noise.

🛠️ Building a Simple Autoencoder in Python (using Keras)

from tensorflow.keras.layers import Input, Dense

from tensorflow.keras.models import Model

from tensorflow.keras.datasets import mnist

import numpy as np

# Load and preprocess data

(x_train, _), (x_test, _) = mnist.load_data()

x_train = x_train.astype("float32") / 255.0

x_test = x_test.astype("float32") / 255.0

x_train = x_train.reshape((len(x_train), 784))

x_test = x_test.reshape((len(x_test), 784))

# Encoder

input_img = Input(shape=(784,))

encoded = Dense(128, activation='relu')(input_img)

encoded = Dense(64, activation='relu')(encoded)

# Decoder

decoded = Dense(128, activation='relu')(encoded)

decoded = Dense(784, activation='sigmoid')(decoded)

# Autoencoder model

autoencoder = Model(input_img, decoded)

autoencoder.compile(optimizer='adam', loss='mse')

# Train

autoencoder.fit(x_train, x_train,

epochs=20,

batch_size=256,

shuffle=True,

validation_data=(x_test, x_test))

📉 Visualizing Dimensionality Reduction

Once trained, you can extract the encoder part and use the 64-dimensional bottleneck representation for:

Visualization (e.g., t-SNE or PCA on the bottleneck)

Clustering

Anomaly detection

Feeding into another machine learning model

# Create encoder model

encoder = Model(input_img, encoded)

compressed_data = encoder.predict(x_test)

🧪 Applications of Autoencoders

Use Case Purpose

Dimensionality Reduction Compress high-dimensional data

Denoising Remove noise from images or signals

Anomaly Detection Identify unusual data patterns

Image Compression Reduce image size while preserving quality

Feature Learning Learn useful representations for downstream tasks

📚 Summary

Concept Description

Autoencoder Neural network that compresses and reconstructs data

Encoder Reduces data to a lower dimension

Decoder Reconstructs original data

Bottleneck Compressed representation

Loss Function (MSE) Measures reconstruction error

🔮 Tips for Better Autoencoders

Use ReLU in hidden layers and sigmoid in the output (for normalized input).

Regularize with dropout or L1/L2 loss to prevent overfitting.

Use denoising autoencoders for more robust feature learning.

Try variational autoencoders (VAEs) for generative modeling.

Learn AI ML Course in Hyderabad

How Deep Learning is Transforming Natural Language Processing (NLP)

A Beginner’s Guide to Convolutional Neural Networks (CNNs)

How to Build a Deep Neural Network (DNN) from Scratch

Search This Blog

Best Quality Thought Software Institute Training in Hyderabad