How to Build a Deep Neural Network (DNN) from Scratch
How to Build a Deep Neural Network (DNN) from Scratch
Building a Deep Neural Network (DNN) from scratch is an excellent way to understand how deep learning works under the hood. While libraries like TensorFlow and PyTorch make it easy to build models, implementing a DNN from scratch—without any deep learning libraries—teaches you the foundational concepts like forward propagation, backpropagation, activation functions, loss computation, and weight updates.
In this guide, we’ll walk through how to build a simple fully connected DNN using Python and NumPy.
✅ Overview of Steps
Initialize network parameters
Forward propagation
Compute loss
Backward propagation
Update weights
Train over multiple epochs
๐ง 1. Define the Neural Network Architecture
Let’s build a DNN with the following structure:
Input layer (e.g., 2 neurons)
1 hidden layer (e.g., 3 neurons)
Output layer (1 neuron, for binary classification)
๐งฎ 2. Initialize Parameters
import numpy as np
# Set seed for reproducibility
np.random.seed(42)
# Define layer sizes
input_size = 2
hidden_size = 3
output_size = 1
# Initialize weights and biases
W1 = np.random.randn(hidden_size, input_size)
b1 = np.zeros((hidden_size, 1))
W2 = np.random.randn(output_size, hidden_size)
b2 = np.zeros((output_size, 1))
๐ 3. Activation Functions
def sigmoid(z):
return 1 / (1 + np.exp(-z))
def sigmoid_derivative(a):
return a * (1 - a)
def relu(z):
return np.maximum(0, z)
def relu_derivative(z):
return z > 0
๐ 4. Forward Propagation
def forward_propagation(X):
Z1 = np.dot(W1, X) + b1
A1 = relu(Z1)
Z2 = np.dot(W2, A1) + b2
A2 = sigmoid(Z2) # For binary classification
cache = (Z1, A1, Z2, A2)
return A2, cache
๐ 5. Compute Loss
Use binary cross-entropy loss:
def compute_loss(Y, A2):
m = Y.shape[1]
loss = -np.sum(Y * np.log(A2) + (1 - Y) * np.log(1 - A2)) / m
return np.squeeze(loss)
๐ 6. Backward Propagation
def backward_propagation(X, Y, cache):
Z1, A1, Z2, A2 = cache
m = X.shape[1]
dZ2 = A2 - Y
dW2 = np.dot(dZ2, A1.T) / m
db2 = np.sum(dZ2, axis=1, keepdims=True) / m
dA1 = np.dot(W2.T, dZ2)
dZ1 = dA1 * relu_derivative(Z1)
dW1 = np.dot(dZ1, X.T) / m
db1 = np.sum(dZ1, axis=1, keepdims=True) / m
gradients = (dW1, db1, dW2, db2)
return gradients
๐ง 7. Update Parameters
def update_parameters(W1, b1, W2, b2, grads, learning_rate):
dW1, db1, dW2, db2 = grads
W1 -= learning_rate * dW1
b1 -= learning_rate * db1
W2 -= learning_rate * dW2
b2 -= learning_rate * db2
return W1, b1, W2, b2
๐️♂️ 8. Training Loop
# Example input (X: 2 features, 4 samples) and labels
X = np.array([[0, 0, 1, 1],
[0, 1, 0, 1]]) # XOR input
Y = np.array([[0, 1, 1, 0]]) # XOR output
# Hyperparameters
learning_rate = 0.1
epochs = 10000
for i in range(epochs):
# Forward pass
A2, cache = forward_propagation(X)
# Loss computation
loss = compute_loss(Y, A2)
# Backpropagation
grads = backward_propagation(X, Y, cache)
# Update weights
W1, b1, W2, b2 = update_parameters(W1, b1, W2, b2, grads, learning_rate)
if i % 1000 == 0:
print(f"Epoch {i}, Loss: {loss:.4f}")
✅ 9. Prediction Function
def predict(X):
A2, _ = forward_propagation(X)
return A2 > 0.5 # Binary classification threshold
๐ฏ 10. Summary of What You've Built
A 2-layer DNN using only NumPy
Implemented forward and backward propagation manually
Trained it on a toy dataset (XOR problem)
Gained a deep understanding of how DNNs work under the hood
๐ง Next Steps and Improvements
Add more hidden layers (deeper networks)
Use different activation functions (e.g., tanh, leaky ReLU)
Implement mini-batch gradient descent
Add regularization (L2, dropout)
Generalize the code for variable layer sizes
Learn AI ML Course in Hyderabad
Read More
A Deep Dive into LSTMs (Long Short-Term Memory Networks)
How Machine Learning Is Powering Smart Cities
Comments
Post a Comment