How to Build a Deep Neural Network (DNN) from Scratch

September 25, 2025

How to Build a Deep Neural Network (DNN) from Scratch

Building a Deep Neural Network (DNN) from scratch is an excellent way to understand how deep learning works under the hood. While libraries like TensorFlow and PyTorch make it easy to build models, implementing a DNN from scratch—without any deep learning libraries—teaches you the foundational concepts like forward propagation, backpropagation, activation functions, loss computation, and weight updates.

In this guide, we’ll walk through how to build a simple fully connected DNN using Python and NumPy.

✅ Overview of Steps

Initialize network parameters

Forward propagation

Compute loss

Backward propagation

Update weights

Train over multiple epochs

🧠 1. Define the Neural Network Architecture

Let’s build a DNN with the following structure:

Input layer (e.g., 2 neurons)

1 hidden layer (e.g., 3 neurons)

Output layer (1 neuron, for binary classification)

🧮 2. Initialize Parameters

import numpy as np

# Set seed for reproducibility

np.random.seed(42)

# Define layer sizes

input_size = 2

hidden_size = 3

output_size = 1

# Initialize weights and biases

W1 = np.random.randn(hidden_size, input_size)

b1 = np.zeros((hidden_size, 1))

W2 = np.random.randn(output_size, hidden_size)

b2 = np.zeros((output_size, 1))

🔁 3. Activation Functions

def sigmoid(z):

return 1 / (1 + np.exp(-z))

def sigmoid_derivative(a):

return a * (1 - a)

def relu(z):

return np.maximum(0, z)

def relu_derivative(z):

return z > 0

🔄 4. Forward Propagation

def forward_propagation(X):

Z1 = np.dot(W1, X) + b1

A1 = relu(Z1)

Z2 = np.dot(W2, A1) + b2

A2 = sigmoid(Z2) # For binary classification

cache = (Z1, A1, Z2, A2)

return A2, cache

📉 5. Compute Loss

Use binary cross-entropy loss:

def compute_loss(Y, A2):

m = Y.shape[1]

loss = -np.sum(Y * np.log(A2) + (1 - Y) * np.log(1 - A2)) / m

return np.squeeze(loss)

🔁 6. Backward Propagation

def backward_propagation(X, Y, cache):

Z1, A1, Z2, A2 = cache

m = X.shape[1]

dZ2 = A2 - Y

dW2 = np.dot(dZ2, A1.T) / m

db2 = np.sum(dZ2, axis=1, keepdims=True) / m

dA1 = np.dot(W2.T, dZ2)

dZ1 = dA1 * relu_derivative(Z1)

dW1 = np.dot(dZ1, X.T) / m

db1 = np.sum(dZ1, axis=1, keepdims=True) / m

gradients = (dW1, db1, dW2, db2)

return gradients

🔧 7. Update Parameters

def update_parameters(W1, b1, W2, b2, grads, learning_rate):

dW1, db1, dW2, db2 = grads

W1 -= learning_rate * dW1

b1 -= learning_rate * db1

W2 -= learning_rate * dW2

b2 -= learning_rate * db2

return W1, b1, W2, b2

🏋️‍♂️ 8. Training Loop

# Example input (X: 2 features, 4 samples) and labels

X = np.array([[0, 0, 1, 1],

[0, 1, 0, 1]]) # XOR input

Y = np.array([[0, 1, 1, 0]]) # XOR output

# Hyperparameters

learning_rate = 0.1

epochs = 10000

for i in range(epochs):

# Forward pass

A2, cache = forward_propagation(X)

# Loss computation

loss = compute_loss(Y, A2)

# Backpropagation

grads = backward_propagation(X, Y, cache)

# Update weights

W1, b1, W2, b2 = update_parameters(W1, b1, W2, b2, grads, learning_rate)

if i % 1000 == 0:

print(f"Epoch {i}, Loss: {loss:.4f}")

✅ 9. Prediction Function

def predict(X):

A2, _ = forward_propagation(X)

return A2 > 0.5 # Binary classification threshold

🎯 10. Summary of What You've Built

A 2-layer DNN using only NumPy

Implemented forward and backward propagation manually

Trained it on a toy dataset (XOR problem)

Gained a deep understanding of how DNNs work under the hood

🧠 Next Steps and Improvements

Add more hidden layers (deeper networks)

Use different activation functions (e.g., tanh, leaky ReLU)

Implement mini-batch gradient descent

Add regularization (L2, dropout)

Generalize the code for variable layer sizes

Learn AI ML Course in Hyderabad

Deep Learning Topics

How Machine Learning Is Powering Smart Cities

AI in Predictive Healthcare: The Power of Data

Search This Blog

Best Quality Thought Software Institute Training in Hyderabad