Tuesday, December 9, 2025

thumbnail

Building Your First GAN: A Step-by-Step Tutorial

 Building your first Generative Adversarial Network (GAN) can be an exciting experience, and it's a great way to get familiar with deep learning techniques. In this tutorial, I’ll walk you through the basic steps to build a simple GAN using Python and TensorFlow or PyTorch. Here, I'll use TensorFlow as an example, but let me know if you prefer the PyTorch version!

What is a GAN?

A GAN consists of two neural networks:

Generator: Tries to generate fake data (e.g., images) that looks real.

Discriminator: Tries to distinguish between real data (from the dataset) and fake data (produced by the generator).

The generator and discriminator are trained together in a zero-sum game, where the generator gets better at generating realistic data, and the discriminator gets better at identifying fake data.

Steps to Build Your First GAN

Step 1: Install Dependencies

Make sure you have Python 3.7+ installed. Then, install the necessary libraries:

pip install tensorflow matplotlib numpy

Step 2: Import Libraries

import tensorflow as tf

from tensorflow.keras import layers

import numpy as np

import matplotlib.pyplot as plt

Step 3: Load and Prepare the Dataset

For simplicity, let’s use the MNIST dataset, which contains 28x28 grayscale images of handwritten digits.

# Load MNIST dataset

(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()

# Preprocess the data: Normalize to range [-1, 1]

x_train = x_train / 127.5 - 1.0

x_train = np.expand_dims(x_train, axis=-1) # Adding channel dimension

# Set batch size

BATCH_SIZE = 64

BUFFER_SIZE = 60000

# Create a dataset pipeline

train_dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

Step 4: Build the Generator Model

The generator will take random noise as input and generate fake images. We’ll use a fully connected network followed by transpose convolutions to upsample the noise into an image.

def build_generator():

  model = tf.keras.Sequential([

  layers.Dense(7 * 7 * 256, input_shape=(100,), use_bias=False),

  layers.BatchNormalization(),

  layers.LeakyReLU(),

  layers.Reshape((7, 7, 256)),

  layers.Conv2DTranspose(128, 5, strides=1, padding='same', use_bias=False),

  layers.BatchNormalization(),

  layers.LeakyReLU(),

  layers.Conv2DTranspose(64, 5, strides=2, padding='same', use_bias=False),

  layers.BatchNormalization(),

  layers.LeakyReLU(),

  layers.Conv2DTranspose(1, 5, strides=2, padding='same', activation='tanh')

  ])

  return model

generator = build_generator()

Step 5: Build the Discriminator Model

The discriminator will classify images as either real (from the dataset) or fake (generated by the generator). It’s a simple CNN.

def build_discriminator():

  model = tf.keras.Sequential([

  layers.Conv2D(64, 5, strides=2, padding='same', input_shape=[28, 28, 1]),

  layers.LeakyReLU(),

  layers.Dropout(0.3),

  layers.Conv2D(128, 5, strides=2, padding='same'),

  layers.LeakyReLU(),

  layers.Dropout(0.3),

  layers.Flatten(),

  layers.Dense(1)

  ])

  return model

discriminator = build_discriminator()

Step 6: Define the Loss Functions

We'll use binary cross-entropy as the loss function, as it’s commonly used in GANs.

# Binary Cross-Entropy loss

cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

Step 7: Define the Optimizers

We’ll use Adam optimizers for both networks.

generator_optimizer = tf.keras.optimizers.Adam(1e-4)

discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

Step 8: Define the Training Loop

Here, we define the training loop, where the discriminator is trained to distinguish between real and fake images, and the generator is trained to fool the discriminator.

@tf.function

def train_step(real_images):

  noise = tf.random.normal([BATCH_SIZE, 100])

  fake_images = generator(noise, training=False)

  # Train discriminator: real images -> 1, fake images -> 0

  with tf.GradientTape() as disc_tape:

  real_output = discriminator(real_images, training=True)

  fake_output = discriminator(fake_images, training=True)

  disc_loss = (cross_entropy(tf.ones_like(real_output), real_output) +

  cross_entropy(tf.zeros_like(fake_output), fake_output)) / 2

  gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

  discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

  # Train generator: try to fool discriminator (real labels for fake images)

  with tf.GradientTape() as gen_tape:

  fake_output = discriminator(fake_images, training=True)

  gen_loss = cross_entropy(tf.ones_like(fake_output), fake_output)

  gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)

  generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))

  return disc_loss, gen_loss

Step 9: Train the GAN

We train the model for a specified number of epochs, displaying generated images periodically to monitor progress.

EPOCHS = 50

def train():

  for epoch in range(EPOCHS):

  for real_images in train_dataset:

  disc_loss, gen_loss = train_step(real_images)

 

  print(f"Epoch {epoch+1}, Disc Loss: {disc_loss.numpy()}, Gen Loss: {gen_loss.numpy()}")

 

  # Generate and save images periodically

  if (epoch + 1) % 10 == 0:

  generate_and_save_images(epoch + 1)

def generate_and_save_images(epoch):

  noise = tf.random.normal([16, 100])

  generated_images = generator(noise, training=False)

  generated_images = (generated_images + 1) / 2 # Rescale to [0, 1]

  plt.figure(figsize=(4, 4))

  for i in range(16):

  plt.subplot(4, 4, i + 1)

  plt.imshow(generated_images[i, :, :, 0], cmap='gray')

  plt.axis('off')

  plt.savefig(f'epoch_{epoch}.png')

  plt.close()

train()

Step 10: Results

After training, you should see the generator progressively improve at generating realistic images of handwritten digits. At each epoch, generated images will be saved, and you’ll observe the progression from blurry, noisy images to more structured and realistic digit images.

Next Steps:

Improve the Architecture: You can try using more advanced techniques like deeper networks, conditional GANs (cGANs), or adding additional features like attention mechanisms.

Train on Larger Datasets: Try training the GAN on more complex datasets like CIFAR-10, CelebA, or LSUN.

Experiment with Different Loss Functions: The Wasserstein GAN (WGAN) is a popular alternative that uses a different loss function.

That’s it! You’ve built your first GAN. The key takeaway is that GANs are a powerful technique for generating realistic data, and with more practice, you can create more complex models. Let me know if you need any further clarification or if you want to explore advanced topics!

Building your first Generative Adversarial Network (GAN) can be an exciting experience, and it's a great way to get familiar with deep learning techniques. In this tutorial, I’ll walk you through the basic steps to build a simple GAN using Python and TensorFlow or PyTorch. Here, I'll use TensorFlow as an example, but let me know if you prefer the PyTorch version!

What is a GAN?

A GAN consists of two neural networks:

Generator: Tries to generate fake data (e.g., images) that looks real.

Discriminator: Tries to distinguish between real data (from the dataset) and fake data (produced by the generator).

The generator and discriminator are trained together in a zero-sum game, where the generator gets better at generating realistic data, and the discriminator gets better at identifying fake data.

Steps to Build Your First GAN

Step 1: Install Dependencies

Make sure you have Python 3.7+ installed. Then, install the necessary libraries:

pip install tensorflow matplotlib numpy

Step 2: Import Libraries

import tensorflow as tf

from tensorflow.keras import layers

import numpy as np

import matplotlib.pyplot as plt

Step 3: Load and Prepare the Dataset

For simplicity, let’s use the MNIST dataset, which contains 28x28 grayscale images of handwritten digits.

# Load MNIST dataset

(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()

# Preprocess the data: Normalize to range [-1, 1]

x_train = x_train / 127.5 - 1.0

x_train = np.expand_dims(x_train, axis=-1) # Adding channel dimension

# Set batch size

BATCH_SIZE = 64

BUFFER_SIZE = 60000

# Create a dataset pipeline

train_dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

Step 4: Build the Generator Model

The generator will take random noise as input and generate fake images. We’ll use a fully connected network followed by transpose convolutions to upsample the noise into an image.

def build_generator():

  model = tf.keras.Sequential([

  layers.Dense(7 * 7 * 256, input_shape=(100,), use_bias=False),

  layers.BatchNormalization(),

  layers.LeakyReLU(),

  layers.Reshape((7, 7, 256)),

  layers.Conv2DTranspose(128, 5, strides=1, padding='same', use_bias=False),

  layers.BatchNormalization(),

  layers.LeakyReLU(),

  layers.Conv2DTranspose(64, 5, strides=2, padding='same', use_bias=False),

  layers.BatchNormalization(),

  layers.LeakyReLU(),

  layers.Conv2DTranspose(1, 5, strides=2, padding='same', activation='tanh')

  ])

  return model

generator = build_generator()

Step 5: Build the Discriminator Model

The discriminator will classify images as either real (from the dataset) or fake (generated by the generator). It’s a simple CNN.

def build_discriminator():

  model = tf.keras.Sequential([

  layers.Conv2D(64, 5, strides=2, padding='same', input_shape=[28, 28, 1]),

  layers.LeakyReLU(),

  layers.Dropout(0.3),

  layers.Conv2D(128, 5, strides=2, padding='same'),

  layers.LeakyReLU(),

  layers.Dropout(0.3),

  layers.Flatten(),

  layers.Dense(1)

  ])

  return model

discriminator = build_discriminator()

Step 6: Define the Loss Functions

We'll use binary cross-entropy as the loss function, as it’s commonly used in GANs.

# Binary Cross-Entropy loss

cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

Step 7: Define the Optimizers

We’ll use Adam optimizers for both networks.

generator_optimizer = tf.keras.optimizers.Adam(1e-4)

discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

Step 8: Define the Training Loop

Here, we define the training loop, where the discriminator is trained to distinguish between real and fake images, and the generator is trained to fool the discriminator.

@tf.function

def train_step(real_images):

  noise = tf.random.normal([BATCH_SIZE, 100])

  fake_images = generator(noise, training=False)

  # Train discriminator: real images -> 1, fake images -> 0

  with tf.GradientTape() as disc_tape:

  real_output = discriminator(real_images, training=True)

  fake_output = discriminator(fake_images, training=True)

  disc_loss = (cross_entropy(tf.ones_like(real_output), real_output) +

  cross_entropy(tf.zeros_like(fake_output), fake_output)) / 2

  gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

  discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

  # Train generator: try to fool discriminator (real labels for fake images)

  with tf.GradientTape() as gen_tape:

  fake_output = discriminator(fake_images, training=True)

  gen_loss = cross_entropy(tf.ones_like(fake_output), fake_output)

  gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)

  generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))

  return disc_loss, gen_loss

Step 9: Train the GAN

We train the model for a specified number of epochs, displaying generated images periodically to monitor progress.

EPOCHS = 50

def train():

  for epoch in range(EPOCHS):

  for real_images in train_dataset:

  disc_loss, gen_loss = train_step(real_images)

 

  print(f"Epoch {epoch+1}, Disc Loss: {disc_loss.numpy()}, Gen Loss: {gen_loss.numpy()}")

 

  # Generate and save images periodically

  if (epoch + 1) % 10 == 0:

  generate_and_save_images(epoch + 1)

def generate_and_save_images(epoch):

  noise = tf.random.normal([16, 100])

  generated_images = generator(noise, training=False)

  generated_images = (generated_images + 1) / 2 # Rescale to [0, 1]

  plt.figure(figsize=(4, 4))

  for i in range(16):

  plt.subplot(4, 4, i + 1)

  plt.imshow(generated_images[i, :, :, 0], cmap='gray')

  plt.axis('off')

  plt.savefig(f'epoch_{epoch}.png')

  plt.close()

train()

Step 10: Results

After training, you should see the generator progressively improve at generating realistic images of handwritten digits. At each epoch, generated images will be saved, and you’ll observe the progression from blurry, noisy images to more structured and realistic digit images.

Next Steps:

Improve the Architecture: You can try using more advanced techniques like deeper networks, conditional GANs (cGANs), or adding additional features like attention mechanisms.

Train on Larger Datasets: Try training the GAN on more complex datasets like CIFAR-10, CelebA, or LSUN.

Experiment with Different Loss Functions: The Wasserstein GAN (WGAN) is a popular alternative that uses a different loss function.

That’s it! You’ve built your first GAN. The key takeaway is that GANs are a powerful technique for generating realistic data, and with more practice, you can create more complex models. Let me know if you need any further clarification or if you want to explore advanced topics!

Learn Generative AI Training in Hyderabad

Read More

Hands-On Tutorials and Case Studies

The Next Frontier: Exploring Generative AI for Real-Time Applications

From AI Art to AI-Generated Movies: The Future of Digital Entertainment

The Future of AI in Fashion: Predicting Design Trends

Visit Our Quality Thought Training Institute in Hyderabad

Get Directions

Subscribe by Email

Follow Updates Articles from This Blog via Email

No Comments

About

Search This Blog

Powered by Blogger.

Blog Archive