Reinforcement Learning: Q-Learning, Deep Q Networks
Reinforcement Learning (RL) is a trial-and-error learning approach where an agent interacts with an environment to maximize cumulative rewards over time.
1. What is Q-Learning?
Q-Learning is a model-free, off-policy RL algorithm used to learn the best actions to take in a given state.
It uses a Q-table (a lookup table of state-action values) to store rewards for actions in each state.
Q-Value (Bellman Equation)
Q(8,a) = Q(8,a) + a[R + γ max Q(s′ , a′) – Q(s , a)]
Where:
- Q(s , a) → Q-value of taking action a in state s
- α → Learning rate
- R → Reward received
- γ → Discount factor (importance of future rewards)
- s′ → Next state
Key Features of Q-Learning:
- Works well in discrete environments (e.g., Gridworld, Maze)
- No need for a model of the environment (model-free)
- Uses a Q-table to store optimal values
2. Implement Q-Learning in Python (Using Gym & NumPy)
We’ll train an agent to navigate FrozenLake, a simple gridworld environment.
Install Dependencies:
pip install numpy gym
Q-Learning Code:
import numpy as np
import gym
# Create FrozenLake environment
env = gym.make(“FrozenLake-v1”, is_slippery=False) # Set `is_slippery=True` for harder mode
# Initialize Q-table (state-action values)
q_table = np.zeros((env.observation_space.n, env.action_space.n))
# Hyperparameters
alpha = 0.1 # Learning rate
gamma = 0.9 # Discount factor
epsilon = 1.0 # Exploration rate (initial)
epsilon_decay = 0.995 # Decay rate for exploration
num_episodes = 5000
# Q-Learning Algorithm
for episode in range(num_episodes):
state = env.reset()[0]
done = False
while not done:
# Choose action using ε-greedy policy
if np.random.uniform(0, 1) < epsilon:
action = env.action_space.sample() # Explore
else:
action = np.argmax(q_table[state, :]) # Exploit
# Take action, observe reward and next state
next_state, reward, done, _, _ = env.step(action)
# Update Q-value using Bellman equation
q_table[state, action] = q_table[state, action] + alpha * (
reward + gamma * np.max(q_table[next_state, :]) – q_table[state, action]
)
state = next_state # Move to next state
# Reduce exploration rate
epsilon = max(0.01, epsilon * epsilon_decay)
# Print final Q-table
print(“Final Q-Table:”)
print(q_table)
Key Takeaways from Q-Learning:
- Simple and effective for discrete environments
- Stores Q-values in a table (scales poorly for large spaces)
- Does not handle complex environments (e.g., images, continuous spaces)
3. Deep Q Networks (DQN) – Using Neural Networks
Why Use Deep Q Networks (DQN)?
Q-learning struggles in large environments where Q-tables become too large.
DQN replaces Q-tables with a Deep Neural Network (DNN) to approximate Q-values.
How DQN Works?
- Neural Network → Predicts Q-values for each action in a state.
- Experience Replay → Stores past experiences and reuses them to stabilize training.
- Target Network → Helps reduce instability in Q-value updates.
4. Implement Deep Q-Learning with DQN (Using PyTorch & Gym)
Install Dependencies:
pip install torch torchvision gym numpy
DQN Code for CartPole Environment:
:import gym
import torch
import torch.nn as nn
import torch.optim as optim
import random
import numpy as np
from collections import deque
# Create CartPole environment
env = gym.make(“CartPole-v1”)
# Define Q-Network (Neural Network)
class DQN(nn.Module):
def __init__(self, state_size, action_size):
super(DQN, self).__init__()
self.fc1 = nn.Linear(state_size, 24)
self.fc2 = nn.Linear(24, 24)
self.fc3 = nn.Linear(24, action_size)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
return self.fc3(x)
# Hyperparameters
state_size = env.observation_space.shape[0]
action_size = env.action_space.n
gamma = 0.99 # Discount factor
learning_rate = 0.001
epsilon = 1.0 # Exploration rate
epsilon_min = 0.01
epsilon_decay = 0.995
batch_size = 32
memory = deque(maxlen=2000)
# Initialize networks
policy_net = DQN(state_size, action_size)
target_net = DQN(state_size, action_size)
target_net.load_state_dict(policy_net.state_dict()) # Initialize target network
optimizer = optim.Adam(policy_net.parameters(), lr=learning_rate)
criterion = nn.MSELoss()
# Function to select action
def select_action(state):
global epsilon
if np.random.rand() <= epsilon:
return random.randrange(action_size) # Explore
state_tensor = torch.FloatTensor(state).unsqueeze(0)
with torch.no_grad():
q_values = policy_net(state_tensor)
return np.argmax(q_values.numpy()) # Exploit
# Train the DQN
num_episodes = 500
for episode in range(num_episodes):
state = env.reset()[0]
state = np.reshape(state, [1, state_size])
for time_step in range(200): # Limit steps per episode
action = select_action(state)
next_state, reward, done, _, _ = env.step(action)
next_state = np.reshape(next_state, [1, state_size])
# Store experience
memory.append((state, action, reward, next_state, done))
state = next_state # Move to next state
if done:
break
# Reduce exploration rate
epsilon = max(epsilon_min, epsilon * epsilon_decay)
# Experience Replay
if len(memory) > batch_size:
batch = random.sample(memory, batch_size)
for state, action, reward, next_state, done in batch:
target = reward if done else reward + gamma * np.max(target_net(torch.FloatTensor(next_state)).detach().numpy())
q_values = policy_net(torch.FloatTensor(state))
q_values[0][action] = target
optimizer.zero_grad()
loss = criterion(q_values, policy_net(torch.FloatTensor(state)))
loss.backward()
optimizer.step()
# Update target network every few episodes
if episode % 10 == 0:
target_net.load_state_dict(policy_net.state_dict())
print(f”Episode {episode}: Reward = {time_step}”)
print(“Training complete!”)
Key Takeaways from DQN:
- Uses a neural network to estimate Q-values (no need for a Q-table).
- Can handle high-dimensional states (like images, game states).
- Uses experience replay & target network for stable learning.
Next Steps:
- Want to train a DQN for a custom game (Atari, robotics, etc.)?
- Need policy-based methods like PPO, A3C for continuous actions?
- Interested in deploying an RL agent to the cloud?
AI in Robotics & Autonomous Systems
Artificial Intelligence (AI) is transforming robotics and autonomous systems, enabling machines to sense, learn, and make decisions without human intervention. AI-powered robots can perceive their environment, plan actions, and execute tasks autonomously in fields like healthcare, industry, transportation, and more.
1. Key AI Techniques in Robotics
AI enables robots to sense, think, and act using the following technologies:
AI Technique Role in Robotics:
AI Technique | Role in Robotics |
---|---|
Computer Vision | Object detection, scene understanding |
Reinforcement Learning (RL) | Training robots through trial & error |
Path Planning (A Search, Dijkstra)* | Navigating dynamic environments |
Simultaneous Localization and Mapping (SLAM) | Real-time map-building and localization |
Natural Language Processing (NLP) | Voice-controlled robots (e.g., Alexa, Siri) |
Deep Learning (CNNs, RNNs, Transformers) | Advanced perception and decision-making |
2. AI Applications in Robotics & Autonomous Systems
1. Autonomous Vehicles (Self-Driving Cars):
AI enables cars to perceive surroundings, predict motion, and navigate safely.
- Computer Vision → Detects lanes, signs, pedestrians (using CNNs like YOLO, Faster R-CNN).
- Sensor Fusion → Combines LIDAR, cameras, and radar for accurate perception.
- Reinforcement Learning → Helps in autonomous decision-making.
- Path Planning → Uses A* Search, Dijkstra, RRT for motion planning.
Example: Tesla Autopilot, Waymo, Mobileye.
2. Industrial Robotics & Automation:
AI-powered robots are used for assembly, quality inspection, and warehouse automation.
- Computer Vision → Detects defects and ensures precision.
- Reinforcement Learning (RL) → Trains robots to adapt to dynamic environments.
- Collaborative Robots (Cobots) → Work alongside humans (e.g., Amazon warehouse robots).
Example: Boston Dynamics’ Spot, Amazon Robotics
3. Humanoid Robots & Social Robots:
Robots with AI-powered NLP and vision can interact with humans.
Speech Recognition (NLP) → Understands and responds to commands.
Facial Recognition → Identifies people and emotions.
Deep Reinforcement Learning → Helps in adaptive behavior learning.
Example: Sophia (Hanson Robotics), Tesla Optimus
4. Drones & UAVs (Unmanned Aerial Vehicles):
AI-controlled drones can perform surveillance, delivery, and disaster response.
- SLAM Algorithms → Helps drones navigate unknown areas.
- Reinforcement Learning → Used for flight optimization.
- Computer Vision → Detects obstacles and objects.
Example: DJI Drones, Military Reconnaissance Drones
5. AI in Medical Robotics:
Robots assist in surgery, rehabilitation, and diagnostics.
- AI-powered Surgery Robots → Assist doctors in precision tasks.
- Rehabilitation Robots → Help patients recover from injuries.
- AI-based Diagnostics → Detects diseases using machine learning.
Example: Da Vinci Surgical System, Exoskeletons
3. Hands-on AI Robotics: Build an Autonomous Robot
Install Dependencies:
pip install opencv-python numpy gym torch
Simple Python AI Robot Simulation (Obstacle Avoidance):
import numpy as np
import random
class AI_Robot:
def __init__(self):
self.position = [0, 0] # (x, y)
self.direction = random.choice([“N”, “S”, “E”, “W”]) # Random start direction
def perceive_environment(self):
“”” Simulate sensor detecting obstacles “””
return random.choice([True, False]) # True = obstacle detected, False = no obstacle
def decide_action(self, obstacle_detected):
“”” AI decides to move forward or turn “””
if obstacle_detected:
self.direction = random.choice([“N”, “S”, “E”, “W”]) # Random new direction
else:
self.move_forward()
def move_forward(self):
“”” Move one step in the current direction “””
if self.direction == “N”:
self.position[1] += 1
elif self.direction == “S”:
self.position[1] -= 1
elif self.direction == “E”:
self.position[0] += 1
elif self.direction == “W”:
self.position[0] -= 1
def run(self, steps=10):
“”” Run the AI robot for a few steps “””
for _ in range(steps):
obstacle_detected = self.perceive_environment()
self.decide_action(obstacle_detected)
print(f”Position: {self.position}, Direction: {self.direction}”)
# Run AI Robot Simulation
robot = AI_Robot()
robot.run()
Key Features:
- Perception → Uses a simulated sensor.
- Decision-making → Decides to move or turn.
- Action Execution → Moves forward or changes direction.
4. Future of AI in Robotics
- Swarm Robotics → Robots working together in groups (e.g., warehouse automation).
- Edge AI in Robotics → Running AI models directly on robots without cloud dependency.
- AI-powered Exoskeletons → Helping disabled people walk again.
- Fully Autonomous AI Robots → Learning from experiences without human intervention.
Next Steps:
- Want to train a robot using Reinforcement Learning (Deep Q-Networks, PPO, A3C)?
- Need simulation environments (ROS, Gazebo, Unity ML-Agents)?
- Want to build a real-world AI robot using Raspberry Pi or Jetson Nano?
Generative AI: GANs (Generative Adversarial Networks)
Generative Adversarial Networks (GANs) are a type of deep learning model that can generate new, realistic data (e.g., images, music, text) by learning from an existing dataset. They consist of two neural networks—a Generator and a Discriminator—that compete against each other in a process called adversarial training.
1. How GANs Work:
GAN Architecture
A GAN consists of two main components.
Component | Role |
---|---|
Generator (G) | Learns to create fake data similar to the real data |
Discriminator (D) | Tries to distinguish real data from fake data |
Training Process (Adversarial Learning):
- Generator creates fake data from random noise.
- Discriminator evaluates the data and classifies it as real or fake.
- Generator improves by learning to trick the Discriminator.
- Discriminator improves by getting better at detecting fakes.
- This process continues until the Generator produces data indistinguishable from real data.
2. GAN Loss Function (Minimax Game):
GANs train using a minimax loss function, where the Generator tries to minimize the Discriminator’s ability to distinguish real from fake data, while the Discriminator tries to maximize its classification accuracy.
min/G max/G V(D , G) = Ex∼pdata(x)[log D (X)] + Ez∼Pz(z)[log(1 – D(G(z)))]
Where:
- G(z)G(z)G(z) is the fake data generated from random noise zzz.
- D(x)D(x)D(x) is the probability that real data xxx is classified as real.
3. Hands-on: Implement a GAN to Generate Fake Images
We’ll implement a simple GAN using PyTorch to generate handwritten digits (MNIST dataset).
Install Dependencies:
pip install torch torchvision matplotlib numpy
GAN Code in PyTorch:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
# Load MNIST dataset
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
dataloader = torch.utils.data.DataLoader(
torchvision.datasets.MNIST(root=’./data’, train=True, transform=transform, download=True),
batch_size=64, shuffle=True
)
# Define Generator Network
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.model = nn.Sequential(
nn.Linear(100, 256),
nn.ReLU(),
nn.Linear(256, 512),
nn.ReLU(),
nn.Linear(512, 784),
nn.Tanh() # Output normalized between -1 and 1
)
def forward(self, x):
return self.model(x).view(-1, 1, 28, 28) # Reshape to image format
# Define Discriminator Network
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self. model = nn. Sequential(
nn. Linear(784, 512),
nn. ReLU(),
nn. Linear(512, 256),
nn. ReLU(),
nn. Linear(256, 1),
nn. Sigmoid()
)
def forward(self, x):
return self. model(x. view(-1, 784)) # Flatten image
# Initialize models
generator = Generator()
discriminator = Discriminator()
# Loss function and optimizers
criterion = nn. BCELoss() # Binary Cross Entropy Loss
optimizer _ G = optim. Adam(generator. parameters(), lr=0.0002)
optimizer _ D = optim. Adam(discriminator. parameters(), lr=0.0002)
# Training the GAN
epochs = 50
for epoch in range(epochs):
for real _ images, _ in data loader:
batch _ size = real _ images. shape[0]
# Train Discriminator
real_labels = torch.ones(batch_size, 1)
fake_labels = torch.zeros(batch_size, 1)
optimizer_D.zero_grad()
outputs = discriminator(real_images)
loss_real = criterion(outputs, real_labels)
noise = torch.randn(batch_size, 100)
fake_images = generator(noise)
outputs = discriminator(fake_images.detach()) # Detach to avoid training G
loss_fake = criterion(outputs, fake_labels)
loss_D = loss_real + loss_fake
loss_D.backward()
optimizer_D.step()
# Train Generator
optimizer_G.zero_grad()
outputs = discriminator(fake_images)
loss_G = criterion(outputs, real_labels) # Fool the Discriminator
loss_G.backward()
optimizer_G.step()
print(f”Epoch [{epoch+1}/{epochs}] Loss D: {loss_D:.4f}, Loss G: {loss_G:.4f}”)
# Generate and plot fake images
noise = torch.randn(16, 100)
fake_images = generator(noise).detach()
plt.figure(figsize=(4, 4))
for i in range(16):
plt.subplot(4, 4, i+1)
plt.imshow(fake_images[i].squeeze(), cmap=’gray’)
plt.axis(‘off’)
plt.show()
Key Features:
- Trains a GAN on MNIST dataset
- Generator creates realistic digits
- Discriminator learns to detect fakes
- Uses Binary Cross Entropy (BCE) loss
4. Variants of GANs
Advanced GAN Architectures:
Type | Purpose |
---|---|
DCGAN (Deep Convolutional GAN) | Uses CNNs for image generation (better quality) |
WGAN (Wasserstein GAN) | Improves stability using Wasserstein distance |
StyleGAN | Generates high-resolution face images |
CycleGAN | Translates images between different domains (e.g., horses → zebras) |
BigGAN | Generates large, high-quality images |
5. Real-World Applications of GANs
- Image Generation → AI-generated faces, fashion designs
- Data Augmentation → Creating synthetic medical images
- Super-Resolution → Enhancing low-quality images (e.g., SRGAN)
- AI Art & Creativity → Generating unique paintings, music, and video
- Deepfake Technology → Creating realistic videos of people
Next Steps:
- Want to train a GAN for face generation (StyleGAN, ProGAN)?
- Need to apply GANs for image-to-image translation (CycleGAN)?
- Interested in using GANs for super-resolution or text-to-image
Ethical AI & Bias in Machine Learning
As AI becomes more integrated into society, ethical concerns like bias, fairness, transparency, and accountability have gained significant attention. Biased AI systems can lead to unfair treatment in hiring, lending, healthcare, and law enforcement. Understanding and mitigating bias is crucial for responsible AI development.
1. Understanding Bias in AI
Bias in machine learning refers to systematic errors that lead to unfair or inaccurate predictions. It can arise due to.
Types of Bias:
Type | Description | Example |
---|---|---|
Historical Bias | Data reflects existing inequalities. | AI trained on biased hiring data prefers men over women. |
Sampling Bias | Training data does not represent the entire population. | Facial recognition models work poorly for darker skin tones. |
Label Bias | Labels are influenced by subjective human judgments. | Sentiment analysis trained on biased reviews misinterprets certain phrases. |
Algorithmic Bias | Model amplifies existing biases in data. | Predictive policing models unfairly target certain communities. |
2. Case Studies of AI Bias
- Amazon’s Hiring Algorithm → AI favored male candidates because training data was based on past hiring, which favored men.
- COMPAS Algorithm (Criminal Justice) → Predicted higher recidivism rates for Black individuals than White individuals.
- Facial Recognition Bias → Some AI systems misidentified people of color at much higher rates, leading to wrongful arrests.
3. Strategies to Mitigate Bias
Ethical AI Practices:
- Diverse & Representative Data → Ensure training data reflects all demographics.
- Bias Audits & Fairness Metrics → Regularly test AI for biased outputs.
- Explainability & Transparency → Use interpretable models to understand decisions.
- Human-in-the-Loop → Include human oversight in sensitive AI applications.
- Fair AI Algorithms → Use techniques like re-weighting training samples to balance biases.
4. Hands-on Practice: Detecting Bias in AI Models
Let’s check for bias in a dataset using Python.
Install Dependencies:
pip install pandas numpy seaborn scikit-learn aif360
Bias Detection in a Dataset:
import pandas as pd
import seaborn as sns
from aif360.datasets import StandardDataset
from aif360.metrics import BinaryLabelDatasetMetric
# Load dataset
data = pd.read_csv(“adult.csv”) # Example: US Census Income dataset
# Convert to AI Fairness 360 format
dataset = StandardDataset(
df=data,
label_name=”income”, # Target variable
favorable _ classes=[” >50K”],
protected _ attribute _ names=[“gender”], # Check for gender bias
privileged _ classes=[[“Male”]]
)
# Compute Bias Metrics
metric = Binary Label Dataset Metric(dataset, privileged _ groups=[{“gender”: 1}], unprivileged _ groups=[{“gender”: 0}])
print(f”Disparate Impact: {metric. disparate _ impact()}”)
Disparate Impact measures if a group receives favorable outcomes at different rates. A value far from 1.0 indicates bias.
5. Ethical AI Principles & Regulations
Global AI Ethics Guidelines:
- EU AI Act → Regulates high-risk AI applications.
- IEEE Ethically Aligned Design → AI must be human-centered.
- Fairness in AI (FAIR Principles) → Promotes transparency and accountability.
Responsible AI Development Means:
- Avoiding harmful biases
- Ensuring fairness & equity
- Making AI models explainable & accountable
Next Steps:
- Would you like to explore:
- Bias mitigation techniques (e.g., re-weighting, adversarial debiasing)?
- Fair AI algorithms & responsible AI development practices?
Hands-on Practice:
Implement AI for stock price prediction
Stock price prediction is a challenging problem in finance, often tackled using Machine Learning (ML) and Deep Learning (DL) techniques. In this project, we’ll use LSTMs (Long Short-Term Memory Networks)—a type of Recurrent Neural Network (RNN)—to forecast stock prices based on historical data.
1. Steps for Stock Price Prediction:
- Data Collection → Get stock price data from Yahoo Finance or other sources.
- Data Preprocessing → Normalize data and create training sequences.
- Model Building → Use LSTM to capture trends in time-series data.
- Model Training → Train the LSTM on historical stock prices.
- Prediction & Evaluation → Forecast future stock prices and visualize results.
2. Install Dependencies:
pip install numpy pandas matplotlib yfinance scikit-learn tensor flow
3. Implement Stock Price Prediction with LSTM:
import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
# Step 1: Download Stock Data (e.g., Apple – AAPL)
stock_symbol = “AAPL”
stock_data = yf.download(stock_symbol, start=”2015-01-01″, end=”2024-01-01″)
stock_data = stock_data[[‘Close’]] # Use only closing prices
# Step 2: Normalize Data
scaler = MinMaxScaler(feature_range=(0,1))
stock_data_scaled = scaler.fit_transform(stock_data)
# Step 3: Create Training Data
def create_sequences(data, seq_length):
X, y = [], []
for i in range(len(data) – seq_length – 1):
X.append(data[i:i+seq_length])
y.append(data[i+seq_length])
return np.array(X), np.array(y)
seq_length = 60 # Use last 60 days to predict the next day
X, y = create_sequences(stock_data_scaled, seq_length)
# Split data into training and testing sets
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]
# Step 4: Build LSTM Model
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(seq_length, 1)),
LSTM(50, return_sequences=False),
Dense(25, activation=’relu’),
Dense(1)
])
# Step 5: Compile and Train Model
model.compile(optimizer=’adam’, loss=’mean_squared_error’)
model.fit(X_train, y_train, epochs=20, batch_size=32)
# Step 6: Make Predictions
predictions = model.predict(X_test)
predictions = scaler.inverse_transform(predictions) # Convert back to original scale
# Step 7: Visualize Results
plt. figure(figsize=(12,6))
plt. plot(stock _ data. index[split+seq_length+1:], stock_data.iloc[split+seq_length+1:], label=”Actual Price”)
plt. plot(stock _ data. index[split+seq_length+1:], predictions, label=”Predicted Price”, linestyle=”dashed”)
plt. legend()
plt. title(f”{stock _ symbol} Stock Price Prediction (LSTM)”)
plt. show()
4. Explanation of the Code:
- Data Collection: Retrieves historical stock data from Yahoo Finance.
- Data Preprocessing: Normalizes the data and creates sequences.
- LSTM Model: Uses deep learning to learn patterns in stock prices.
- Training: Model learns trends from past prices.
- Prediction & Visualization: Compares predicted vs. actual stock prices.
5. Next Steps:
- Try different stocks (e.g., TSLA, MSFT, BTC-USD for Bitcoin).
- Improve accuracy with hyperparameter tuning.
- Use Technical Indicators (RSI, MACD, Moving Averages) as features.
- Experiment with Transformer models (e.g., GPT, Time-Series BERT) for stock forecasting.
Train a GAN to Generate Synthetic Images
Generative Adversarial Networks (GANs) are powerful deep learning models used for generating synthetic images. In this hands-on guide, we’ll train a simple GAN to generate handwritten digits (MNIST dataset) using PyTorch.
1. How GANs Work
GANs consist of two neural networks:
Component | Role |
---|---|
Generator (G) | Generates fake images from random noise. |
Discriminator (D) | Distinguishes real images from fake images. |
Adversarial Learning:
- Generator creates fake images from random noise.
- Discriminator classifies images as real or fake.
- Generator improves to fool the Discriminator.
- Discriminator improves to become better at distinguishing fake from real.
- This process continues until the Generator produces realistic images.
2. Install Dependencies
pip install torch torch vision mat plotlib numpy
3. Implement a Simple GAN with PyTorch
We will train a GAN on the MNIST dataset (handwritten digits).
Step 1: Import Libraries:
import torch
import torch. nn as nn
import torch. optim as optim
import torch vision
import torchvision. transforms as transforms
import matplotlib. pyplot as plt
import numpy as np
Step 2: Load the MNIST Dataset:
# Load dataset and normalize images
transform = transforms. Compose([
transforms. To Tensor(),
transforms. Normalize((0.5,), (0.5,)) # Normalize between -1 and 1
])
batch _ size = 64
data loader = torch. utils. data. Data Loader(
torch vision. datasets. MNIST(root=’./data’, train=True, transform=transform, download=True),
batch_size=batch_size, shuffle=True
)
Step 3: Define the Generator and Discriminator Networks
# Define the Generator
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self. model = nn. Sequential(
nn.Linear(100, 256),
nn.ReLU(),
nn. Linear(256, 512),
nn. ReLU(),
nn. Linear(512, 784),
nn. Tanh() # Output values between -1 and 1
)
def forward(self, x):
return self.model(x).view(-1, 1, 28, 28) # Reshape to image format
# Define the Discriminator
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self. model = nn. Sequential(
nn. Linear(784, 512),
nn. ReLU(),
nn. Linear(512, 256),
nn. ReLU(),
nn. Linear(256, 1),
nn. Sigmoid() # Output probability of real vs. fake
)
def forward(self, x):
return self. model(x. view(-1, 784)) # Flatten image
Step 4: Initialize Models and Optimizers
generator = Generator()
discriminator = Discriminator()
criterion = nn.BCELoss() # Binary Cross Entropy Loss
optimizer _ G = optim. Adam(generator. parameters(), lr=0.0002)
optimizer _ D = optim. Adam(discriminator. parameters(), lr=0.0002)
Step 5: Train the GAN
# Training Loop
epochs = 50
for epoch in range(epochs):
for real_images, _ in dataloader:
batch_size = real_images.shape[0]
# Real and Fake labels
real _ labels = torch. ones(batch _ size, 1)
fake _ labels = torch. zeros(batch _ size, 1)
# Train Discriminator
optimizer _ D. zero _ grad()
outputs = discriminator(real _ images)
loss _ real = criterion(outputs, real _ labels)
noise = torch. randn(batch _ size, 100)
fake _ images = generator(noise)
outputs = discriminator(fake _ images. detach()) # Detach to avoid training G
loss _ fake = criterion(outputs, fake _ labels)
loss _ D = loss _ real + loss _ fake
los s _ D. backward()
optimizer _ D .step()
# Train Generator
optimizer_G.zero_grad()
outputs = discriminator(fake_images)
loss_G = criterion(outputs, real_labels) # Fool the Discriminator
loss_G.backward()
optimizer_G.step()
print(f”Epoch [{epoch+1}/{epochs}] Loss D: {loss_D:.4f}, Loss G: {loss_G:.4f}”)
Step 6: Generate and Visualize Fake Images
# Generate Fake Images
noise = torch.randn(16, 100)
fake_images = generator(noise).detach()
# Plot Fake Images
plt.figure(figsize=(4, 4))
for i in range(16):
plt.subplot(4, 4, i+1)
plt.imshow(fake_images[i].squeeze(), cmap=’gray’)
plt.axis(‘off’)
plt.show()
4. Explanation of the Code
- Data Collection: Loads the MNIST dataset.
- Generator: Creates synthetic images from random noise.
- Discriminator: Distinguishes real vs. fake images.
- Loss Function: Uses Binary Cross-Entropy (BCE) for training.
- Training: The GAN learns to generate realistic handwritten digits.
- Visualization: Displays generated images.
5. Improve the GAN Model
- Train on more complex datasets (e.g., CIFAR-10, CelebA for faces).
- Use Convolutional Layers (DCGAN) instead of simple feedforward networks.
- Improve Training Stability with WGAN (Wasserstein GAN).
- Generate Higher-Resolution Images with StyleGAN or BigGAN.
Next Steps:
Would you like to:
- Train a DCGAN with convolutional layers?
- Generate realistic human faces using StyleGAN?
- Implement image-to-image translation (CycleGAN)?