Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
25 views

Pytorch Tutorial 1 Rev 1

This document provides an overview of PyTorch, an open source machine learning framework. It discusses prerequisites for understanding PyTorch, what PyTorch is, and how to perform common tasks like defining neural networks, loading data, calculating loss, and training models. Key PyTorch concepts covered include tensors, neural network modules, loss functions, optimization algorithms, and using GPUs. The document compares PyTorch and NumPy for tensor operations and outlines the basic process for training and testing neural networks in PyTorch.

Uploaded by

zhangchelsea9
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Pytorch Tutorial 1 Rev 1

This document provides an overview of PyTorch, an open source machine learning framework. It discusses prerequisites for understanding PyTorch, what PyTorch is, and how to perform common tasks like defining neural networks, loading data, calculating loss, and training models. Key PyTorch concepts covered include tensors, neural network modules, loss functions, optimization algorithms, and using GPUs. The document compares PyTorch and NumPy for tensor operations and outlines the basic process for training and testing neural networks in PyTorch.

Uploaded by

zhangchelsea9
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Machine Learning

Pytorch Tutorial
TA : 蕭淇元
2023.02.20
mlta-2023-spring@googlegroups.com
Outline
● Background: Prerequisites & What is Pytorch?
● Training & Testing Neural Networks in Pytorch
● Dataset & Dataloader
● Tensors
● torch.nn: Models, Loss Functions
● torch.optim: Optimization
● Save/load models
Prerequisites
● We assume you are already familiar with…
1. Python3
■ if-else, loop, function, file IO, class, ...
■ refs: link1, link2, link3
2. Deep Learning Basics
■ Prof. Lee’s 1st & 2nd lecture videos from last year
■ ref: link1, link2

Some knowledge of NumPy will also be useful!


What is PyTorch?
● An machine learning framework in Python.
● Two main features:
○ N-dimensional Tensor computation (like NumPy) on GPUs
○ Automatic differentiation for training deep neural networks
Training Neural Networks

Define Neural Optimization


Loss Function
Network Algorithm

Training

More info about the training process in last year's lecture video.
Training & Testing Neural Networks

Training Validation Testing

Guide for training/validation/testing can be found here.


Training & Testing Neural Networks - in Pytorch
Step 1.
torch.utils.data.Dataset &
Load Data torch.utils.data.DataLoader

Training Validation Testing


Dataset & Dataloader
● Dataset: stores data samples and expected values
● Dataloader: groups data in batches, enables multiprocessing

● dataset = MyDataset(file)
● dataloader = DataLoader(dataset, batch_size, shuffle=True)

Training: True
Testing: False

More info about batches and shuffling here.


Dataset & Dataloader
from torch.utils.data import Dataset, DataLoader

class MyDataset(Dataset):
def init (self, file):
self.data = ... Read data & preprocess

def __getitem__(self, index):


return self.data[index] Returns one sample at a time

def __len__(self):
return len(self.data) Returns the size of the dataset
Dataset & Dataloader
dataset = MyDataset(file)

dataloader = DataLoader(dataset, batch_size=5, shuffle=False)

DataLoader
getitem (0) 0

getitem (1) 1
Dataset getitem (2) 2 batch_size
getitem (3) 3

getitem (4) 4
mini-batch
Tensors
● High-dimensional matrices (arrays)

1-D tensor 2-D tensor 3-D tensor


e.g. audio e.g. black&white e.g. RGB
images images
Tensors – Shape of Tensors
● Check with .shape

4
3

5
3
5 5
(5, ) (3, 5) (4, 5, 3)

dim 0 dim 0 dim 1 dim 0 dim 1 dim 2

Note: dim in PyTorch == axis in NumPy


Tensors – Creating Tensors
● Directly from data (list or numpy.ndarray) tensor([[1., -1.],
x = torch.tensor([[1, -1], [-1, 1]]) [-1., 1.]])

x = torch.from_numpy(np.array([[1, -1], [-1, 1]]))

● Tensor of constant zeros & ones tensor([[0., 0.],


[0., 0.]])
x = torch.zeros([2, 2])

x = torch.ones([1, 2, 5]) tensor([[[1., 1., 1., 1., 1.],


shape [1., 1., 1., 1., 1.]]])
Tensors – Common Operations
Common arithmetic functions are supported, such as:

● Addition ● Summation

z = x + y y = x.sum()

● Subtraction ● Mean

z = x - y y = x.mean()

● Power

y = x.pow(2)
Tensors – Common Operations
● Transpose: transpose two specified dimensions

>>> x = torch.zeros([2, 3])


2
>>> x.shape
3
torch.Size([2, 3])

>>> x = x.transpose(0, 1)

>>> x.shape 3

torch.Size([3, 2])
2
Tensors – Common Operations
● Squeeze: remove the specified dimension with length = 1

>>> x = torch.zeros([1, 2, 3])

>>> x.shape 1
3
2
torch.Size([1, 2, 3])

>>> x = x.squeeze(0)
(dim = 0)
>>> x.shape 2

torch.Size([2, 3]) 3
Tensors – Common Operations
● Unsqueeze: expand a new dimension

>>> x = torch.zeros([2, 3]) 2


>>> x.shape
3
torch.Size([2, 3])

>>> x = x.unsqueeze(1) (dim = 1)

>>> x.shape 2

torch.Size([2, 1, 3]) 3
1
x 2
Tensors – Common Operations 3
1

● Cat: concatenate multiple tensors


y 2
>>> x = torch.zeros([2, 1, 3])
3
3
>>> y = torch.zeros([2, 3, 3])

>>> z = torch.zeros([2, 2, 3]) z


2

>>> w = torch.cat([x, y, z], dim=1) 2


3

>>> w.shape
w
torch.Size([2, 6, 3]) 2
3
6
more operators: https://pytorch.org/docs/stable/tensors.html
Tensors – Data Type
● Using different data types for model and data will cause errors.

Data type dtype tensor

32-bit floating point torch.float torch.FloatTensor

64-bit integer (signed) torch.long torch.LongTensor

see official documentation for more information on data types.


Tensors – PyTorch v.s. NumPy
● Similar attributes

PyTorch NumPy
x.shape x.shape
x.dtype x.dtype

see official documentation for more information on data types.

ref: https://github.com/wkentaro/pytorch-for-numpy-users
Tensors – PyTorch v.s. NumPy
● Many functions have the same names as well

PyTorch NumPy
x.reshape / x.view x.reshape
x.squeeze() x.squeeze()
x.unsqueeze(1) np.expand_dims(x, 1)

ref: https://github.com/wkentaro/pytorch-for-numpy-users
Tensors – Device
● Tensors & modules will be computed with CPU by default

Use .to() to move tensors to appropriate devices.


● CPU
x = x.to(‘cpu’)
● GPU
x = x.to(‘cuda’)
Tensors – Device (GPU)
● Check if your computer has NVIDIA GPU

torch.cuda.is_available()

● Multiple GPUs: specify ‘cuda:0’, ‘cuda:1’, ‘cuda:2’, ...

● Why use GPUs?


○ Parallel computing with more cores for arithmetic calculations
○ See What is a GPU and do you need one in deep learning?
Tensors – Gradient Calculation
1 >>> x = torch.tensor([[1., 0.], [-1., 1.]],requires_grad=True)

2 >>> z = x.pow(2).sum()

3 >>> z.backward()

4 >>> x.grad
1 2
tensor([[ 2., 0.],

[-2., 2.]])
3 4

See here to learn about gradient calculation.


Training & Testing Neural Networks – in Pytorch
Step 2.
torch.nn.Module
Load Data
Define Neural
Network

Loss Function Training Validation Testing

Optimization
Algorithm
torch.nn – Network Layers
● Linear Layer (Fully-connected Layer)

nn.Linear(in_features, out_features)

Input Tensor Output Tensor


nn.Linear(32, 64)
* x 32 * x 64

can be any shape (but last dimension must be 32)


e.g. (10, 32), (10, 5, 32), (1, 1, 3, 32), ...
torch.nn – Network Layers
● Linear Layer (Fully-connected Layer)

ref: last year's lecture video


torch.nn – Neural Network Layers
● Linear Layer (Fully-connected Layer)

y1
x1

y2
x2

32 y3 64 W x + =
x3 x b y
(64x32)
...

...

x
32
y
64
torch.nn – Network Parameters
● Linear Layer (Fully-connected Layer)

>>> layer = torch.nn.Linear(32, 64)

>>> layer.weight.shape

torch.Size([64, 32]) W x x + b = y
(64x32)
>>> layer.bias.shape

torch.Size([64])
torch.nn – Non-Linear Activation Functions
● Sigmoid Activation

nn.Sigmoid()

● ReLU Activation

nn.ReLU()

See here to learn about why we need activation functions.


torch.nn – Build your own neural network
import torch.nn as nn

class MyModel(nn.Module):
def init(self):
super(MyModel, self).init ()
self.net = nn.Sequential(
nn.Linear(10, 32), Initialize your model & define layers
nn.Sigmoid(),
nn.Linear(32, 1)
)

def forward(self, x):


Compute output of your NN
return self.net(x)
torch.nn – Build your own neural network
import torch.nn as nn import torch.nn as nn

class MyModel(nn.Module): class MyModel(nn.Module):


def init(self): def init (self):
super(MyModel, self).init () super(MyModel, self). init ()
self.net = nn.Sequential( self.layer1 = nn.Linear(10, 32)
nn.Linear(10, 32), self.layer2 = nn.Sigmoid()
nn.Sigmoid(), = self.layer3 = nn.Linear(32,1)
nn.Linear(32, 1)
) def forward(self, x):
out = self.layer1(x)
def forward(self, x): out = self.layer2(out)
return self.net(x) out = self.layer3(out)
return out
Training & Testing Neural Networks – in Pytorch
Step 3.
torch.nn.MSELoss
torch.nn.CrossEntropyLoss etc.
Load Data
Define Neural
Network

Loss Function Training Validation Testing

Optimization
Algorithm
torch.nn – Loss Functions
● Mean Squared Error (for regression tasks)

criterion = nn.MSELoss()

● Cross Entropy (for classification tasks)

criterion = nn.CrossEntropyLoss()

● loss = criterion(model_output, expected_value)


Training & Testing Neural Networks – in Pytorch
Step 4.
torch.optim
Load Data
Define Neural
Network

Loss Function Training Validation Testing

Optimization
Algorithm
torch.optim
● Gradient-based optimization algorithms that adjust network parameters
to reduce error. (See Adaptive Learning Rate lecture video)

● E.g. Stochastic Gradient Descent (SGD)

torch.optim.SGD(model.parameters(), lr, momentum = 0)


torch.optim
optimizer = torch.optim.SGD(model.parameters(), lr, momentum = 0)

● For every batch of data:


1. Call optimizer.zero_grad() to reset gradients of model parameters.
2. Call loss.backward() to backpropagate gradients of prediction loss.
3. Call optimizer.step() to adjust model parameters.

See official documentation for more optimization algorithms.


Training & Testing Neural Networks – in Pytorch

Load Data
Define Neural
Network

Loss Function Training Validation Testing

Optimization
Algorithm Step 5.
Entire Procedure
Neural Network Training Setup

dataset = MyDataset(file) read data via MyDataset

tr_set = DataLoader(dataset, 16, shuffle=True) put dataset into Dataloader

model = MyModel().to(device) construct model and move to device (cpu/cuda)

criterion = nn.MSELoss() set loss function

optimizer = torch.optim.SGD(model.parameters(), 0.1) set optimizer


Neural Network Training Loop
iterate n_epochs
for epoch in range(n_epochs):

model.train() set model to train mode

for x, y in tr_set: iterate through the dataloader

optimizer.zero_grad() set gradient to zero

x, y = x.to(device), y.to(device) move data to device (cpu/cuda)

pred = model(x) forward pass (compute output)

loss = criterion(pred, y) compute loss

loss.backward() compute gradient (backpropagation)

optimizer.step() update model with optimizer


Neural Network Validation Loop
model.eval() set model to evaluation mode

total_loss = 0

for x, y in dv_set: iterate through the dataloader

x, y = x.to(device), y.to(device) move data to device (cpu/cuda)

with torch.no_grad(): disable gradient calculation

pred = model(x) forward pass (compute output)

loss = criterion(pred, y) compute loss

total_loss += loss.cpu().item() * len(x) accumulate loss

avg_loss = total_loss / len(dv_set.dataset) compute averaged loss


Neural Network Testing Loop
model.eval() set model to evaluation mode

preds = []

for x in tt_set: iterate through the dataloader

x = x.to(device) move data to device (cpu/cuda)


with torch.no_grad(): disable gradient calculation

pred = model(x) forward pass (compute output)

preds.append(pred.cpu()) collect prediction


Notice - model.eval(), torch.no_grad()
● model.eval()
Changes behaviour of some model layers, such as dropout and batch
normalization.

● with torch.no_grad()
Prevents calculations from being added into gradient computation graph.
Usually used to prevent accidental training on validation/testing data.
Save/Load Trained Models
● Save

torch.save(model.state_dict(), path)

● Load
ckpt = torch.load(path)

model.load_state_dict(ckpt)
More About PyTorch
● torchaudio
○ speech/audio processing
● torchtext
○ natural language processing
● torchvision
○ computer vision
● skorch
○ scikit-learn + pyTorch
More About PyTorch
● Useful github repositories using PyTorch
○ Huggingface Transformers (transformer models: BERT, GPT, ...)
○ Fairseq (sequence modeling for NLP & speech)
○ ESPnet (speech recognition, translation, synthesis, ...)
○ Most implementations of recent deep learning papers
○ ...
References
● Machine Learning 2022 Spring Pytorch Tutorial
● Official Pytorch Tutorials
● https://numpy.org/
Any questions?

You might also like