====== Lab 05: PyTorch ====== In this lab you’ll practice a deep learning workflow in **PyTorch**. We will go through the following topics: * **Part 01 - PyTorch Basics:** tensors, shapes/dtypes, device (CPU/GPU), random seeds, broadcasting * **Part 02 - Pre-bundled Models:** loading torchvision models, transforms/preprocessing, running inference * **Part 03 - Training Loops:** Dataset/DataLoader, model definition, metrics, evaluation, checkpoints * **Part 04 - ONNX Export:** ''torch.onnx.export'', validating with **onnxruntime**. ===== Environment & Installation ===== Requirements: * **Python 3.10+**, **pip** * **JupyterLab** * **torch**, **torchvision**, **torchaudio** * **matplotlib**, **onnx**, **onnxruntime**, **tqdm** # Create and activate a virtual environment python -m venv .venv # macOS/Linux source .venv/bin/activate # Windows (PowerShell) .\.venv\Scripts\Activate.ps1 # Upgrade pip pip install --upgrade pip # --- Option A: CPU-only install (works everywhere) --- pip install jupyterlab matplotlib onnx onnxruntime tqdm # --- Option B: NVIDIA GPU (choose the wheel that matches your CUDA) --- pip install jupyterlab matplotlib onnx onnxruntime-gpu tqdm # --- Install PyTorch based on your HW and OS: --- # Please follow this guide: https://pytorch.org/get-started/locally/ # Launch JupyterLab jupyter lab If you are unsure about CUDA, prefer the **CPU-only** install. It’s the most trouble-free for the lab machines and personal laptops. # Post-install HW check - prints version and CUDA availability import torch print("torch:", torch.__version__, "| cuda available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("gpu name:", torch.cuda.get_device_name(0)) ==== Getting the notebooks ==== Download the notebooks into your working folder and open them in JupyterLab. * {{ :courses:becm33mle:tutorials:lab_pytorch_jupyters.zip | Jupyter Notebooks}} ==== Running tips ==== * After (re)installing packages, **Restart Jupyter Kernel** to pick up new libraries. * If you see CUDA or driver errors, switch to the **CPU-only** wheel and set ''device = "cpu"''. ===== Vocabulary ===== ==== PyTorch ==== PyTorch is an open-source deep learning framework that lets you build, train, and deploy neural networks in pure Python. It uses a dynamic computation graph (define-by-run) and supports GPU acceleration. ==== Tensors ==== Tensors are n-dimensional arrays (like NumPy arrays) that can live on CPU or GPU and participate in automatic differentiation. They unify data and gradients in one object. ==== Autograd (automatic differentiation) ==== Autograd tracks tensor operations to build a dynamic graph and computes gradients with a single ''.backward()'' call. This removes the need to derive gradients by hand and makes experimentation fast-change the model or loss, and gradients update automatically. Use ''torch.no_grad()'' or ''.detach()'' to turn tracking off for inference or memory savings. ==== Devices (CPU/GPU) ==== PyTorch tensors and models can live on ''"cpu"'' or ''"cuda"''. Moving data and models to the same device (''.to(device)'') avoids costly transfers. ==== Random ==== Setting seeds and controlling sources of randomness makes experiments repeatable. Reproducibility is essential for debugging, comparing models fairly, and reporting results. ==== Broadcasting ==== Broadcasting automatically expands tensors with compatible shapes to enable vectorized math without explicit repeats. ==== Pre-bundled / pretrained models ==== ''torchvision.models'' provides high-quality pretrained networks (e.g., ResNet) trained on large datasets. ==== Transforms ==== Consistent transforms (resize, crop, normalize) align your data with model expectations and improve generalization. We use ''torchvision.transforms'' for images and compose deterministic test-time transforms with augmentations for training. ==== Dataset + DataLoader ==== A custom Dataset defines how to read and transform one sample. DataLoader batches, shuffles, and loads samples (optionally in parallel via ''num_workers''). This abstraction keeps I/O and preprocessing organized and scalable. ==== Model definition ==== Subclass ''nn.Module'' to define layers in ''__init__'' and computation in ''forward()''. ==== Loss functions ==== Choose a loss aligned with your task. Monitor both training and validation losses for over/underfitting. ==== Optimizers ==== Optimizers update parameters using gradients from ''.backward()''. Zero grads with ''.zero_grad()'', compute loss/backward, then ''.step()'' is the standard workflow. ==== Learning-rate schedulers ==== Schedulers adjust the learning rate during training. Good schedules can speed convergence and improve final accuracy. ==== Training and evaluation loops ==== A clean loop separates train and eval phases. Track metrics, save best checkpoints, and early-stop on validation signals. Keep loops minimal. ==== Metrics ==== Pick metrics that reflect your objective (accuracy/F1/AUROC for classification, MAE/RMSE for regression). Compute them on the validation set. Use calibrated metrics for imbalanced data. ==== Checkpoints ==== Save ''model.state_dict()'', ''optimizer.state_dict()'', epoch, and best metrics so you can resume or reproduce results. ==== Feature extraction / Fine-tuning ==== Feature extraction freezes the backbone and trains only a small head—fast and data-efficient.\\ Fine-tuning unfreezes some/all layers for higher accuracy when you have more data and compute. ==== Export to ONNX ==== ''torch.onnx.export'' serializes a model to an open format for portable inference in many runtimes.