Quick links: Schedule | Forum | BRUTE | Lectures | Labs

Labs How To

Basic info

If you are new to CTU, see the checklist for visiting students.

Forum

There is a discussion forum administered for this course that can be used to solicit help for the assignments. It is monitored by the lab assistants and it is the preferred form of communication for getting assistance for the assignments since all students can see the question and answer threads. If you find an error in the assignment or it is unclear/ambiguous, please write on the forum.

We also ask you to write remarks about lectures (typos, questions).

Python, IDEs

You will implement the labs in Python. The first lab will need only Python with standard packages (NumPy, Matplotlib).

For the case you are not too sure about your Python/NumPy skills, have a look here: http://cs231n.github.io/python-numpy-tutorial/.

You are free to use your favorite IDE for Python development. There is a professional PyCharm license available to the university https://download.cvut.cz. However, it could be somewhat slow (running on Java) and unnecessarily complex (has many features). We recommend Visual Studio Code (VSCode). It is very easy to setup and convenient to work with. Follow these basic instructions for setting up the IDE: VSCode and Python

Below we provide instructions for VSCode specifically useful for this course. You should be able to find similar functionality in other IDE of your preference.

Jupyter / IPython (VSCode)

There are in principle three ways of using Python in VSCode:

  • Writing the code in a Python .py file, running it on the command line and save the outputs to files (e.g. images). The course templates are written like this by default as it is the most general form.
  • Using Jupyter notebooks (.ipynb files).
  • Using Jupyter code cells together with an interactive window (.py file with special comments to separate the code cells).

The second and third options could be also combined by connecting a Jupyter notebook and an interactive window to the same externally run Jupyter server.

The advantage of a Jupyter notebook is that both the code and the outputs are stored in one file, which may be useful for submitting the report. It is, however, more difficult to handle by git and without the interactive console, all outputs (even the debug ones) are stored in the same file, so one is constantly creating and deleting cells and variable inspection is cumbersome. The Jupyter code cell file is a nice compromise, which we tend to use most of the time in our research. It allows code separation into functional cells, but provides an independent console for data manipulation and code testing.

In order to get the imports automatically reloaded every time you run a cell, add this to your settings.json VSCode file (F1 → Preferences: Open User Settings (JSON)):

"jupyter.runStartupCommands": [
    "%load_ext autoreload", "%autoreload 2"
]

Jupyter Notebooks

Here's some lines of code useful in Jupyter notebooks.

""" some basic packages and settings to show images inline """
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
 
""" controls for figure sizes to change (can be also specified for particular figures independently)"""
plt.rcParams['figure.dpi'] = 200
plt.rcParams['figure.figsize'] = [16, 8]
plt.rcParams.update({'errorbar.capsize': 1})
 
""" Select a free GPU to be used by the notebook computations"""
import nvsmi
import os
 
def get_free_gpus(ignore_my_pid = True):
    used_gpu_ids = {p.gpu_id for p in nvsmi.get_gpu_processes() if not ignore_my_pid or p.pid != os.getpid()}
 
    free_gpus = [g.id for g in nvsmi.get_gpus() if g.id not in used_gpu_ids]
 
    return free_gpus
 
# select a free GPU
free_gpus = get_free_gpus()
print('Free GPUs: {}'.format(free_gpus))
if len(free_gpus) == 0:
    print('NO FREE GPU!!!')
else:
    os.environ['CUDA_VISIBLE_DEVICES'] = free_gpus[-1]
    print('GPU {} selected'.format(free_gpus[-1]))
 

Google Colab

Another easy way how to experiment with deep learning is Google Colab. Take a look at Intro Colab space. It has PyTorch installed and you also get GPU acceleration. But it is a bit harder to work with a bigger project with classes, debug, etc..

Remote Servers (GPU)

There are dedicated student GPU servers at the department. Feel free to use them for training your networks.

Do follow the rules and instructions specified at the page. In particular:

  1. Use nvidia-smi or gpu-status script to check which GPU is available and that none of your process is still occupy some GPU memory.
  2. Always specify which GPU a script will use, e.g.:
     export CUDA_VISIBLE_DEVICES=3; python train.py --lr 0.001
  3. It is allowed to use only a single GPU per student.

SSH

To not have to type your password each time you login, you can configure authentication on the server using pre-shared public keys. You can do so using ssh-keygen and ssh-add.
This instruction for Linux seems ok.

SSHFS

If you want to copy some data there and back to the server, it is convenient to mount your working directory on the server to your filesystem using sshfs. This tool works at first, but may be failing in certain cases, when you change network access point, sleep / wake computer, etc. I use the following settings:

sshfs -o defer_permissions,reconnect,ServerAliveInterval=120,ServerAliveCountMax=3,follow_symlinks,compression=yes -o kernel_cache,entry_timeout=5,sync_read shekhovt@cantor.felk.cvut.cz:/~ ~/cantor 

Alternatively you may use scp or rsync commands to sync your files.

Modules / Environment

All GPU servers have configurable environment module system Lmod. You may search for e.g. all available versions of PyTorch by ml spider torch and load a particular module by e.g. ml torchvision/0.11.1-foss-2021a-CUDA-11.3.1. In principle, this should handle all necessary dependences.

In our experience, much more reliable and stable way of working is to create a virtual environment with the use of very few modules:

  1. You will need Python3, load it by calling
    ml Python/3.9.6-GCCcore-11.2.0
  2. Create a virtual environment using this version of Python:
    virtualenv -p /mnt/appl/software/Python/3.9.6-GCCcore-11.2.0/bin/python3 ~/dle/venv
    where the path to the Python executable is the output of which python3 command.
  3. Activate the environment:
    source dle/venv/bin/activate

Now you are ready to install the necessary Python packages (include your own favorite packages):

python3 -m pip install matplotlib numpy ipython ipykernel

Next time you log in to the server, you need to activate this environment. In case DLE is your only subject using GPU servers, you may add the above ml and source commands to your .bashrc. Otherwise, create a file in ~/dle/run.txt with this content

ml Python/3.9.6-GCCcore-11.2.0
source dle/venv/bin/activate
and call source dle/run.txt next time you log in.

Remote development with VSCode

VSCode allows to edit/debug code and Jupyter notebooks transparently directly on a remote server. This feature is not necessary for the course, but may become useful in some situations. When experimenting with this option, keep in mind that VSCode tends to keep the connection active even after you turn off your computer. As the GPU memory is expensive, login to the server regularly and check if your processes still occupy some GPUs. You may call pkill -f ipykernel to kill these processes.

Assuming the virtual environment setup from above, you will need one more step to configure VSCode to use both the environment and to load the necessary Python module. We need to wrap the call to Python into a script (save it as e.g. ~/dle/venv/bin/python_ml and make it executable):

#!/bin/bash -l
ml Python/3.9.6-GCCcore-11.2.0
/home.nfs/[your_user_name]/dle/venv/bin/python "$@"

In VSCode hit F1 and find 'Python:Select Interpreter' by starting typing. Use '+Enter Interpreter path → Browse →' and pick this script from your project folder. Whenever asked later (e.g. to run a Jupyter notebook) select the same.

When testing this setup, VSCode first refused to recognize the python_ml script as an interpreter and we had to remove of VSCode server files at the remote server first (rm -rf ~/.vscode-server/).

Let us know if you encounter some problems, perhaps some additional settings are required.

courses/bev033dle/labs/0_howto/start.txt · Last modified: 2024/04/22 16:32 by shekhole