Quick links: Schedule | Forum | BRUTE | Lectures | Labs

Labs How To

Basic info

If you are new to CTU, see the checklist for visiting students.

Forum

There is a discussion forum administered for this course that can be used to solicit help for the assignments. It is monitored by the lab assistants and it is the preferred form of communication for getting assistance for the assignments since all students can see the question and answer threads. If you find an error in the assignment or it is unclear/ambiguous, please write on the forum.

We also ask you to write remarks about lectures (typos, questions).

Python, IDEs

You will implement the labs in Python. The first lab will need only Python with standard packages (NumPy, Matplotlib).

For the case you are not too sure about your Python/NumPy skills, have a look here: http://cs231n.github.io/python-numpy-tutorial/.

You are free to use your favorite IDE for Python development. There is a professional PyCharm license available to the university https://download.cvut.cz. However, it could be somewhat slow (running on Java) and unnecessarily complex (has many features). We recommend Visual Studio Code (VSCode). It is very easy to setup and convenient to work with. Follow these basic instructions for setting up the IDE: VSCode and Python

Below we provide instructions for VSCode specifically useful for this course. You should be able to find similar functionality in other IDE of your preference.

Jupyter / IPython (VSCode)

There are in principle three ways of using Python in VSCode:

  • Writing the code in a Python .py file, running it on the command line and save the outputs to files (e.g. images). The course templates are written like this by default as it is the most general form.
  • Using Jupyter notebooks (.ipynb files).
  • Using Jupyter code cells together with an interactive window (.py file with special comments to separate the code cells).

The second and third options could be also combined by connecting a Jupyter notebook and an interactive window to the same externally run Jupyter server.

The advantage of a Jupyter notebook is that both the code and the outputs are stored in one file, which may be useful for submitting the report. It is, however, more difficult to handle by git and without the interactive console, all outputs (even the debug ones) are stored in the same file, so one is constantly creating and deleting cells and variable inspection is cumbersome. The Jupyter code cell file is a nice compromise, which we tend to use most of the time in our research. It allows code separation into functional cells, but provides an independent console for data manipulation and code testing.

Remote Interactive Development at Student GPU Cluster

What to expect: By following this guide, you will be able to interact (thourh .ipynb files or Python interactive window/code cells) with a Jupyter server run on the student GPU cluster inside your locally installed VSCode. This maybe be handy if you prefer working in your coding environment rather than being forced to a web interface like Colab or JupyterLab.

What not to expect: You won't be able to debug your code remotely outside of the Jupyter notebooks. If you find a way, how to setup that, let us know, please.

For building your remote development environment, start by logging into the student GPU server:

>> ssh username@gpu.fel.cvut.cz

Install direnv utility

The server allows you to load several versions of pre-build modules (like Python, PyTorch, cURL, …) dynamically. This utility automates the process for a project. Every time you enter the project directory, it will load pre-specified modules and setup the virtual environment.

Install direnv:

>> ml cURL/8.7.1-GCCcore-13.3.0
>> curl -sfL https://direnv.net/install.sh | bash

and add the following line to your .bashrc:

eval "$(direnv hook bash)"

Automatic module loading

Allocate an interactive GPU node, check the CUDA version and start your project:

>> srun -p fast --gres=gpu:1 --pty bash -i
>> nvidia-smi   
>> mkdir your_project_dir
>> cd your_project_dir
In the reference setup we got CUDA version 12.7.

Create a file .envrc in the project directory and add the following lines in it (we are using CUDA 12.6.0 because version 12.7 is not available as a module):

ml CUDA/12.6.0
ml PyTorch/2.6.0-foss-2024a-CUDA-12.6.0
ml torchvision/0.21.0-foss-2024a-CUDA-12.6.0
If you then run direnv allow, it will automatically load the respective modules every time you enter this directory and unload them when you leave it.

The above config ensures that we load PyTorch, Torchvision and ton of dependencies (check the loaded modules with module list). In particular, it loads also a compatible version of Python (3.12.3 in this case, check the version with python –version).

Note: Sometimes we want to keep the freedom of also updating PyTorch at will. In that case, is may be better to load just CUDA and Python here and install particular PyTorch version in the next step together with other packages.

Virtual environment

Every project needs different mixture of Python packages and their versions. It is a good habit to build a virtual environment for each project, so that the updates in one project do not break dependencies in another.

With Python loaded by direnv, and still in our project directory run

>> python -m venv ./venv

You may again want to activate this environment automatically every time you enter your project directory. For that, add this line to the end of the .envrc file:

source venv/bin/activate

and you can check that the environment is activated by

>> python -c "import sys; print(sys.prefix)"

Now you can install your favorite packages. They will be local to your virtual environment:

>> pip install matplotlib scipy numpy einops lovely-tensors jupyter

You may now run locally your Python scripts and the environment stays stable and controllable.

Remote development

Let us now connect the obtained GPU node with your local VSCode instance for comfortable interactive development.

Run the Jupyter server

While still at the allocated GPU node (typically called gpu1) and in the directory with your project run a Jupyter server:

>> PORT=XXXX; jupyter-notebook --no-browser --port=${PORT} --ip=0.0.0.0
where PORT is your favorite number between 8000 and 9999.

Tip: Even better is to first run screen or tmux (the later is not installed, you will need to install it on your own). You will most likely need a console access to your project later on, but your terminal is blocked by the Jupyter server (you may also run it on the background and handle its outputs).

Open a SSH tunnel

To be able to connect to the running Jupyter server from your local machine, you will need to create a ssh tunnel. Open a second terminal (or use tmux magic), connect to gpu.fel.cvut.cz again and run this command:

>> ssh -N -L PORT:gpu1:PORT username@gpu.fel.cvut.cz
where PORT is the same as when running the Jupyter server.

Note: You may do the same from your local computer and then connect to the Jupyter server in your browser at http://127.0.0.1:PORT.

Connect VSCode to the GPU Node

  1. Open your local VSCode and connect to a remote server (gpu.fel.cvut.cz) by pressing F1 and selecting “Remote-SSH: Connect to Host…”.
  2. Install the direnv extension (the one by Martin Kühl).
  3. Open your project directory.
  4. You should be able to select the correct interpreter in the venv sub-directory with Python version corresponding to the one installed earlier.
  5. Open new .ipynb file or an interactive Python file and when choosing the kernel, choose the one at http://127.0.0.1:PORT (still the same port).

Well done! :)

Note: When I was trying it first time, I had to try several times, before all pieces fitted together. Maybe there are some delays in checking the new servers in VSCode? Anyway, be patient, it should work ;)

courses/bev033dle/labs/0_howto/start.txt · Last modified: 2025/04/07 11:26 by sochmjan