You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 

2.3 KiB

Using SecretLab GPU Servers

GPU Servers

SGPUW1: 172.30.200.1 (3x RTX 6000 Ada 48GB, 768GB RAM, 8TB SSD, Xubuntu 22.04 LTS)

SGPUW2: 172.30.200.2 (1x RTX 4090 24GB, 1x RTX 4080 16GB, 1x QRTX 8000 48GB, 192GB RAM, 8TB SSD, Xubuntu 22.04 LTS)

Connect to SecretLab VPN

SGPUW1 and SGPUW2 are only accessible via SecretLab VPN. Download SecretLab VPN profile from Nextcloud, install OpenVPN client software on your device, install the SecretLab VPN profile on your client device, and then connect to SecretLab VPN before attempting to access these GPU servers.

Note: SecretLab VPN does not tunnel your Internet traffic, only the SecretLab VPN subnet traffic is routed thru it.

If RDP or SSH fails to connect, remember the first thing to check is if SecretLab VPN is connected or not!

Connect To SecretLab GPU Servers (SGPUW1, SGPUW2)

To connect to the GPU, take the following steps:

Open up your terminal, then ssh into the GPU. For example: $ ssh <username>@172.30.200.1.

This would request your already assigned password. You can either type in or copy-paste in the password.

(Optional) Run $ nvidia-smi to see the number and types of GPUs on the server. Also a way to monitor utilization, and note the indices of different types of GPUs on SGPUW2.

Run $ genv devices to see the GPUs attached to existing genv sessions. You cannot use GPUs already in use!

Assuming there are free GPUs, activate a new genv session and attach a GPU to it using these commands: $ genv activate followed by $ genv attach --count 1.

Now you are ready to run code on the GPU!

Resource Management

On SGPUW1, each user may have a maximum of 1 GPU attached to their genv session(s). Additional GPU(s) will automatically be detached by genv. SGPUW2 does not have any such restrictions.

On SGPUW1, all GPUs are detached at 9:01 AM central time. Users will have to re-attach a GPU to their genv session on or after 9:02 AM. This also means processes running on the GPU will be killed, and have to be restarted. Start long running tasks well in advance, or wait for the next 9:02 AM. SGPUW2 does not have any such restrictions.

SGPUW2 has different GPU models with varying capabilities. To attach a specific GPU to you genv session, run $ genv activate followed by $ genv attach --index <number>. For example, $ genv attach --index 1 will attach the RTX 4080.