Browse Source

Update 'gpus.md'

main
Anindya Maiti 6 months ago
parent
commit
260be68199
  1. 22
      gpus.md

22
gpus.md

@ -15,14 +15,28 @@ If RDP or SSH fails to connect, remember the first thing to check is if SecretLa
## Connect To SecretLab GPU Servers (SGPUW1, SGPUW2)
To connect to the GPU, take the following steps:
Open up your terminal, then ssh into the GPU. Example: `$ ssh <username>@172.30.200.1`
Open up your terminal, then ssh into the GPU. For example: `$ ssh <username>@172.30.200.1`.
This would request your already assigned password. You can either type in or copy-paste in the password.
Then you can do `$ nvidia-smi` to see the number of GPUs and available GPUs.
to activate a GPU.
(Optional) Run `$ nvidia-smi` to see the number and types of GPUs on the server.
Run `$ genv devices` to see the GPUs attached to existing genv sessions. *You cannot use GPUs already in use!*
Assuming there are free GPUs, activate a new genv session and attach a GPU to it using these commands: `$ genv activate` followed by `$ genv attach --count 1`.
Now you are ready to run code on the GPU!
## Resource Management
On SGPUW1, each user may have a maximum of 1 GPU attached to their genv session(s). Additional GPU(s) will automatically be detached by genv. SGPUW2 does not have any such restrictions.
On SGPUW1, all GPUs are detached at 9:01 AM Central time. Users will have to re-attach a GPU to their genv session on or after 9:02 AM. This also means processes running on the GPU will be killed, and have to be restarted. Start long running tasks well in advance, or wait for the next 9:02 AM. SGPUW2 does not have any such restrictions.
SGPUW2 has different GPU models with varying capabilities. To attach a specific GPU to you genv session, run `$ genv activate` followed by `$ genv attach --index <number>`. For example, `$ genv attach --index 1` will attach the RTX 4090.
The command is `$ genv activate --gpus <number of GPU>`
If you ls when you log in through the terminal, you will encounter something funny because you will have zero or no directories.
to resolve this, you have to log into the VM (Microsoft remote Desktop) of the lab. then, you open remote desktop connection.

Loading…
Cancel
Save