DUG

DUG is a Perth based HPC as service company located in West Perth.

It provide some resources for us to do the research, which includes access to A100 GPU.

To access that, you will first need to contact them to setup your accounts and projects. In this step, you will need to provide your ssh pub key to them, you can generate a new one if you want.

Login

To login, you will need to load your provided ssh key first via the command

ssh-add ~/.ssh/your_rsa_key # not the .pub one

Then you can update your ~/.ssh/config

Add this chunk

Host dug
HostName mcc_uwa
User your_username
ProxyJump [email protected]
IdentityFile /path/to/your/dug_rsa_key

After this you should be able to login to the login node via ssh dug

It will prompt you a password requirements, type it in, then you are in.

Run on the HPC

Keep in mind with the architecture diagram above, you do not have much storage places inside the login node (Which is 10GB for DUG here), so you will need to go to your data directory in DUG. Similar to the concept for Kaya, you will need to go to sbatch or group directory.

Create conda env

This example is provided by Kai

This will create the conda environment under your data directory

Also you will want to get the conda init every time your login automatically, add the below section into the .bashrc

Make sure the cache directories for different stuff into your data folder

Same for the huggingface, etc

Job submit

Rather #SBATCH in Kaya, DUG use #rj as the prefix for the delcaration for HPC params.

Example of job scripts

Access internet from compute nodes

The compute nodes cannot access internet directly. You'll need to configure the proxy settings in your job script:

Kai's origin note is here: https://github.com/Kai0226/PHD/blob/main/DUG%20connection.md

We will keep it updated

Last updated