UNSW Beginners Guide
If you are familiar with Leonardi or Orange machine, the table below summaries the similarities and differences between these machines to help users to transition onto raijin.
- Katana is the computational cluster supported by UNSW faculty of Science, it is not managed by NCI.
- Leonardi is the computational cluster supported by UNSW faculty of Engineering, it is not managed by NCI.
- Orange is the computational cluster supported by Intersect, it is not managed by NCI.
|CPU||Intel Xeon, all sorts||AMD Opteron||Intel Sandy Bridge||Intel Sandy Bridge|
|Interconnect||10Gb/s Ethernet||10Gb/s Ethernet||40Gb/s QDR||56Gb/s FDR|
|core counts||2160||2944||1600||61088 + gpu + 2048 knl|
16 x 24GB
34 x 96GB
87 x 128GB
12 x 144GB
1 x 256GB
8 x 96GB
40 x 128GB
6 x 256GB
2 x 512GB
90 x 64GB
13 x 256GB
2395 x 32GB
1125 x 64GB
72 x 128GB
12 x 256GB (GPU)
3 x 1024GB
|Disk||Global scratch 340TB||100TB||101TB shared
+ 200TB local scratch
|30PB Lustre + 1.6PB Local scratch
150GB/s on short
60 – 120GB/s on gdata
|Filesystem||/home (10GB per user)
|/home (60GB per user)
|/home (2GB per user)
SGI-mpt (based on mpich)
|Single node||#PBS -l nodes=1:ppn=16||#SBATCH -N=1||#PBS -l select=1:ncpus=16||#PBS -l ncpus=16|
|Multi node||#PBS -l nodes=2:ppn=16||#SBATCH -N=2
|#PBS -l select=2:ncpus=16:
|#PBS -l ncpus=32
#PBS -l mem=120GB
Please go to my.nci.org.au to get user account and propose a project (under UNSW scheme) to obtain Compute and Storage grant.
You can also select software groups to join, e.g.:
Once account created, use the username to access our peak system, raijin.
A simple example job script looks like this:
Single Node Job
#!/bin/bash #PBS -P a99 #PBS -q normal #PBS -l walltime=20:00:00 #PBS -l mem=300MB #PBS -l jobfs=1GB #PBS -l ncpus=16 ## For licensed software, you have to specify it to get the job running. For unlicensed software, you should also specify it to help us analyse the software usage on our system. #PBS -l software=my_program ## The job will be executed from current working directory instead of home. #PBS -l wd ./my_program.exe > my_output.out
Multi Node MPI Job
#!/bin/bash #PBS -P a99 #PBS -q normal #PBS -l walltime=06:00:00 #PBS -l mem=128GB #PBS -l jobfs=1GB #PBS -l ncpus=64 ## For licensed software, you have to specify it to get the job running. For unlicensed software, you should also specify it to help us analyse the software usage on our system. #PBS -l software=my_program ## The job will be executed from current working directory instead of home. #PBS -l wd module load openmpi/1.10.2 mpirun ./my_program.exe > my_output.out ## Please make sure your program is MPI-enabled.
To submit the job,
More detailed PBSPro usage can be found in How to use PBS.
To see all available software:
You can see over 300 application installed on raijin here, click on the software to see customised job script and specific license condition and job limitation.
- ANSYS/Fluent, please join unsw_ansys group on my.nci.org.au. After you are approved, you will need to add the flag -l software=unsw_ansys in your PBS jobscript to access the license.
- Matlab, please join matlab_unsw group on my.nci.org.au. After you are approved, you will need to add the flag -l software=matlab_unsw in your PBS jobscript to access the license.
Raijin Quick Guide
|• /home||Backed up, important files. 2GB default per user.|
|• /short||Not backed up, temporary files.|
|• /g/data||Not backed up, long-term large data files.|
|• /projects||Backed up, important files shared amongst groups.|
|• $PBS_JOBFS||Not backed up, local to the node, I/O intensive data.|
|• MDSS||Backed up, archiving large data files.|
|○ mdss ls||List files on tape|
|○ mdss dmls –l||List files stautus: online (disk cache) or on tape|
|○ mdss put/get||Put or retrieve files from mdss|
|○ netcp||Submit a copyq job to copy files onto mdss|
|○ netmv||Submit a copyq job to move files onto mdss|
|• nci_account||Display compute and disk quota usage|
|○ nci_account –v||Display detailed accounting information per user|
|• lquota||Display /home and /short and /g/data usage|
|• nf_limits||Display walltime/memory limits for project|
|• short_files_report –G group||Reports location and usage in /short owned by the group|
|• module avail||List available packages|
|• module load/unload package||Load specific package|
|• module show package||Show environments set by the module|
|• module list||List which modules are loaded|
|• module use directory||Use all the modules in current directory|
|• qsub [options] jobname||Submit job in the queue|
|• qdel jobid||Delete job in the queue|
|• qalter [options] jobid||Modify resources of the jobs which are already in the queue|
|• qmove destination jobid||Move jobs between different queue (eg. normal to express)|
|• qselect [options]||Select PBS batch jobs|
|• qstat/nqstat_anu||Display status of PBS batch jobs|
|○ qstat –s jobid||See comment of the job (why is my job not running)|
|PBSPro job script|
|#PBS –P project||Specifies a project for the job|
|#PBS –q normal/express/copyq||Specifies the destination upon submission|
|#PBS –l ncpus=xx||Specifies the number of cpus|
|#PBS –l walltime=xx:xx:xx||Specifies the walltime requirement|
|#PBS –l mem=xxxMB||Specifies the memory requirement|
|#PBS –l jobfs=xxxMB||Specifies the disk requirement|
|#PBS –l software=xxx||Specifies all the licensed software|
|#PBS –l wd||Starts the job from the directory it was submitted|
|#PBS –W depend=after:xxx||Sets dependencies between this and other jobs.|
Please see more details in our Raijin User Guide
||Irreproducible data eg. source code||raijin only||2GB (user)||none||Yes|
||Large data IO, data maintained beyond one job||raijin only||72GB (project)||none||No|
||Processing of large data files||global||none||No|
|massdata||Archiving large data files||external – access using the
||none||2 copies in two different locations|
||IO intensive data, job lifetime||local to each individual raijin nodes||unlimited(3)||duration of job||No|
- Each user belongs to at least two Unix groups:
unigrp– determined by their host institution, and
projectid(s) – one for each project they are attached to.
- Increases to these quotas will be considered on a case-by-case basis.
- Users request allocation of
/jobfsas part of their job submission – the actual disk quota for a particular job is given by the
jobfsrequest. Requests larger than 420GB will be automatically redirected to /short (but will still be deleted at the end of the job).
- Please make sure you specify #PBS -lother=gdata1 when submitting jobs accessing files in /g/data1. If /g/data1 filesystem is not available, your job will not start. The following command can be used to monitor the status of /g/data1 on raijin and can be incorporated inside your jobscript for checking the status of /g/data1:
/opt/rash/bin/modstatus -n gdata1_status
- Please make sure you specify #PBS -lother=mdss when submitting jobs accessing files in mdss. If mdss filesystem is not available, your job will not start. The following command can be used to monitor the status of mdss on raijin and can be incorporated inside your jobscript for checking the status of mdss:
/opt/rash/bin/modstatus -n mdss_status
More detail can be found in Filesystem User Guide.