There are three options for priority, paid for, access to RACC resources: 1) the partition for the legacy met-cluster projects, based on hardware purchased by these projects. 2. similar, but new project partition consisting of new nodes purchased by projects (no nodes purchased yet). 3. custom project partitions for some large projects with custom hardware requirements, e.g. with GPUs.
Idle time on the contributed hardware might be shared with other users. This will be arranged in such a way that inconvenience for the paying users is minimal.
The level of support for the paying projects on RACC is the same as for the users of the free partitions, i. e. it is on the best effort basis. Even in the case of in warranty hardware, supported by the supplier, RACC is based on free software. Also, in case of RACC we focus on capacity and efficiency, not on reliability. The cluster configuration might be adjusted to changing research requirements, and some tests might be carried out on the production system. A higher level of support and reliability is provided by Research Cloud Service.
The active meteorology paid-for projects should now use the dedicated resources for interactive and batch computing on the RACC. met-cluster has now only one compute node (so it is not really a cluster anymore!) and is not suitable for running compute intensive jobs. The sizes of the CPU allocations on RACC are the same as they were on met-cluster (however, in some cases we are counting hardware threads, not CPU cores now, and the numbers will be double). The existing ‘project’ compute nodes are out-of-warranty and this project partition will be gradually reduced in size and then decommissioned as the legacy subscriptions expire.
When possible, some of the login nodes will be set aside for exclusive use by priority users who are member of the paying projects. Priority users have access to both ‘project’ and ‘free’ login nodes, and login sessions scheduling is performed on the pool consisting of both groups of nodes. Both the priority and free nodes have similar spec. The benefit of the priority nodes is that they are used by a smaller number of users the CPU and memory load is expected to be lower. As a new session is scheduled on the node with the lowest load, the priority users will be typically using the priority nodes, but in cases of a high load on the priority nodes they might be logged in on any available node.
To submit a job to the privileged ‘project’ partition, one needs to specify the account name i.e. their project name and select the partition ‘project’. This can be done from the command line:
sbatch -A <project name> -p project batch_script.sh
or, by specifying the account and the partition in the batch script with the directives:
#SBATCH --account=<project name>
Note that specifying the account with -A <project name> and not adding -p project will result in running the job in the free partition ‘cluster’ (in some cases it might be useful). Such job will enjoy higher priority given by the ‘project’ quality of service (QOS) associated with the account, and it will count towards the account’s CPU limit.
The accounts (projects) available to the user can be listed with the command:
sacctmgr show assoc user=<username> format=User,Account,QOS
The project allocation and the list of allowed users can be checked with the command:
sacctmgr show assoc Account=<project> format=Account,User,GrpTRES
It might happen that a job does not start in the paid-for allocation because some resources are already in use, or because the job is larger than the account’s CPU allocation. In those cases the job status will be ‘AssocGrpCpuLimit’. For pending (queued) jobs, the project account can be changed. If you have access to another account you can try:
scontrol update job <job number> Account=<another project>
Or, you can reset the job to the default free account ‘local’ and the partition ‘cluster’:
scontrol update job <job number> Account=local Partition=cluster QOS=normal
For cases where research grants require the purchase of hardware, or when access to large number of CPU cores or large-memory nodes is required, servers can be purchased from our suppliers with the agreement of IT matching the current specification. The current node specification is 2 x Intel Xeon Gold 6126 (2 x 12 cores) with 384 GB of RAM, and the cost is £7055.17 + VAT (we need to update the quote for Xeon Gold 6226). Warranty on these servers is 5 years, so purchasing a node will give 5 years of access to the proportional amount of resources in the project partition. After that time those nodes will be part of the free partition and will be useful to all cluster users for another couple of years. Buying these nodes is also the way to ensure the clusters keeps running in the future. The current nodes are becoming old and eventually they will need to be replaced.
To start creating this partition we need a sufficient number of projects willing to purchase nodes. We need a couple of identical nodes to start with, otherwise sharing nodes, scheduling jobs, etc. are not going to work well.
Access will be organized in a similar way as for the legacy met projects described above. It will be one shared pool of identical nodes with project allocation proportional to the number of contributed nodes. We can adjust the proportional allocations to maximize resource utilization and throughput, such that project can have access to more resources at once, when they need them.
A shared partition, and not access restricted to the purchased servers is proposed, because this model has a number of advantages. Sharing resources allows access to a larger number of CPUs than the number of CPUs you purchased, while idle time on your hardware can be used by others. When not relying on a single server, a server failure will not mean that the project looses all access to computing. Sharing nodes also offers better power efficiency, so it is more environment friendly approach to computing. The nodes are automatically powered on and off as they are needed. At less busy time, in a shared partition, a number of users can share the nodes that happen to be already powered on, instead of frequently powering up their own nodes, or worse, keeping them always running idle or underutilized.
In case of large computational projects, with the amount of hardware that could form a separate cluster, or with specialized hardware e.g. with GPUs, separate project partitions can be created. In such cases confirmation from the IT must be obtained before any hardware is purchased.