We are changing the way that we allocate compute resources on the HPC cluster to partitions. The new allocation will give owners more flexibility and may reduce job wait times.
From now on, all research groups that have purchased dedicated resources on the cluster will have access to the newest compute nodes. This change will happen automatically; RCC staff have already reconfigured several partitions.
Before this change, your purchase was tied to a specific hardware resource. For example, if you purchased 48 normalized compute units (NCUs) in 2018, we would allocate those NCUs on 2018 hardware. However, if you had also purchased NCU in 2017, those would be assigned to 2017 hardware, and you could not run a parallel application on both resources simultaneously.
Now, we will assign all owner-based queues to a node pool that contains the latest hardware. This change will significantly reduce job waiting times in the event that one of your assigned nodes has crashed. But most importantly, it will also allow you to run applications on all cores that you have purchased regardless of the year they were bought.
You will still have access to only the number of cores that you have leased. Our Job Scheduler (Slurm) will impose these limits.
An important exception is the GPU nodes, which we will still allocate to individual research groups.
We are also updating our documentation and our command-line tools (i.e., rcctool) to reflect this change.
If you want any clarification on this policy change, please feel free to reach out to us: support@rcc.fsu.edu.