UPDATE Apr 21 - We are still working our Lustre issues. Spear and Lustre are still unavailable.
Yesterday (Sunday, April 19), the data center in Dirac suffered a brief power outage. Most systems are back online now, but there are lingering issues with the Lustre filesystem. This means that Spear systems are unavailable as well.
We are currently working on a solution, and will post an update as soon as we have resolved the issues.
This summer, we are going to replace the scheduling and job management software on the High Performance Computing Cluster with a new package called Slurm. Slurm will replace the current MOAB/Torque workload manager that we have been using for the past eight years. This will affect all HPC users.
A publication by Bin Chen, a member of applications group has been accepted for publication in the journal Astrophysical Journal Supplement Series. The article titled "Algorithms And Programs For Strong Gravitational Lensing In Kerr Space-time Including Polarization" used the HPC, Matlab and cython for performance analysis.
HPC jobs are traditionally compiled and run in low-level languages, such as C and Fortran. The reason for this—speed. Parallel libraries like OpenMP, MPI or gpGPU accelerators enable code to run in parallel faster on more hardware. Conversely, higher level interpreted languages like Python and MATLAB are much easier for scientists to work with, but are generally magnitudes slower than the lower-level compiled languages. However, these higher-level languages are significantly improving in speed and flexibility.
UPDATE (Mar 4): These issues have been resolved; thanks for your patience.
We are experiencing some problems with our job scheduler; it is running
but will not/very sporadically accept new jobs and will timeout when you
try to access job information (for example, with checkjob). Running jobs
will not be impacted.
We are pleased to welcome a new member to our team, Terry Ward.
Terry has 43 years of experience working in systems administration, mostly in large data center environments. His experience encompasses a broad range of IT activities, including system architecture, network management, and software development.
Terry will join our support team, and will take over a number of critical systems management tasks, including HPC scheduler management and storage operations.