We have been notified that the UCT Upper Campus data centre will be
shut down for emergency electrical work, see notice below. Please
can you checkpoint or shut down your jobs by Saturday 12th July
20:00.
We apologise for this…
The HPC rack was neatened up. This involved moving and consolidating servers, making space for PDU's and removing redundant cables that were impeding airflow. New HPC servers were installed. This task took two entire days as other
Our users may have noticed that the hpc cluster dashboard is reflecting some infrastructure changes. Please note that this post refers to the older hpc cluster, not hex. The 200 series are going to be decommissioned soon, and this is…
Our older cluster, hpc.uct.ac.za is undergoing several changes. The 200 series will soon be decomissioned. The nodes are old, have insufficient core density and are inefficient power-wise compared to more modern servers. The space in the racks is required for…
Quite simply, our cluster is running at almost maximum capacity, and there's not much we can do about that. It's always nice to be wanted though :-)
Our strategy to deal with this is three-fold:
1) Shuffle user priorities to…
So after waiting 24 hours and then looking at the AWS billing reports a couple of things stand out. At first we were puzzled as to why we were billed for Run Instances as well as Spot Instances. Turns out
Node 200 had become a bit unstable over time so we reinstalled it. Same hardware, just upgraded to SL5.8. We also replaced nodes 216 and 217 with blades with 8GB of RAM to make the 200 series architecture homogeneous. After…
We've added a few more servers to the 200 series and will hopefully add a few more next week. In all we should increase our cluster by about 18 cores. To cope with this we've modified our dashboard to more…