We have installed two high memory DELL R820 servers into the 800 series. Each server has four sockets with six cores of 2.4GHz each. There are also thirty two 32GB DIMMS making up 1TB of RAM, although these memory chips …
Our current cluster, hex, runs Torque with MAUI as the scheduler. While MAUI is GPU aware it does not allow GPUs to be scheduled. In other words you can list the nodes with GPUs but you cannot submit a job …
The HPC team is busy testing a prototype SLURM cluster. This will replace both the hex and hpc clusters, although it is likely that the hex infrastructure will be incorporated into the new cluster. The time frame for
We've installed four new 600 series worker nodes on hex. This increases the core count by 256, and we hope to add a few more to both the hex and hpc clusters shortly. If you notice anything odd please let…
We have been notified that the UCT Upper Campus data centre will be
shut down for emergency electrical work, see notice below. Please
can you checkpoint or shut down your jobs by Saturday 12th July
20:00.
We apologise for this…
The HPC rack was neatened up. This involved moving and consolidating servers, making space for PDU's and removing redundant cables that were impeding airflow. New HPC servers were installed. This task took two entire days as other
Our users may have noticed that the hpc cluster dashboard is reflecting some infrastructure changes. Please note that this post refers to the older hpc cluster, not hex. The 200 series are going to be decommissioned soon, and this is…
Our older cluster, hpc.uct.ac.za is undergoing several changes. The 200 series will soon be decomissioned. The nodes are old, have insufficient core density and are inefficient power-wise compared to more modern servers. The space in the racks is required for…
Quite simply, our cluster is running at almost maximum capacity, and there's not much we can do about that. It's always nice to be wanted though :-)
Our strategy to deal with this is three-fold:
1) Shuffle user priorities to…