We have completed the upgrade to the GPU portion of the hex cluster: – Installed new GPU004 server with two nVidia K40 cards. – Two additional nVidia K40 cards added to GPU003. This brings the number of GPU cards in…
The new GPU server, srvslsgpu004, is up and running. Still to be configured is the Infiniband card and the BGFS volume. The server is being tested and will remain offline until next week. In the server are 2 x 10…
ICTS will be conducting power maintenance in their data centers on Sunday the 26th of June between 09:00 and 17:00. The Bremner data center will be shut down completely and hence the Hal Slurm cluster will be offline. We will…
The ICTS hex cluster will be down for scheduled maintenance from Monday January 11th 09:00 to Tuesday January 12th 17:00. The head node, data node and all worker nodes will be patched and rebooted, hence all jobs should be canceled…
Co-Design is a new HPC infrastructure design, essentially the next level of performance and scale. Co-design in a nutshell is the ability to reduce as much of the CPU cycle operations to a design whereby there are synergies created between…
We have installed two high memory DELL R820 servers into the 800 series. Each server has four sockets with six cores of 2.4GHz each. There are also thirty two 32GB DIMMS making up 1TB of RAM, although these memory chips …
Our current cluster, hex, runs Torque with MAUI as the scheduler. While MAUI is GPU aware it does not allow GPUs to be scheduled. In other words you can list the nodes with GPUs but you cannot submit a job …
The HPC team is busy testing a prototype SLURM cluster. This will replace both the hex and hpc clusters, although it is likely that the hex infrastructure will be incorporated into the new cluster. The time frame for
We've installed four new 600 series worker nodes on hex. This increases the core count by 256, and we hope to add a few more to both the hex and hpc clusters shortly. If you notice anything odd please let…