We're currently investigating a memory issue with some of the worker nodes. Memory is not being freed up after jobs complete.
UPDATE - 4 July:
Turns out it's not a memory error. The problem is the way that net-snmp monitors…
We recently installed Crux on our cluster for researchers in the IIDMM department at UCT. Unfortunately it's not MPI aware and only runs on single cores. Additionaly the binary we downloaded did not run on the cluster due to a…
Subsequent to the installation of MrBayes we decided to deploy BLAST on the HPC cluster to complement the package on our Grid cluster. BLAST (Basic Local Alignment Search Tool) is a free bioinformatics package…
Gromacs is used to perform molecular dynamics simulations. It's a popular tool and we decided to install it on our cluster based on the needs of two researchers, one at UCT and the other in Kenya. Gromacs is designed to…
Over the last week ICTS engineers have been adding 5 new BL460 blades to the cluster. The OS install took place on Tuesday, applications on Wednesday and our first user is already running jobs on the CPUs.
The 400 series…
ICTS's grid cluster recently surpassed 1 computational year with the equivalent of 1 CPU running jobs constantly for 558 days. The sudden jump was the completion of a long term job that had been running for 36 days on 8…
Patched kernels on HPC servers to 2.6.18-238.1.1.el5; All went fine except for the head node which has an issue with latest kernel (dies at boot with a kernel panic) so booting it into older version 2.6.18-194.1.1.el5 until we can sort…
Two new servers have been added to the cluster bringing the core count to 20.
Below is a set of simple tests carried out with a C program compiled with mpicc which runs a large number of floating point computations…
OpenMPI is now configured on the new cluster. There was an issue with the installation, in that the package was pre-configured to expect Infiniband which we do not have (yet). However after several hours spent battling with it we found…
Installed and configured a new cluster, srvslnhpc001. The head node is a HP BL20P blade with 2 dual core 3.6GHz CPUs and 8GB RAM. The 3 worker nodes are BL20P blades with 2 dual core 3.6GHz CPUs and 4GB RAM…