<?xml version="1.0"?>
<oembed><version>1.0</version><provider_name>UCT HPC</provider_name><provider_url>https://ucthpc.uct.ac.za</provider_url><author_name>Andrew Lewis</author_name><author_url>https://ucthpc.uct.ac.za/index.php/author/andrew-lewis/</author_url><title>Why are my jobs queuing?!? - UCT HPC</title><type>rich</type><width>600</width><height>338</height><html>&lt;blockquote class="wp-embedded-content" data-secret="D2YirQgsWT"&gt;&lt;a href="https://ucthpc.uct.ac.za/index.php/2014/04/16/why-are-my-jobs-queuing/"&gt;Why are my jobs queuing?!?&lt;/a&gt;&lt;/blockquote&gt;&lt;iframe sandbox="allow-scripts" security="restricted" src="https://ucthpc.uct.ac.za/index.php/2014/04/16/why-are-my-jobs-queuing/embed/#?secret=D2YirQgsWT" width="600" height="338" title="&#x201C;Why are my jobs queuing?!?&#x201D; &#x2014; UCT HPC" data-secret="D2YirQgsWT" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" class="wp-embedded-content"&gt;&lt;/iframe&gt;&lt;script type="text/javascript"&gt;
/*! This file is auto-generated */
!function(c,d){"use strict";var e=!1,o=!1;if(d.querySelector)if(c.addEventListener)e=!0;if(c.wp=c.wp||{},c.wp.receiveEmbedMessage);else if(c.wp.receiveEmbedMessage=function(e){var t=e.data;if(!t);else if(!(t.secret||t.message||t.value));else if(/[^a-zA-Z0-9]/.test(t.secret));else{for(var r,s,a,i=d.querySelectorAll('iframe[data-secret="'+t.secret+'"]'),n=d.querySelectorAll('blockquote[data-secret="'+t.secret+'"]'),o=new RegExp("^https?:$","i"),l=0;l&lt;n.length;l++)n[l].style.display="none";for(l=0;l&lt;i.length;l++)if(r=i[l],e.source!==r.contentWindow);else{if(r.removeAttribute("style"),"height"===t.message){if(1e3&lt;(s=parseInt(t.value,10)))s=1e3;else if(~~s&lt;200)s=200;r.height=s}if("link"===t.message)if(s=d.createElement("a"),a=d.createElement("a"),s.href=r.getAttribute("src"),a.href=t.value,!o.test(a.protocol));else if(a.host===s.host)if(d.activeElement===r)c.top.location.href=t.value}}},e)c.addEventListener("message",c.wp.receiveEmbedMessage,!1),d.addEventListener("DOMContentLoaded",t,!1),c.addEventListener("load",t,!1);function t(){if(o);else{o=!0;for(var e,t,r,s=-1!==navigator.appVersion.indexOf("MSIE 10"),a=!!navigator.userAgent.match(/Trident.*rv:11\./),i=d.querySelectorAll("iframe.wp-embedded-content"),n=0;n&lt;i.length;n++){if(!(r=(t=i[n]).getAttribute("data-secret")))r=Math.random().toString(36).substr(2,10),t.src+="#?secret="+r,t.setAttribute("data-secret",r);if(s||a)(e=t.cloneNode(!0)).removeAttribute("security"),t.parentNode.replaceChild(e,t);t.contentWindow.postMessage({message:"ready",secret:r},"*")}}}}(window,document);
&lt;/script&gt;
</html><description>Quite simply, our cluster is running at almost maximum capacity, and there's not much we can do about that. &nbsp;It's always nice to be wanted though :-)Our strategy to deal with this is three-fold:1) Shuffle user priorities to allow low coreshort term users to jump the queue. This may seem unfair, but consider the example of a user requiring 32 cores holding everyone back for 2 days even though some users only need 1 core for an hour and there are currently 31 cores free. Unfortunately the built in fair-share policy can't deal with this situation fast enough so this is being done manually on a case by case basis just to keep as many jobs flowing as possible. We have set the Galaxy user to have a very high priority as these are pipe-line jobs that need to be dealt with sequentially. They are also fairly short term and they run in batches.2) Move users to other queues. We may ask some of our users to change their scripts to run on CLOUDQ, or CLOUDHMQ. &nbsp;Most likely these will be either users who run short term jobs (sub 1 hour) or high memory jobs.3) Buy more kit. This is being done, but it's expensive and needs to be well motivated. We hope to have another 512 cores put in over the next few months.&nbsp;&nbsp;</description></oembed>
