{"id":755,"date":"2014-04-16T21:47:41","date_gmt":"2014-04-16T19:47:41","guid":{"rendered":"http:\/\/oldblogs.uct.ac.za\/blog\/big-bytes\/2014\/04\/16\/why-are-my-jobs-queuing"},"modified":"2015-08-14T11:24:07","modified_gmt":"2015-08-14T09:24:07","slug":"why-are-my-jobs-queuing","status":"publish","type":"post","link":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/","title":{"rendered":"Why are my jobs queuing?!?"},"content":{"rendered":"Quite simply, our cluster is running at almost maximum capacity, and there's not much we can do about that. \u00a0It's always nice to be wanted though :-)\r\n\r\nOur strategy to deal with this is three-fold:\r\n\r\n1) Shuffle user priorities to allow low coreshort term users to jump the queue. This may seem unfair, but consider the example of a user requiring 32 cores holding everyone back for 2 days even though some users only need 1 core for an hour and there are currently 31 cores free. Unfortunately the built in fair-share policy can't deal with this situation fast enough so this is being done manually on a case by case basis just to keep as many jobs flowing as possible. We have set the Galaxy user to have a very high priority as these are pipe-line jobs that need to be dealt with sequentially. They are also fairly short term and they run in batches.\r\n\r\n2) Move users to other queues. We may ask some of our users to change their scripts to run on CLOUDQ, or CLOUDHMQ. \u00a0Most likely these will be either users who run short term jobs (sub 1 hour) or high memory jobs.\r\n\r\n3) Buy more kit. This is being done, but it's expensive and needs to be well motivated. We hope to have another 512 cores put in over the next few months.\r\n\r\n&nbsp;\r\n\r\n&nbsp;","protected":false},"excerpt":{"rendered":"<p>Quite simply, our cluster is running at almost maximum capacity, and there&#8217;s not much we can do about that. &nbsp;It&#8217;s always nice to be wanted though \ud83d\ude42<\/p>\n<p>Our strategy to deal with this is three-fold:<\/p>\n<p>1) Shuffle user priorities to allow low coreshort term users to jump the queue. This may seem unfair, but consider the example of a user requiring 32 cores holding everyone back for 2 days even though some users only need 1 core for an hour and there are currently 31 cores free. Unfortunately the built in fair-share policy can&#8217;t deal with this situation fast enough so this is being done manually on a case by case basis just to keep as many jobs flowing as possible. We have set the Galaxy user to have a very high priority as these are pipe-line jobs that need to be dealt with sequentially. They are also fairly short term and they run in batches.<\/p>\n<p>2) Move users to other queues. We may ask some of our users to change their scripts to run on CLOUDQ, or CLOUDHMQ. &nbsp;Most likely these will be either users who run short term jobs (sub 1 hour) or high memory jobs.<\/p>\n<p>3) Buy more kit. This is being done, but it&#8217;s expensive and needs to be well motivated. We hope to have another 512 cores put in over the next few months.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[6,4,13,14],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Why are my jobs queuing?!? - UCT HPC<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Why are my jobs queuing?!? - UCT HPC\" \/>\n<meta property=\"og:description\" content=\"Quite simply, our cluster is running at almost maximum capacity, and there&#039;s not much we can do about that. &nbsp;It&#039;s always nice to be wanted though :-)Our strategy to deal with this is three-fold:1) Shuffle user priorities to allow low coreshort term users to jump the queue. This may seem unfair, but consider the example of a user requiring 32 cores holding everyone back for 2 days even though some users only need 1 core for an hour and there are currently 31 cores free. Unfortunately the built in fair-share policy can&#039;t deal with this situation fast enough so this is being done manually on a case by case basis just to keep as many jobs flowing as possible. We have set the Galaxy user to have a very high priority as these are pipe-line jobs that need to be dealt with sequentially. They are also fairly short term and they run in batches.2) Move users to other queues. We may ask some of our users to change their scripts to run on CLOUDQ, or CLOUDHMQ. &nbsp;Most likely these will be either users who run short term jobs (sub 1 hour) or high memory jobs.3) Buy more kit. This is being done, but it&#039;s expensive and needs to be well motivated. We hope to have another 512 cores put in over the next few months.&nbsp;&nbsp;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\" \/>\n<meta property=\"og:site_name\" content=\"UCT HPC\" \/>\n<meta property=\"article:published_time\" content=\"2014-04-16T19:47:41+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2015-08-14T09:24:07+00:00\" \/>\n<meta name=\"author\" content=\"Andrew Lewis\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrew Lewis\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\"},\"author\":{\"name\":\"Andrew Lewis\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e\"},\"headline\":\"Why are my jobs queuing?!?\",\"datePublished\":\"2014-04-16T19:47:41+00:00\",\"dateModified\":\"2015-08-14T09:24:07+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\"},\"wordCount\":227,\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"articleSection\":[\"hardware\",\"hpc\",\"maui\",\"torque\"],\"inLanguage\":\"en-ZA\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\",\"name\":\"Why are my jobs queuing?!? - UCT HPC\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\"},\"datePublished\":\"2014-04-16T19:47:41+00:00\",\"dateModified\":\"2015-08-14T09:24:07+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/#breadcrumb\"},\"inLanguage\":\"en-ZA\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucthpc.uct.ac.za\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Why are my jobs queuing?!?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"name\":\"UCT HPC\",\"description\":\"University of Cape Town High Performance Computing\",\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-ZA\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\",\"name\":\"University of Cape Town High Performance Computing\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"contentUrl\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"width\":450,\"height\":423,\"caption\":\"University of Cape Town High Performance Computing\"},\"image\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e\",\"name\":\"Andrew Lewis\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g\",\"caption\":\"Andrew Lewis\"},\"sameAs\":[\"http:\/\/blogs.uct.ac.za\/blog\/big-bytes\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Why are my jobs queuing?!? - UCT HPC","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/","og_locale":"en_US","og_type":"article","og_title":"Why are my jobs queuing?!? - UCT HPC","og_description":"Quite simply, our cluster is running at almost maximum capacity, and there's not much we can do about that. &nbsp;It's always nice to be wanted though :-)Our strategy to deal with this is three-fold:1) Shuffle user priorities to allow low coreshort term users to jump the queue. This may seem unfair, but consider the example of a user requiring 32 cores holding everyone back for 2 days even though some users only need 1 core for an hour and there are currently 31 cores free. Unfortunately the built in fair-share policy can't deal with this situation fast enough so this is being done manually on a case by case basis just to keep as many jobs flowing as possible. We have set the Galaxy user to have a very high priority as these are pipe-line jobs that need to be dealt with sequentially. They are also fairly short term and they run in batches.2) Move users to other queues. We may ask some of our users to change their scripts to run on CLOUDQ, or CLOUDHMQ. &nbsp;Most likely these will be either users who run short term jobs (sub 1 hour) or high memory jobs.3) Buy more kit. This is being done, but it's expensive and needs to be well motivated. We hope to have another 512 cores put in over the next few months.&nbsp;&nbsp;","og_url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/","og_site_name":"UCT HPC","article_published_time":"2014-04-16T19:47:41+00:00","article_modified_time":"2015-08-14T09:24:07+00:00","author":"Andrew Lewis","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Andrew Lewis","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/#article","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/"},"author":{"name":"Andrew Lewis","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e"},"headline":"Why are my jobs queuing?!?","datePublished":"2014-04-16T19:47:41+00:00","dateModified":"2015-08-14T09:24:07+00:00","mainEntityOfPage":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/"},"wordCount":227,"publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"articleSection":["hardware","hpc","maui","torque"],"inLanguage":"en-ZA"},{"@type":"WebPage","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/","url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/","name":"Why are my jobs queuing?!? - UCT HPC","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/#website"},"datePublished":"2014-04-16T19:47:41+00:00","dateModified":"2015-08-14T09:24:07+00:00","breadcrumb":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/#breadcrumb"},"inLanguage":"en-ZA","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2014\/04\/16\/why-are-my-jobs-queuing\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucthpc.uct.ac.za\/"},{"@type":"ListItem","position":2,"name":"Why are my jobs queuing?!?"}]},{"@type":"WebSite","@id":"https:\/\/ucthpc.uct.ac.za\/#website","url":"https:\/\/ucthpc.uct.ac.za\/","name":"UCT HPC","description":"University of Cape Town High Performance Computing","publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-ZA"},{"@type":"Organization","@id":"https:\/\/ucthpc.uct.ac.za\/#organization","name":"University of Cape Town High Performance Computing","url":"https:\/\/ucthpc.uct.ac.za\/","logo":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/","url":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","contentUrl":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","width":450,"height":423,"caption":"University of Cape Town High Performance Computing"},"image":{"@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e","name":"Andrew Lewis","image":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g","caption":"Andrew Lewis"},"sameAs":["http:\/\/blogs.uct.ac.za\/blog\/big-bytes"]}]}},"_links":{"self":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/755"}],"collection":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/comments?post=755"}],"version-history":[{"count":2,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/755\/revisions"}],"predecessor-version":[{"id":2049,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/755\/revisions\/2049"}],"wp:attachment":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/media?parent=755"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/categories?post=755"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/tags?post=755"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}