{"id":815,"date":"2013-09-30T14:06:18","date_gmt":"2013-09-30T12:06:18","guid":{"rendered":"http:\/\/oldblogs.uct.ac.za\/blog\/big-bytes\/2013\/09\/30\/own-goal"},"modified":"2022-09-26T20:30:30","modified_gmt":"2022-09-26T18:30:30","slug":"own-goal","status":"publish","type":"post","link":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/","title":{"rendered":"Own goal"},"content":{"rendered":"<div>Last month we decided to adjust our HPC queues so that jobs by default would land on the correct series. \u00a0Working under the usual amount of stress and pressure these changes were duly made (by the author) to the queue manager:<\/div>\r\n<div><span style=\"font-size: xx-small;\">set queue [QueueName] resources_default.neednodes = [NodeTag]<\/span><\/div>\r\n<div>About 24 hours later it was remarked upon that jobs were queuing even though resources were available. \u00a0Initially it was suspected that as the PBS nodes=X directive was being used that jobs designated for the series600 servers were being offered GPU series servers.<\/div>\r\n<div>Googling for the usual \"PBS queues jobs and they don't run\" didn't produce much in the way of solutions. \u00a048 hours later it was clear that there was a deeper problem, several large jobs had just finished freeing up many cores and jobs were still queuing, even single core jobs. \u00a0Eventually Heine managed to spot the error, instead of entering the NodeTag in the .neednodes directive I had added the queue name again, hence the default request could never be satisfied. \u00a0Correcting the line and updating queue manager fixed the problem immediately.<\/div>\r\n<div>Fortunately this did not result in much interruption to research but it did highlight the need for the existing change control system to be used.<\/div>\r\n<div><img src=\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/07\/owngoal.jpg\" alt=\"Own goal\" border=\"0\" \/><\/div>","protected":false},"excerpt":{"rendered":"<div>Last month we decided to adjust our HPC queues so that jobs by default would land on the correct series. &nbsp;Working under the usual amount of stress and pressure these changes were duly made (by the author) to the queue manager:<\/div>\n<div><span>set queue [QueueName] resources_default.neednodes = [NodeTag]<\/span><\/div>\n<div>About 24 hours later it was remarked upon that jobs were queuing even though resources were available. &nbsp;Initially it was suspected that as the PBS nodes=X directive was being used that jobs designated for the series600 servers were being offered GPU series servers.<\/div>\n<div>Googling for the usual &#8220;PBS queues jobs and they don&#8217;t run&#8221; didn&#8217;t produce much in the way of solutions. &nbsp;48 hours later it was clear that there was a deeper problem, several large jobs had just finished freeing up many cores and jobs were still queuing, even single core jobs. &nbsp;Eventually Heine managed to spot the error, instead of entering the NodeTag in the .neednodes directive I had added the queue name again, hence the default request could never be satisfied. &nbsp;Correcting the line and updating queue manager fixed the problem immediately.&nbsp;<\/div>\n<div>Fortunately this did not result in much interruption to research but it did highlight the need for the existing change control system to be used.<\/div>\n<div><img decoding=\"async\" src=\"http:\/\/blogs.uct.ac.za\/gallery\/1253\/owngoal.jpg\" border=\"0\" alt=\"Own goal\"><\/div>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Own goal - UCT HPC<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Own goal - UCT HPC\" \/>\n<meta property=\"og:description\" content=\"Last month we decided to adjust our HPC queues so that jobs by default would land on the correct series. &nbsp;Working under the usual amount of stress and pressure these changes were duly made (by the author) to the queue manager:set queue [QueueName] resources_default.neednodes = [NodeTag]About 24 hours later it was remarked upon that jobs were queuing even though resources were available. &nbsp;Initially it was suspected that as the PBS nodes=X directive was being used that jobs designated for the series600 servers were being offered GPU series servers.Googling for the usual &quot;PBS queues jobs and they don&#039;t run&quot; didn&#039;t produce much in the way of solutions. &nbsp;48 hours later it was clear that there was a deeper problem, several large jobs had just finished freeing up many cores and jobs were still queuing, even single core jobs. &nbsp;Eventually Heine managed to spot the error, instead of entering the NodeTag in the .neednodes directive I had added the queue name again, hence the default request could never be satisfied. &nbsp;Correcting the line and updating queue manager fixed the problem immediately.&nbsp;Fortunately this did not result in much interruption to research but it did highlight the need for the existing change control system to be used.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\" \/>\n<meta property=\"og:site_name\" content=\"UCT HPC\" \/>\n<meta property=\"article:published_time\" content=\"2013-09-30T12:06:18+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-09-26T18:30:30+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/07\/owngoal.jpg\" \/>\n<meta name=\"author\" content=\"Andrew Lewis\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrew Lewis\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\"},\"author\":{\"name\":\"Andrew Lewis\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e\"},\"headline\":\"Own goal\",\"datePublished\":\"2013-09-30T12:06:18+00:00\",\"dateModified\":\"2022-09-26T18:30:30+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\"},\"wordCount\":209,\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"articleSection\":[\"hpc\"],\"inLanguage\":\"en-ZA\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\",\"name\":\"Own goal - UCT HPC\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\"},\"datePublished\":\"2013-09-30T12:06:18+00:00\",\"dateModified\":\"2022-09-26T18:30:30+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/#breadcrumb\"},\"inLanguage\":\"en-ZA\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucthpc.uct.ac.za\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Own goal\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"name\":\"UCT HPC\",\"description\":\"University of Cape Town High Performance Computing\",\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-ZA\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\",\"name\":\"University of Cape Town High Performance Computing\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"contentUrl\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"width\":450,\"height\":423,\"caption\":\"University of Cape Town High Performance Computing\"},\"image\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e\",\"name\":\"Andrew Lewis\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g\",\"caption\":\"Andrew Lewis\"},\"sameAs\":[\"http:\/\/blogs.uct.ac.za\/blog\/big-bytes\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Own goal - UCT HPC","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/","og_locale":"en_US","og_type":"article","og_title":"Own goal - UCT HPC","og_description":"Last month we decided to adjust our HPC queues so that jobs by default would land on the correct series. &nbsp;Working under the usual amount of stress and pressure these changes were duly made (by the author) to the queue manager:set queue [QueueName] resources_default.neednodes = [NodeTag]About 24 hours later it was remarked upon that jobs were queuing even though resources were available. &nbsp;Initially it was suspected that as the PBS nodes=X directive was being used that jobs designated for the series600 servers were being offered GPU series servers.Googling for the usual \"PBS queues jobs and they don't run\" didn't produce much in the way of solutions. &nbsp;48 hours later it was clear that there was a deeper problem, several large jobs had just finished freeing up many cores and jobs were still queuing, even single core jobs. &nbsp;Eventually Heine managed to spot the error, instead of entering the NodeTag in the .neednodes directive I had added the queue name again, hence the default request could never be satisfied. &nbsp;Correcting the line and updating queue manager fixed the problem immediately.&nbsp;Fortunately this did not result in much interruption to research but it did highlight the need for the existing change control system to be used.","og_url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/","og_site_name":"UCT HPC","article_published_time":"2013-09-30T12:06:18+00:00","article_modified_time":"2022-09-26T18:30:30+00:00","og_image":[{"url":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/07\/owngoal.jpg"}],"author":"Andrew Lewis","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Andrew Lewis","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/#article","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/"},"author":{"name":"Andrew Lewis","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e"},"headline":"Own goal","datePublished":"2013-09-30T12:06:18+00:00","dateModified":"2022-09-26T18:30:30+00:00","mainEntityOfPage":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/"},"wordCount":209,"publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"articleSection":["hpc"],"inLanguage":"en-ZA"},{"@type":"WebPage","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/","url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/","name":"Own goal - UCT HPC","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/#website"},"datePublished":"2013-09-30T12:06:18+00:00","dateModified":"2022-09-26T18:30:30+00:00","breadcrumb":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/#breadcrumb"},"inLanguage":"en-ZA","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/09\/30\/own-goal\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucthpc.uct.ac.za\/"},{"@type":"ListItem","position":2,"name":"Own goal"}]},{"@type":"WebSite","@id":"https:\/\/ucthpc.uct.ac.za\/#website","url":"https:\/\/ucthpc.uct.ac.za\/","name":"UCT HPC","description":"University of Cape Town High Performance Computing","publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-ZA"},{"@type":"Organization","@id":"https:\/\/ucthpc.uct.ac.za\/#organization","name":"University of Cape Town High Performance Computing","url":"https:\/\/ucthpc.uct.ac.za\/","logo":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/","url":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","contentUrl":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","width":450,"height":423,"caption":"University of Cape Town High Performance Computing"},"image":{"@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/c183ad1c0a1063124a72d63963ae9c7e","name":"Andrew Lewis","image":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/9652c9c73beeab594b8dc2383a880048?s=96&d=mm&r=g","caption":"Andrew Lewis"},"sameAs":["http:\/\/blogs.uct.ac.za\/blog\/big-bytes"]}]}},"_links":{"self":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/815"}],"collection":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/comments?post=815"}],"version-history":[{"count":4,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/815\/revisions"}],"predecessor-version":[{"id":4344,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/815\/revisions\/4344"}],"wp:attachment":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/media?parent=815"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/categories?post=815"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/tags?post=815"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}