{"id":839,"date":"2013-04-12T15:47:20","date_gmt":"2013-04-12T13:47:20","guid":{"rendered":"http:\/\/oldblogs.uct.ac.za\/blog\/big-bytes\/2013\/04\/12\/array-jobs"},"modified":"2015-08-25T10:17:16","modified_gmt":"2015-08-25T08:17:16","slug":"array-jobs","status":"publish","type":"post","link":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/","title":{"rendered":"Array Jobs"},"content":{"rendered":"Recently we had been given a challenge by a student who needed to analyze thousands of input files with an application but found it cumbersome to submit each input file individually.\r\n<br><br>\r\nWe started looking at a \"for loop\" to run through the input files with the PBS directive \"#PBS -v TASKS\".<br>\r\n<strong>for ((i=0; i &lt; TASKS; i++)); do<br>\r\ncommand &lt; input.$i<br>\r\ndone <\/strong><br><br>\r\n.. and exporting the TASKS variable at CLI with a value equal to the number of inputs needed to be analyzed. The problem with this \"for loop\" was that it would run through each input at a time and not execute all of them in parallel. To get around this problem you have to add a ampersand (&amp;) to the end of the command you are executing and a \"wait\" option as seen below.<br>\r\n<strong>for ((i=0; i &lt; TASKS; i++)); do<br>\r\ncommand &lt; input.$i &amp;<br>\r\ndone<br>\r\nwait <\/strong><br>\r\n<br>\r\nThe above will execute nicely but it doesn't scale and will discover another stumbling block. Issuing 20 input files will effectively use 20 concurrent CPUs but should you submit 1000 input files you will be limited to the number of cores available on the cluster. If the cluster only has 500 cores available, it will cause the job to be rejected. Then we discovered that PBS have directives for these special cases. Enter PBS directive \"#PBS -t\" and \"${PBS_ARRAYID}\".<br>\r\n<br>\r\n<strong>#PBS -N inputs<br>\r\n#PBS -l nodes=1:series600:ppn=1<br>\r\n#PBS -q UCTlong<br>\r\ncd \/home\/username\/application\/<br>\r\n.\/command &lt; \/home\/username\/application\/inputs\/input.${PBS_ARRAYID}<\/strong><br>\r\n<br><br>\r\nThe job submission script would look something like the above. To submit to the cluster you would use \" qsub -t 0-100 job.sh \" with 0-100 being the range of input files with their respective extentions input.0, input.1, input.2 etc etc .... input.100 in a directory. The \" -t 0-100\" will parse to the PBS_ARRAYID and 100 jobs will be spawned on the cluster with some running and some being queued. To checkup on the status of a array job execute \" qstat -t \"","protected":false},"excerpt":{"rendered":"<p>Recently we had been given a challenge by a student who needed to analyze thousands of input files with an application but found it cumbersome to submit each input file individually. <\/p>\n<p>We started looking at a &#8220;for loop&#8221; to run through the input files with the PBS directive &#8220;#PBS -v TASKS&#8221;. <br \/><strong>for ((i=0; i &lt; TASKS; i++)); do <br \/>command &lt; input.$i <br \/>done <\/strong><br \/>.. and exporting the TASKS variable at CLI with a value equal to the number of inputs needed to be analyzed. The problem with this &#8220;for loop&#8221; was that it would run through each input at a time and not execute all of them in parallel. To get around this problem you have to add a ampersand (&amp;) to the end of the command you are executing and a &#8220;wait&#8221; option as seen below. <br \/><strong>for ((i=0; i &lt; TASKS; i++)); do <br \/>command &lt; input.$i &amp; <br \/>done <br \/>wait <\/strong><\/p>\n<p>The above will execute nicely but it doesn&#8217;t scale and will discover another stumbling block. Issuing 20 input files will effectively use 20 concurrent CPUs but should you submit 1000 input files you will be limited to the number of cores available on the cluster. If the cluster only has 500 cores available, it will cause the job to be rejected. Then we discovered that PBS have directives for these special cases. Enter PBS directive &#8220;#PBS -t&#8221; and &#8220;${PBS_ARRAYID}&#8221;. <\/p>\n<p><strong>#PBS -N inputs<br \/>#PBS -l nodes=1:series600:ppn=1<br \/>#PBS -q UCTlong &nbsp;<br \/>cd \/home\/username\/application\/<br \/>.\/command &lt; \/home\/username\/application\/inputs\/input.${PBS_ARRAYID}<\/strong><\/p>\n<p>The job submission script would look something like the above. To submit to the cluster you would use &#8221; qsub -t 0-100 job.sh &#8221; with 0-100 being the range of input files with their respective extentions input.0, input.1, input.2 etc etc &#8230;. input.100 in a directory. The &#8221; -t 0-100&#8243; will parse to the PBS_ARRAYID and 100 jobs will be spawned on the cluster with some running and some being queued. To checkup on the status of a array job execute &#8221; qstat -t &#8221; <\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Array Jobs - UCT HPC<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Array Jobs - UCT HPC\" \/>\n<meta property=\"og:description\" content=\"Recently we had been given a challenge by a student who needed to analyze thousands of input files with an application but found it cumbersome to submit each input file individually. We started looking at a &quot;for loop&quot; to run through the input files with the PBS directive &quot;#PBS -v TASKS&quot;. for ((i=0; i &lt; TASKS; i++)); do command &lt; input.$i done .. and exporting the TASKS variable at CLI with a value equal to the number of inputs needed to be analyzed. The problem with this &quot;for loop&quot; was that it would run through each input at a time and not execute all of them in parallel. To get around this problem you have to add a ampersand (&amp;) to the end of the command you are executing and a &quot;wait&quot; option as seen below. for ((i=0; i &lt; TASKS; i++)); do command &lt; input.$i &amp; done wait The above will execute nicely but it doesn&#039;t scale and will discover another stumbling block. Issuing 20 input files will effectively use 20 concurrent CPUs but should you submit 1000 input files you will be limited to the number of cores available on the cluster. If the cluster only has 500 cores available, it will cause the job to be rejected. Then we discovered that PBS have directives for these special cases. Enter PBS directive &quot;#PBS -t&quot; and &quot;${PBS_ARRAYID}&quot;. #PBS -N inputs#PBS -l nodes=1:series600:ppn=1#PBS -q UCTlong &nbsp;cd \/home\/username\/application\/.\/command &lt; \/home\/username\/application\/inputs\/input.${PBS_ARRAYID}The job submission script would look something like the above. To submit to the cluster you would use &quot; qsub -t 0-100 job.sh &quot; with 0-100 being the range of input files with their respective extentions input.0, input.1, input.2 etc etc .... input.100 in a directory. The &quot; -t 0-100&quot; will parse to the PBS_ARRAYID and 100 jobs will be spawned on the cluster with some running and some being queued. To checkup on the status of a array job execute &quot; qstat -t &quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\" \/>\n<meta property=\"og:site_name\" content=\"UCT HPC\" \/>\n<meta property=\"article:published_time\" content=\"2013-04-12T13:47:20+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2015-08-25T08:17:16+00:00\" \/>\n<meta name=\"author\" content=\"Timothy Carr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Timothy Carr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\"},\"author\":{\"name\":\"Timothy Carr\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/41f6cd039836d7741f2b82a7b7cfe8d0\"},\"headline\":\"Array Jobs\",\"datePublished\":\"2013-04-12T13:47:20+00:00\",\"dateModified\":\"2015-08-25T08:17:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\"},\"wordCount\":332,\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"articleSection\":[\"hpc\"],\"inLanguage\":\"en-ZA\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\",\"name\":\"Array Jobs - UCT HPC\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\"},\"datePublished\":\"2013-04-12T13:47:20+00:00\",\"dateModified\":\"2015-08-25T08:17:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/#breadcrumb\"},\"inLanguage\":\"en-ZA\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucthpc.uct.ac.za\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Array Jobs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"name\":\"UCT HPC\",\"description\":\"University of Cape Town High Performance Computing\",\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-ZA\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\",\"name\":\"University of Cape Town High Performance Computing\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"contentUrl\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"width\":450,\"height\":423,\"caption\":\"University of Cape Town High Performance Computing\"},\"image\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/41f6cd039836d7741f2b82a7b7cfe8d0\",\"name\":\"Timothy Carr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/7e94dcf3a408e6ada008042fc29d4b15?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/7e94dcf3a408e6ada008042fc29d4b15?s=96&d=mm&r=g\",\"caption\":\"Timothy Carr\"},\"sameAs\":[\"http:\/\/ucthpc.uct.ac.za\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Array Jobs - UCT HPC","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/","og_locale":"en_US","og_type":"article","og_title":"Array Jobs - UCT HPC","og_description":"Recently we had been given a challenge by a student who needed to analyze thousands of input files with an application but found it cumbersome to submit each input file individually. We started looking at a \"for loop\" to run through the input files with the PBS directive \"#PBS -v TASKS\". for ((i=0; i &lt; TASKS; i++)); do command &lt; input.$i done .. and exporting the TASKS variable at CLI with a value equal to the number of inputs needed to be analyzed. The problem with this \"for loop\" was that it would run through each input at a time and not execute all of them in parallel. To get around this problem you have to add a ampersand (&amp;) to the end of the command you are executing and a \"wait\" option as seen below. for ((i=0; i &lt; TASKS; i++)); do command &lt; input.$i &amp; done wait The above will execute nicely but it doesn't scale and will discover another stumbling block. Issuing 20 input files will effectively use 20 concurrent CPUs but should you submit 1000 input files you will be limited to the number of cores available on the cluster. If the cluster only has 500 cores available, it will cause the job to be rejected. Then we discovered that PBS have directives for these special cases. Enter PBS directive \"#PBS -t\" and \"${PBS_ARRAYID}\". #PBS -N inputs#PBS -l nodes=1:series600:ppn=1#PBS -q UCTlong &nbsp;cd \/home\/username\/application\/.\/command &lt; \/home\/username\/application\/inputs\/input.${PBS_ARRAYID}The job submission script would look something like the above. To submit to the cluster you would use \" qsub -t 0-100 job.sh \" with 0-100 being the range of input files with their respective extentions input.0, input.1, input.2 etc etc .... input.100 in a directory. The \" -t 0-100\" will parse to the PBS_ARRAYID and 100 jobs will be spawned on the cluster with some running and some being queued. To checkup on the status of a array job execute \" qstat -t \"","og_url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/","og_site_name":"UCT HPC","article_published_time":"2013-04-12T13:47:20+00:00","article_modified_time":"2015-08-25T08:17:16+00:00","author":"Timothy Carr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Timothy Carr","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/#article","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/"},"author":{"name":"Timothy Carr","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/41f6cd039836d7741f2b82a7b7cfe8d0"},"headline":"Array Jobs","datePublished":"2013-04-12T13:47:20+00:00","dateModified":"2015-08-25T08:17:16+00:00","mainEntityOfPage":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/"},"wordCount":332,"publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"articleSection":["hpc"],"inLanguage":"en-ZA"},{"@type":"WebPage","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/","url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/","name":"Array Jobs - UCT HPC","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/#website"},"datePublished":"2013-04-12T13:47:20+00:00","dateModified":"2015-08-25T08:17:16+00:00","breadcrumb":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/#breadcrumb"},"inLanguage":"en-ZA","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2013\/04\/12\/array-jobs\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucthpc.uct.ac.za\/"},{"@type":"ListItem","position":2,"name":"Array Jobs"}]},{"@type":"WebSite","@id":"https:\/\/ucthpc.uct.ac.za\/#website","url":"https:\/\/ucthpc.uct.ac.za\/","name":"UCT HPC","description":"University of Cape Town High Performance Computing","publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-ZA"},{"@type":"Organization","@id":"https:\/\/ucthpc.uct.ac.za\/#organization","name":"University of Cape Town High Performance Computing","url":"https:\/\/ucthpc.uct.ac.za\/","logo":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/","url":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","contentUrl":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","width":450,"height":423,"caption":"University of Cape Town High Performance Computing"},"image":{"@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/41f6cd039836d7741f2b82a7b7cfe8d0","name":"Timothy Carr","image":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/7e94dcf3a408e6ada008042fc29d4b15?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/7e94dcf3a408e6ada008042fc29d4b15?s=96&d=mm&r=g","caption":"Timothy Carr"},"sameAs":["http:\/\/ucthpc.uct.ac.za"]}]}},"_links":{"self":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/839"}],"collection":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/comments?post=839"}],"version-history":[{"count":4,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/839\/revisions"}],"predecessor-version":[{"id":2360,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/839\/revisions\/2360"}],"wp:attachment":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/media?parent=839"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/categories?post=839"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/tags?post=839"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}