{"id":3523,"date":"2018-10-19T16:20:42","date_gmt":"2018-10-19T14:20:42","guid":{"rendered":"http:\/\/ucthpc.uct.ac.za\/?p=3523"},"modified":"2018-10-22T13:39:42","modified_gmt":"2018-10-22T11:39:42","slug":"update-on-beegfs-storage","status":"publish","type":"post","link":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/","title":{"rendered":"Update on BeegFS storage"},"content":{"rendered":"<p>The disk cluster is configured and running. Four <a href=\"https:\/\/www.dell.com\/en-us\/work\/shop\/povw\/poweredge-r740xd\" target=\"_blank\" rel=\"noopener\">Dell R740DX servers<\/a>\u00a0each with a combination of twelve 10TB NLSAS and two 240GB SSD drives are providing a 345TB scratch volume, an 8TB \/home volume and a 4TB software volume. Each server also has two 120GB SSD drives for the OS. The storage and compute nodes (<a href=\"https:\/\/www.dell.com\/za\/enterprise\/p\/poweredge-c6420\/pd\" target=\"_blank\" rel=\"noopener\">Dell C6420s<\/a>) are all connected at 100Gb\/s via an <a href=\"https:\/\/store.mellanox.com\/products\/mellanox-msb7800-es2f-switch-ib-2-based-edr-infiniband-1u-switch-36-qsfp28-ports-2-power-supplies-ac-x86-dual-core-standard-depth-p2c-airflow-rail-kit-rohs6.html\" target=\"_blank\" rel=\"noopener\">EDR MSB7800 Mellanox switch<\/a>.<\/p>\n<p>Each storage target is natively configured with RAID6 providing redundancy in that we can lose two drives simultaneously at the cost of write speed and rebuild performance. We decided to use RAID6 as opposed to RAID10 as we will only be backing up data on the \/home and software volumes, not scratch. A nice feature of Dell&#8217;s R740XD is the ability to hot swap drives from any of the three RAID sets, front or back.<\/p>\n<p>The storage servers also run meta data services which manage the striping and placement of files on the storage disks. Having multiple meta data services decreases the read\\write time for file access. The meta data is stored on mirrored SSD drives for additional speed. As RAID1 is less reliable than RAID6 the meta data services are buddy mirrored which means we can lose an entire meta data RAID set and the service will automatically fail over to the second server.<\/p>\n<p>BeegFS is tricky to set up, especially in <a href=\"https:\/\/www.beegfs.io\/wiki\/MultiMode\" target=\"_blank\" rel=\"noopener\">multi-mode<\/a>\u00a0and requires a great deal of planning. Multi mode allows us to run several BeegFS clusters in one environment which means we can provision separate disk sets and quotas depending on the storage requirements. Below is the logical planning diagram of our multi-mode services.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2018\/10\/logical.jpg\" \/><\/p>\n<p>We are running the latest version of BeegFS which uses systemd to manage the multi mode services (with the exception of the clients which still use init scripts). Version 7 also uses its own libbeegfs-ib package to build and manage the RDMA client services.<\/p>\n<p>It&#8217;s critical to adjust and <a href=\"https:\/\/www.beegfs.io\/wiki\/TuningAdvancedConfiguration\" target=\"_blank\" rel=\"noopener\">fine tune BeegFS<\/a>, especially if one is using a non standard interconnect. A few small file system config adjustment took our native write speed from 15Gb\/s to 18Gb\/s.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2018\/10\/RDMA.png\" \/><\/p>\n<p>An example of disk write and striping is given below. A user is copying a 52GB file from one BeegFS disk to another. The overall write speed from the compute node is 5.6GB\/s and this is load balanced across all storage servers at approximately 1.4GB\/s which is approaching the 12Gb\/s limit of the RAID controller in each server. Keep in mind that the file is being read from the storage targets at the same time so the combined RW speed is actually 11+GB\/s. Yes, that&#8217;s a capital B.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2018\/10\/MaxIB.jpg\" \/><\/p>\n<p>Astute readers will have noted that it is very unlikely that with only 4 targets that the switch will be saturated, given it is capable of 100Gb\/s, however the interconnect will also be used for MPI inter-process communication of parallel jobs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The disk cluster is configured and running. Four Dell R740DX servers\u00a0each with a combination of twelve 10TB NLSAS and two 240GB SSD drives are providing a 345TB scratch volume, an 8TB \/home volume and a 4TB software volume. Each server also has two 120GB SSD drives for the OS. The storage and compute nodes (Dell&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[25,4,26],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Update on BeegFS storage - UCT HPC<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Update on BeegFS storage - UCT HPC\" \/>\n<meta property=\"og:description\" content=\"The disk cluster is configured and running. Four Dell R740DX servers\u00a0each with a combination of twelve 10TB NLSAS and two 240GB SSD drives are providing a 345TB scratch volume, an 8TB \/home volume and a 4TB software volume. Each server also has two 120GB SSD drives for the OS. The storage and compute nodes (Dell...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\" \/>\n<meta property=\"og:site_name\" content=\"UCT HPC\" \/>\n<meta property=\"article:published_time\" content=\"2018-10-19T14:20:42+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-10-22T11:39:42+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2018\/10\/logical.jpg\" \/>\n<meta name=\"author\" content=\"admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\"},\"author\":{\"name\":\"admin\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/bd981a3acf3b5495041e7884626c1157\"},\"headline\":\"Update on BeegFS storage\",\"datePublished\":\"2018-10-19T14:20:42+00:00\",\"dateModified\":\"2018-10-22T11:39:42+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\"},\"wordCount\":496,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"articleSection\":[\"BeeGFS\",\"hpc\",\"storage\"],\"inLanguage\":\"en-ZA\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\",\"name\":\"Update on BeegFS storage - UCT HPC\",\"isPartOf\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\"},\"datePublished\":\"2018-10-19T14:20:42+00:00\",\"dateModified\":\"2018-10-22T11:39:42+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#breadcrumb\"},\"inLanguage\":\"en-ZA\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ucthpc.uct.ac.za\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Update on BeegFS storage\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#website\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"name\":\"UCT HPC\",\"description\":\"University of Cape Town High Performance Computing\",\"publisher\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-ZA\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#organization\",\"name\":\"University of Cape Town High Performance Computing\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"contentUrl\":\"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png\",\"width\":450,\"height\":423,\"caption\":\"University of Cape Town High Performance Computing\"},\"image\":{\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/bd981a3acf3b5495041e7884626c1157\",\"name\":\"admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-ZA\",\"@id\":\"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/925fcc8bb19120dc6cb9841c0ace3e14?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/925fcc8bb19120dc6cb9841c0ace3e14?s=96&d=mm&r=g\",\"caption\":\"admin\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Update on BeegFS storage - UCT HPC","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/","og_locale":"en_US","og_type":"article","og_title":"Update on BeegFS storage - UCT HPC","og_description":"The disk cluster is configured and running. Four Dell R740DX servers\u00a0each with a combination of twelve 10TB NLSAS and two 240GB SSD drives are providing a 345TB scratch volume, an 8TB \/home volume and a 4TB software volume. Each server also has two 120GB SSD drives for the OS. The storage and compute nodes (Dell...","og_url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/","og_site_name":"UCT HPC","article_published_time":"2018-10-19T14:20:42+00:00","article_modified_time":"2018-10-22T11:39:42+00:00","og_image":[{"url":"http:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2018\/10\/logical.jpg"}],"author":"admin","twitter_card":"summary_large_image","twitter_misc":{"Written by":"admin","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#article","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/"},"author":{"name":"admin","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/bd981a3acf3b5495041e7884626c1157"},"headline":"Update on BeegFS storage","datePublished":"2018-10-19T14:20:42+00:00","dateModified":"2018-10-22T11:39:42+00:00","mainEntityOfPage":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/"},"wordCount":496,"commentCount":0,"publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"articleSection":["BeeGFS","hpc","storage"],"inLanguage":"en-ZA","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/","url":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/","name":"Update on BeegFS storage - UCT HPC","isPartOf":{"@id":"https:\/\/ucthpc.uct.ac.za\/#website"},"datePublished":"2018-10-19T14:20:42+00:00","dateModified":"2018-10-22T11:39:42+00:00","breadcrumb":{"@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#breadcrumb"},"inLanguage":"en-ZA","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ucthpc.uct.ac.za\/index.php\/2018\/10\/19\/update-on-beegfs-storage\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ucthpc.uct.ac.za\/"},{"@type":"ListItem","position":2,"name":"Update on BeegFS storage"}]},{"@type":"WebSite","@id":"https:\/\/ucthpc.uct.ac.za\/#website","url":"https:\/\/ucthpc.uct.ac.za\/","name":"UCT HPC","description":"University of Cape Town High Performance Computing","publisher":{"@id":"https:\/\/ucthpc.uct.ac.za\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ucthpc.uct.ac.za\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-ZA"},{"@type":"Organization","@id":"https:\/\/ucthpc.uct.ac.za\/#organization","name":"University of Cape Town High Performance Computing","url":"https:\/\/ucthpc.uct.ac.za\/","logo":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/","url":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","contentUrl":"https:\/\/ucthpc.uct.ac.za\/wp-content\/uploads\/2015\/09\/logocircless.png","width":450,"height":423,"caption":"University of Cape Town High Performance Computing"},"image":{"@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/bd981a3acf3b5495041e7884626c1157","name":"admin","image":{"@type":"ImageObject","inLanguage":"en-ZA","@id":"https:\/\/ucthpc.uct.ac.za\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/925fcc8bb19120dc6cb9841c0ace3e14?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/925fcc8bb19120dc6cb9841c0ace3e14?s=96&d=mm&r=g","caption":"admin"}}]}},"_links":{"self":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/3523"}],"collection":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/comments?post=3523"}],"version-history":[{"count":14,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/3523\/revisions"}],"predecessor-version":[{"id":3541,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/posts\/3523\/revisions\/3541"}],"wp:attachment":[{"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/media?parent=3523"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/categories?post=3523"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ucthpc.uct.ac.za\/index.php\/wp-json\/wp\/v2\/tags?post=3523"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}