Snark disk array
- 1 Overview
- 2 Specifications
- 3 Plans for new disk configuration
- 4 What we'll use it for
We bought a big ole 15-slot disk array on the cheap from adverts.ie in early 2012. Information about it should be maintained here.
- Dell Powervault MD1000 disk array
- 15 SAS slots
- SAS in and out for daisy-chaining
- Currently configured with 4 × 300GB + 1 × 73GB SAS disks
Plans for new disk configuration
Building anything on the array using SAS disks would be expensive compared to SATA. Instead, we could dump the existing disks (and possibly sell them) and seek sponsorship for a new set of higher-capacity SATA disks.
If we stick with SAS we can obtain 7200-rpm SAS storage-orientated disks. These are somewhat more expensive than SATA disks, but shouldn't have a problem running 24/7/365. We've been lucky with Spoon's disk failure rate so far.
Following the College sysadmin meeting in early February, ISS mentioned the possibility of sourcing disks for us on the cheap.
Options for RAID configuration
RAID 1+0 is a stripe of mirrors: each pair of disks is combined into a RAID1 (one disk mirroring the other), and all of these RAID1s are striped together. In this configuration, we would use up to 14 slots in Seth, with the last one lying empty, or set up as a hot-swap (if any of the active disks fail, Seth will automagically replace it with this spare disk).
With RAID1+0, we can lose half of the disks in Seth without losing any data, as long as no more than one disk from each RAID1 stripe is lost. If both disks from any one RAID1 stripe fail at once, we lose the entire array. In terms of storage efficiency, RAID1+0 isn't great, as 50% of the capacity of the array is used to mirror the other 50%. So, if we got 14 1TB disks, we would be left with 7TB usable storage.
RAID 5+0 is similar to RAID1+0, except that instead of each stripe being a RAID1 mirror, it is a RAID5 stripe. For Seth, every set of three disks would be combined into a RAID5 device, and all these RAID5s would be striped together. In this configuration, we would use all 15 slots in Seth.
With a 15-disk RAID5+0, we can lose one third of the disks in Seth without losing Data, as long as no more than one disk from each RAID5 stripe is lost. If more than one disk from any one component RAID5 stripe fails at once, we lose the entire array. Also, RAID5+0 takes longer to recover from a failed disk, since RAID5s take longer to rebuild than RAID1s (calculating parity information takes longer than just duplicating a disk). This is a disadvantage of RAID5+0, as it means that data is vulnerable for longer in the event of a disk failure. However, RAID 5+0 can provide excellent storage efficiency compared to 1+0. 15 1TB disks assembled as a RAID5+0 would give us 13TB of usable space.
RAID 6+0 is basically RAID5+0, except that each component stripe is a RAID6 (a RAID5, but with two parity disks instead of one). The additional parity gives data protection in case of a failure, as if one of the parity disks fails, there is a backup of it, and if one of the non-parity disk fails, there are two parity disks which can reconstruct it. This is at the expense of write time (parity has to be calculated and updated on both disks), and means we lose an additional five disks of storage. Also, 15 isn't a multiple of 4, so we would be able to use at most 12 slots at once. Basically, RAID6+0 is probably a bad idea.
Options for disk sizes
The cheapest option for kitting out the would (obviously) be to buy 15 low-capacity (say, 500GB) disks. However, we need to bear in mind that once we've set up storage on it, expanding it in the future will be a significant pain in the ass. So, we should maximize the storage density early on.
To build a RAID5+0, we need at least six disks, each with the same capacity. We could start off 6 1TB disks in RAID5, and get 4TB of usable storage, expandable as needed with additional sets of 3 1TB disks. However, for some reason the difference in price between 1TB and 2TB disks is quite low compared to that between 500GB and 1TB disks. So, we should probably splash out for the 2TB disks instead. 6 2TB disks will give us 8TB of usable storage, more than Spoon and Cube put together. Starting out with 2TB disks will let us eventually expand to 26TB storage, if necessary.
We'd all love to start out with 3TB disks, but they're currently far too expensive (I've seen them on sale for nearly 400 euro each). Also, the maximum capacity of any one disk in the array is 2.2TB.
Options for filesystems
Colin Fowler from CS administration advised using XFS as a file system for backups, based on benchmarks that they carried out when they set up their new seventy-something terabyte backup unit. His observations -
- Current ext implementations/tools don't scale to large quantities of storage (the tools such as fsck and possibly even the partitioning tools don't handle volumes over a low quantity of terabytes). Maybe not a problem right now, but it would fuck over future admins when 16TB hard drives become affordable and computers with only 100,000 vacuum tubes become plausible.
- JFS and XFS were the only options left for the CS dept. Both of these filesystems are well supported and mature (JFS is originally from IBM AIX, XFS originally from SGI IRIX).
- JFS was slower in the benchmarks. XFS performed better. XFS also has a reputation for being good with dealing with large files, which will probably be an important consideration for us.
- Scott, using BTRFS is a terrible idea.
- We could use an OS that supported ZFS. This would limit us to either OpenIndiana (may disappear off the face of the earth or get sued by Oracle) or FreeBSD (ZFS support immature, lags behind Solaris). Probably may be a bit of pain to admin the different OSes too.
What we'll use it for
There are currently a few proposals floating around as to what to do with Seth once we have it kitted out. These are listed below.
Plan 1 - centralized /home
/home on Spoon and Cube (and
/srv/webspace on Cube), and move them to Seth, connected to Cube. Then, serve
/home to all Netsoc machines, over NFS between Spoon and Cube using a gigabit crossover, and over some other more secure means to the netsoc room.
- Users keep their files in once place
- With large disks, we can offer a much bigger quota than we do now
- Makes backups simpler, as we don't have to keep track of files on multiple machines
- Frees up an awful lot of space on the disks already inside Cube and Spoon, which could then be used for other things (Netsoc debian/ubuntu/ports/whatever mirror, for example)
- File access might be slow on machines not physically connected to the array
- More disk space for users means more files that we still don't have backups of
- If we want to keep multiple backups all this data, as well as / on all our machines, we'll still need an even bigger, second disk array to hold it all
- If the machine Seth is connected to goes down, users lose access to their files on all other machines until it is brought back up.
Plan 2 - split array
The array can be split into two, with 8 disks connected to one machine's RAID controller, and 7 disks connected to another. Attach the 7-disk side to to Cube, as /home for Cube, and possibly other machines. Attach the other side some mythical backup server, and use it to back up data from all machines.
- Backups can be taken of the data on the 7-disk side without the need for a whole new array
- We need to buy an additional SAS card for the second RAID controller.
- The backup server has to be in the vicinity of Spoon and Cube, meaning we lose everything if maths burns down
Plan 3 - backups only
Acquire a reasonably-priced server with a SAS card, connect the array to it to it, and host it somewhere in College other than the maths department. Keep
/home on all machines the way it is now, and back up
everything on all machines to the array.
- No big quota bump