Tuesday 17 February 2009

RAID explained (in several ways)

http://www.ahinc.com/raid.htm

RAID 0

What is it?

RAID 0 uses a method of writing to the disks called striping. Let's assume you have a server with three drives of 500 MB, 1 GB and 2 GB. Normally a server would treat each of these drives individually. By incorporating striping, the system would see all of the drives as only one drive for a total of 3.5 GB. Big deal, you say. Wait, there's more.

When the system writes data to the disk, the RAID 0 striping kicks in and automatically distributes the data across all three drives. Part of a file (chunks of data) will be written to the first drive, the next part to the second drive, the next part to the third drive and then it starts all over again until the entire contents of the file have been written.

Speed

What this does is increase the speed of the reading/writing process. If you have two drives on your server, it increases the speed by about 25%. If you have three drives, it increases the speed about 33%. When you consider that the main task a server is performing is reading and writing data, any increase in speed is highly welcome.

Disk Usage

Besides increasing speed, the other benefit is that the drives can be of different sizes.

Because RAID 0 only writes the data once, it does not achieve data redundancy. If one of the drives fails, the entire system has to be restored because all files are split or striped across all drives. Because there is no data redundancy, there is no loss of disk space.

Requirements

Adding a RAID controller and more drives.

Top of page

RAID 1

What is it?

RAID 1 uses a technology called mirroring or disk shadowing. RAID 1 requires a minimum of two drives that are exactly the same size. Every time a write is executed the same data is written to both drives, i.e. a mirror image. Well, almost a mirror image. The data is not reversed in the same way as when you look in the mirror.

So what you achieve with RAID 1 is data redundancy. If one of the drives fails, the system can continue to run by just writing to one drive. If you have hot swappable drives, you could pull out the bad drive, plug in a new one and the system is back to its normal state. How efficient and easy it is to execute all of this depends onthe RAID controller and/or software that is being used.

Speed

There is basically no increase or decrease in the time it takes to write or read data.

Disk Usage

The disadvantage of RAID 1 is that you lose half of your disk capacity. If you have two 4GB drives, you don't have a total of 8 GB of space, but only 4 GB. So you are losing half of the capacity of disk space that you paid for. But on the other hand disk drives are fairly inexpensive today. What has to be considered is what is the cost of downtime if a drive fails on your server? The downtime cost is probably much more than the cost of the additional drive.

Requirements

RAID 1 can be accomplished by simply adding another drive and perhaps you may need a new controller that supports RAID. It is possible to use a RAID software controller, but we don't recommend it. Windows servers and some versions of the Linux/UNIX/AIX operating systems provide the mirroring software. Configuration and installation is fairly simple. So, for a few hundred dollars you can quickly have RAID 1 up and working.

Top of page

RAID 5

What is it?

RAID 5 accomplishes both techniques of RAID 0 and RAID 1. There are other benefits of RAID 5 but lets leave that discussion to the techies. You'll just have to take my word for it. RAID 5 requires a minimum of three drives and it is recommended that all drives on the system be of the same size. The more drives you have on the server, the better RAID 5 will perform. We usually recommend having five drives where one is used as a spare. This will allow for up to two drives to fail and the system can keep running.

RAID 5 is the version most often recommended. Because the price of disk drives have drastically dropped, the cost of implementing RAID 5 is now within most companies budgets.

Speed

There is a decrease in write speed due to calculations that have to be made before data is written to the drives. If you want to increase read/write speed and have the benefits of RAID 5, you need to implement RAID 10.

Disk Usage

The loss of disk space is basically 100 divided by the number of disk drives. With 3 drives, there is a 33% loss of disk space. With 5 drives, there is a 20% loss of disk space.

Requirements

RAID 5 is more expensive to implement. You will need additional drives and a RAID controller. The cost of implementing RAID 5 could be in the range of $1000 to $5000 depending on the total number of drives, type of drives and controller required.

Top of page

Is it worth it?

If you ...

value your data
can't afford any downtime
consider the cost of recovery when you have a drive failure
Yes it is!

Until companies have a drive failure, most companies don't consider or view this technology as being valuable. Don't let yourself fall in that trap. Give the benefits of RAID some thought. To help prove the benefits, you might want to shutdown your server for an hour and discover how unproductive your organization can become. Of course this is a very short time compared to an actual drive replacement and data recovery that could take up to two or more days.


#-##-##-##-##-##-##-##-##-##-##-##-##-##-##-##-#-

http://cuddletech.com/veritas/raidtheory/x31.html


RAID: The Details

2.1. RAID Type: Concatenation

Concatenations are also know as "Simple" RAIDs. A Concatenation is a collection of disks that are "welded" together. Data in a concatenation is layed across the disks in a linear fashion from on disk to the next. So if we've got 3 9G (gig) disks that are made into a Simple RAID, we'll end up with a single 27G virtual disk (volume). When you write data to the disk you'll write to the first disk, and you'll keep writing your data to the first disk until it's full, then you'll start writing to the second disk, and so on. All this is done by the Volume Manager, which is "keeper of the RAID". Concatenation is the cornerstone of RAID.

Now, do you see the problem with this type of RAID? Because we're writing data linearly across the disks, if we only have 7G of data on our RAID we're only using the first disk! The 2 other disks are just sitting there bored and useless. This sucks. We got the big disk we wanted, but it's not any better than a normal disk drive you can buy off the shelves in terms of performance. There has got to be a better way..........

2.2. RAID Type: Striping (RAID-0)

Striping is similar to Concatenation because it will turn a bunch of little disks into a big single virtual disk (volume), but the difference here is that when we write data we write it across ALL the disks. So, when we need to read or write data we're moving really really fast, in fact faster than any one disk could move. There are 2 things to know about RAID-0, they are: stripe width, and columns. They sound scary, but they're totally sweet, let me show you. So, if we're going to read and write across multiple disks in our RAID we need an organized way to go about it. First, we'll have to agree on how much data should be written to a disk before moving to the next; we call that our "stripe width". Then we'll need far kooler term for each disk, a term that allows us to visualize our new RAID better..... "column" sounds kool! Alright, so each disk is a "column" and the amount of data we put on each "column" before moving to the next is our "stripe width".

Let's solidify this. If we're building a RAID-0 with 4 columns, and a stripe width of 128k, what do I have? It might look something like this:

Look good? So, when we start writing to our new RAID, we'll write the first 128k to the first column, then the next 128k to the second column, then the next 128k to the third column, then the next 128k to the fourth column, THEN the next 128k to the first column, and keep going till all the data is written. See? If we were writing a 1M file we'd wrap that one file around all 4 disks almost 3 times! Can you see now where our speed up comes from? SCSI drives can write data at about (depending on what type of drive and what type of SCSI) 20M/s. On our Striped RAID we'd be writing at 80M/s! Kool huh!?

But, now we've got ANOTHER problem. In a Simple RAID if we had, say, 3 9G disks, we'd have 27G of data. Now, if I only wrote 9G of data to that RAID and the third disk died, so what, there is no data on it. (See where I'm going with this?) We'd only be using one of our three disks in a simple. BUT, in a Striped RAID, we could write only 10M of data to the RAID, but if even ONE disk failed, the whole thing would be trash because we wrote it on ALL of the disks. So, how do we solve this one?

2.3. RAID Type: Mirroring (RAID-1)

Mirroring isn't actually a "RAID" like the other forms, but it's a critical component to RAID, so it was honored by being given it's own number. The concept is to create a separate RAID (Simple or RAID0) that is used to duplicate an existing RAID. So, it's literally a mirror image of your RAID. This is done so that if a disk crashes in your RAID the mirror will take over. If one RAID crashes, then the other RAID takes its place. Simple, right?

There's not much to it. However, there is a new problem! This is expensive... really expensive. Let's say you wanted a 27G RAID. So you bought 3 9G drives. In order to mirror it you'll need to buy 3 more 9G drives. If you ever get depressed you'll start thinking: "You know, I just shelled out $400 for 3 more drives, and I don't even get more usable space!". Well, in this industry we all get depressed a lot so, they thought of another kool idea for a RAID......

2.4. RAID Type: Stripping plus Mirroring (RAID-0+1)

When we talk about mirroring (RAID-1) we're not explicitly specifying whether we're mirroring a Simple RAID or a Striped (RAID-0) RAID. RAID-0+1 is a term used to explicitly say that we're mirroring a Striped RAID. The only thing you need to know about it is this...

A mirror is nothing more that another RAID identical to the RAID we're trying to protect. So when we build a mirror we'll need the mirror to be the same type of RAID as the original RAID. If the RAID we want to mirror is a Simple RAID, our mirror then will be a Simple RAID. If we want to mirror a Striped RAID, then we'll want another Striped RAID to mirror the first. Right? So, if you say to me, we're building a RAID-0+1, I know that we're going to mirror a Striped RAID, and the mirror itself is going to be striped as well.

You'll see this term used more often than "RAID-1" simply because a mirror, in and of itself, isn't useful. Again, it's not really a "RAID" in the sense that we mean to use the word.

2.5. RAID Type: RAID-5 (Striping with Parity)

RAID-5 is the ideal solution for maximizing disk space and disk redundancy. It's like Striping (RAID-0) in the fact that we have columns and stripe widths, but when we write data two interesting things happen: the data is written to multiple disks at the same time, and parity is written with the data.

Okey, let's break it down a bit. Let's say we build a RAID-5 out of 4 9G drives. So we'll have 4 columns, and lets say our stripe width is 128k again. The first 128k is written on disks one, two AND three. At the same time it's written a little magic number is written on each disk with the data. That magic number is called the parity. Then, the second 128k of data is written to (watch carefully) disks two, three and four. Again, a parity number is written with that data. The third 128k of data is written to disks three, four and one. (See, we wrapped around). And data keeps being written like that.

Here's the beauty of it. Each piece of our data is on three different disks in the RAID at the same time! Let's look back at our 4 disk raid. We're working normally, writing along, and then SNAP! Disk 3 fails! Are we worried? Not particularly. Because our data is being written to 3 disks per write instead of just one, the RAID is smart enough to just get the data off the other 2 disks it wrote to! Then, once we replace the bad disk with a new one, the RAID "floods" all the data back onto the disk from the data on the other 2 adjacent disks! But, you ask, how does the RAID know it's giving you the correct data? Because of our parity. When the data was written to disk(s) that parity was written with it. We (actually the computer does this automatically) just look at the data on disks 2 and 4, then compare (XOR) the parity written with the data and if the parity checks out, we know the data is good. Kool huh?

Now, as you might expect, this isn't perfect either. Why? Okey, number 1, remember that parity that saves our butt and makes sure our data is good? Well, as you might expect the systems CPU has to calculate that, which isn't hard but we're still wasting CPU cycles for the RAID, which means if the system is really loaded we may need to (eek!) wait. This is the "performance hit" you'll hear people talk about. Also, we're writing to 3 disks at a time for the SAME data, which means we're using up I/O bandwidth and not getting a real boost out of it.

2.6. RAID Comparison: RAID0+1 vs RAID5

There are battles fought in the storage arena, much like the old UNIX vs NT battles. We tend to fight over RAID0+1 vs RAID5. The fact is that RAID5 is advantageous because we use less disks in the endeavor to provide large amounts of disk space, while still having protection. All that means is that RAID5 is inexpensive compared to RAID0+1 where we'll need double the amount of disk we expect to use, because we'll only need a third more disks rather than twice as many. But, then RAID5 is also slower than RAID0+1 because of that damned parity. If you really want speed, you'll need to bite the bullet and use RAID0+1 because even though you need more disks, you don't need to calculate anything, you just dump the data to the disks. In my estimates (this isn't scientific, just what I've noticed by experience) RAID0+1 is about 20%-30% faster than RAID5.

Now, in the real world, you rarely have much choice, and the way to go is clear. If you're given 10 9G disks and are told to create a 60G RAID, and you can't buy more disks, you'll need to either go RAID5, or be unprotected. However, if you've got thoughs same disks and they only want 36G RAID you can go RAID0+1, with the only drawback that they won't have much room to grow. It's all up to you as an admin, but always take growth into account. Look at what you've got, downtime availability to grow when needed, budget, performance needs, etc, etc, etc. Welcome to the world of capacity planning!

No comments: