Solaris Raid Level and Array Type Primer
A simple article highligting the raid levels and arry types available for the Solaris operating system
Terminology
- raid — redundant array of independant disks
- jbod — just a bunch of disks
- scsi — small computer systems interface
- lun— logical unit number
Commands
- format
- luxadm
- ssadm
- lad
- healthck
Understanding Disk names and Paths
Solaris uses two means of nameing disks attached to the system.
- The first is the Actual path — to the charecture device file. All disk and tape charecture device files are found in /devices. For example:
/devices/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020373378a7,0
This path discribes where the device can be found on the system:- sbus@2,0 — the board id in the system
- SUNW,socal@d,10000 — the slot that the card sits in (in this case a "d" indicates the onboard slot)
- sf@0,0 — indicates the right gbic
- ssd@w21000020373378a7 — indicates the disk. The long number is the world wide number of the disk (ie: unique identifying number).
- the second is the easy to read name found in
/dev/dsk
and/dev/rdsk
. these file name are links to the actual device found in /devices. For example:s4u-ds# ls /dev/rdsk/c0t0d0* /dev/rdsk/c0t0d0s0 /dev/rdsk/c0t0d0s3 /dev/rdsk/c0t0d0s6 /dev/rdsk/c0t0d0s1 /dev/rdsk/c0t0d0s4 /dev/rdsk/c0t0d0s7 /dev/rdsk/c0t0d0s2 /dev/rdsk/c0t0d0s5 s4u-ds# ls -l /dev/rdsk/c0t0d0s2 lrwxrwxrwx 1 root root 78 Jun 3 2000 /dev/rdsk/c0t0d0s2 ->../../devices/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020373378a7,0:c,raw
Understanding Disk slices
Each disk on a Solaris system can be divided into 7 sections. These sections are called partitions or slices. To setup the slices you use the format
command. For example:
s4u-ds# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 <ST34321A cyl 8892 alt 2 hd 15 sec 63> /pci@1f,0/pci@1,1/ide@3/dad@0,0 Specify disk (enter its number): 0 FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk repair - repair a defective sector show - translate a disk address label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions volname - set 8-character volume name ! - execute , then return quit format> partition PARTITION MENU: 0 - change `0' partition 1 - change `1' partition 2 - change `2' partition 3 - change `3' partition 4 - change `4' partition 5 - change `5' partition 6 - change `6' partition 7 - change `7' partition select - select a predefined table modify - modify a predefined partition table name - name the current table print - display the current table label - write partition map and label to the disk ! - execute , then return quit partition> print Current partition table (original): Total disk cylinders available: 8892 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 8335 3.76GB (8336/0/0) 7877520 1 swap wu 8336 - 8890 256.09MB (555/0/0) 524475 2 backup wm 0 - 8891 4.01GB (8892/0/0) 8402940 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0 partition>
Illustration:
This partition table shows that disk c0t0d0 is split into 2 sections (slices). Slice 0 is used for root, slice 1 is used for swap, slice 2 is the "map" of the entire disk. Slice 2 should never be changed.
If you need to change the slice sizes, you must backup all data first. When slices are resized all data on those slices will be lost. Slices are changed from the partiton menu as seen above. For example:
partition> 1 Part Tag Flag Cylinders Size Blocks 1 swap wu 8336 - 8890 256.09MB (555/0/0) 524475 Enter partition id tag[swap]: Enter partition permission flags[wu]: Enter new starting cyl[8336]: Enter partition size[524475b, 555c, 256.09mb, 0.25gb]:
Raid overview
Multiple disks or disk slices can be grouped together to make a single virtual disk. Different algorithms, called raid levels, are used to create virtulal disks with varying levels of redundancy and performance.
Hardware raid.
- Raid functionality provided by dedicated hardware and firmware.
- Decreases the cpu load on the host.
- Generaly better performance and more expensive.
Software raid
- Raid functionality provided by software on the host.
- Considerable host cpu overhead.
- Generaly more flexable and less expensive.
Raid Levels
- raid 0 — concate/stripe
- raid 1 — mirroring
- raid 0+1 — striping plus mirroring
- raid 1+0 — mirroring plus stiping
- raid 3 — striping with dedicated parity (rarely used)
- raid 5 — striping with distributed parity.
raid 0 — Concatenation
Cconcatenation takes disk resources (pieces) and comines them into one virtual disk. Data is written from the first piece to the last in sequential order. Raid 0 concatination offers no redundancy. If you lose one disk, your data needs to be restored from backup.
Purpose:
The primary reason for using concatenation to create a virtual disk that is larger than a single disk. Concatenation also allows you to grow the virtual disk by adding more pieces to it. For example:
raid 0 — striping
Striping is a term for breaking up a data stream and placing it across muliple disks in eqqual-sized chunks. Each chunk is written to succesive drives in the virtual disk. Raid 0 Striping provides no redundancy. If you lose one disk, your data needs to be restored from backup.
Purpose:
The primary purpose for using a stripe is to improve iops (i/o per second) performance. If the data stream is larger than a chunk size, multiple disks can be written toin parallel. For example:
raid 1 — mirroring
Mirroring uses twice the disk resources as raid 0 alone. Thus, hardware requirements are doubled.
Purpose:
Mirroring provides redundancy by writing data to 2 or more independent pieces. Raid 1 is used in combination with raid 0. A raid 1 typically contains 2 raid 0 pieces. Data is written to both pieces. If one piece should lose a disk, the other piece continues allowing data i/o.
raid 0+1 — striped and mirrored virtual disk
The most common use of mirroring is in combination with a stripe. The stripe is the highest performance raid type and with a mirror provides high redundancy.
Purpose:
Mirrored stripes can lose all the disk resources on one side of the mirror and continue functioning. This holds true as long as the disk resources on each side of the mirror are independent from each other. For example
raid 1+0 — Mirrored stripes
Mirrored stripes combine performance, redundancy and high availability. A mirrored stripe contains at least 4 pieces. It is the equivalent of combining raid 0+1 virtual disks. The first 2 pieces make up the first raid 1+0. The second two pieces make up the next raid 1+0. Then the two raid 1+0's are mirrored. This gives the performance of striping and very high redundancy. However disk resource needs are quadrupled. For exmaple:
raid 3 — striping with dedicated parity
Purpose:
Provides a stripe with a means of redundancy that is cheaper than a mirror. A disk dedicated for parity is added. The parity disk is used to rebuild data if a disk in the stripe is lost.
- I/O performance is poor.
- This raid level is rarely used.
raid 5 — striping with distributed parity
Purpose:
Provide redundancy without the resource overhead of mirroring.
- Better i/o performance than raid 3
- Data and parity are striped across all drives.
- Overall random i/o performance is dependant on percentage of writes.
- Writes require 4 disk operations. Read old data, read old parity, calculate new parity, write new data write new parity.
For example:
Array types
JBOD – Just a bunch of disks
- All disks in the array are available to the operating system.
- Requires disk management software in order to construct virtual disks.
Examples:
- Multi-pac
- Sparc Storage Array
- D1000
- A5x00
Hardware Raid
- All disks are hidden from the host
- Vitual disks are created by the raid controller
- Virtual disks are preseented to the host.
- Software on the host interacts with the controller the manage the array.
Examples:
- A1000
- A3x000