Solaris Raid Level and Array Type Primer

A simple article highligting the raid levels and arry types available for the Solaris operating system

Terminology

  • raid — redundant array of independant disks
  • jbod — just a bunch of disks
  • scsi — small computer systems interface
  • lun— logical unit number

Commands

  • format
  • luxadm
  • ssadm
  • lad
  • healthck

Understanding Disk names and Paths

Solaris uses two means of nameing disks attached to the system.

  • The first is the Actual path — to the charecture device file. All disk and tape charecture device files are found in /devices. For example:
    /devices/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020373378a7,0
    This path discribes where the device can be found on the system:
    • sbus@2,0 — the board id in the system
    • SUNW,socal@d,10000 — the slot that the card sits in (in this case a "d" indicates the onboard slot)
    • sf@0,0 — indicates the right gbic
    • ssd@w21000020373378a7 — indicates the disk. The long number is the world wide number of the disk (ie: unique identifying number).
  • the second is the easy to read name found in /dev/dsk and /dev/rdsk. these file name are links to the actual device found in /devices. For example:
    s4u-ds# ls /dev/rdsk/c0t0d0*
    /dev/rdsk/c0t0d0s0  /dev/rdsk/c0t0d0s3  /dev/rdsk/c0t0d0s6
    /dev/rdsk/c0t0d0s1  /dev/rdsk/c0t0d0s4  /dev/rdsk/c0t0d0s7
    /dev/rdsk/c0t0d0s2  /dev/rdsk/c0t0d0s5
    
    s4u-ds# ls -l /dev/rdsk/c0t0d0s2
    lrwxrwxrwx   1 root     root          78 Jun  3  2000 /dev/rdsk/c0t0d0s2 ->../../devices/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020373378a7,0:c,raw

Understanding Disk slices

Each disk on a Solaris system can be divided into 7 sections. These sections are called partitions or slices. To setup the slices you use the format command. For example:

s4u-ds# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
      0. c0t0d0 <ST34321A cyl 8892 alt 2 hd 15 sec 63>
         /pci@1f,0/pci@1,1/ide@3/dad@0,0
Specify disk (enter its number): 0

FORMAT MENU:
       disk       - select a disk
       type       - select (define) a disk type
       partition  - select (define) a partition table
       current    - describe the current disk
       format     - format and analyze the disk
       repair     - repair a defective sector
       show       - translate a disk address
       label      - write label to the disk
       analyze    - surface analysis
       defect     - defect list management
       backup     - search for backup labels
       verify     - read and display labels
       save       - save new disk/partition definitions
       volname    - set 8-character volume name
       !     - execute , then return
       quit
format> partition

PARTITION MENU:
       0      - change `0' partition
       1      - change `1' partition
       2      - change `2' partition
       3      - change `3' partition
       4      - change `4' partition
       5      - change `5' partition
       6      - change `6' partition
       7      - change `7' partition
       select - select a predefined table
       modify - modify a predefined partition table
       name   - name the current table
       print  - display the current table
       label  - write partition map and label to the disk
       ! - execute , then return
       quit
partition> print

Current partition table (original):
Total disk cylinders available: 8892 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders        Size            Blocks
 0       root    wm       0 - 8335        3.76GB    (8336/0/0) 7877520
 1       swap    wu    8336 - 8890      256.09MB    (555/0/0)   524475
 2     backup    wm       0 - 8891        4.01GB    (8892/0/0) 8402940
 3 unassigned    wm       0               0         (0/0/0)          0
 4 unassigned    wm       0               0         (0/0/0)          0
 5 unassigned    wm       0               0         (0/0/0)          0
 6 unassigned    wm       0               0         (0/0/0)          0
 7 unassigned    wm       0               0         (0/0/0)          0

partition>

Illustration:

This partition table shows that disk c0t0d0 is split into 2 sections (slices). Slice 0 is used for root, slice 1 is used for swap, slice 2 is the "map" of the entire disk. Slice 2 should never be changed.

If you need to change the slice sizes, you must backup all data first. When slices are resized all data on those slices will be lost. Slices are changed from the partiton menu as seen above. For example:

partition> 1
Part      Tag    Flag     Cylinders        Size            Blocks
 1       swap    wu    8336 - 8890      256.09MB    (555/0/0)   524475

Enter partition id tag[swap]: 
Enter partition permission flags[wu]: 
Enter new starting cyl[8336]: 
Enter partition size[524475b, 555c, 256.09mb, 0.25gb]: 

Raid overview

Multiple disks or disk slices can be grouped together to make a single virtual disk. Different algorithms, called raid levels, are used to create virtulal disks with varying levels of redundancy and performance.

Hardware raid.

  • Raid functionality provided by dedicated hardware and firmware.
  • Decreases the cpu load on the host.
  • Generaly better performance and more expensive.

Software raid

  • Raid functionality provided by software on the host.
  • Considerable host cpu overhead.
  • Generaly more flexable and less expensive.

    Raid Levels

    • raid 0 — concate/stripe
    • raid 1 — mirroring
    • raid 0+1 — striping plus mirroring
    • raid 1+0 — mirroring plus stiping
    • raid 3 — striping with dedicated parity (rarely used)
    • raid 5 — striping with distributed parity.

    raid 0 — Concatenation

    Cconcatenation takes disk resources (pieces) and comines them into one virtual disk. Data is written from the first piece to the last in sequential order. Raid 0 concatination offers no redundancy. If you lose one disk, your data needs to be restored from backup.

    Purpose:

    The primary reason for using concatenation to create a virtual disk that is larger than a single disk. Concatenation also allows you to grow the virtual disk by adding more pieces to it. For example:

    raid 0 concatination

    raid 0 — striping

    Striping is a term for breaking up a data stream and placing it across muliple disks in eqqual-sized chunks. Each chunk is written to succesive drives in the virtual disk. Raid 0 Striping provides no redundancy. If you lose one disk, your data needs to be restored from backup.

    Purpose:

    The primary purpose for using a stripe is to improve iops (i/o per second) performance. If the data stream is larger than a chunk size, multiple disks can be written toin parallel. For example:

    raid 0 stripe

    raid 1 — mirroring

    Mirroring uses twice the disk resources as raid 0 alone. Thus, hardware requirements are doubled.

    Purpose:

    Mirroring provides redundancy by writing data to 2 or more independent pieces. Raid 1 is used in combination with raid 0. A raid 1 typically contains 2 raid 0 pieces. Data is written to both pieces. If one piece should lose a disk, the other piece continues allowing data i/o.

    raid 0+1 — striped and mirrored virtual disk

    The most common use of mirroring is in combination with a stripe. The stripe is the highest performance raid type and with a mirror provides high redundancy.

    Purpose:

    Mirrored stripes can lose all the disk resources on one side of the mirror and continue functioning. This holds true as long as the disk resources on each side of the mirror are independent from each other. For example

    raid 0+1 striped and mirrored

    raid 1+0 — Mirrored stripes

    Mirrored stripes combine performance, redundancy and high availability. A mirrored stripe contains at least 4 pieces. It is the equivalent of combining raid 0+1 virtual disks. The first 2 pieces make up the first raid 1+0. The second two pieces make up the next raid 1+0. Then the two raid 1+0's are mirrored. This gives the performance of striping and very high redundancy. However disk resource needs are quadrupled. For exmaple:

    raid 1+0 mirrored stripe

    raid 3 — striping with dedicated parity

    Purpose:

    Provides a stripe with a means of redundancy that is cheaper than a mirror. A disk dedicated for parity is added. The parity disk is used to rebuild data if a disk in the stripe is lost.

    • I/O performance is poor.
    • This raid level is rarely used.

    raid 5 — striping with distributed parity

    Purpose:

    Provide redundancy without the resource overhead of mirroring.

    • Better i/o performance than raid 3
    • Data and parity are striped across all drives.
    • Overall random i/o performance is dependant on percentage of writes.
    • Writes require 4 disk operations. Read old data, read old parity, calculate new parity, write new data write new parity.

    For example:

    raid 5 striped with disributed parity

    Array types

    JBOD – Just a bunch of disks

    • All disks in the array are available to the operating system.
    • Requires disk management software in order to construct virtual disks.

    Examples:

    • Multi-pac
    • Sparc Storage Array
    • D1000
    • A5x00

    Hardware Raid

    • All disks are hidden from the host
    • Vitual disks are created by the raid controller
    • Virtual disks are preseented to the host.
    • Software on the host interacts with the controller the manage the array.

    Examples:

    • A1000
    • A3x000