RAID

From Alessandro's Wiki
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

literally "Redundant Array of Inexpensive Disks" is a technology used to abstract the concept of data partition out from the physical disks.

I will go through the opensource branch of this technique, in particular how to access the RAID level from a generic unix/linux system.

mdadm

Gestore + comune e semplice di dispositivi RAID

comandi al volo

  • crea bitmap
mdadm /dev/md3 -Gb internal

oppure

mdadm --grow --bitmap=internal /dev/md3
  • mdadm: set /dev/sda1 faulty in /dev/md0
mdadm /dev/md0 -f /dev/sda1
  • mdadm: hot removed /dev/sda1
mdadm /dev/md0 -r /dev/sda1
  • mdadm: hot added /dev/sda1
mdadm /dev/md0 -a /dev/sda1
  • creo un raid multipath con 4 dischi
mdadm -C /dev/md0 --level=multipath --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
Continue creating array? yes
mdadm: array /dev/md0 started.
  • Verificare la consistenza:
echo check >> /sys/block/mdX/md/sync_action
watch -n .1 cat /proc/mdstat
  • Controllare la velocità del RAID
cat /proc/sys/dev/raid/speed_limit_max

per un massimo di 30MB/sec.

echo "30000" > /proc/sys/dev/raid/speed_limit_max
  • check for bad blocks on the raid
cat /sys/block/md0/md/mismatch_cnt

recuperare un raid1

livecd ~ # mdadm --assemble /dev/md1 /dev/hda5 /dev/hdc5
mdadm: /dev/md1 has been started with 2 drives.
livecd ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda5[0] hdc5[1]
     19542976 blocks [2/2] [UU]
     bitmap: 0/150 pages [0KB], 64KB chunk
unused devices: <none>

Installare il sistema su RAID software

mknod /dev/md1 b 9 1
mknod /dev/md2 b 9 2
mknod /dev/md3 b 9 3
mdadm --create --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm --create --verbose /dev/md2 --level=1 --raid-devices=2 /dev/sda2 /dev/sdb2
mdadm --create --verbose /dev/md3 --level=1 --raid-devices=2 /dev/sda3 /dev/sdb3
mdadm --detail --scan > /etc/mdadm.conf
mdadm /dev/md1 -Gb internal
mdadm /dev/md2 -Gb internal
mdadm /dev/md3 -Gb internal
grub
 grub> root (hd0,x)
 grub> setup (hd0)
 grub> quit

Creare un raid0 (Dis+Disk+Disk..)

NON FARLO DON'T DO IT

localhost ~ # mknod /dev/md0 b 9 0
localhost ~ # mdadm --create /dev/md0 --chunk=64 --level=raid0 --raid-devices=2 /dev/hda7 /dev/hdc1
mdadm: /dev/hda7 appears to contain a reiserfs file system
    size = 4891648K
Continue creating array? y
mdadm: array /dev/md0 started.
localhost ~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid0 hdc1[1] hda7[0]
     303363200 blocks 64k chunks
unused devices: <none>

RAID 1

  • partitioning: using two identical hard disks:
fdisk -l /dev/hd[bc]
  Device Boot      Start         End      Blocks   Id  System
/dev/hdb1               1       19929   160079661   83  Linux raid autodetect
  Device Boot      Start         End      Blocks   Id  System
/dev/hdc1               1       19929   160079661   83  Linux raid autodetect
  • costruiamo e facciamo partire il raid:
[root@elwood ~]# mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/hdb1 /dev/hdc1
mdadm: array /dev/md0 started.
  • do the checks:
[root@elwood ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdc1[1] hdb1[0]
     160079552 blocks [2/2] [UU]
     [>....................]  resync =  0.4% (652864/160079552) finish=44.7min speed=59351K/sec
  • Enjoy the speed of nvme: > 2GB/s
md1 : active raid1 nvme2n1p1[1] nvme1n1p1[0]
     1855726592 blocks super 1.2 [2/2] [UU]
     [=======>.............]  check = 36.2% (673459008/1855726592) finish=7.9min speed=2462455K/sec
  • write the configuration in the mdadm.conf file, useful in some unusual-boot conditions
[root@elwood ~]# vi /etc/mdadm.conf
DEVICE /dev/hd[bc]1
ARRAY /dev/md0 devices=/dev/hdb1,/dev/hdc1

RAID 5

  • partitioning: using three identical hard disks:
fdisk -l /dev/sd[bcd]
  Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       19929   160079661   83  Linux raid autodetect
  Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       19929   160079661   83  Linux raid autodetect
  Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       19929   160079661   83  Linux raid autodetect


  • the command:
[root@elwood ~]# mdadm --create /dev/md5 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: array /dev/md5 started.

RAID 10

  • partitioning: using four identical hard disks
fdisk -l /dev/sd[bcd]
  Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       19929   160079661   83  Linux raid autodetect
  Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       19929   160079661   83  Linux raid autodetect
  Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       19929   160079661   83  Linux raid autodetect
  Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1       19929   160079661   83  Linux raid autodetect


  • the command:
[root@elwood ~]# mdadm --create /dev/md10 --level=10 --raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: array /dev/md10 started.
  • do the checks:
cat /proc/mdstat                                                                                                          Sat Oct 29 17:24:16 2011
md0 : active raid10 sdd2[3] sdb2[2] sdc2[1] sde2[0]
     1423179776 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
     [=========>...........]  resync = 47.4% (675489472/1423179776) finish=187.0min speed=66616K/sec

Rimuovere un disco

  • rimuovo la partizione /dev/hdb1 dal RAID /dev/md5
fricco ~ # mdadm --manage /dev/md5 --fail /dev/hdb1
mdadm: set /dev/hdb1 faulty in /dev/md5
fricco ~ # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1]
md5 : active raid1 hdd1[1] hdb1[2](F)
     244187904 blocks [2/1] [_U]
md0 : active raid0 hdc1[1] hda7[0]
     303363200 blocks 64k chunks
unused devices: <none>
fricco ~ # mdadm --manage /dev/md5 -r /dev/hdb1
mdadm: hot removed /dev/hdb1

reset raid information of a disk

mdadm --zero-superblock /dev/<disk>

Sostituire un disco in raid1 Corrotto

  • La configurazione è la seguente:
macchina ~ # cat /proc/mdstat |grep ^md
md10 : active raid1 hdd2[1] hda2[0]
md100 : active raid1 hdd3[1] hda3[0]
md160 : active raid1 hdd5[1] hdb1[0]
md0 : active raid1 hdd1[1] hda1[0]
  • i dischi installati sono;
macchina ~ # fdisk -l 2> /dev/null|grep 'Disk /dev/hd'
Disk /dev/hda: 80.0 GB, 80000000000 bytes
Disk /dev/hdb: 163.9 GB, 163928604672 bytes
Disk /dev/hdd: 251.0 GB, 251000193024 bytes
  • il disco /dev/hdd è danneggiato:
macchina ~ # dmesg |grep hdd|tail
end_request: I/O error, dev hdd, sector 162802377
raid1: hdd5: rescheduling sector 100823408
hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdd: dma_intr: error=0x40 { UncorrectableError }, LBAsect=162802376, high=9, low=11807432,  sector=162802369
end_request: I/O error, dev hdd, sector 162802369
raid1:md160: read error corrected (8 sectors at 100823536 on hdd5)
hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdd: dma_intr: error=0x40 { UncorrectableError }, LBAsect=162802377, high=9, low=11807433,  sector=162802377
end_request: I/O error, dev hdd, sector 162802377
raid1:md160: read error corrected (8 sectors at 100823544 on hdd5)
  • Faccio il backup della tabella di partizioni:
sfdisk -d /dev/hdd > hdd_partition_table
  • Spengo la macchina, e monto il nuovo disco:

[... reboot ...]

  • ricario la tabella delle partizioni del vecchio disco, nel nuovo:
sfdisk /dev/hdd < hdd_partition_table
  • Añado, una por una las particiones del nuevo disco a los RAIDs
aledg ~ # mdadm /dev/md160 -a /dev/hdd5
mdadm: added /dev/hdd5
aledg ~ # mdadm /dev/md0 -a /dev/hdd1
mdadm: added /dev/hdd1
aledg ~ # mdadm /dev/md10 -a /dev/hdd2
mdadm: added /dev/hdd2
aledg ~ # mdadm /dev/md100 -a /dev/hdd3
mdadm: added /dev/hdd3
  • Miramos si esta reconstruyendo los raids, el primero que he areglado se estarà reconstruyendo, los otros , tienen que estar en estado "DELAYED" osea que se completaran uno a la vez:
aledg ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [multipath] [faulty]
md10 : active raid1 hdd2[2] hda2[0]
     10241344 blocks [2/1] [U_]
       resync=DELAYED
     bitmap: 28/157 pages [112KB], 32KB chunk
md100 : active raid1 hdd3[2] hda3[0]
     20482752 blocks [2/1] [U_]
       resync=DELAYED
     bitmap: 17/157 pages [68KB], 64KB chunk
md160 : active raid1 hdd5[2] hdb1[0]
     160079552 blocks [2/1] [U_]
     [>....................]  recovery =  0.4% (690112/160079552) finish=141.1min speed=18812K/sec
md0 : active raid1 hdd1[2] hda1[0]
     264960 blocks [2/1] [U_]
       resync=DELAYED
     bitmap: 0/33 pages [0KB], 4KB chunk

performance script (not tested yet)

#!/bin/bash
### EDIT THESE LINES ONLY
RAID_NAME="${1:-md0}"
# devices seperated by spaces i.e. "a b c d..." between "(" and ")"
RAID_DRIVES=(b c d e f g)
# this should be changed to match above line
blockdev --setra 16384 /dev/sd[bcdefg]
SPEED_MIN=${2:-5000}
SPEED_MAX=${3:-50000}
### DO NOT EDIT THE LINES BELOW
echo $SPEED_MIN > /proc/sys/dev/raid/speed_limit_min
echo $SPEED_MAX > /proc/sys/dev/raid/speed_limit_max
# looping though drives that make up raid -> /dev/sda,/dev/sdb...
for index in "${RAID_DRIVES[@]}"
do
        echo 1024 > /sys/block/sd${index}/queue/read_ahead_kb
        echo 256 > /sys/block/sd${index}/queue/nr_request
        # Disabling NCQ on all disks...
        echo 1 > /sys/block/sd${index}/device/queue_depth
done
# Set read-ahead.
echo "Setting read-ahead to 64 MiB for /dev/${RAID_NAME}"
blockdev --setra 65536 /dev/${RAID_NAME}
# Set stripe-cache_size
echo "Setting stripe_cache_size to 16 MiB for /dev/${RAID_NAME}"
echo 16384 > /sys/block/${RAID_NAME}/md/stripe_cache_size
echo 8192 > /sys/block/${RAID_NAME}/md/stripe_cache_active