RAID
From Alessandro's Wiki
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
literally "Redundant Array of Inexpensive Disks" is a technology used to abstract the concept of data partition out from the physical disks.
I will go through the opensource branch of this technique, in particular how to access the RAID level from a generic unix/linux system.
mdadm
Gestore + comune e semplice di dispositivi RAID
comandi al volo
- crea bitmap
mdadm /dev/md3 -Gb internal
oppure
mdadm --grow --bitmap=internal /dev/md3
- mdadm: set /dev/sda1 faulty in /dev/md0
mdadm /dev/md0 -f /dev/sda1
- mdadm: hot removed /dev/sda1
mdadm /dev/md0 -r /dev/sda1
- mdadm: hot added /dev/sda1
mdadm /dev/md0 -a /dev/sda1
- creo un raid multipath con 4 dischi
mdadm -C /dev/md0 --level=multipath --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 Continue creating array? yes mdadm: array /dev/md0 started.
- Verificare la consistenza:
echo check >> /sys/block/mdX/md/sync_action watch -n .1 cat /proc/mdstat
- Controllare la velocità del RAID
cat /proc/sys/dev/raid/speed_limit_max
per un massimo di 30MB/sec.
echo "30000" > /proc/sys/dev/raid/speed_limit_max
- check for bad blocks on the raid
cat /sys/block/md0/md/mismatch_cnt
recuperare un raid1
livecd ~ # mdadm --assemble /dev/md1 /dev/hda5 /dev/hdc5 mdadm: /dev/md1 has been started with 2 drives. livecd ~ # cat /proc/mdstat Personalities : [raid1] md1 : active raid1 hda5[0] hdc5[1] 19542976 blocks [2/2] [UU] bitmap: 0/150 pages [0KB], 64KB chunk unused devices: <none>
Installare il sistema su RAID software
mknod /dev/md1 b 9 1 mknod /dev/md2 b 9 2 mknod /dev/md3 b 9 3 mdadm --create --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1 mdadm --create --verbose /dev/md2 --level=1 --raid-devices=2 /dev/sda2 /dev/sdb2 mdadm --create --verbose /dev/md3 --level=1 --raid-devices=2 /dev/sda3 /dev/sdb3 mdadm --detail --scan > /etc/mdadm.conf mdadm /dev/md1 -Gb internal mdadm /dev/md2 -Gb internal mdadm /dev/md3 -Gb internal grub grub> root (hd0,x) grub> setup (hd0) grub> quit
Creare un raid0 (Dis+Disk+Disk..)
NON FARLO DON'T DO IT
localhost ~ # mknod /dev/md0 b 9 0 localhost ~ # mdadm --create /dev/md0 --chunk=64 --level=raid0 --raid-devices=2 /dev/hda7 /dev/hdc1 mdadm: /dev/hda7 appears to contain a reiserfs file system size = 4891648K Continue creating array? y mdadm: array /dev/md0 started. localhost ~ # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md0 : active raid0 hdc1[1] hda7[0] 303363200 blocks 64k chunks unused devices: <none>
RAID 1
- partitioning: using two identical hard disks:
fdisk -l /dev/hd[bc] Device Boot Start End Blocks Id System /dev/hdb1 1 19929 160079661 83 Linux raid autodetect
Device Boot Start End Blocks Id System /dev/hdc1 1 19929 160079661 83 Linux raid autodetect
- costruiamo e facciamo partire il raid:
[root@elwood ~]# mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/hdb1 /dev/hdc1 mdadm: array /dev/md0 started.
- do the checks:
[root@elwood ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 hdc1[1] hdb1[0] 160079552 blocks [2/2] [UU] [>....................] resync = 0.4% (652864/160079552) finish=44.7min speed=59351K/sec
- Enjoy the speed of nvme: > 2GB/s
md1 : active raid1 nvme2n1p1[1] nvme1n1p1[0] 1855726592 blocks super 1.2 [2/2] [UU] [=======>.............] check = 36.2% (673459008/1855726592) finish=7.9min speed=2462455K/sec
- write the configuration in the mdadm.conf file, useful in some unusual-boot conditions
[root@elwood ~]# vi /etc/mdadm.conf DEVICE /dev/hd[bc]1 ARRAY /dev/md0 devices=/dev/hdb1,/dev/hdc1
RAID 5
- partitioning: using three identical hard disks:
fdisk -l /dev/sd[bcd] Device Boot Start End Blocks Id System /dev/sdb1 1 19929 160079661 83 Linux raid autodetect Device Boot Start End Blocks Id System /dev/sdc1 1 19929 160079661 83 Linux raid autodetect Device Boot Start End Blocks Id System /dev/sdd1 1 19929 160079661 83 Linux raid autodetect
- the command:
[root@elwood ~]# mdadm --create /dev/md5 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1 mdadm: array /dev/md5 started.
RAID 10
- partitioning: using four identical hard disks
fdisk -l /dev/sd[bcd] Device Boot Start End Blocks Id System /dev/sdb1 1 19929 160079661 83 Linux raid autodetect Device Boot Start End Blocks Id System /dev/sdc1 1 19929 160079661 83 Linux raid autodetect Device Boot Start End Blocks Id System /dev/sdd1 1 19929 160079661 83 Linux raid autodetect Device Boot Start End Blocks Id System /dev/sde1 1 19929 160079661 83 Linux raid autodetect
- the command:
[root@elwood ~]# mdadm --create /dev/md10 --level=10 --raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 mdadm: array /dev/md10 started.
- do the checks:
cat /proc/mdstat Sat Oct 29 17:24:16 2011 md0 : active raid10 sdd2[3] sdb2[2] sdc2[1] sde2[0] 1423179776 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU] [=========>...........] resync = 47.4% (675489472/1423179776) finish=187.0min speed=66616K/sec
Rimuovere un disco
- rimuovo la partizione /dev/hdb1 dal RAID /dev/md5
fricco ~ # mdadm --manage /dev/md5 --fail /dev/hdb1 mdadm: set /dev/hdb1 faulty in /dev/md5 fricco ~ # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] md5 : active raid1 hdd1[1] hdb1[2](F) 244187904 blocks [2/1] [_U] md0 : active raid0 hdc1[1] hda7[0] 303363200 blocks 64k chunks unused devices: <none> fricco ~ # mdadm --manage /dev/md5 -r /dev/hdb1 mdadm: hot removed /dev/hdb1
reset raid information of a disk
mdadm --zero-superblock /dev/<disk>
Sostituire un disco in raid1 Corrotto
- La configurazione è la seguente:
macchina ~ # cat /proc/mdstat |grep ^md md10 : active raid1 hdd2[1] hda2[0] md100 : active raid1 hdd3[1] hda3[0] md160 : active raid1 hdd5[1] hdb1[0] md0 : active raid1 hdd1[1] hda1[0]
- i dischi installati sono;
macchina ~ # fdisk -l 2> /dev/null|grep 'Disk /dev/hd' Disk /dev/hda: 80.0 GB, 80000000000 bytes Disk /dev/hdb: 163.9 GB, 163928604672 bytes Disk /dev/hdd: 251.0 GB, 251000193024 bytes
- il disco /dev/hdd è danneggiato:
macchina ~ # dmesg |grep hdd|tail end_request: I/O error, dev hdd, sector 162802377 raid1: hdd5: rescheduling sector 100823408 hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error } hdd: dma_intr: error=0x40 { UncorrectableError }, LBAsect=162802376, high=9, low=11807432, sector=162802369 end_request: I/O error, dev hdd, sector 162802369 raid1:md160: read error corrected (8 sectors at 100823536 on hdd5) hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error } hdd: dma_intr: error=0x40 { UncorrectableError }, LBAsect=162802377, high=9, low=11807433, sector=162802377 end_request: I/O error, dev hdd, sector 162802377 raid1:md160: read error corrected (8 sectors at 100823544 on hdd5)
- Faccio il backup della tabella di partizioni:
sfdisk -d /dev/hdd > hdd_partition_table
- Spengo la macchina, e monto il nuovo disco:
[... reboot ...]
- ricario la tabella delle partizioni del vecchio disco, nel nuovo:
sfdisk /dev/hdd < hdd_partition_table
- Añado, una por una las particiones del nuevo disco a los RAIDs
aledg ~ # mdadm /dev/md160 -a /dev/hdd5 mdadm: added /dev/hdd5 aledg ~ # mdadm /dev/md0 -a /dev/hdd1 mdadm: added /dev/hdd1 aledg ~ # mdadm /dev/md10 -a /dev/hdd2 mdadm: added /dev/hdd2 aledg ~ # mdadm /dev/md100 -a /dev/hdd3 mdadm: added /dev/hdd3
- Miramos si esta reconstruyendo los raids, el primero que he areglado se estarà reconstruyendo, los otros , tienen que estar en estado "DELAYED" osea que se completaran uno a la vez:
aledg ~ # cat /proc/mdstat Personalities : [raid0] [raid1] [multipath] [faulty] md10 : active raid1 hdd2[2] hda2[0] 10241344 blocks [2/1] [U_] resync=DELAYED bitmap: 28/157 pages [112KB], 32KB chunk md100 : active raid1 hdd3[2] hda3[0] 20482752 blocks [2/1] [U_] resync=DELAYED bitmap: 17/157 pages [68KB], 64KB chunk md160 : active raid1 hdd5[2] hdb1[0] 160079552 blocks [2/1] [U_] [>....................] recovery = 0.4% (690112/160079552) finish=141.1min speed=18812K/sec md0 : active raid1 hdd1[2] hda1[0] 264960 blocks [2/1] [U_] resync=DELAYED bitmap: 0/33 pages [0KB], 4KB chunk
performance script (not tested yet)
#!/bin/bash
### EDIT THESE LINES ONLY
RAID_NAME="${1:-md0}"
# devices seperated by spaces i.e. "a b c d..." between "(" and ")"
RAID_DRIVES=(b c d e f g)
# this should be changed to match above line
blockdev --setra 16384 /dev/sd[bcdefg]
SPEED_MIN=${2:-5000}
SPEED_MAX=${3:-50000}
### DO NOT EDIT THE LINES BELOW
echo $SPEED_MIN > /proc/sys/dev/raid/speed_limit_min
echo $SPEED_MAX > /proc/sys/dev/raid/speed_limit_max
# looping though drives that make up raid -> /dev/sda,/dev/sdb...
for index in "${RAID_DRIVES[@]}"
do
echo 1024 > /sys/block/sd${index}/queue/read_ahead_kb
echo 256 > /sys/block/sd${index}/queue/nr_request
# Disabling NCQ on all disks...
echo 1 > /sys/block/sd${index}/device/queue_depth
done
# Set read-ahead.
echo "Setting read-ahead to 64 MiB for /dev/${RAID_NAME}"
blockdev --setra 65536 /dev/${RAID_NAME}
# Set stripe-cache_size
echo "Setting stripe_cache_size to 16 MiB for /dev/${RAID_NAME}"
echo 16384 > /sys/block/${RAID_NAME}/md/stripe_cache_size
echo 8192 > /sys/block/${RAID_NAME}/md/stripe_cache_active