UbuntuRaid1Log, logging the setup (10.04.1) and use

Target is to convert an existing Ubuntu installation into an RAID1 array, in order to improve the reliability.
Because the manual setup of an RAID1 array is very confusing, it is important to log the setup.

After several tries I got it working.

Important: In order to minimize down time of the server, it is wise to prepare a spare hard disk for quick replacement without data loss, in case of a failed hard disk, please see #Replace_broken_hard_disk.

Prerequisites

Analysis of the situation

Because I need to have a running system , I installed on a 4 GB USB-Stick an Ubuntu Desktop Edition 10.04.1, and installed additional the ...generic-pae kernel, the programs mdadm for RAID support and SSH-server for remote access for documentation. First action is to check the actual situation:

$ mdadm -V
mdadm - v2.6.7.1 - 15th October 2008

$ sudo mdadm --examine --scan  # -Es  = short form
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=d865e07f:12394ce5:a78ec32f:6eb6df8d
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=f71087e3:4fcf4b18:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=bd4de79e:318a0b3b:a78ec32f:6eb6df8d
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=b4cfe9fb:11886e44:a78ec32f:6eb6df8d

$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md_d2 : inactive sdd3[1](S)
      242035200 blocks
       
md_d1 : inactive sdd2[1](S)
      1550208 blocks
       
md_d0 : inactive sdc[1](S)
      244198464 blocks
       
unused devices: <none>

$ blkid
/dev/sda1: LABEL="SYSTEM" UUID="9935b2a9-4f99-43ba-bcee-c6488b0e63ba" TYPE="ext2" 
/dev/sda2: LABEL="USER" UUID="52196d5e-a59f-46a3-97fa-1fc81ec21f57" SEC_TYPE="ext2" TYPE="ext3" 
/dev/sda3: SEC_TYPE="msdos" LABEL="BIOS" UUID="4901-AA1A" TYPE="vfat" 
/dev/sdb1: UUID="0b201bbc-110f-40ca-ae55-7808bc253fd0" TYPE="ext4" 
/dev/sdb5: UUID="53d5a553-3588-4338-afa1-7430d84f57fd" TYPE="swap" 
/dev/sdc1: UUID="f71087e3-4fcf-4b18-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdc2: UUID="bd4de79e-318a-0b3b-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdc3: UUID="b4cfe9fb-1188-6e44-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd1: UUID="f71087e3-4fcf-4b18-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd2: UUID="bd4de79e-318a-0b3b-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd3: UUID="b4cfe9fb-1188-6e44-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 

The situation is confusing, it has to be cleaned up

Note: all messages "mdadm: metadata format 00.90 unknown, ignored." are deleted.

$ sudo mdadm --stop --scan
mdadm: stopped /dev/md/d2
mdadm: stopped /dev/md/d1
mdadm: stopped /dev/md/d0

$ sudo mdadm --assemble /dev/md0 /dev/sdc1 /dev/sdd1
mdadm: /dev/md0 has been started with 2 drives.

$ sudo mdadm --assemble /dev/md1 /dev/sdc2 /dev/sdd2
mdadm: /dev/md1 has been started with 2 drives.

$ sudo mdadm --assemble /dev/md2 /dev/sdc3 /dev/sdd3
mdadm: /dev/md2 has been started with 2 drives.

rudi@rudi-usb:~$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md2 : active raid1 sdc3[0] sdd3[1]
      242035200 blocks [2/2] [UU]
      
md1 : active raid1 sdc2[0] sdd2[1]
      1550208 blocks [2/2] [UU]
      
md0 : active raid1 sdc1[0] sdd1[1]
      513984 blocks [2/2] [UU]
      
unused devices: <none>

$ blkid
/dev/sda1: LABEL="SYSTEM" UUID="9935b2a9-4f99-43ba-bcee-c6488b0e63ba" TYPE="ext2" 
/dev/sda2: LABEL="USER" UUID="52196d5e-a59f-46a3-97fa-1fc81ec21f57" SEC_TYPE="ext2" TYPE="ext3" 
/dev/sda3: SEC_TYPE="msdos" LABEL="BIOS" UUID="4901-AA1A" TYPE="vfat" 
/dev/sdb1: UUID="0b201bbc-110f-40ca-ae55-7808bc253fd0" TYPE="ext4" 
/dev/sdb5: UUID="53d5a553-3588-4338-afa1-7430d84f57fd" TYPE="swap" 
/dev/sdc1: UUID="f71087e3-4fcf-4b18-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdc2: UUID="bd4de79e-318a-0b3b-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdc3: UUID="b4cfe9fb-1188-6e44-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd1: UUID="f71087e3-4fcf-4b18-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd2: UUID="bd4de79e-318a-0b3b-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd3: UUID="b4cfe9fb-1188-6e44-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/md0: UUID="8753ea07-f22f-4bb1-81ed-ddb389652047" TYPE="ext2" 
/dev/md1: UUID="ad16b658-952b-4276-9623-37d0246c1e06" TYPE="swap" 
/dev/md2: LABEL="MX250" UUID="624cbccb-5008-40aa-94a1-35d23000b776" SEC_TYPE="ext2" TYPE="ext3" 

# That looks OK now.

Next step is to mount the root partition /dev/md2 to /mnt. In order to avoid any mistyping, I wrote a little shell script.

rudi@rudi-usb:~$ cat chroot_md2.sh 
#!/bin/sh
# chroot_md2.sh - chroot to /mnt
# 2010-12-28 RR
sudo mount -t ext3 /dev/md2 /mnt
sudo mount --bind /dev /mnt/dev
sudo mount --bind /proc /mnt/proc
sudo mount --bind /sys /mnt/sys
sudo chroot /mnt 

rudi@rudi-usb:~$ sudo ./chroot_md2.sh 
root@rudi-usb:/# 

root@rudi-usb:/# cat /etc/mdadm/mdadm.conf 
...
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=d865e07f:12394ce5:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=3e1f9b05:cc404299:a78ec32f:6eb6df8d
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=b4cfe9fb:11886e44:a78ec32f:6eb6df8d

root@rudi-usb:/# mdadm -Es
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=d865e07f:12394ce5:a78ec32f:6eb6df8d
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=f71087e3:4fcf4b18:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=bd4de79e:318a0b3b:a78ec32f:6eb6df8d
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=b4cfe9fb:11886e44:a78ec32f:6eb6df8d

# Now there is the question, which UUID is correct, and why is md0 double?

Cleanup try # 1, correct the UUID's in mdadm.conf

$ man mdadm  # tells under chapter "DEVICE NAMES" that from version 2.6 on that the
               device name should be /dev/md_dNN or /dev/md/dNN

$ sudo reboot

# Check for double /dev/md0:
$ cat /proc/mdstat        
md_d0 : inactive sdc[1](S)
# Analysis: A superblock was written erroneous in /dev/sdc 

# Solution:
mdadm --zero-superblock /dev/sdc
mdadm: Couldn't open /dev/sdc for write - not zeroing
rudi@rudi-usb:~$ sudo mdadm --stop --scan
mdadm: stopped /dev/md/d2
mdadm: stopped /dev/md/d1
mdadm: stopped /dev/md/d0
rudi@rudi-usb:~$ sudo mdadm --zero-superblock /dev/sdc
# OK now.

# Partition /dev/sdc1 was auto-mounted to /media, so unmount:
rudi@rudi-usb:~$ ls /media
8753ea07-f22f-4bb1-81ed-ddb389652047_
rudi@rudi-usb:~$ sudo umount /media/8753ea07-f22f-4bb1-81ed-ddb389652047_/

# Restore the first RAID1 partition /dev/md_d0:
$ sudo mdadm --create /dev/md_d0 --level=1 --raid-disks=2 /dev/sdc1 /dev/sdd1 
mdadm: /dev/sdc1 appears to contain an ext2fs file system
    size=514048K  mtime=Thu Jan  1 01:00:00 1970
mdadm: /dev/sdd1 appears to contain an ext2fs file system
    size=514048K  mtime=Sat Jan 29 15:50:46 2011
Continue creating array? y
mdadm: array /dev/md_d0 started.

$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md_d0 : active raid1 sdd1[1] sdc1[0]
      513984 blocks [2/2] [UU]
      
md_d1 : active raid1 sdd2[1] sdc2[0]
      1550208 blocks [2/2] [UU]
      
md_d2 : active raid1 sdd3[1] sdc3[0]
      242035200 blocks [2/2] [UU]
      
unused devices: <none>

$ sudo mdadm -Es
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=f874ee18:944046d4:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=bd4de79e:318a0b3b:a78ec32f:6eb6df8d
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=b4cfe9fb:11886e44:a78ec32f:6eb6df8d
# OK

$ sudo reboot

$ sudo mdadm -Es
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=f874ee18:944046d4:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=21bbb030:79e49a7c:a78ec32f:6eb6df8d
ARRAY /dev/md128 level=raid1 num-devices=2 UUID=ccf55e08:5107a4ac:a78ec32f:6eb6df8d
# That was wrong, why? 

# Restore device names
$ sudo mdadm --stop --scan 
mdadm: stopped /dev/md/d1
mdadm: stopped /dev/md/d128
mdadm: stopped /dev/md/d0

$ sudo mdadm --assemble /dev/md_d0 /dev/sdc1 /dev/sdd1
mdadm: /dev/md_d0 has been started with 2 drives.
rudi@rudi-usb:~$ sudo mdadm --assemble /dev/md_d1 /dev/sdc2 /dev/sdd2
mdadm: /dev/md_d1 has been started with 2 drives.
rudi@rudi-usb:~$ sudo mdadm --assemble /dev/md_d2 /dev/sdc3 /dev/sdd3
mdadm: /dev/md_d2 has been started with 2 drives.

Try again the setup with new device names

$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md_d0 : active raid1 sdc1[0] sdd1[1]
      513984 blocks [2/2] [UU]
      
md_d1 : inactive sdc2[0](S)
      1550208 blocks
       
md_d2 : inactive sdc3[0](S)
      242051264 blocks
       
unused devices: <none>
# Why was RAID1 device /dev/md_d0 acticated only?
# Try with a new create:
$ sudo mdadm --stop /dev/md_d1
mdadm: stopped /dev/md_d1
$ sudo mdadm --create /dev/md_d1 --level=1 --raid-disks=2 /dev/sdc2 /dev/sdd2
mdadm: /dev/sdc2 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Fri Jan 28 10:57:49 2011
mdadm: /dev/sdd2 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Fri Jan 28 10:57:49 2011
Continue creating array? y
mdadm: array /dev/md_d1 started.
$ sudo mdadm --stop /dev/md_d2
mdadm: stopped /dev/md_d2
# Because I know the RAID1 array /dev/md_d2 is in synch, you can use "--assume-clean"
# in order to avoid 3 hours sync time.
$ sudo mdadm --create --assume-clean /dev/md_d2 --level=1 --raid-disks=2 /dev/sdc3 /dev/sdd3
mdadm: metadata format 00.90 unknown, ignored.
mdadm: metadata format 00.90 unknown, ignored.
mdadm: /dev/sdc3 appears to contain an ext2fs file system
    size=242035200K  mtime=Sat Jan 29 14:43:43 2011
mdadm: /dev/sdc3 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Sat Jan 29 16:33:22 2011
mdadm: /dev/sdd3 appears to contain an ext2fs file system
    size=242035200K  mtime=Sat Jan 29 14:43:43 2011
mdadm: /dev/sdd3 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Sat Jan 29 16:33:22 2011
Continue creating array? y
mdadm: array /dev/md_d2 started.
rudi@rudi-usb:~$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md_d2 : active raid1 sdd3[1] sdc3[0]
      242035200 blocks [2/2] [UU]
      
md_d1 : active raid1 sdd2[1] sdc2[0]
      1550208 blocks [2/2] [UU]
      
md_d0 : active raid1 sdc1[0] sdd1[1]
      513984 blocks [2/2] [UU]
      
unused devices: <none>

$ sudo reboot  # In order to check for RAID1 auto mount

$ sudo mdadm -Es
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=f874ee18:944046d4:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=21bbb030:79e49a7c:a78ec32f:6eb6df8d
ARRAY /dev/md128 level=raid1 num-devices=2 UUID=ccf55e08:5107a4ac:a78ec32f:6eb6df8d
# Why different RAID1 array device names?
# Either "GRUB2" or "mdadm" has its own way, I could not figure out.

$ sudo mdadm --examine --scan >> /etc/mdadm/mdadm.conf
# If there is a problem "Permission denied", set group=disk and allow group write

# Delete the old UUID's with the editor "nano"
$ sudo nano /etc/mdadm/mdadm.conf

# Paste new file mdadm.conf into "initramfs"
$ sudo update-initramfs -u -k all

nano /etc/mdadm/mdadm.conf

$ sudo reboot

$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdd2[1] sdc2[0]
      1550208 blocks [2/2] [UU]
      
md128 : active raid1 sdd3[1] sdc3[0]
      242035200 blocks [2/2] [UU]
      
md0 : active raid1 sdd1[1] sdc1[0]
      513984 blocks [2/2] [UU]
      
unused devices: <none>

# OK, now the RAID1 arrays are all active after boot, 
# and the names are the same on all locations /etc/mdadm/mdadm.conf and mdadm -Es


rudi@rudi-usb:~$ cat chroot_md128.sh 
#!/bin/sh
# chroot_md128.sh - chroot to /mnt
# 2010-12-28 RR
sudo mount -t ext3 /dev/md128 /mnt
sudo mount --bind /dev /mnt/dev
sudo mount --bind /proc /mnt/proc
sudo mount --bind /sys /mnt/sys
sudo chroot /mnt 

rudi@rudi-usb:~$ sudo ./chroot_md128.sh 

root@rudi-usb:/# cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdd2[1] sdc2[0]
      1550208 blocks [2/2] [UU]
      
md128 : active raid1 sdd3[1] sdc3[0]
      242035200 blocks [2/2] [UU]
      
md0 : active raid1 sdd1[1] sdc1[0]
      513984 blocks [2/2] [UU]
      
unused devices: <none>

# Delete the old UUID's with the editor "nano"
root@rudi-usb:/# nano /etc/mdadm/mdadm.conf

# Edit /etc/fstab and correct the /dev/mdx names
# Edit /etc/mtab and correct the /dev/mdx names

# Create file "09_swraid1_setup"
root@rudi-usb:/# cat /etc/grub.d/09_swraid1_setup 
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.
menuentry 'Ubuntu, with Linux 2.6.32-27-generic-pae' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        insmod raid
        insmod mdraid
        insmod ext2
        set root='(md128)'
        linux   /vmlinuz-2.6.32-27-generic-pae root=/dev/md128 ro   splash bootdegraded=yes 
        initrd  /initrd.img-2.6.32-27-generic-pae
}

root@rudi-usb:/# update-grub

# Paste new file mdadm.conf into "initramfs"
root@rudi-usb:/# update-initramfs -u -k all

root@rudi-usb:/# grub-install /dev/sdc
Installation finished. No error reported.
root@rudi-usb:/# grub-install /dev/sdd
Installation finished. No error reported.

root@rudi-usb:/# exit
$ sudo poweroff

Last try with system given RAID1 array names

Remove USB-stick and boot with the hard disk.

Error: no such disk
grub rescue> ls
(md0) (md1) (hd0) (hd0,3) (hd0,2) (hd0,1) (hd1) (hd1,4) (hd1,3) (hd1,2) (hd1,1)
(hd2) (hd2,3) (hd2,2) (hs2,1)
# Why is (md128) missing? 
# Answer: There was a file system size conflict, figured out with "esfsck /dev/md0"

# Edit /dev/md0/boot/grub/grub.cfg
# root must (md0)
# the kernel location must be specified with the UUID, because /dev/sdx will move.
menuentry 'Ubuntu, mit Linux 2.6.32-27-generic-pae' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        insmod raid
        insmod mdraid
        insmod ext2
        set root='(md0)'
        search --no-floppy --fs-uuid --set 624cbccb-5008-40aa-94a1-35d23000b776
        linux   /boot/vmlinuz-2.6.32-27-generic-pae root=UUID=624cbccb-5008-40aa-94a1-35d23000b776 ro   splash bootdegraded=yes
        initrd  /boot/initrd.img-2.6.32-27-generic-pae
}
# That works now

# Check the UUID's
rudi@rudiswiki:~$ mdadm -Es
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=f874ee18:944046d4:a78ec32f:6eb6df8d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=21bbb030:79e49a7c:a78ec32f:6eb6df8d
ARRAY /dev/md128 level=raid1 num-devices=2 UUID=ccf55e08:5107a4ac:a78ec32f:6eb6df8d

rudi@rudiswiki:~$ blkid | tail
/dev/sdb5: UUID="53d5a553-3588-4338-afa1-7430d84f57fd" TYPE="swap" 
/dev/sdc1: UUID="f874ee18-9440-46d4-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdc2: UUID="21bbb030-79e4-9a7c-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdc3: UUID="ccf55e08-5107-a4ac-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd1: UUID="f874ee18-9440-46d4-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd2: UUID="21bbb030-79e4-9a7c-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/sdd3: UUID="ccf55e08-5107-a4ac-a78e-c32f6eb6df8d" TYPE="linux_raid_member" 
/dev/md0: UUID="8753ea07-f22f-4bb1-81ed-ddb389652047" TYPE="ext2" 
/dev/md1: UUID="ad16b658-952b-4276-9623-37d0246c1e06" TYPE="swap" 
/dev/md128: LABEL="MX250" UUID="624cbccb-5008-40aa-94a1-35d23000b776" TYPE="ext3" 

# Check for UUID, OK
rudi@rudiswiki:~$ file -s /dev/md0
/dev/md0: Linux rev 1.0 ext2 filesystem data (mounted or unclean), UUID=8753ea07-f22f-4bb1-81ed-ddb389652047

# Remove the special file /etc/grub.d/09_swraid1_setup, it was not useful.
rudi@rudiswiki:~$ sudo /etc/grub.d/09_swraid1_setup
rudi@rudiswiki:~$ sudo update-grub

# Edit /etc/fstab, insert the UUID's because of moving "/dev/sdx"
# values of the RAID1 installation (sudo mdadm --examine --scan)
# 2011-01-30 RudolfReuter
# /dev/md128
UUID=624cbccb-5008-40aa-94a1-35d23000b776 /               ext3    errors=remount-ro 0       1
# /dev/md1
UUID=ad16b658-952b-4276-9623-37d0246c1e06 none            swap    sw              0       0
# /dev/md0, Grub2 does use for booting partition #1 ONLY
UUID=8753ea07-f22f-4bb1-81ed-ddb389652047 /boot         ext2    defaults        0       2

rudi@rudiswiki:~$ sudo update-initramfs -u

$ sudo reboot

error: file not found
grub rescue>

# Reboot with USB Stick
rudi@rudiswiki:~$ sudo e2fsck /dev/md0
Error: The filesystem size (according to the superblock) is xxx ...
For a repair look at [[UbuntuRaid1#raid1tipps|RAID1 Tipps]].

# "chroot" to /dev/md128
root@rudi-usb:/# grub-install /dev/sdc
root@rudi-usb:/# grub-install /dev/sdd
root@rudi-usb:/# exit

# Reboot with RAID1 array hard disk
# Finally it works!

For mounting the RAID1 system as root look at chroot.

The last problem was the different size of the /dev/md0 partitions.
The very last command in the setup sequence must be grub-install /dev/sd[cd].

Summary

So, why did that take so long time to setup the RAID1 array with USB hard disks?

In principle I see no software problems, just documentation problems:

  1. Problem was to figure out which kernel has the RAID support: ...generic-pae

  2. Problem was to find the system names for the RAID1 arrays (e.g. /dev/md128)

  3. Problem was to figure out, that you have to use UUID's instead of /dev/sdx.

  4. Problem was to find out, that the boot loader must be in partition #1 only.
    All examples in the Internet had the system partition last, in order to allow LVM, which I
    do not need for a home server. It would be much easier to put the system partition on place #1.

  5. Problem was to find out the right setup sequence for GRUB2 boot loader.

  6. Problem was to find the right tools for fixing abnormal behavior (e.g. mdadm --stop --scan).

    The daemon mdadm is working very hard in the background.
    So, you have to check often what it is doing with cat /proc/mdstat.

  7. Problem was to find all the commands to check the setup stages (e.g. bootinfoscript, mdadm -Es, blkid, file).

Replace broken hard disk

At 2012-02-01 a 2 year old Maxtor 250 GB 2.5" hard disk starts to replace sectors until it must be removed.

What to do for replacement:

# hdparm DOES NOT WORK with USB disk
$ sudo hdparm -i /dev/sdb
/dev/sdb:
 HDIO_DRIVE_CMD(identify) failed: Invalid exchange
 HDIO_GET_IDENTITY failed: Invalid argument

# smartctl does work PARTIALLY, the Serial Number is not shown
$ smartctl -i /dev/sdc 
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: WD       2500BEV External Version: 1.75
$ smartctl -i /dev/sdb
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: Seagate  Portable         Version: 0130

# fdisk will show a disk identifier, explanation see link
$ sudo fdisk -l /dev/sdb  # Seagate
Platte /dev/sdb: 250.1 GByte, 250059350016 Byte
...
Disk identifier: 0x0004b866
$ sudo fdisk -l /dev/sdc  # Western Digital
Platte /dev/sdc: 250.1 GByte, 250059350016 Byte
...
Disk identifier: 0x0003c605

$ sudo mdadm --add /dev/md128 /dev/sdc3
mdadm: re-added /dev/sdc3

# check for sync
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md128 : active raid1 sdc3[2] sdb3[0]
      242035200 blocks [2/1] [U_]
      [>....................]  recovery =  0.0% (135424/242035200) finish=327.4min speed=12311K/sec
      
md1 : inactive sdc2[1](S) sdb2[0](S)
      3100416 blocks
       
md0 : active raid1 sdb1[0]
      513984 blocks [2/1] [U_]
      
unused devices: <none>

# sync process has finished, /dev/md128 looks OK, /dev/md1 = SWAP, /dev/md0 = boot
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md128 : active raid1 sdc3[1] sdb3[0]
      242035200 blocks [2/2] [UU]
      
md1 : inactive sdc2[1](S) sdb2[0](S)
      3100416 blocks
       
md0 : active raid1 sdc1[1] sdb1[0]
      513984 blocks [2/2] [UU]
      
unused devices: <none>

Hard Disk Western Digital 250 GB, disk identifier 0x0003c606

# copy partition table
$ sudo sfdisk -d /dev/sdb | sfdisk /dev/sdd
Überprüfe, dass niemand diese Festplatte zur Zeit benutzt …
BLKRRPART: Keine Berechtigung
OK

Festplatte /dev/sdd: 30401 Zylinder, 255 Köpfe, 63 Sektoren/Spur
Alte Aufteilung:
Einheit = Zylinder von 8225280 Bytes, Blöcke von 1024 Bytes, Zählung beginnt bei 0

   Gerät  boot. Anfang   Ende  #Zyl.    #Blöcke   Id  System
/dev/sdd1          0+  30401-  30402- 244197560    7  HPFS/NTFS
/dev/sdd2          0       -       0          0    0  Leer
/dev/sdd3          0       -       0          0    0  Leer
/dev/sdd4          0       -       0          0    0  Leer
Neue Aufteilung:
Einheit = Sektoren von 512 Bytes, Zählung beginnt bei 0

   Gerät  boot.   Anfang      Ende  #Sektoren Id  System
/dev/sdd1            63   1028159    1028097  fd  Linux raid autodetect
/dev/sdd2       1044225   4144769    3100545  fd  Linux raid autodetect
/dev/sdd3   *   4160835 488263544  484102710  fd  Linux raid autodetect
/dev/sdd4             0         -          0   0  Leer
Die neue Partitionstabelle wurde erfolgreich geschrieben

# check the new hard disk
$ sudo mdadm --manage /dev/md0 --add /dev/sdd1
mdadm: Cannot open /dev/sdd1: Device or resource busy

# check for mounted
$ ls /media
Elements

# unmount new hard disk
$ sudo umount /media/Elements/

# check the new hard disk
$ sudo mdadm --manage /dev/md0 --add /dev/sdd1
mdadm: added /dev/sdd1
rudi@rudiswiki:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md128 : active raid1 sdc3[1] sdb3[0]
      242035200 blocks [2/2] [UU]
      
md1 : inactive sdc2[1](S) sdb2[0](S)
      3100416 blocks
       
md0 : active raid1 sdd1[2](S) sdc1[1] sdb1[0]
      513984 blocks [2/2] [UU]
      
unused devices: <none>

$ sudo mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90
  Creation Time : Sat Jan 29 16:13:03 2011
     Raid Level : raid1
     Array Size : 513984 (502.02 MiB 526.32 MB)
  Used Dev Size : 513984 (502.02 MiB 526.32 MB)
   Raid Devices : 2
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Feb  2 19:47:19 2012
          State : clean
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

           UUID : f874ee18:944046d4:a78ec32f:6eb6df8d
         Events : 0.206

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1

       2       8       49        -      spare   /dev/sdd1

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md128 : active raid1 sdb3[0]
      242035200 blocks [2/1] [U_]
      
md1 : active raid1 sdb2[0]
      1550208 blocks [2/1] [U_]
      
md0 : active raid1 sdb1[0] sdc[2]
      513984 blocks [2/1] [U_]
      [==================>..]  recovery = 94.0% (483392/513984) finish=0.0min speed=9691K/sec

# sync the system partition
$ sudo mdadm --manage /dev/md128 --add /dev/sdc3
# Error: the disk /dev/scd is busy. How to fix it?
# solution: stop the /dev/mt0 array.
$ sudo mdadm --stop /dev/mt0
# OK
$ sudo mdadm --add /dev/md128 /dev/sdc3
# OK

# sync the swap partition
$ sudo mdadm --manage /dev/md1 --add /dev/sdc2
[sudo] password for rudi: 
mdadm: added /dev/sdc2
# have a look if the sync process works
rudi@rudiswiki:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md128 : active raid1 sdc3[1] sdb3[0]
      242035200 blocks [2/2] [UU]
      
md1 : active raid1 sdc2[2] sdb2[0]
      1550208 blocks [2/1] [U_]
      [=========>...........]  recovery = 48.2% (748416/1550208) finish=0.7min speed=18545K/sec
      
unused devices: <none>
# OK
# have a look if the sync process has finished
rudi@rudiswiki:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md128 : active raid1 sdc3[1] sdb3[0]
      242035200 blocks [2/2] [UU]
      
md1 : active raid1 sdc2[1] sdb2[0]
      1550208 blocks [2/2] [UU]
      
unused devices: <none>
# OK

# last, sync the boot? partition, with the help of /usr/share/doc/mdadm/README.recipes
$ sudo mdadm --assemble /dev/md0 /dev/sdb1
mdadm: /dev/md0 assembled from 1 drive - need all 2 to start it (use --run to insist).
# That does not work, try again
$ sudo mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1
mdadm: no RAID superblock on /dev/sdc1
mdadm: /dev/sdc1 has no superblock - assembly aborted
# That does not work, try again
$ sudo mdadm --run /dev/md0 /dev/sdb1
mdadm: error opening /dev/md0: No such file or directory
mdadm: /dev/sdb1 does not appear to be an md device
# That does not work, try again
$ sudo mdadm --assemble --auto=yes --run /dev/md0 /dev/sdb1
mdadm: /dev/md0 has been started with 1 drive (out of 2).
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdb1[0]
      513984 blocks [2/1] [U_]
# OK
# add drive /dev/sdc1
$ sudo mdadm --add /dev/md0 /dev/sdc1
mdadm: added /dev/sdc1
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdc1[1] sdb1[0]
      513984 blocks [2/2] [UU]
# OK

Change disk identifier in MBR (DOS)

In order to differentiate the hard disks, I selected the Disk identifier (2 bytes in MBR, see link). Unfortunately it was cleared by accident. So I will describe how to set it up new again (WD 250 GB /dev/sdc):

# show old "disk identifier"
$ sudo fdisk -l /dev/sdc
...
Disk identifier: 0x00000000

# setup new "disk identifier" 0x0003c606
$ sudo fdisk /dev/sdc
# type x for expert menu
Expertenkommando (m für Hilfe): i
New disk identifier (current 0x00000000): 0x0003c606
Disk identifier: 0x0003c606
# type r to return to main menu

# change partition type forth and back to force MBR write!
Disk identifier: 0x0003c606

   Gerät  boot.     Anfang        Ende     Blöcke   Id  System
/dev/sdc1               1          64      514048+  fd  Linux raid autodetect
/dev/sdc2              66         258     1550272+  fd  Linux raid autodetect
/dev/sdc3   *         260       30393   242051355   fd  Linux raid autodetect

Befehl (m für Hilfe): t
Partitionsnummer (1-4): 1
Hex code (L um eine Liste anzuzeigen): 83
Der Dateisystemtyp der Partition 1 ist nun 83 (Linux)

Befehl (m für Hilfe): t
Partitionsnummer (1-4): 1
Hex code (L um eine Liste anzuzeigen): fd
Der Dateisystemtyp der Partition 1 ist nun fd (Linux raid autodetect)

# write changed MBR to disk
Befehl (m für Hilfe): w
Die Partitionstabelle wurde verändert!

Rufe ioctl() um Partitionstabelle neu einzulesen.

WARNING: Re-reading the partition table failed with error 16: Das Gerät oder die Ressource ist belegt.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Synchronisiere Platten.

# check for new "disk identifier"
$ sudo fdisk -l /dev/sdc
...
Disk identifier: 0x0003c606
# OK

List of pages in this category:

-- RudolfReuter 2011-01-29 19:45:12


Go back to CategoryHowTo or FrontPage ; KontaktEmail (ContactEmail)

UbuntuRaid1Log (last edited 2012-02-03 08:27:01 by dslb-084-058-163-123)