RAID1 recovery
One of the MA database systems went down. It wouldn’t boot, so we put in the install disk, ‘linux rescue’ at the prompt, and it booted into rescue mode. The system is set up with the system files on a pair of RAID1 SATA drives. Drive /dev/sda was gone–fdisk found no partition. /dev/sdb was fine. I looked around for hacking traces but found nothing. /var/log/messages indicated the system had shutdown for reboot two days before. We hadn’t done it, so how/why?
First, I re-partitioned /dev/sda to look like /dev/sdb using the same ‘fd’ RAID partition type.
To bring it back up, I shut down, switched the sda and sdb cables so we could boot off the good drive and then have RAID restore the second drive. The original /dev/sdb didn’t have grub installed on the MBR, so I had to reboot with the rescue disk and reinstall grub.
/mnt/sysimage/sbin/grub
grub>root (hd0,0)
grub>setup (hd0)
I had to use grub because grub-install wasn’t available from the rescue environment and /mnt/sysimage/sbin/grub-install couldn’t find /sbin/grub.
Then reboot, grub comes up, the system boots. The root /dev/md1 RAID1 is degraded as this shows, so add /dev/sdb back:
mdadm --query --detail /dev/md1
...degraded...
mdadm --add /dev/md1 /dev/sdb
And 20 minutes later the array is clean!