On one of my Solaris 9 servers, I lost a hard drive on a large array that had been built with no redundancy. RAID 0 is a tempting option for some admins trying to get the most space that they can out of the capacity they have, but it is almost always the wrong choice. Anyway, a drive died and had to be replaced. This array had been built using Veritas Volume Manager, so here are the steps I used. This is all from the command line. There is also a Veritas Enterprise Administrator GUI (vea) that you can use, but it has problems occasionally. Anyway, this is what I did – keep in mind that my array was already destroyed, so I didn’t worry about any data loss. If you are trying to keep your data, don’t follow my steps! You’ve been warned.
Legend: gendg is the disk group, genlv is the logical volume, gengd04 is the bad drive, gengd07 is the good drive, genlv-01 is the failed plex. You can find most of this information with the vxprint command.
Umount the effected file system
/sbin/umount /u0
Add Disk 13 (gendg07) to gendg
/usr/sbin/vxdg -g gendg adddisk gendg07=Disk_13
Replace gendg04 with gendg07
/usr/sbin/vxdg -g gendg repldisk gendg04=gendg07
Dis-associate the failed plex
/usr/sbin/vxplex -g gendg dis genlv-01
Re-associate the plex, it will be rebuilt
/usr/sbin/vxplex -g gendg att genlv genlv-01
Recover the disk group gendg and the logical volume genlv
/usr/sbin/vxrecover -b -g gendg -sE genlv
Since this was RAID 0, recreate the file system
/usr/sbin/mkfs -F vxfs /dev/vx/rdsk/gendg/genlv
Check the file system
/usr/sbin/fsck -F vxfs /dev/vx/rdsk/gendg/genlv
Mount the file system again
/sbin/mount /u0
I will try to post a similar experience using Sun’s Solaris Volume Manager next.
1 comment