Replace a Bad Drive, or Replace With Larger Disks – Software RAID1 (mdadm)

Replace a Bad Drive in Software RAID1, or Replace Drives With Larger Drives in Software RAID1

Over the years, I’ve had to replaced failed raid1 drives or change drives so that I could use larger disks in an existing raid1 array.  Since it seems I have to google the process every time, I figured I’d take a moment to jot down the process.

1. My Configuration

In my example I have two hard drives in software RAID1 (mdadm), /dev/sda and /dev/sdb.  The partitions are /dev/sda1, /dev/sda3, /dev/sdb1 and /dev/sdb3.  They look like this

/dev/sda1 and /dev/sdb1 are RAID1 array /dev/md0
/dev/sda3 and /dev/sdb3 are RAID1 array /dev/md1
/dev/md0 is my /boot partition
/dev/md1 is my / partition

 

This example will cover replacing a single failed drive, and also replacing both drives with larger disks, while maintaining the RAID1 array.

For now, I’ll pretend that /dev/sdb has failed and we will replace it.

Note: If you are replacing /dev/sda as you follow along, you’ll want to be sure that you have Grub installed on /dev/sdb first.  See further down for details on how to do that.

 

2. Removing the Drive

Shut down your server

shutdown -h now

Remove the /dev/sdb hard drive from the server and replace it with a new drive.  Be sure the drive is the same size or greater.  If you add a drive that is the same size as the original, it is recommended that it’s the same model as /dev/sda or you may run into complications (not all same-size drives contain the same amount of sectors)

Once the server has booted back up, you’ll want to copy the exact partition structure from /dev/sda to your new drive at /dev/sdb

* If your server did not boot and got stuck, it’s likely that grub was not installed on the remaining drive.

** If the replacement drive had an OS and this OS is booting instead of the one you want, go into your BIOS and swap which is the primary boot drive

 

3. Replicating Partition Structure

Now we’ll replicate the disk partition structure between by copying it from the raid1 disk to the new disk you’ve just installed.

sfdisk -d /dev/sda | sfdisk /dev/sdb

Make sure you get the order of these correct or you’ll destroy your data!

To check that both drives have an identical partition structure, do:

fdisk -l

If the are different then you’re likely using two disks that are the same size but different models.  The new drive needs to contain at least the same amount of sectors as the old drive – or greater.

 

4. Adding the Disk to the RAID1 Array

Now we’ll add /dev/sdb into the raid array so that it can begin synchronizing.

mdadm --manage /dev/md0 --add /dev/sdb1

Do the same for the / partition

mdadm --manage /dev/md1 --add /dev/sdb3

Both arrays should now be syncing, though your md0 may already be complete if it’s a small partition.

cat /proc/mdstat

This will you show you the current status of the synchronization process.

Personalities : [raid1] 
md0 : active raid1 sda1[0] sdb1[1]
      104320 blocks [2/2] [UU]
      
md1 : active raid1 sda3[2] sdb3[1]
      73850688 blocks [2/1] [_U]
      [=>...................]  recovery =  5.6% (4180352/73850688) finish=281.6min speed=4120K/sec
      
unused devices: <none>

When that is complete, you should see [UU] in all arrays, like in md0 example above.

 

5. Install Grub on Secondary Drive

Now that the synch is done, we should be sure to install the Grub bootloader to the new drive as a failover.  If your primary drive fails you want to be able to boot off the secondary, right?

# grub
grub> find /grub/stage1

You’ll likely see:

(hd0,0)
(hd1,0)

This means both /dev/sda1 and /dev/sdb1 contain grub files, but grub is really only installed on /dev/sda1 right now.

grub> device (hd0) /dev/sdb
grub> root (hd0,0)
grub> setup (hd0)
grub> quit

This is like telling grub that (hd0) is refer instead to sdb and then proceed to set it up on /dev/sdb1

You May be Done Already!  See below.

If you’re only replacing a failed drive, you can stop here.  You should be done.  However if you’re replacing both drives and installing larger ones, continue on, but first be sure that your raid synchronization process is complete…

 

6. Removing the Second Old RAID1 Disk

Shut down the server so that you can remove the second disk.

shutdown -h now

Pull out the /dev/sda drive and replace it with your new larger drive.  You may need to go into your BIOS and set the secondary drive as the primary boot drive.  Since we’ve already complete synching to /dev/sdb in the process above, it’s now the drive with the data we want.

If your server still doesn’t boot up after that, it’s likely grub wasn’t installed correctly on /dev/sdb.  Plug back in /dev/sda, boot it up, and be sure to follow the grub install mentioned above.

 

7. Replicate the Partition Structure

Ok, so your server is back online now.  We’ll need to match the partition structure from /dev/sdb onto /dev/sda

sfdisk -d /dev/sdb | sfdisk /dev/sda

Verify they’re identical using

fdisk -l

8. Add the Disk to the RAID1 Array

Now we’ll add /dev/sda1 to /dev/md0 and /dev/sda3 to /dev/md1

mdadm --manage /dev/md0 --add /dev/sda1
mdadm --manage /dev/md1 --add /dev/sda3

To see the progress…

cat /proc/mdstat
Personalities : [raid1] 
md0 : active raid1 sda1[0] sdb1[1]
      104320 blocks [2/2] [UU]
      
md1 : active raid1 sda3[2] sdb3[1]
      73850688 blocks [2/1] [_U]
      [=>...................]  recovery =  5.6% (4180352/73850688) finish=281.6min speed=4120K/sec
      
unused devices: <none>

Wait for the synchronization process to complete by checking /proc/mdstat every now and then.

 

9. Installing Grub on the New Primary Disk

Installing Grub onto /dev/sda (you may not need to do this, depending on how you got here).  It generally won’t hurt to do it if you’re not sure.

# grub
grub> find /grub/stage1

will display:

(hd0,0)
(hd1,0)

Again, this is because both disks contain grub, but grub is currently only installed on /dev/sdb (hd1,0)

grub> root (hd0,0)
grub> setup (hd0)
grub> quit

 

We’re done!

You should now have two new drives in your raid1 array, both with the original data from your old drives.  In addition, both drives have grub installed, so should the primary disk fail, the secondary will still be bootable.

Leave a Reply

Your email address will not be published. Required fields are marked *