I’ve created via zfs a RAIDZ, which is roughly something like a RAID5 in the traditional way. It contained three disks and as time goes by, the resulting array (roughly double the size of a single disk) was nearly full.
I got an identical vendor and size disk like the first three disks and added it to the case. But the RAIDZ-expansion-feature is at the time of this writing still not completed and far from being included in Ubuntu releases.
So I came up with a plan to get a bigger array without loosing any data and without backing up all the data to an external system (I simply didn’t have *that* many spare disks).
mdadm has a „–grow“-feature, so I had to copy everything to a mdadm-Raid, but without any additional disks.
The plan looked like this:
Initial 3 disks in RAIDZ/zfs, set one as offline
Use the offline disk and the recently added new disk to a new mdadm (RAID-5), degraded from the start.
Copy everything from RAIDZ to mdadm.
Destroy the remaining zfspool and add the remaining two disks to mdadm, growing, reshaping and resyncing all in one big step.
The only possible setback would be a drive failure on ZFS after degrading the ZFS or a drive failure in the first two disks of the new mdadm. – This would always be a full disaster. So – fingers crossed – and everything worked fine. It took a few days (copy and resyncing is slow), but it finally worked.
To make sure that I don’t mess up, I created a demo-script to test the several steps and whether my idea worked at all.
It comes in two steps, the first one works until the start of resyncing the mdadm. The second one should be started AFTER the resyncing has finished. In the small demo files I used this happend very fast, in reality this could take days.
#!/bin/bash
mkdir -p /test
cd /test # Separates Verzeichnis
cd test/
rm -f 1.disk
rm -f 2.disk
rm -f 3.disk
rm -f 4.disk
losetup -D
umount /test/mnt
mdadm --stop /dev/md0
mdadm --remove /dev/md0
rm -f /test/backupfile.mdadm
echo "##### Creating images"
dd if=/dev/zero of=1.disk bs=1M count=256
dd if=/dev/zero of=2.disk bs=1M count=256
dd if=/dev/zero of=3.disk bs=1M count=256
dd if=/dev/zero of=4.disk bs=1M count=256
DISK1=$(losetup --find --show ./1.disk)
DISK2=$(losetup --find --show ./2.disk)
DISK3=$(losetup --find --show ./3.disk)
DISK4=$(losetup --find --show ./4.disk)
parted ./1.disk mklabel gpt
parted ./2.disk mklabel gpt
parted ./3.disk mklabel gpt
parted ./4.disk mklabel gpt
parted -a optimal -- ./1.disk mkpart primary 0% 100%
parted -a optimal -- ./2.disk mkpart primary 0% 100%
parted -a optimal -- ./3.disk mkpart primary 0% 100%
parted -a optimal -- ./4.disk mkpart primary 0% 100%
echo "##### Starting zfs pool on disk 1, 2, 3"
zpool create origtank raidz ${DISK1} ${DISK2} ${DISK3}
echo "##### zpool status"
zpool status -v origtank
echo "##### Creating test file on /origtank"
dd if=/dev/zero of=/origtank/data bs=1M count=300
echo "##### Setting third disk as faulty"
zpool offline origtank ${DISK3}
echo "##### zpool status"
zpool status -v origtank
echo "##### ls -lA /origtank; df -h /origtank"
ls -lA /origtank; df -h /origtank
echo "##### Creating new md0 from disk3 and disk4"
#parted -s ./3.disk mklabel gpt
#parted -s ./4.disk mklabel gpt
#parted -s -a optimal -- ./3.disk mkpart primary 0% 100%
#parted -s -a optimal -- ./4.disk mkpart primary 0% 100%
wipefs -a ${DISK3}
wipefs -a ${DISK4}
parted -s ${DISK3} set 1 raid on
parted -s ${DISK4} set 1 raid on
mdadm --create /dev/md0 -f --auto md --level=5 --raid-devices=3 ${DISK3} ${DISK4} missing
echo "##### mdstat"
cat /proc/mdstat
mdadm --detail /dev/md0
echo "## # ## Formatting /dev/md0"
sleep 2
mkfs.ext4 /dev/md0
echo "##### Mount md0"
mkdir /test/mnt
mount /dev/md0 /test/mnt
echo "##### ls -lA /test/mnt; df -h /test/mnt"
ls -lA /test/mnt; df -h /test/mnt
echo "## # ## Copy data"
sleep 2
# rsync --delete -avPH /origtank/ /test/mnt
rsync -avPH /origtank/ /test/mnt
echo "##### ls -lA /test/mnt; df -h /test/mnt"
ls -lA /test/mnt; df -h /test/mnt
echo "##### Creating NEW test file on /origtank"
dd if=/dev/zero of=/origtank/dataNEW bs=1M count=30
echo "## # ## Copy NEW data"
sleep 2
# rsync --delete -avPH /origtank/ /test/mnt
rsync -avPH /origtank/ /test/mnt
echo "##### ls -lA /test/mnt; df -h /test/mnt"
ls -lA /test/mnt; df -h /test/mnt
echo "## # ## destroying pool"
sleep 2
zpool destroy origtank
echo "## # ## Adding disks to md0"
sleep 2
mdadm --add /dev/md0 ${DISK1} ${DISK2}
mdadm --grow --raid-devices=4 /dev/md0 --backup-file=/test/backupfile.mdadm
cat /proc/mdstat
After the successful resync of the mdadm, you can resize the filesystem:
resize2fs /dev/md0
cat /proc/mdstat
This only takes a few minutes (even on very huge disks), but be patient! You can see the progress by looking at mdadm –detail.
echo "##### mdstat"
cat /proc/mdstat
mdadm --detail /dev/md0
echo "##### ls -lA /test/mnt; df -h /test/mnt"
ls -lA /test/mnt; df -h /test/mnt