Please, note:
Nothing is exactly the same as a physical disk pull.
A physical disk pull can trigger errors by the expander that will not be detected just emulating.
Hardware failures are complex, so you should not avoid testing physically.
If your company has the Servers in another location you should request them to have Servers next to you, or travel to the location and spend enough time hands on.
A set of commands very handy for simulating a physical drive pull, when you have not physical access to the Server, or working within a VM.
To delete a disk (Linux stop seeing it until next reboot/power cycle):
echo "1" > /sys/block/${device_name}/device/delete
Set a disk offline:
echo "offline" > /sys/block/${device_name}/device/state
Online the disk
echo "running" > /sys/block/${device_name}/device/state
Scan all hosts, rescan
for host in /sys/class/scsi_host/host*; do echo '- - -' > $host/scan; done
Disabling the port in the expander
This is more like physically pulling the drive.
In order to use the commands, install the package smp_utils. This is now
installed on the 4602 and the 4U60.
The command to disable a port on the expander:
smp_phy_control --phy=${phy_number} --op=dis /dev/bsg/${expander_id}
You will need to know the phy number of the drive. There may be a better
way, but to get it I used:
smp_discover /dev/bsg/${expander_id}
You need to look for the sas_address of the drive in the output from the
smp_discover command. You may need to try all the expanders to find it.
You can get the sas_address for your drive by:
cat /sys/block/${device_name}/device/sas_address
To re-enable the port use:
smp_phy_control --phy=${phy_number} --op=lr /dev/bsg/${expander_id}
Some handy scripts when working with ZFS
To kill one drive given the id (device name may change between reboots)
TO_REMOVE="wwn-0x5000c500a6134007"
DRIVE=`ls -al /dev/disk/by-id/ | grep ${TO_REMOVE} | grep -v "\-part" | awk '{ print $11 }' | tr --delete './'`;
if [[ ! -z "${DRIVE}" ]];
then
echo "1" > /sys/block/${DRIVE}/device/delete
else
echo "Drive not found"
fi
Loop to see the status of the pool
while true; do zpool status carles-N58-C3-D16-P3-S1 | head --lines=20; sleep 5; done