Ok, so I lend one of my Servers to two of my colleagues in The States, that required to prepare some test for a customer. I always try to be nice and to stimulate sales across the organizations I help, so if they need a Server for a PoC and demo to a customer, they know they can count on me.
It is important to remark that the Servers I was using had two motherboards, with their CPU and RAM, and Dual Port SAS drives. We had those Servers so we can implement High Availability. The Dual Port SAS allow two different computers or IO controllers to access the same drive at the same time.
I work with Declustered RAID, DRAID, and ZFS.
The Server was a 4U90, so a 4U Server with 90 SAS3 spinning drives and 4 SSD. Drives are Dual Ported, and two Controllers (motherboard + CPU + RAM) have access simultaneously to the drives for HA.
After their tests my colleagues, returned me the Server, and I needed to use it and my surprise was when I tried to provision with ZFS and I encountered problems. Not much in the logs. Please note I was using only one node (or controller), and the other was not in use but they ask me to keep the OS and the data (in 2xMD drive). I shutdown the node A after the Engineers in San Jose powered the Server off, so only my node was working.
I checked:
cat /proc/mdstat
And that was the thing 8 MD Arrays where there.
[root@4u90-B ~]# cat /proc/mdstat
Personalities :
md2 : inactive sdba1[9](S) sdag1[7](S) sdaf1[3](S)
11720629248 blocks super 1.2
md1 : inactive sdax1[7](S) sdad1[5](S) sdac1[1](S) sdae1[9](S)
12056071168 blocks super 1.2
md0 : inactive sdat1[1](S) sdav1[9](S) sdau1[5](S) sdab1[7](S) sdaa1[3](S)
19534382080 blocks super 1.2
md4 : inactive sdbf1[9](S) sdbe1[5](S) sdbd1[1](S) sdal1[7](S) sdak1[3](S)
19534382080 blocks super 1.2
md5 : inactive sdam1[1](S) sdan1[5](S) sdao1[9](S)
11720629248 blocks super 1.2
md8 : inactive sdcq1[7](S) sdz1[2](S)
7813752832 blocks super 1.2
md7 : inactive sdbm1[7](S) sdar1[1](S) sdy1[9](S) sdx1[5](S)
15627505664 blocks super 1.2
md3 : inactive sdaj1[9](S) sdai1[5](S) sdah1[1](S)
11720629248 blocks super 1.2
md6 : inactive sdaq1[7](S) sdap1[3](S) sdr1[8](S) sdp1[0](S)
15627505664 blocks super 1.2
Ok. So I stop the Arrays
mdadm --stop /dev/md127
And then I zero the superblock:
mdadm --zero-superblock /dev/sdb1
After doing this for all I try to provision and… surprise! does not work. /dev/md127 has respawned like in the old times from Doom video game.
I check the mdmonitor service and even disable it.
systemctl disable mdmonitor
I repeat the process.
And /dev/md127 appears again, using another device.
At this point, just in case, I check the other controller, which should be powered off.
Ok, it was on. With different Ip, so it was not answering to ping, but I still had access to BMC//IPMI. After confirming with my colleagues that I can shutdown that node (they did not turn it on apparently) I launch the poweroff command, and repeat, same!.
I see that the poweroff command on the second Controller is doing a reboot, not poweroff. Is a Firmware issue I find. So I access to the Linux from the management tool and I launch the halt command that makes it not respond to the ping anymore.
I repeat the process, and still the ghost md array appears there, and blocks me from doing my zpool create.
The /etc/mdadm.conf file did not exist (by default is not created).
I try a more aggressive approach:
DRIVES=`cat /proc/partitions | grep 3907018584 | awk '{ print $4; }'`
for DRIVE in $DRIVES; do echo "Trying /dev/${DRIVE}1"; mdadm --examine /dev/${DRIVE}1; done
Ok. And destruction time:
for DRIVE in $DRIVES; do echo "Trying /dev/${DRIVE}"; wipefs -a -f /dev/${DRIVE}; done
for DRIVE in $DRIVES; do echo "Trying /dev/${DRIVE}1"; mdadm --zero-superblock /dev/${DRIVE}1; done
Apparently the system is clean, but still I cannot provision, and /dev/md127 respawns and reappears all the time.
After googling and not finding anything about this problem, and my colleagues no having clue about what is causing this, I just proceed with a simple solution, as I need the Server for my company completing the tests in the next 24 hours.
So I create the file /etc/mdadm.conf with this content:
[root@draid-08 ~]# cat /etc/mdadm.conf
AUTO -all
After that I rebooted the Server and I saw the infamous /dev/md127 is not there and I’m able to provision.
I share the solution as it may help other people.
The most straightforward procedure would had been reinstalling clean the OS, but this operation is very slow when simulating a Virtual CD remotely, so it was worth fixing that as OS level, as I save one day delaying my work.