Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Brain when logged in user CWDed into ZFS volume #38

Open
rbicelli opened this issue Apr 17, 2021 · 5 comments
Open

Split Brain when logged in user CWDed into ZFS volume #38

rbicelli opened this issue Apr 17, 2021 · 5 comments

Comments

@rbicelli
Copy link

rbicelli commented Apr 17, 2021

Hi,
Consider this scenario:

  • a user (let's say an admin user) is logged in and working (CWDed on a ZFS volume served by the cluster
  • a failover event is triggered (i.e secondary node goes off standby mode and tries to take over its resources)

I observed that a fence action is triggered.

The worst thing happened is that the fence action don't work as expected: the volume stays mounted on both nodes, causing ZFS errors (and file corruption). I assume SCSI reservations are somehow not honored.

I triple checked the configuration and looks like ok.

Since I'm planning to add sanoid/syncoid for snapshot/replica send, I would like to avoid a split brain in case of failover in the middle of a process on node using the filesystem.

I think this behaviour it reproducible with ease.

@rbicelli
Copy link
Author

Relevant portion of log is (sorrry for cut but i was into a split-screened shell):

Apr 17 16:51:25 zsan02 crmd[3513]:  notice: Result of stop operation for vol1-ip on zsan02: 0 (ok)                                                                                    
Apr 17 16:51:25 zsan02 lrmd[3510]:  notice: vol1_stop_0:34973:stderr [ /usr/lib/ocf/resource.d/heartbeat/ZFS: line 35: [: : integer expression expected ]                             
Apr 17 16:51:25 zsan02 lrmd[3510]:  notice: vol1_stop_0:34973:stderr [ umount: /vol1: target is busy. ]                                                                               
Apr 17 16:51:25 zsan02 lrmd[3510]:  notice: vol1_stop_0:34973:stderr [         (In some cases useful info about processes that use ]                                                  
Apr 17 16:51:25 zsan02 lrmd[3510]:  notice: vol1_stop_0:34973:stderr [          the device is found by lsof(8) or fuser(1)) ]                                                         
Apr 17 16:51:25 zsan02 lrmd[3510]:  notice: vol1_stop_0:34973:stderr [ cannot unmount '/vol1': umount failed ]                                                                        
Apr 17 16:51:25 zsan02 lrmd[3510]:  notice: vol1_stop_0:34973:stderr [ /usr/lib/ocf/resource.d/heartbeat/ZFS: line 35: [: : integer expression expected ]                             
Apr 17 16:51:25 zsan02 crmd[3513]:  notice: Result of stop operation for vol1 on zsan02: 1 (unknown error)                                                                            
Apr 17 16:51:25 zsan02 crmd[3513]:  notice: zsan02-vol1_stop_0:63 [ /usr/lib/ocf/resource.d/heartbeat/ZFS: line 35: [: : integer expression expected\numount: /vol1: target is busy.\ 
n        (In some cases useful info about processes that use\n         the device is found by lsof(8) or fuser(1))\ncannot unmount '/vol1': umount failed\n/usr/lib/ocf/resource.d/he 
artbeat/ZFS: line 35: [: : integer expression expected\n ]                                                                                                                            
Apr 17 16:51:25 zsan02 stonith-ng[3509]:  notice: fence-vol1 can fence (reboot) zsan02: static-list                                                                                   
Apr 17 16:51:25 zsan02 stonith-ng[3509]:  notice: fence-vol2 can fence (reboot) zsan02: static-list                                                                                   
Apr 17 16:51:25 zsan02 stonith-ng[3509]:  notice: fence-vol3 can fence (reboot) zsan02: static-list                                                                                   
Apr 17 16:51:26 zsan02 stonith-ng[3509]:  notice: Operation 'reboot' targeting zsan02 on zsan01 for [email protected]: OK

looks like when something is using the filesystem locally the resource agent is unable to stop the fs, then crashes and triggers a fence event. Fencing that doesn't happen (I've configured idrac but doesn't power cycle the node if I trigger a fence). But this is another story.

Same behaviour occours with a zfs send in progress.

In order to mitigate this issue I wrote an helper script, that i put in /usr/lib/ocf/lib/heartbeat/helpers/zfs-helper:

#!/bin/bash
# Pre-Export script for ZFS Pool
# Check if there is some process using files in	Zpool and kill them
# Requires lsof, ps, awk, sed

zpool_pre_export () {

        # Forcibly Terminate all pids using zpool
        ZPOOL=$1
        #Exits gracefully anyway, for now
        RET=0
	
	lsof /$ZPOOL{*,/*} | awk '{print ($2)}' | sed -e "1d" | \
        while read PID
        do
          	echo "Terminating PID $PID"
                kill -9 $PID
        done
	
	# Check if some blocking ZFS operations are running, such 
        # zfs send ...
        ps aux | grep $ZPOOL | awk '{print ($2)}' | \
        while read PID
        do
          	echo "Terminating PID $PID"
                kill -9 $PID
        done

	exit $RET
}

case $1 in

	pre-export)
        zpool_pre_export $2
        ;;
esac

@intentions
Copy link

Wouldn't using the multihost protection prevent the second host from mounting the pool?

@rbicelli
Copy link
Author

rbicelli commented Apr 17, 2021

Wouldn't using the multihost protection prevent the second host from mounting the pool?

Wasn't aware of this feature. I've enabled it and testing it.

@Nooby1
Copy link

Nooby1 commented Nov 1, 2021

I have put it in /usr/lib/ocf/lib/heartbeat/zfs-helper.sh, as there is no helper directory in RHEL8 and there are other scripts in this directory.

Does anything else have to be done for this on RHEL8?

@rbicelli
Copy link
Author

I don't remember since months are passed, but is possible that I needed to create the required directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants