Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: excl->share in zfs_clone_range when block_cloning is enabled #16789

Closed
asomers opened this issue Nov 20, 2024 · 4 comments · Fixed by #16796 or #16797
Closed

panic: excl->share in zfs_clone_range when block_cloning is enabled #16789

asomers opened this issue Nov 20, 2024 · 4 comments · Fixed by #16796 or #16797
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@asomers
Copy link
Contributor

asomers commented Nov 20, 2024

System information

Type Version/Name
Distribution Name FreeBSD
Distribution Version 15.0-CURRENT
Kernel Version main-n273749-4b65481ac68a-dirty (git main branch as of today)
Architecture amd64
OpenZFS Version zfs-2.3.0-rc1-FreeBSD_g3a9fca901 zfs-kmod-2.3.99-64-FreeBSD_g1c9a4c8cb

Describe the problem you're observing

I can immediately reproduce a panic: excl->share on FreeBSD 15.0-CURRENT with witness enabled by using fsx to copy part of a file to another offset of the same file, with copy_file_range. Note that zpool create by default enables the block_cloning feature. If I disable that with zpool create -o feature@block_cloning=disabled then I cannot reproduce the crash.

Describe how to reproduce the problem

$ sudo pkg install -y devel/fsx
$ sudo zpool create testpool vtbd5
$ sudo zfs create testpool/fsx
$ sudo chmod 1777 /testpool/fsx
$ cd /testpool/fsx
$ cat > fsx.toml <<HERE
flen = 1048576
nomsyncafterwrite = false
nosizechecks = false
blockmode = false

[opsize]
max = 262144
min = 0
align = 1

[weights]
close_open = 1
read = 10
write = 10
mapread = 10
mapwrite = 10
invalidate = 1
truncate = 1
fsync = 1
fdatasync = 1
posix_fallocate = 0
punch_hole = 1
sendfile = 1
posix_fadvise = 1
copy_file_range = 1
HERE
$ fsx -f fsx.toml -v fsx.bin

Include any warning/errors/backtraces from the system logs

Stack trace:
(kgdb) #0  __curthread ()
    at /usr/home/somers/src/freebsd.org/src/sys/amd64/include/pcpu_aux.h:57
        td = <optimized out>
#1  doadump (textdump=textdump@entry=0)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/kern_shutdown.c:404
        error = 0
        coredump = <optimized out>
#2  0xffffffff80a4feed in db_dump (dummy=<optimized out>, 
    dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_command.c:596
        error = <optimized out>
#3  0xffffffff80a4f50b in db_command (last_cmdp=<optimized out>, 
    cmd_table=<optimized out>, dopager=true)
    at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_command.c:508
        modif = '\000' <repeats 11 times>, "\200\000\376\377\377\000\240\000\206\377\377\377\377\000\000\000\000\000\000\000\000\024\000ޅ\377\377\377\377\270\237\030\266\000\376\377\377\b", '\000' <repeats 23 times>, "\270\237\030\266\000\376\377\377\\\016<\203\377\377\377\377", '\000' <repeats 24 times>, "@\267(\264\000\376\377\377"
        addr = -2093216164
        count = -1
        cmd = 0xffffffff85c1a938 <db_cmds+616>
        have_addr = <optimized out>
        t = <optimized out>
        result = <optimized out>
#4  0xffffffff80a4e5e0 in db_command_loop ()
    at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_command.c:555
No locals.
#5  0xffffffff80a604c2 in db_trap (type=type@entry=3, code=code@entry=0)
    at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_main.c:267
        jb = {{_jb = {0, -2196000688248, -2196000688048, 0, -2195968188488, 
              -2196000688240, 0, -2136603956, 0, -2050388976, -2056072656, 
              -2047184000}}}
        bkpt = false
        watchpt = false
        prev_jb = 0x0
        why = 0xffffffff855999e1 "panic"
#6  0xffffffff833c44c2 in kdb_trap (type=type@entry=3, code=0, 
    tf=tf@entry=0xfffffe00b428bc20)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/subr_kdb.c:790
        __pc = 0x0
        __pc = 0x0
        other_cpus = {__bits = {18446741877708864128, 18446741877708864608, 0, 
            18446741877741359112, 18446744071616731088, 0, 0, 0, 0, 0, 0, 0, 
            18, 3023517235, 18446744069414584320, 18446741877708864600}}
        be = 0xffffffff85c1b4d0 <ddb_dbbe>
        intr = 2
        did_stop_cpus = <optimized out>
        handled = <optimized out>
#7  0xffffffff84fdfad4 in trap (frame=0xfffffe00b428bc20)
    at /usr/home/somers/src/freebsd.org/src/sys/amd64/amd64/trap.c:606
        __pc = 0x0
        __pc = 0x0
        __pc = 0x0
        ksi = {ksi_link = {tqe_next = 0xffffffff80a60bb6 <db_printf+630>, 
            tqe_prev = 0x0}, ksi_info = {si_signo = 93489948, 
            si_errno = 93489948, si_code = 93489948, si_pid = 93489948, 
            si_uid = 93489948, si_status = 93489948, si_addr = 0x5755560, 
            si_value = {sival_int = 91780296, sival_ptr = 0x57874c8057874c8, 
              sigval_int = 91780296, sigval_ptr = 0x57874c8057874c8}, 
            _reason = {_fault = {_trapno = -1148842708}, _timer = {
                _timerid = -1148842708, _overrun = 1122754903}, _mesgq = {
                _mqd = -1148842708}, _poll = {_band = 4822195592954776876}, 
              _capsicum = {_syscall = -1148842708}, __spare__ = {
                __spare1__ = 4822195592954776876, __spare2__ = {-1239904248, 
                  -512, -1272399104, -512, 0, 0, -1272399216}}}}, 
          ksi_flags = 0, ksi_sigq = 0xfffffe00b428bbb0}
        signo = -1
        ucode = -1239904256
        td = <optimized out>
        p = <optimized out>
        dr6 = <unavailable>
        type = 3
        addr = <optimized out>
        pf = <optimized out>
        i = <optimized out>
#8  <signal handler called>
No locals.
#9  kdb_enter (why=0xffffffff855999e1 "panic", msg=0xffffffff855999e1 "panic")
    at /usr/home/somers/src/freebsd.org/src/sys/kern/subr_kdb.c:556
No locals.
#10 0xffffffff83207d0b in vpanic (fmt=<optimized out>, 
    fmt@entry=0xffffffff8555f098 "excl->share", ap=ap@entry=0xfffffe00b428bf70)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/kern_shutdown.c:967
        buf = "excl->share", '\000' <repeats 244 times>
        __pc = 0x0
        __pc = 0x0
        __pc = 0x0
        other_cpus = <optimized out>
        td = 0xfffffe00b125d740
        bootopt = <optimized out>
        newpanic = <optimized out>
#11 0xffffffff832076dd in panic (fmt=0xffffffff8555f098 "excl->share")
    at /usr/home/somers/src/freebsd.org/src/sys/kern/kern_shutdown.c:892
        ap = {{gp_offset = 8, fp_offset = 48, 
            overflow_arg_area = 0xfffffe00b428bff0, 
            reg_save_area = 0xfffffe00b428bf40}}
#12 0xffffffff834cf988 in witness_checkorder (
    lock=lock@entry=0xfffffe00b8781e30, flags=flags@entry=1, 
    file=file@entry=0xffffffff89dbd04b "/usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/include/os/freebsd/spl/sys/vnode.h", line=line@entry=101, 
    interlock=interlock@entry=0x0)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/subr_witness.c:1183
        __pc = 0x0
        w = 0xfffff8043fd86a80
        class = 0xffffffff85cd2ac8 <lock_class_lockmgr>
        td = <optimized out>
        lock_list = 0xffffffff86052938 <w_locklistdata+198696>
        lock1 = <optimized out>
        iclass = <optimized out>
        plock = <optimized out>
        w1 = <optimized out>
        lle = <optimized out>
        j = <optimized out>
        i = <optimized out>
        lock2 = <optimized out>
#13 0xffffffff830f999c in lockmgr_slock (lk=0xfffffe00b8781e30, flags=2098176, 
    file=0xffffffff89dbd04b "/usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/include/os/freebsd/spl/sys/vnode.h", line=101)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/kern_lock.c:1224
        x = 0
#14 0xffffffff85481897 in VOP_LOCK1_APV (
    vop=vop@entry=0xffffffff89e60190 <zfs_vnodeops>, 
    a=a@entry=0xfffffe00b428c3e0) at vnode_if.c:2241
        rc = <optimized out>
#15 0xffffffff83827f44 in VOP_LOCK1 (vp=0xfffffe00b8781dc0, flags=2098176, 
    file=0xffffffff89dbd04b "/usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/include/os/freebsd/spl/sys/vnode.h", line=101) at ./vnode_if.h:1118
        a = {a_gen = {a_desc = 0xffffffff85d7e570 <vop_lock1_desc>}, 
          a_vp = 0xfffffe00b8781dc0, a_flags = 2098176, 
          a_file = 0xffffffff89dbd04b "/usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/include/os/freebsd/spl/sys/vnode.h", a_line = 101}
#16 _vn_lock (vp=vp@entry=0xfffffe00b8781dc0, flags=flags@entry=2098176, 
    file=0xffffffff89dbd04b "/usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/include/os/freebsd/spl/sys/vnode.h", line=line@entry=101)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/vfs_vnops.c:1829
        error = <optimized out>
#17 0xffffffff89a6043b in vn_flush_cached_data (
    vp=vp@entry=0xfffffe00b8781dc0, sync=1)
    at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/include/os/freebsd/spl/sys/vnode.h:101
        flags = 1
        flags = <optimized out>
#18 0xffffffff89a5d5b4 in zfs_clone_range (inzp=inzp@entry=0xfffffe00d549aae0, 
    inoffp=inoffp@entry=0xfffffe00b428ccc0, 
    outzp=outzp@entry=0xfffffe00d549aae0, 
    outoffp=outoffp@entry=0xfffffe00b428ccc8, 
    lenp=lenp@entry=0xfffffe00b428c8b8, cr=cr@entry=0xfffffe003e1b4200)
    at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1472
        bulk = {{sa_data = 0xfffffe00b428c610, 
            sa_data_func = 0xffffffff834d8c5e <witness_unlock+3838>, 
            sa_length = 50720, sa_attr = 46120, 
            sa_addr = 0xffffffff86052984 <w_locklistdata+198772>, 
            sa_buftype = 255, sa_size = 0}, {sa_data = 0xfffffe00b8781e38, 
            sa_data_func = 0x0, sa_length = 21210, sa_attr = 34145, 
            sa_addr = 0xfffffe00b428c680, sa_buftype = 17785, 
            sa_size = 33613}, {sa_data = 0xffffffff00000009, 
            sa_data_func = 0x0, sa_length = 0, sa_attr = 0, 
            sa_addr = 0xffffffff85cd2ac8 <lock_class_lockmgr>, 
            sa_buftype = 10552, sa_size = 34309}}
        mtime = {18446744071671194496, 18446741877741359112}
        ctime = {18446741875728397312, 0}
        nbps = 18446741877658343232
        clear_setid_bits_txg = 0
        count = 0
        last_synced_txg = 0
        inoff = 416811
        outoff = 461290
        len = 44479
        done = 0
        inzfsvfs = 0xfffffe00bdd7f000
        outzfsvfs = 0xfffffe00bdd7f000
        error = <optimized out>
        inos = 0xfffffe00bdc25000
        outos = 0xfffffe00bdc25000
        inlr = <optimized out>
        outlr = <optimized out>
        zilog = <optimized out>
        maxblocks = <optimized out>
        uid = <optimized out>
        gid = <optimized out>
        projid = <optimized out>
        bps = <optimized out>
        size = <optimized out>
        tx = <optimized out>
        db = <optimized out>
        outsize = <optimized out>
        inblksz = <optimized out>
        __sdt_probe36 = <optimized out>
        __sdt_probe37 = <optimized out>
        __sdt_probe38 = <optimized out>
#19 0xffffffff891ef5a0 in zfs_freebsd_copy_file_range (
    ap=ap@entry=0xfffffe00b428c9e8)
    at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:6179
        mp = 0xfffffe00a8b62100
        len = 44479
        invp = 0xfffffe00b8781dc0
        outvp = 0xfffffe00b8781dc0
        outzfsvfs = <optimized out>
        error = <optimized out>
#20 0xffffffff85492627 in VOP_COPY_FILE_RANGE_APV (
    vop=vop@entry=0xffffffff89e60190 <zfs_vnodeops>, 
    a=a@entry=0xfffffe00b428c9e8) at vnode_if.c:4471
        rc = <optimized out>
#21 0xffffffff8383b16e in VOP_COPY_FILE_RANGE (invp=<optimized out>, 
    inoffp=0xfffffe00b428ccc0, outvp=0xfffffe00b8781dc0, 
    outoffp=0xfffffe00b428ccc8, lenp=0xfffffe00b428cba0, flags=0, 
    incred=<optimized out>, outcred=<optimized out>, fsizetd=<optimized out>)
    at ./vnode_if.h:2419
        a = {a_gen = {a_desc = 0xffffffff85d831c0 <vop_copy_file_range_desc>}, 
          a_invp = 0xfffffe00b8781dc0, a_inoffp = 0xfffffe00b428ccc0, 
          a_outvp = 0xfffffe00b8781dc0, a_outoffp = 0xfffffe00b428ccc8, 
          a_lenp = 0xfffffe00b428cba0, a_flags = 0, 
          a_incred = 0xfffffe003e1b4200, a_outcred = 0xfffffe003e1b4200, 
          a_fsizetd = 0xfffffe00b125d740}
#22 vn_copy_file_range (invp=invp@entry=0xfffffe00b8781dc0, 
    inoffp=inoffp@entry=0xfffffe00b428ccc0, 
    outvp=outvp@entry=0xfffffe00b8781dc0, 
    outoffp=outoffp@entry=0xfffffe00b428ccc8, 
    lenp=lenp@entry=0xfffffe00b428cba0, flags=flags@entry=0, 
    incred=0xfffffe003e1b4200, outcred=0xfffffe003e1b4200, 
    fsize_td=0xfffffe00b125d740)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/vfs_vnops.c:3141
        invpl = 0xfffffe00b8781dc0
        outvpl = 0xfffffe00b8781dc0
        len = <optimized out>
        error = <optimized out>
        uval = <optimized out>
        inmp = 0xfffffe00a8b62100
        outmp = 0xfffffe00a8b62100
#23 0xffffffff8381b113 in kern_copy_file_range (
    td=td@entry=0xfffffe00b125d740, infd=infd@entry=3, inoffp=<optimized out>, 
    inoffp@entry=0xfffffe00b428ccc0, outfd=outfd@entry=3, 
    outoffp=outoffp@entry=0xfffffe00b428ccc8, len=<optimized out>, 
    len@entry=44479, flags=0)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/vfs_syscalls.c:5004
        infp = 0xfffffe00be2a5640
        outfp = 0xfffffe00be2a5640
        retlen = 44479
        rl_wcookie = 0xfffffe00af7c1240
        rl_rcookie = 0xfffffe00af7c0d80
        savinoff = 416811
        error = <optimized out>
        savoutoff = 461290
        invp = 0xfffffe00b8781dc0
        outvp = 0xfffffe00b8781dc0
#24 0xffffffff8381bfd1 in sys_copy_file_range (td=td@entry=0xfffffe00b125d740, 
    uap=uap@entry=0xfffffe00b125db40)
    at /usr/home/somers/src/freebsd.org/src/sys/kern/vfs_syscalls.c:5042
        inoff = 416811
        outoff = 461290
        outoffp = 0xfffffe00b428ccc8
        inoffp = 0xfffffe00b428ccc0
        error = <optimized out>
#25 0xffffffff84fe7aaa in syscallenter (td=0xfffffe00b125d740)
    at /usr/home/somers/src/freebsd.org/src/sys/amd64/amd64/../../kern/subr_syscall.c:161
        se = 0xffffffff85cca640 <sysent+18208>
        p = 0xfffffe00d4686020
        sa = 0xfffffe00b125db30
        error = <optimized out>
        sy_thr_static = <optimized out>
        traced = <optimized out>
        _audit_entered = <optimized out>
#26 amd64_syscall (td=0xfffffe00b125d740, traced=0)
    at /usr/home/somers/src/freebsd.org/src/sys/amd64/amd64/trap.c:1192
        ksi = {ksi_link = {tqe_next = 0xfffffe00d4686160, 
            tqe_prev = 0xfffffe00b125d748}, ksi_info = {
            si_signo = -2092083493, si_errno = -1, si_code = 8, si_pid = 0, 
            si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {
              sival_int = 0, sival_ptr = 0x0, sigval_int = 0, 
              sigval_ptr = 0x0}, _reason = {_fault = {_trapno = -1148842708}, 
              _timer = {_timerid = -1148842708, _overrun = 1122754903}, 
              _mesgq = {_mqd = -1148842708}, _poll = {
                _band = 4822195592954776876}, _capsicum = {
                _syscall = -1148842708}, __spare__ = {
                __spare1__ = 4822195592954776876, __spare2__ = {0, 0, 0, -1, 
                  0, -512, -1239904248}}}}, ksi_flags = 1069188108, 
          ksi_sigq = 0xfffffe00b428ce60}
#27 <signal handler called>
No locals.
#28 0x00000076404d133a in ?? ()
No symbol table info available.
Backtrace stopped: Cannot access memory at address 0x763d1f8198

Links

Downstream bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282878

@asomers asomers added the Type: Defect Incorrect behavior (e.g. crash, hang) label Nov 20, 2024
@amotin
Copy link
Member

amotin commented Nov 20, 2024

As I can see, it is a lock recursion on a source of copy_file_range(). As I can see the first lock is taken by zfs_freebsd_copy_file_range(), calling zfs_clone_range(), while the second lock is taken by zn_flush_cached_data() called by zfs_clone_range(). Seems we need some reorganization there. The problem should happen only if mmap() and copy_file_range() are mixed on the same file.

@asomers
Copy link
Contributor Author

asomers commented Nov 20, 2024

The problem should happen only if mmap() and copy_file_range() are mixed on the same file.

Yes, fsx does that. It mixes ordinary writes, mmap writes, and copy_file_range writes on the same file.

@amotin
Copy link
Member

amotin commented Nov 21, 2024

It seems zn_flush_cached_data() takes vnode lock since it's other caller zfs_ioctl() on FreeBSD unlike Linux does not lock it. I think proper solution would be to lock it there instead, though it is not my strongest area. Meanwhile from another side it seems to help to change zfs_freebsd_copy_file_range() from LK_EXCLUSIVE to LK_SHARED, which should be good by itself.

amotin added a commit to amotin/zfs that referenced this issue Nov 21, 2024
Previously vnode was not locked there, unlike Linux.  It required
locking it in vn_flush_cached_data(), which recursed on the lock
if called from zfs_clone_range(), having the vnode locked.

Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes	openzfs#16789
amotin added a commit to amotin/zfs that referenced this issue Nov 22, 2024
Linux locks copy_file_range() source as shared.  FreeBSD was doing
it also, but then was changed to exclusive, partially because KPI
of that time was doing so, and partially seems out of caution.
Considering zfs_clone_range() uses range locks on both source and
destination, neither should require exclusive vnode locks. But one
step at a time, just sync it with Linux for now.

Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes	openzfs#16789
@amotin
Copy link
Member

amotin commented Nov 22, 2024

@asomers I kind of fixed it twice, so would be nice if you could test and review #16796 and #16797 .

behlendorf pushed a commit that referenced this issue Nov 23, 2024
Linux locks copy_file_range() source as shared.  FreeBSD was doing
it also, but then was changed to exclusive, partially because KPI
of that time was doing so, and partially seems out of caution.
Considering zfs_clone_range() uses range locks on both source and
destination, neither should require exclusive vnode locks. But one
step at a time, just sync it with Linux for now.

Reviewed-by: Alan Somers <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes #16789
Closes #16797
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Nov 23, 2024
Previously vnode was not locked there, unlike Linux.  It required
locking it in vn_flush_cached_data(), which recursed on the lock
if called from zfs_clone_range(), having the vnode locked.

Reviewed-by: Alan Somers <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#16789
Closes openzfs#16796
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Nov 23, 2024
Linux locks copy_file_range() source as shared.  FreeBSD was doing
it also, but then was changed to exclusive, partially because KPI
of that time was doing so, and partially seems out of caution.
Considering zfs_clone_range() uses range locks on both source and
destination, neither should require exclusive vnode locks. But one
step at a time, just sync it with Linux for now.

Reviewed-by: Alan Somers <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#16789
Closes openzfs#16797
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
2 participants