The following reply was made to PR kern/125149; it has been noted by GNATS.

From: Jaakko Heinonen
To: Volker Werth
Cc: Weldon Godfrey , bug-followup@FreeBSD.org
Subject: Re: kern/125149: [zfs][nfs] changing into .zfs dir from nfs client
causes endless panic loop
Date: Tue, 7 Oct 2008 18:36:30 +0300

Hi,

On 2008-10-02, Volker Werth wrote:
> > #8 0xffffffff804f06fa in vput (vp=0x0) at atomic.h:142
> > #9 0xffffffff8060670d in nfsrv_readdirplus (nfsd=0xffffff000584f100,
> > slp=0xffffff0005725900,
> > td=0xffffff00059a0340, mrq=0xffffffffdf761af0) at
> > /usr/src/sys/nfsserver/nfs_serv.c:3613
> > #10 0xffffffff80615a5d in nfssvc (td=Variable "td" is not available.
> > ) at /usr/src/sys/nfsserver/nfs_syscalls.c:461
> > #11 0xffffffff8072f377 in syscall (frame=0xffffffffdf761c70) at
> > /usr/src/sys/amd64/amd64/trap.c:852
> > #12 0xffffffff807158bb in Xfast_syscall () at
> > /usr/src/sys/amd64/amd64/exception.S:290
> > #13 0x000000080068746c in ?? ()
> > Previous frame inner to this frame (corrupt stack?)

>
> I think the problem is the NULL pointer to vput. A maintainer needs to
> check how nvp can get a NULL pointer (judging by assuming my fresh
> codebase is not too different from yours).


The bug is reproducible with nfs clients using readdirplus. FreeBSD
client doesn't use readdirplus by default but you can enable it with -l
mount option. Here are steps to reproduce the panic with FreeBSD nfs
client:

- nfs export a zfs file system
- on client mount the file system with -l mount option and list the
zfs control directory
# mount_nfs -l x.x.x.x:/tank /mnt
# ls /mnt/.zfs

I see two bugs here:

1) nfsrv_readdirplus() doesn't check VFS_VGET() error status properly.
It only checks for EOPNOTSUPP but other errors are ignored. This is the
final reason for the panic and in theory it could happen for other
file systems too. In this case VFS_VGET() returns EINVAL and results
NULL nvp.
2) zfs VFS_VGET() returns EINVAL for .zfs control directory entries.
Looking at zfs_vget() it tries find corresponding znode to fulfill
the request. However control directory entries don't have backing
znodes.

Here is a patch which fixes 1). The patch prevents system from panicing
but a fix for 2) is needed to make readdirplus work with .zfs directory.

%%%
Index: sys/nfsserver/nfs_serv.c
================================================== =================
--- sys/nfsserver/nfs_serv.c (revision 183511)
+++ sys/nfsserver/nfs_serv.c (working copy)
@@ -3597,9 +3597,12 @@ again:
* Probe one of the directory entries to see if the filesystem
* supports VGET.
*/
- if (VFS_VGET(vp->v_mount, dp->d_fileno, LK_EXCLUSIVE, &nvp) ==
- EOPNOTSUPP) {
- error = NFSERR_NOTSUPP;
+ error = VFS_VGET(vp->v_mount, dp->d_fileno, LK_EXCLUSIVE, &nvp);
+ if (error) {
+ if (error == EOPNOTSUPP)
+ error = NFSERR_NOTSUPP;
+ else
+ error = NFSERR_SERVERFAULT;
vrele(vp);
vp = NULL;
free((caddr_t)cookies, M_TEMP);
%%%

And here's an attempt to add support for .zfs control directory entries
(bug 2)) in zfs_vget(). The patch is very experimental and it only works
for snapshots which are already active (mounted).

%%%
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
================================================== =================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c (revision 183587)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c (working copy)
@@ -759,9 +759,10 @@ zfs_vget(vfs_t *vfsp, ino_t ino, int fla
VN_RELE(ZTOV(zp));
err = EINVAL;
}
- if (err != 0)
- *vpp = NULL;
- else {
+ if (err != 0) {
+ /* try .zfs control directory */
+ err = zfsctl_vget(vfsp, ino, flags, vpp);
+ } else {
*vpp = ZTOV(zp);
vn_lock(*vpp, flags);
}
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c
================================================== =================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c (revision 183587)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c (working copy)
@@ -1047,6 +1047,63 @@ zfsctl_lookup_objset(vfs_t *vfsp, uint64
return (error);
}

+int
+zfsctl_vget(vfs_t *vfsp, uint64_t nodeid, int flags, vnode_t **vpp)
+{
+ zfsvfs_t *zfsvfs = vfsp->vfs_data;
+ vnode_t *dvp, *vp;
+ zfsctl_snapdir_t *sdp;
+ zfsctl_node_t *zcp;
+ zfs_snapentry_t *sep;
+ int error;
+
+ *vpp = NULL;
+
+ ASSERT(zfsvfs->z_ctldir != NULL);
+ error = zfsctl_root_lookup(zfsvfs->z_ctldir, "snapshot", &dvp,
+ NULL, 0, NULL, kcred);
+ if (error != 0)
+ return (error);
+
+ if (nodeid == ZFSCTL_INO_ROOT || nodeid == ZFSCTL_INO_SNAPDIR) {
+ if (nodeid == ZFSCTL_INO_SNAPDIR)
+ *vpp = dvp;
+ else {
+ VN_RELE(dvp);
+ *vpp = zfsvfs->z_ctldir;
+ VN_HOLD(*vpp);
+ }
+ /* XXX: LK_RETRY? */
+ vn_lock(*vpp, flags | LK_RETRY);
+ return (0);
+ }
+
+ sdp = dvp->v_data;
+
+ mutex_enter(&sdp->sd_lock);
+ sep = avl_first(&sdp->sd_snaps);
+ while (sep != NULL) {
+ vp = sep->se_root;
+ zcp = vp->v_data;
+ if (zcp->zc_id == nodeid)
+ break;
+
+ sep = AVL_NEXT(&sdp->sd_snaps, sep);
+ }
+
+ if (sep != NULL) {
+ VN_HOLD(vp);
+ *vpp = vp;
+ vn_lock(*vpp, flags);
+ } else
+ error = EINVAL;
+
+ mutex_exit(&sdp->sd_lock);
+
+ VN_RELE(dvp);
+
+ return (error);
+}
/*
* Unmount any snapshots for the given filesystem. This is called from
* zfs_umount() - if we have a ctldir, then go through and unmount all the
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ctldir.h
================================================== =================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ctldir.h (revision 183587)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ctldir.h (working copy)
@@ -60,6 +60,7 @@ int zfsctl_root_lookup(vnode_t *dvp, cha
int flags, vnode_t *rdir, cred_t *cr);

int zfsctl_lookup_objset(vfs_t *vfsp, uint64_t objsetid, zfsvfs_t **zfsvfsp);
+int zfsctl_vget(vfs_t *vfsp, uint64_t nodeid, int flags, vnode_t **vpp);

#define ZFSCTL_INO_ROOT 0x1
#define ZFSCTL_INO_SNAPDIR 0x2
%%%

--
Jaakko
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"