[patch 00/13] vfs: add helpers to check r/o bind mounts - Kernel

This is a discussion on [patch 00/13] vfs: add helpers to check r/o bind mounts - Kernel ; On Thu, Apr 24, 2008 at 04:58:00PM +0200, Miklos Szeredi wrote: > Since all the vfs_* functions will become static with path_* being the > only caller, the compiler will be happy to get rid of that stack frame > ...

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 21 to 40 of 49

Thread: [patch 00/13] vfs: add helpers to check r/o bind mounts

  1. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, Apr 24, 2008 at 04:58:00PM +0200, Miklos Szeredi wrote:

    > Since all the vfs_* functions will become static with path_* being the
    > only caller, the compiler will be happy to get rid of that stack frame
    > too.
    >
    > What is left is the guarantee, that the race-free r/o remounts will
    > always work and some obscure caller didn't forget to surround it with
    > the r/o checks.
    >
    > I think it's definitely worth it.


    It would be, if we had only single vfs_...() as critical sections. We
    do not - see previous mail or read the ****ing tree, already.

    We don't even have many callers of each, and with a few we do it's not
    obvious that we want to go through vfsmounts (and vfsmount-based checks)
    in all of them. So no, I don't buy your argument. Sorry.

    I'm not even convinced that they are useful as helpers, at least until
    we'd decided what to do with checks in nfsd. Until then we do, as
    far as I'm concerned, one place where they would definitely DTRT - fs/namei.c.
    And I want more than one caller before merging those, let alone removing
    the interface that doesn't require checks to be vfsmount-based for all users.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    > > > Oh, for **** sake... grep and ye shall see. Right next to setattr we
    > > > have nfsd4_set_nfs4_acl(), with pair of set_nfsv4_acl_one(). I'd rather
    > > > have those two covered by a single will/wont range, TYVM.
    > > >
    > > > nfsd_create() will happily do vfs_mkdir() and nfsd_create_setattr(). Ditto.

    > >
    > > Please read the patches? I've left exactly these
    > > mnt_want_write/drop_write() calls in place, and removed all the
    > > others.

    >
    > You've left _what_ in place? nfsd4_set_nfs4_acl() currently doesn't have
    > that single range - fh_verify() + set_nvfs4_acl_one() + set_nfsv4_acl_one().
    > And AFAICS you do nothing of that kind there.
    >


    Yes I did. See only caller of nfsd4_set_nfs4_acl() in nfs4proc.c.

    Miklos
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, Apr 24, 2008 at 05:37:39PM +0200, Miklos Szeredi wrote:
    > > > What is left is the guarantee, that the race-free r/o remounts will

    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
    > > > always work and some obscure caller didn't forget to surround it with

    ^^^^^^^^^^^^^^^^^

    > Why are those so important? Yes, if we have multiple vfs_() calls,
    > surround them with an extra want/drop pair.


    Which leaves you with the same need to audit all these suckers anyway.

    I'm in principle fine with having such helper functions, *IF* they are
    not sold as providing all protection one needs, *IF* you are not expecting
    to be able to fold all areas down into them and *IF* original ones are
    left intact.

    Modulo the like path_rename(), BTW - that one is just plain ugly API.

    > > We don't even have many callers of each, and with a few we do it's not
    > > obvious that we want to go through vfsmounts (and vfsmount-based checks)
    > > in all of them. So no, I don't buy your argument. Sorry.
    > >
    > > I'm not even convinced that they are useful as helpers, at least until
    > > we'd decided what to do with checks in nfsd. Until then we do, as
    > > far as I'm concerned, one place where they would definitely DTRT - fs/namei.c.
    > > And I want more than one caller before merging those,

    >
    > unix_bind() -> vfs_mknod()
    > sys_mq_unlink() -> vfs_unlink()
    > open.c (several) -> notify_change()
    > *setxattr() -> vfs_setxattr()
    > *removexattr() -> vfs_removexattr()


    OK.

    > > let alone removing the interface that doesn't require checks to be
    > > vfsmount-based for all users.

    >
    > What users? There are paractically _no_ other users. The ones that
    > there are (like reiserfs) should not be using them, and there are
    > already some patches cleaning that mess up.


    OK, explain me, in small words, WTF should something that wants to do
    operations on filesystem tree have a vfsmount. Slowly. And "r/o
    bind loses value if it can be bypassed" is a hogwash - fs methods are
    still there, so it *can* be bypassed just fine, thank you very much.
    It's really up to caller. "But they won't be able to do open()" also
    doesn't fly - again, it's up to whoever writes particular piece of code.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    > > > > What is left is the guarantee, that the race-free r/o remounts will
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
    > > > > always work and some obscure caller didn't forget to surround it with

    > ^^^^^^^^^^^^^^^^^
    >
    > > Why are those so important? Yes, if we have multiple vfs_() calls,
    > > surround them with an extra want/drop pair.

    >
    > Which leaves you with the same need to audit all these suckers anyway.


    Not really. Missing such calls would just make the *caller* buggy
    (i.e. racy with remount r/o), but it would not make the *filesystem*
    buggy. Big difference.

    > I'm in principle fine with having such helper functions, *IF* they are
    > not sold as providing all protection one needs,


    I'm not selling them as that.

    > *IF* you are not expecting
    > to be able to fold all areas down into them and *IF* original ones are
    > left intact.


    Left intact for whom, specifically? Another question you've managed
    to avoid answering.

    > Modulo the like path_rename(), BTW - that one is just plain ugly API.


    I'm all open to improvements.

    > > > let alone removing the interface that doesn't require checks to be
    > > > vfsmount-based for all users.

    > >
    > > What users? There are paractically _no_ other users. The ones that
    > > there are (like reiserfs) should not be using them, and there are
    > > already some patches cleaning that mess up.

    >
    > OK, explain me, in small words, WTF should something that wants to do
    > operations on filesystem tree have a vfsmount. Slowly. And "r/o
    > bind loses value if it can be bypassed" is a hogwash - fs methods are
    > still there, so it *can* be bypassed just fine, thank you very much.


    And we know what to do with such users.

    > It's really up to caller. "But they won't be able to do open()" also
    > doesn't fly - again, it's up to whoever writes particular piece of code.


    I understand your theory. But it has zero practical significance.

    IOW it doesn't matter that someone _may_ want to access the filesystem
    without a vfsmount, if that someone doesn't exist.

    Miklos
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424113950.818688067@szeredi.hu>, Miklos Szeredi writes:
    > Patches 1-3 are cleanups/bugfixes preceding the main series. The
    > description is in 4/13. This is against latest git.
    >
    >
    >
    > R/O bind mounts patches have been reviewed numerous times. Still a
    > relatively large number of places remained where checking for R/O
    > mounts was missed.
    >
    > Then I did this series, which basically guarantees, that that cannot
    > happen. Al rejected it, and rather fixed some of the remaining
    > places. He still missed several, which sort of proves my point.
    >
    > I think it's totally pointless to continue trying to fix the symptoms
    > instead of getting at the root of the problem.
    >
    > I know that VFS interfaces are a sensitive question, but it would be
    > nice it we could have some sanity back in this discussion.


    And I'll reiterate my desire (see http://lkml.org/lkml/2008/4/8/394) for
    stackable file systems to know as little as possible about vfsmounts (or
    their derivatives ala struct path). If I have to call helpers which require
    vfsmounts, but similar objects don't get passed to me somehow, then I have
    to maintain vfsmounts on my own.

    Sure, unionfs already does some of it now (and some of it can be recoded
    more cleanly), but I'd like to get rid of much of it if I could -- these new
    path_* helpers make me have to maintain vfsmounts even moreso than before.

    >
    >
    > Thanks,
    > Miklos


    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    > Sure, unionfs already does some of it now (and some of it can be recoded
    > more cleanly), but I'd like to get rid of much of it if I could -- these new
    > path_* helpers make me have to maintain vfsmounts even moreso than before.


    With the words of Al: "that is hogwash". Stacks either maintain
    vfsmounts or they don't. There's no middle ground.

    Miklos
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424124245.GC15214@ZenIV.linux.org.uk>, Al Viro writes:
    > On Thu, Apr 24, 2008 at 01:39:50PM +0200, Miklos Szeredi wrote:
    > > Then I did this series, which basically guarantees, that that cannot
    > > happen. Al rejected it, and rather fixed some of the remaining
    > > places. He still missed several, which sort of proves my point.

    >
    > Which ones have I missed?
    >
    > > I think it's totally pointless to continue trying to fix the symptoms
    > > instead of getting at the root of the problem.
    > >
    > > I know that VFS interfaces are a sensitive question, but it would be
    > > nice it we could have some sanity back in this discussion.

    >
    > Yes, it would. How about that, for starters:
    >
    > path_create() et.al. are *wrong* for nfsd; if nothing else, I'm not at
    > all convinced that even apparmour wants export path + relative there
    > _and_ r/o vs. r/w is decision that doesn't clearly map to ex_mnt flags.
    >
    > Moreover, it's not at all obvious that we want to drop write access as
    > soon as vfs_...() is over in case of nfsd. Some of the stuff done
    > immeidately afterwards might very well qualify for inclusion into
    > protected area; some of the stuff done immediately _prior_ very likely
    > needs that as well - look at fh_verify() and tell me why we don't want
    > that "I'll hold write access to vfsmount" to span the area including
    > that sucker. If we want the r/o vs r/w policy directly vfsmount-based
    > for nfsd, that is.
    >
    > For ecryptfs it's also bogus - at the very least we need to decide what
    > should happen when underlying vfsmount is remounted. Again, I'm less
    > than convinced that we want the same way to express r/o vs. r/w policy.


    I tend to agree that the mnt_want* bracketing may need to encompass more
    than one vfs_* method. When copyup is involved, the problem is only
    exacerbated. A copyup may want to perform several actions "atomically", for
    example:

    1. vfs_mkdir /a
    2. vfs_mkdir /a/b
    3. vfs_mkdir /a/b/c
    4. vfs_create /a/b/c/f
    5. vfs_read(source file) -> vfs_write(/a/b/c/f)

    These actions are often done in close proximity in unionfs, and in such a
    case, it'd be nice if I could want_write only once around this sequence.

    However, I'm not sure that for copyup (or ecryptfs), that an vfsmount-level
    want/drop write is the right interface. As you pointed out before, Al, some
    other mechanism might be needed, perhaps at the superblock level, to declare
    one's desire to write the f/s "atomically" (to avoid concurrent topology
    changes).

    Thanks,
    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message , Miklos Szeredi writes:
    > > Sure, unionfs already does some of it now (and some of it can be recoded
    > > more cleanly), but I'd like to get rid of much of it if I could -- these new
    > > path_* helpers make me have to maintain vfsmounts even moreso than before.

    >
    > With the words of Al: "that is hogwash". Stacks either maintain
    > vfsmounts or they don't. There's no middle ground.


    So, if I wanted to not maintain vfsmounts at all in unionfs, how can I use
    the proposed new path_* helpers which require vfsmounts? Will there be some
    other helpers available to perform lower-filesystem operations (e.g., mkdir,
    create, unlink, etc.) which won't require passing vfsmounts?

    The "or they don't" is not much of an option when I'm forced to use an API
    that requires a vfsmount...

    > Miklos


    Thanks,
    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424134826.GD15214@ZenIV.linux.org.uk>, Al Viro writes:
    > On Thu, Apr 24, 2008 at 03:05:21PM +0200, Miklos Szeredi wrote:

    [...]
    > And yes, we need the counterpart for superblock-level stuff, to deal with
    > remaining races (look at fs_may_remount_ro() and puke - it's still racy
    > as hell). E.g. unlink should do sb-level "will write" when it drops
    > i_nlink to 0 and final removal of inode should do "won't write".


    Al, any near-term plans for sb-level "want write" locking as we discussed
    briefly at LSF? Being able to do so for copyup in unionfs will hopefully
    allow me to prevent concurrent topology changes.

    And while we're digressing. It appears to me that when someone wants to
    change a single meta-data item in a f/s, then locking rules are simple:
    e.g., lock the directory inode. But when you want to modify more than one,
    you need a different kind of lock -- hence the s_vfs_rename_mutex. The
    latter requires calling lock_rename to find the lowest common ancestor of
    the two directories that need locking. This rename-locking protocol appears
    to be a special case of the sb-level "want write" idea, right? Moreover, I
    don't believe that cross-directory renames are that common, so replacing the
    s_vfs_rename_mutex and its use with a more generic sb-level multi-operation
    write-lock should have a negligible performance impact.

    Thanks,
    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    > > > Sure, unionfs already does some of it now (and some of it can be recoded
    > > > more cleanly), but I'd like to get rid of much of it if I could -- these new
    > > > path_* helpers make me have to maintain vfsmounts even moreso than before.

    > >
    > > With the words of Al: "that is hogwash". Stacks either maintain
    > > vfsmounts or they don't. There's no middle ground.

    >
    > So, if I wanted to not maintain vfsmounts at all in unionfs, how can I use
    > the proposed new path_* helpers which require vfsmounts? Will there be some
    > other helpers available to perform lower-filesystem operations (e.g., mkdir,
    > create, unlink, etc.) which won't require passing vfsmounts?
    >
    > The "or they don't" is not much of an option when I'm forced to use an API
    > that requires a vfsmount...


    Yes. The vfsmount is already needed for open() and that pretty much
    determines the fate of all stacking filesystems, except the ones which
    don't want to do file I/O, but that is rather hard to imagine.

    And anyway, I don't see the issue here. If the stack wants to work on
    a single filesystem directly, just do a bind mount of the original to
    a kernel private mount and use that. I'm not sure this can be done
    with the current API, but it doesn't sound difficult to implement at
    all.

    Miklos
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, Apr 24, 2008 at 01:25:58PM -0400, Erez Zadok wrote:
    > Al, any near-term plans for sb-level "want write" locking as we discussed
    > briefly at LSF? Being able to do so for copyup in unionfs will hopefully
    > allow me to prevent concurrent topology changes.


    That's pretty close to the top.

    > the two directories that need locking. This rename-locking protocol appears
    > to be a special case of the sb-level "want write" idea, right?


    Not really - that one doesn't provide any exclusion between writers and
    that's what we want from rename/rename.

    Take a look at Documentation/filesystems/directory-locking for details
    of locking scheme...
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424142857.GF15214@ZenIV.linux.org.uk>, Al Viro writes:
    > On Thu, Apr 24, 2008 at 04:09:18PM +0200, Miklos Szeredi wrote:

    [...]
    > FWIW, I'm not all that happy about the way ecryptfs_interpose() is done,
    > while we are at it. We get the sucker opened by whoever steps on given
    > place in the tree first, with subsequent operations done using the resulting
    > struct file. With fallback to r/o open. What happens to somebody who
    > tries to open it with enough permissions to do r/w?


    Yes, ecryptfs_interpose() calls ecryptfs_init_persistent_file() which calls
    dentry_open(O_RDWR). What's the proposed solution for this in the face of
    r/o vfsmounts? How could ecryptfs avoid calling this dentry_open in the
    first place?

    For unionfs, all I need is return -EROFS or the like to trigger a possible
    copyup.

    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, 2008-04-24 at 17:37 +0200, Miklos Szeredi wrote:
    > Why are those so important? Yes, if we have multiple vfs_() calls,
    > surround them with an extra want/drop pair. We do already do multiple
    > overlapping want/drop pairs with O_TRUNC and O_CREAT (AFAIR).


    The one that I remember is the pair that we take for O_CREAT and the
    nameidata_to_filp() in do_filp_open() that we do on it after the
    creation and hold while the file is open.

    The only reason for this one is that we want to shut down a potential
    race that would allow the file to be created, then still return a -EROFS
    because someone did a r/w->r/o transition between the create an the
    nameidata_to_filp().

    We could pass some information into nameidata_to_filp()->__dentry_open()
    telling it that we already have a write and it doesn't need to take one,
    but I think having the two nested writes is cleaner.

    Is that the case you're thinking of?

    -- Dave

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, Apr 24, 2008 at 01:29:49PM -0400, Erez Zadok wrote:
    > In message <20080424142857.GF15214@ZenIV.linux.org.uk>, Al Viro writes:
    > > On Thu, Apr 24, 2008 at 04:09:18PM +0200, Miklos Szeredi wrote:

    > [...]
    > > FWIW, I'm not all that happy about the way ecryptfs_interpose() is done,
    > > while we are at it. We get the sucker opened by whoever steps on given
    > > place in the tree first, with subsequent operations done using the resulting
    > > struct file. With fallback to r/o open. What happens to somebody who
    > > tries to open it with enough permissions to do r/w?

    >
    > Yes, ecryptfs_interpose() calls ecryptfs_init_persistent_file() which calls
    > dentry_open(O_RDWR). What's the proposed solution for this in the face of
    > r/o vfsmounts? How could ecryptfs avoid calling this dentry_open in the
    > first place?


    Doesn't have anything to do with vfsmounts (you have one to deal with and
    if it's r/o, it's equivalent to just doing the entire thing on top of r/o
    fs; not interesting).

    No, what I'm worried about is much simpler. Look: we have a file on
    underlying fs, owned by root.root with 644 for permissions. Comes a
    luser and tries to open the counterpart of that file in ecryptfs; that
    triggers ecryptfs_interpose() and attempts to open file. Of course,
    that's going to fail - it's not world-writable. So then it (actually
    ecryptfs_init_persistent_file()) falls back to opening with O_RDONLY.
    Which succeeds just fine and file (opened r/o) is set as ->lower_file.

    Now comes root and tries to open the damn thing r/w. It should be able
    to and if it came first it'd get it; as it is, what it gets is ->lower_file
    and that puppy is opened read-only and you have no guarantee that underlying
    fs will not go bonkers seeing write attempts on it (e.g. open for write
    doing a bit more setup of ->private_data, etc.).

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    > > Why are those so important? Yes, if we have multiple vfs_() calls,
    > > surround them with an extra want/drop pair. We do already do multiple
    > > overlapping want/drop pairs with O_TRUNC and O_CREAT (AFAIR).

    >
    > The one that I remember is the pair that we take for O_CREAT and the
    > nameidata_to_filp() in do_filp_open() that we do on it after the
    > creation and hold while the file is open.
    >
    > The only reason for this one is that we want to shut down a potential
    > race that would allow the file to be created, then still return a -EROFS
    > because someone did a r/w->r/o transition between the create an the
    > nameidata_to_filp().
    >
    > We could pass some information into nameidata_to_filp()->__dentry_open()
    > telling it that we already have a write and it doesn't need to take one,
    > but I think having the two nested writes is cleaner.


    Agreed.

    > Is that the case you're thinking of?


    Yes. And the same thing is done a little below for similar reasons
    around may_open() which does truncation.

    Miklos
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424181332.GB5882@ZenIV.linux.org.uk>, Al Viro writes:
    > On Thu, Apr 24, 2008 at 01:29:49PM -0400, Erez Zadok wrote:
    > > In message <20080424142857.GF15214@ZenIV.linux.org.uk>, Al Viro writes:
    > > > On Thu, Apr 24, 2008 at 04:09:18PM +0200, Miklos Szeredi wrote:

    > > [...]
    > > > FWIW, I'm not all that happy about the way ecryptfs_interpose() is done,
    > > > while we are at it. We get the sucker opened by whoever steps on given
    > > > place in the tree first, with subsequent operations done using the resulting
    > > > struct file. With fallback to r/o open. What happens to somebody who
    > > > tries to open it with enough permissions to do r/w?

    > >
    > > Yes, ecryptfs_interpose() calls ecryptfs_init_persistent_file() which calls
    > > dentry_open(O_RDWR). What's the proposed solution for this in the face of
    > > r/o vfsmounts? How could ecryptfs avoid calling this dentry_open in the
    > > first place?

    >
    > Doesn't have anything to do with vfsmounts (you have one to deal with and
    > if it's r/o, it's equivalent to just doing the entire thing on top of r/o
    > fs; not interesting).
    >
    > No, what I'm worried about is much simpler. Look: we have a file on
    > underlying fs, owned by root.root with 644 for permissions. Comes a
    > luser and tries to open the counterpart of that file in ecryptfs; that
    > triggers ecryptfs_interpose() and attempts to open file. Of course,
    > that's going to fail - it's not world-writable. So then it (actually
    > ecryptfs_init_persistent_file()) falls back to opening with O_RDONLY.
    > Which succeeds just fine and file (opened r/o) is set as ->lower_file.
    >
    > Now comes root and tries to open the damn thing r/w. It should be able
    > to and if it came first it'd get it; as it is, what it gets is ->lower_file
    > and that puppy is opened read-only and you have no guarantee that underlying
    > fs will not go bonkers seeing write attempts on it (e.g. open for write
    > doing a bit more setup of ->private_data, etc.).


    Ah, I see. Yes: ecryptfs_init_persistent_file does have this odd "try to
    open readwrite and if that failed, try readonly" code there. I can't really
    say why it's done that way: Mike? Maybe it was a workaround to not having
    the right permissions to open that persistent file?

    This could be a similar issue to how I had to handle the .wh.* files in
    unionfs. Because they are separate files, and they get created/destroyed
    directly by unionfs, I cannot rely on the caller's capabilities to be able
    to create/remove these .wh.* files. So in several places I have to call
    cap_raise temporarily. (Of course, if/when there'd be native whiteout
    support in the most most common file systems, I could happily rip out this
    whiteout code. :-)

    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424173013.GA5882@ZenIV.linux.org.uk>, Al Viro writes:
    > On Thu, Apr 24, 2008 at 01:25:58PM -0400, Erez Zadok wrote:
    > > Al, any near-term plans for sb-level "want write" locking as we discussed
    > > briefly at LSF? Being able to do so for copyup in unionfs will hopefully
    > > allow me to prevent concurrent topology changes.

    >
    > That's pretty close to the top.
    >
    > > the two directories that need locking. This rename-locking protocol appears
    > > to be a special case of the sb-level "want write" idea, right?

    >
    > Not really - that one doesn't provide any exclusion between writers and
    > that's what we want from rename/rename.


    When you say "that one", do you mean the sb-level idea? If the sb-level
    "want write" not provide exclusion among writers, then how can I prevent
    lower topology changes while copyup takes place?

    If your sb-level "want write" idea won't provide that exclusion at the sb
    level, then how about we elevate the s_vfs_rename_mutex into a generic
    s_vfs_multi_dir_change_mutex of sorts, which can be used by rename as well
    as copyup?

    > Take a look at Documentation/filesystems/directory-locking for details
    > of locking scheme...


    OK. lock_rename() at most locks one "parent" and one "child" inodes. So
    what happens to all the ones in b/t? Suppose I have a path /a/b/c/d/e/f and
    someone wants to mv /a/b/c/d/e to /a/b/; in that case lock_rename at least
    grabs the mutex on /a/b, right? Can someone go in the middle and try to
    muck with the "c/d/" parts in between?

    From what I gathered, lock_rename won't be enough for me to prevent topology
    changes during copyup; I'd really need to lock an entire arbitrarily deep
    path of inodes. That just seems so complex and prone to bugs, that it might
    just be easier to have a single sb-level mutex for these complex
    multi-directory operations. I'm not sure how much performance hit this'd
    be, thou, and if there's a more efficient way to do so.

    If you think lock_rename will be sufficient for me to deal with copyups
    vs. topology changes, then I'll start using it.

    Thanks,
    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, Apr 24, 2008 at 03:40:03PM -0400, Erez Zadok wrote:
    > Ah, I see. Yes: ecryptfs_init_persistent_file does have this odd
    > "try to open readwrite and if that failed, try readonly" code there.
    > I can't really say why it's done that way: Mike? Maybe it was a
    > workaround to not having the right permissions to open that
    > persistent file?


    The notion was that of "best effort," and it is sloppy.

    I think the right way to do it will be to allow up to two persistent
    files. If the user with read-only perms opens, then open the
    persistent file RO. Then a user with write-only (but not read) perms
    opens; open another persistent file WO. Allow subsequent reads or
    writes by the various users to go through whichever persistent file
    has the right access. Then a user with RW access opens the file; close
    both the RO file and the WO file and open a new file RW. All the
    while, make sure that ecryptfs_open() performs all the requisite perm
    checks.

    If this sounds reasonable, I will get working on a patch to do this.

    Mike
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    In message <20080424201630.GC18686@localhost.austin.ibm.com>, Michael Halcrow writes:
    > On Thu, Apr 24, 2008 at 03:40:03PM -0400, Erez Zadok wrote:
    > > Ah, I see. Yes: ecryptfs_init_persistent_file does have this odd
    > > "try to open readwrite and if that failed, try readonly" code there.
    > > I can't really say why it's done that way: Mike? Maybe it was a
    > > workaround to not having the right permissions to open that
    > > persistent file?

    >
    > The notion was that of "best effort," and it is sloppy.
    >
    > I think the right way to do it will be to allow up to two persistent
    > files. If the user with read-only perms opens, then open the
    > persistent file RO. Then a user with write-only (but not read) perms
    > opens; open another persistent file WO. Allow subsequent reads or
    > writes by the various users to go through whichever persistent file
    > has the right access. Then a user with RW access opens the file; close
    > both the RO file and the WO file and open a new file RW. All the
    > while, make sure that ecryptfs_open() performs all the requisite perm
    > checks.


    Correct me if I'm wrong, but what you call "persistent" file is simply (in
    my stacking-speek :-) the lower file, right? If so, then calling it
    persistent may be confusing, b/c you could be stacked on top of a volatile
    f/s (e.g., tmpfs or even *gasp* another stackable f/s) -- yes, even if it
    may not make much sense from a security perspective.

    The comment above ecryptfs_init_persistent_file() states that you only ever
    keep a single open file for every lower inode. Was this a security decision
    (say, to prevent different access modes to the same ciphertext?); or was
    that the attempt to workaround the limitation of the
    address_space_operations API (that ->writepage doesn't get a struct file,
    which you need to pass to vfs_write)?

    I'm unclear what you mean above wrt two open files: do you mean

    1. two struct file's, each pointing to a SEPARATE copy of the physical lower
    file?
    or
    2. two struct file's, BOTH pointing to the same lower physical file?

    Option #1 may be racy and I'm not clear what happens if a data file gets
    'forked' that way: do you then have to merge it later on?

    Or, are you propsing to keep alternating b/t an open struct file for RO, and
    another open struct file for RW (or WO)? If you alternate b/t the two,
    keeping only one open at a time, what happens if two threads want to access
    the same file: one reading and another writing the file. Does that mean
    that for every read(2) or write(2) you'll have to essentially release one
    open file and open another (could slow things down a lot)?

    > If this sounds reasonable, I will get working on a patch to do this.
    >
    > Mike


    Cheers,
    Erez.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: [patch 00/13] vfs: add helpers to check r/o bind mounts

    On Thu, Apr 24, 2008 at 06:39:07PM -0400, Erez Zadok wrote:
    > In message <20080424201630.GC18686@localhost.austin.ibm.com>,
    > Michael Halcrow writes:
    > > I think the right way to do it will be to allow up to two persistent
    > > files. If the user with read-only perms opens, then open the
    > > persistent file RO. Then a user with write-only (but not read) perms
    > > opens; open another persistent file WO. Allow subsequent reads or
    > > writes by the various users to go through whichever persistent file
    > > has the right access. Then a user with RW access opens the file; close
    > > both the RO file and the WO file and open a new file RW. All the
    > > while, make sure that ecryptfs_open() performs all the requisite perm
    > > checks.

    >
    > Correct me if I'm wrong, but what you call "persistent" file is
    > simply (in my stacking-speek :-) the lower file, right? If so, then
    > calling it persistent may be confusing, b/c you could be stacked on
    > top of a volatile f/s (e.g., tmpfs or even *gasp* another stackable
    > f/s) -- yes, even if it may not make much sense from a security
    > perspective.


    So long as the eCryptfs inode is alive, it will have at least one open
    file in the corresponding inode for the lower filesystem. This file is
    "persistent" for the life of the eCryptfs inode.

    > The comment above ecryptfs_init_persistent_file() states that you
    > only ever keep a single open file for every lower inode. Was this a
    > security decision (say, to prevent different access modes to the
    > same ciphertext?); or was that the attempt to workaround the
    > limitation of the address_space_operations API (that ->writepage
    > doesn't get a struct file, which you need to pass to vfs_write)?


    I intend to use the persistent file as a means to use vfs_read() and
    vfs_write() in readpage() and writepage().

    > I'm unclear what you mean above wrt two open files: do you mean
    >
    > 1. two struct file's, each pointing to a SEPARATE copy of the physical lower
    > file?
    > or
    > 2. two struct file's, BOTH pointing to the same lower physical file?


    I intned two lower files, both opened against the same lower inode,
    via any dentry that happened to be used to get to that inode.

    > Option #1 may be racy and I'm not clear what happens if a data file
    > gets 'forked' that way: do you then have to merge it later on?
    >
    > Or, are you propsing to keep alternating b/t an open struct file for
    > RO, and another open struct file for RW (or WO)? If you alternate
    > b/t the two, keeping only one open at a time, what happens if two
    > threads want to access the same file: one reading and another
    > writing the file.


    I proposed two open files, one with RO access, and one with WO access,
    in the circumstance that one process only had read access on open, and
    the other process only had write access on open. A write-only
    ecryptfs_open() would result in an eCryptfs file struct that only used
    the WO persistent file, after the appropriate permission checks are
    done.

    > Does that mean that for every read(2) or write(2) you'll have to
    > essentially release one open file and open another (could slow
    > things down a lot)?


    Not really; I was proposing to keep two lower files open to use for
    write-only or read-only operations, and if I ever got RW permission on
    a file linked to the lower inode, I would just get rid of both and
    replace them with a single RW persistent file. All while maintaining
    proper permission checks in ecryptfs_open(), so a RO ecryptfs_open()
    would only ever be allowed to read from the RW persistent file.

    Really, this business with possibly two lower persistent files is
    probably too much of a game to play. Rusty suggested that perhaps I
    should just create a root-owned kernel thread in eCryptfs that would
    have the responsibility of acquiring a RW lower persistent file,
    regardless of the euid and permissions of whatever process made the
    initial syscall that brought the eCryptfs inode into existence. I am
    tempted to go that route instead.

    Mike
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast