[9fans] Ideas for gc on venti - Plan9

This is a discussion on [9fans] Ideas for gc on venti - Plan9 ; Hi folks, as I'm using venti as storage backend for an media archive, where content can be deleted (and probably will happen often enough), I'm currently thinking about how an garbage collection could be achived. Let's assume the following premise: ...

+ Reply to Thread
Results 1 to 11 of 11

Thread: [9fans] Ideas for gc on venti

  1. [9fans] Ideas for gc on venti


    Hi folks,


    as I'm using venti as storage backend for an media archive, where
    content can be deleted (and probably will happen often enough),
    I'm currently thinking about how an garbage collection could be
    achived.

    Let's assume the following premise:

    * only a few well-known apps are writing to venti (eg. only
    vac and vtstore).
    * we know all the root scores and can iterate through the
    metadata from time to time.
    * venti's storage is divided in several logs of not to big size
    (eg. 2GB).

    Now we introduce an "deprecated" mode for an volume: no more
    writes to that volume, requested blocks are automatically moved
    to another volume (and cleared from the deprecated one). Maybe
    from time to time there might run an compaction process which
    removes the holes in the volume.

    Well, that's not yet any form of gc - just an smooth data moving
    from one volume to another - also good if you intend to take some
    disk offline in near future, w/o serious interruption.
    (The deprecated volume get emptier and emptier, and no new
    data is added.)

    GC is the next step:

    Assuming each block to keep is accessed at least once in some given
    time, we'll know that the remaining data on the volume will be
    trash after that time. So everything we've got to do is to iterate
    through all archives and access all their blocks (*1). Once this
    is completely done, the deferred volume only contains trash and
    can be safely deleted.


    What do you think about that approach ?

    cu

    *1) we could introduce a new "touch" rpc call, which simply tells
    venti that some list of blocks is still required, but does not
    send back their data.

    --
    ----------------------------------------------------------------------
    Enrico Weigelt, metux IT service -- http://www.metux.de/

    cellphone: +49 174 7066481 email: info@metux.de skype: nekrad666
    ----------------------------------------------------------------------
    Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
    ----------------------------------------------------------------------


  2. Re: [9fans] Ideas for gc on venti

    > What do you think about that approach ?

    I think you will lose your data.

    The greatest strength of venti, and also of
    the worm file system, is that once data is written,
    those disk blocks are never changed again.
    That makes it virtually impossible to lose data
    due to software or human errors. This is no small thing.

    Why not just use an ordinary file system?
    What benefit are you deriving from using venti
    that is making all this rewriting worthwhile?

    If it's just that when two people upload the same
    file, you don't store it twice, you could just store
    files named by their SHA1 hashes in an ordinary
    file system.

    Russ



  3. Re: [9fans] Ideas for gc on venti

    > Well, that's not yet any form of gc - just an smooth data moving
    > from one volume to another - also good if you intend to take some
    > disk offline in near future, w/o serious interruption.
    > (The deprecated volume get emptier and emptier, and no new
    > data is added.)


    in the original venti paper, the problems associated with disk
    management, redundancy and backup were ignored so they
    could be handled seperately.

    i think this is good design. but i can't take credit for this
    opinion. i've had kernighan & plauger, elements of programming
    style on my desk for a few days. this is a book old enough to give
    examples in pl/1 but i think it still gives advice which bears repeating.

    one of the suggestions is that each function should hide something
    important.

    it makes sense for the storage managment function to present an
    idealized block device while hiding details like disk replacement
    and redundency.

    now, if i could get all my own functions to live up to this standard....

    - erik



  4. Re: [9fans] Ideas for gc on venti

    * Russ Cox wrote:

    Hi,


    > The greatest strength of venti, and also of
    > the worm file system, is that once data is written,
    > those disk blocks are never changed again.


    Yep, but my scenario is not completely worm.
    Some data might be removed/unused. Even it might not be absolutely
    necessary, it would be nice to reclaim space.

    > Why not just use an ordinary file system?
    > What benefit are you deriving from using venti
    > that is making all this rewriting worthwhile?


    Venti makes lots of things easier, eg. it avoids duplicated data.
    For example, if some users upload already existing media, I've
    just got one more db record, but no duplicate data. Doing this
    on fs basis would require more logic on application side.

    Another, very important, point is that I'm creating an cloud venti,
    which synchronizes with its peers on-demand and distributes the
    data over the cloud. So I don't need additional logic for
    clustering the application / it's data spaces.
    (I'll also use the venticloud for several other things, eg. for
    building an distributed fs or something like S3 on it).


    cu
    --
    ----------------------------------------------------------------------
    Enrico Weigelt, metux IT service -- http://www.metux.de/

    cellphone: +49 174 7066481 email: info@metux.de skype: nekrad666
    ----------------------------------------------------------------------
    Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
    ----------------------------------------------------------------------


  5. Re: [9fans] Ideas for gc on venti

    * erik quanstrom wrote:

    > it makes sense for the storage managment function to present an
    > idealized block device while hiding details like disk replacement
    > and redundency.


    Well, I intend to make venti the storage device itself
    (eg. in form on an hw appliance ;-P). At this point an special
    venti could make hw RAID obsolete and also do things like bad
    block handling.

    RAID has some disadvantages, eg. you have to nail-down partition
    sizes and it's not trivial to resize or move around volumes.
    A venti-based system (which maybe presents an block device via
    venti) can make runtime configuration much easier.


    cu
    --
    ----------------------------------------------------------------------
    Enrico Weigelt, metux IT service -- http://www.metux.de/

    cellphone: +49 174 7066481 email: info@metux.de skype: nekrad666
    ----------------------------------------------------------------------
    Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
    ----------------------------------------------------------------------


  6. Re: [9fans] Ideas for gc on venti

    On Wed, 18 Jun 2008 22:57:27 +0200 Enrico Weigelt wrote:
    > * erik quanstrom wrote:
    >
    > > it makes sense for the storage managment function to present an
    > > idealized block device while hiding details like disk replacement
    > > and redundency.

    >
    > Well, I intend to make venti the storage device itself
    > (eg. in form on an hw appliance ;-P). At this point an special
    > venti could make hw RAID obsolete and also do things like bad
    > block handling.
    >
    > RAID has some disadvantages, eg. you have to nail-down partition
    > sizes and it's not trivial to resize or move around volumes.
    > A venti-based system (which maybe presents an block device via
    > venti) can make runtime configuration much easier.


    Have you looked at zfs (on solaris, freebsd or macos)? It
    seems to offer most of what you are looking for.

    As for venti, you can use something like venti/copy to copy a
    subset of trees to a new venti and then reuse all of the old
    venti space. This is exactly like a copying GC (only "live
    data" is copied). But why bother. For one thing you can't do
    selective file copying without a lot of extra hassle.


  7. Re: [9fans] Ideas for gc on venti

    one legitimate reason is the liability of keeping a user's data
    long after any business arrangements for storing such data has
    expired. this applies to kenfs too.



  8. Re: [9fans] Ideas for gc on venti

    > one legitimate reason is the liability of keeping a user's data
    > long after any business arrangements for storing such data has
    > expired. this applies to kenfs too.


    this is a good point.

    are there any fs that have mechanisms to help
    apply data retention policy? if one does offline
    backup, deleting only the stuff that needs to
    be forgotten can be quite painful.

    suppose (as a weak example) the labs' main worm
    were subject to the normal business data retention
    rules. there would be a lot of history lost.

    it's in the forgetting that memory is made useful.

    - erik



  9. Re: [9fans] Ideas for gc on venti

    > RAID has some disadvantages, eg. you have to nail-down partition
    > sizes and it's not trivial to resize or move around volumes.


    you seem to be making a general claim about all storage
    management solutions that i don't think can be backed
    up.

    as an example i have no rooting interest in, way back
    in 1996, i was able to use aix lvm to migrate a couple
    of hundred filesystems in many tens of vgs to tens of
    filesystems on a handful of vgs with mirrored lvs. i
    didn't find it hard at all to reallocate or resize
    anything. there were no partitions in sight.

    (i sure don't miss dasd.)

    > A venti-based system (which maybe presents an block device via
    > venti) can make runtime configuration much easier.


    combining functionality that is logically distinct is
    generally called unmodular, and a layering violation
    in this particular senerio.

    - erik



  10. Re: [9fans] Ideas for gc on venti

    // combining functionality that is logically distinct is
    // generally called unmodular, and a layering violation
    // in this particular senerio.

    i agree with the principle, but i'm not sure it applies in this
    case. what's described (at least the part before any "garbage"
    collection is done) is really just arena management, not disk
    management. the arenas are all defined within venti, and
    nothing underneath really has any understanding of how (or
    if) they're being used. i don't think there's anything
    conceptually wrong with asking venti to be able to manage
    which arenas are "live" or not.

    of course, i think the specific "deprecated" suggestion is
    predicated on the idea that you're going to periodically scan
    the entire data log, which doesn't seem like an assumption
    that's going to scale all that well (especially in light of the
    stated goal of eventual distribution).

    and this is certainly not a defense of the garbage collection
    idea. i'd be quite averse to any form of automated garbage
    collection in venti. i've got a few scores written down in a
    notebook which aren't in any root and don't duplicate
    blocks in any fs (unless by accident).

    it would be nice to be able to selectively & manually mark a
    given score as "deprecated" and have any blocks only
    associated with that score freed (i've got a few hundred MB
    already "wasted" on my venti based on having put a space
    in a vac command line in the wrong place, for example), but
    i find russ' point about the code to touch written blocks
    being entirely bug-free based on not existing to be pretty
    darn compelling. that level of safety is worth a lot.

    anthony

    ps: what'd make me give up on the deletion idea entirely
    is some form of authentication in venti, even if it's just
    allowing fossil to connect to it via tls using certificates. i
    can deal with my own mistakes, but it does make me
    slightly uncomfortable being open to DoS attacks.



  11. Re: [9fans] Ideas for gc on venti

    > // combining functionality that is logically distinct is
    > // generally called unmodular, and a layering violation
    > // in this particular senerio.
    >
    > i agree with the principle, but i'm not sure it applies in this
    > case. what's described (at least the part before any "garbage"
    > collection is done) is really just arena management, not disk
    > management. the arenas are all defined within venti, and
    > nothing underneath really has any understanding of how (or
    > if) they're being used. i don't think there's anything
    > conceptually wrong with asking venti to be able to manage
    > which arenas are "live" or not.


    the case given was that a disk needed replacing. i can run
    your argument the other way and say that venti doesn't care
    which disk goes where or how the storage itself is organized.
    one should be able to replace a failed drive without involving
    venti.

    slightly off topic. we use this to our advantage at coraid,
    though we are not using venti. our mail fs uses aoe storage.
    there are not a lot of people expert in the adminstration of
    our fs, but there are many people who can repair a degraded
    raid or perform other storage administration. this requires no
    knowledge of the fs.

    when you're on call 24/7/365 for fs problems, this is a
    wonderful thing.

    - erik



+ Reply to Thread