[9fans] Streaming on venti - Plan9

This is a discussion on [9fans] Streaming on venti - Plan9 ; Hi folks, is anyone already working on an venti-based storage format which is optimized for streaming ? VAC eg. is good for archiving, but it's tree-based structure is probably not optimal for streaming (on large files, a lot of blocks ...

+ Reply to Thread
Results 1 to 20 of 20

Thread: [9fans] Streaming on venti

  1. [9fans] Streaming on venti


    Hi folks,


    is anyone already working on an venti-based storage format
    which is optimized for streaming ?


    VAC eg. is good for archiving, but it's tree-based structure
    is probably not optimal for streaming (on large files, a lot
    of blocks IMHO have to be loaded before getting the first
    payload block can be reached).

    Some kind of linked list, eg. like in CMD-1541 filesystem
    (each block as an pointer to the next one) or an linked list
    of index blocks would fit better.



    cu
    --
    ---------------------------------------------------------------------
    Enrico Weigelt == metux IT service - http://www.metux.de/
    ---------------------------------------------------------------------
    Please visit the OpenSource QM Taskforce:
    http://wiki.metux.de/public/OpenSource_QM_Taskforce
    Patches / Fixes for a lot dozens of packages in dozens of versions:
    http://patches.metux.de/
    ---------------------------------------------------------------------


  2. Re: [9fans] Streaming on venti

    On Thu, Jun 5, 2008 at 4:24 AM, Enrico Weigelt wrote:
    >
    > Hi folks,
    >
    >
    > is anyone already working on an venti-based storage format
    > which is optimized for streaming ?
    >


    ah, well, what's this mean? What kind of data rate are you looking at?

    ron


  3. Re: [9fans] Streaming on venti

    * ron minnich wrote:
    > On Thu, Jun 5, 2008 at 4:24 AM, Enrico Weigelt wrote:
    > >
    > > Hi folks,
    > >
    > >
    > > is anyone already working on an venti-based storage format
    > > which is optimized for streaming ?
    > >

    >
    > ah, well, what's this mean? What kind of data rate are you looking at?


    It's intendet for video streaming. Upload is uncritical, but
    sequential download should be fast.

    The venti behind will be clustered, but that's another story ...


    cu
    --
    ---------------------------------------------------------------------
    Enrico Weigelt == metux IT service - http://www.metux.de/
    ---------------------------------------------------------------------
    Please visit the OpenSource QM Taskforce:
    http://wiki.metux.de/public/OpenSource_QM_Taskforce
    Patches / Fixes for a lot dozens of packages in dozens of versions:
    http://patches.metux.de/
    ---------------------------------------------------------------------


  4. Re: [9fans] Streaming on venti

    > It's intendet for video streaming. Upload is uncritical, but
    > sequential download should be fast.
    >
    > The venti behind will be clustered, but that's another story ...


    there's no such thing as sequential in venti. venti is content
    addressed.

    - erik



  5. Re: [9fans] Streaming on venti

    > VAC eg. is good for archiving, but it's tree-based structure
    > is probably not optimal for streaming (on large files, a lot
    > of blocks IMHO have to be loaded before getting the first
    > payload block can be reached).


    A typical venti tree has a branching factor of 409 (8192/20).
    For a 1GB file, that means you have to load two extra blocks to find
    the first one, and 322 interior blocks to find all 131,072 data blocks.
    Is improving that 0.2% really your justification for a less capable
    data structure?

    > Some kind of linked list, eg. like in CMD-1541 filesystem
    > (each block as an pointer to the next one) or an linked list
    > of index blocks would fit better.


    For a 1GB file, you'd need 20*131,072 = 2,621,440 bytes of
    storage to hold the pointers. No matter where you put them,
    your entire file is now (1GB+2,621,440)/8192 = 131,392 blocks.
    The tree was using 131,394 blocks. So at best, the linked list
    has reduced the number of block loads by 0.002%, and you've
    given up random access, including streaming starting halfway
    through a file. Doesn't sound better to me.

    Venti's performance is dominated much more by fragmentation
    in where the blocks are laid out in the arena logs (that causes seeks)
    than anything in higher level data structures. There is a paper about
    this in the upcoming Usenix. See http://swtch.com/~rsc/papers/
    for a link to PDF and HTML. (Because the paper is targeted at a
    non-Plan 9 audience, "Venti" in that paper refers to venti as described
    in the original paper. The current venti sources implement all the
    improvements described as "Foundation" in the paper.)

    Russ



  6. Re: [9fans] Streaming on venti

    * Russ Cox wrote:
    > > VAC eg. is good for archiving, but it's tree-based structure
    > > is probably not optimal for streaming (on large files, a lot
    > > of blocks IMHO have to be loaded before getting the first
    > > payload block can be reached).

    >
    > A typical venti tree has a branching factor of 409 (8192/20).


    I gues, 8k is vac's index block size ?
    So, maybe it could even be improved (for my case) by increasing
    it to the 56k venti limit ?

    > For a 1GB file, that means you have to load two extra blocks to find
    > the first one, and 322 interior blocks to find all 131,072 data blocks.
    > Is improving that 0.2% really your justification for a less capable
    > data structure?


    Well, I'll have to think about this. My primary goal is to
    make the sequential read as fast as possible. There won't be
    any non-sequential access.

    > Venti's performance is dominated much more by fragmentation
    > in where the blocks are laid out in the arena logs (that causes seeks)
    > than anything in higher level data structures. There is a paper about
    > this in the upcoming Usenix. See http://swtch.com/~rsc/papers/
    > for a link to PDF and HTML. (Because the paper is targeted at a
    > non-Plan 9 audience, "Venti" in that paper refers to venti as described
    > in the original paper. The current venti sources implement all the
    > improvements described as "Foundation" in the paper.)


    thx, I'll have a look at this (as soon as time allows ;-o).


    cu
    --
    ----------------------------------------------------------------------
    Enrico Weigelt, metux IT service -- http://www.metux.de/

    cellphone: +49 174 7066481 email: info@metux.de skype: nekrad666
    ----------------------------------------------------------------------
    Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
    ----------------------------------------------------------------------


  7. Re: [9fans] Streaming on venti

    Enrico Weigelt schrieb:
    >
    > Well, I'll have to think about this. My primary goal is to
    > make the sequential read as fast as possible. There won't be
    > any non-sequential access.
    >


    IMHO, using venti to serve streams is a bad idea. None of the advantages
    of venti do matter in this application. Nowadays disks are so big that
    GPT partitions are preferrable to provide disk storage for big streams.

    Of course, some support form the OS is needed, which excludes Plan9.
    --
    Dipl.-Math. Wilhelm Bernhard Kloke
    Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
    Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-373
    PGP: http://vestein.arb-phys.uni-dortmund...b/mypublic.key

  8. Re: [9fans] Streaming on venti

    > Of course, some support form the OS is needed, which excludes Plan9.

    what is the basis for this claim? references?

    - erik


  9. Re: [9fans] Streaming on venti

    erik quanstrom schrieb:
    >> Of course, some support form the OS is needed, which excludes Plan9.

    >
    > what is the basis for this claim? references?
    >
    > - erik
    >


    Does Plan9 support GPT partitions now? The last time I tried Plan9 on a
    GPT partitioned disk I got my disk severely screwed up.
    --
    Dipl.-Math. Wilhelm Bernhard Kloke
    Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
    Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-373
    PGP: http://vestein.arb-phys.uni-dortmund...b/mypublic.key

  10. Re: [9fans] Streaming on venti

    a gpt partitioned disk should have its mbr declaring mostly the disk
    in use, IIRC, plan 9 fdisk does not screw it up unless you decide
    to change the partitions in the mbr.


    On Fri, Jun 6, 2008 at 10:33 AM, Wilhelm B. Kloke
    wrote:
    > erik quanstrom schrieb:
    >>> Of course, some support form the OS is needed, which excludes Plan9.

    >>
    >> what is the basis for this claim? references?
    >>
    >> - erik
    >>

    >
    > Does Plan9 support GPT partitions now? The last time I tried Plan9 on a
    > GPT partitioned disk I got my disk severely screwed up.
    > --
    > Dipl.-Math. Wilhelm Bernhard Kloke
    > Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
    > Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-373
    > PGP: http://vestein.arb-phys.uni-dortmund...b/mypublic.key
    >
    >



  11. Re: [9fans] Streaming on venti

    Francisco J Ballesteros schrieb:
    > a gpt partitioned disk should have its mbr declaring mostly the disk
    > in use, IIRC, plan 9 fdisk does not screw it up unless you decide
    > to change the partitions in the mbr.


    On a GPT partitioned disk the mbr has to be ignored, as there is no way
    to make it look the same in every respect.

    Plan9 (and other OSs, such as FreeBSD, NetBSD, OS/2, too) has
    idiosyncratic ideas conflicting with my own idiosyncratic ideas how
    to layout the MBR. Every time I tried to add a new OS to some
    working partition table I got some sort of bad experience, which
    ranged from reordering the partion table to confusing sector
    addressing styles.

    Just not screwing up the partition table is not enough. I want to
    access the data partitions outside the plan9 partition either
    to store venti arenas on them, so that these are accessible
    from other OSs (Plan9 from User Space, e.g.) or to store other
    OS-independent file formats like multimedia, or a database.
    IIRC this was not there in Plan9 at the time when I tried it.

    But let me return to my main point: If you want to store and access
    files bigger than the size of a normal venti arena or even a large
    part of one, it looks like a bad use of venti to me.
    In his paper about Foundation Russ Cox explains some circumstances
    where you probably don't want venti to archive these files forever.
    For these sort of files the services of other file systems are
    overkill, too. You don't want write more than one at the same time mostly.
    You don't need the ability to append to the file, once it is closed.
    If you have files of several Gigabyte size you
    probably want to have them contiguous. You cannot use BSD slices or
    BSD partitions or plan9 partitions to store these, because there are too few
    of them. The 128 partitions on a GPT are sufficient, because a 500GB
    disk can be laid out to use one big chunk to host file systems and
    several smaller chunks from 1GB to, say 20GB for multimedia files.
    Of course, you need utilities to manage these.

    A file system very suitable for this situation was that of DEC's
    RT11 (if it were not limited to 16bit block adresses). On this
    system, every file was contiguous, there was a directory, which
    looked mostly like a big partition table, because the directory had
    only one level. A new file could be allocated either with a fixed
    size in the 1st gap to fit, or to fill the biggest gap. The default
    was to fill the maximum of the 2nd largest gap and half the largest
    gap size.
    --
    Dipl.-Math. Wilhelm Bernhard Kloke
    Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
    Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-373
    PGP: http://vestein.arb-phys.uni-dortmund...b/mypublic.key

  12. Re: [9fans] Streaming on venti

    > erik quanstrom schrieb:
    >>> Of course, some support form the OS is needed, which excludes Plan9.

    >>
    >> what is the basis for this claim? references?
    >>
    >> - erik
    >>

    >
    > Does Plan9 support GPT partitions now? The last time I tried Plan9 on a
    > GPT partitioned disk I got my disk severely screwed up.
    > -


    plan 9 doesn't support gpt.

    however, nemo is correct. when gpt partitions are
    created below the fdisk limit of 2TB, that space
    is supposed to be shown in partition table. (at
    www.uefi.org; see §5.2.2 of the uefi specification
    2.1 on the "protective" mbr.)

    your claim was that plan 9 lacks something to support
    streaming. what does support for gpt partition tables
    have to do with this?

    - erik



  13. Re: [9fans] Streaming on venti

    > In his paper about Foundation Russ Cox explains some circumstances
    > where you probably don't want venti to archive these files forever.
    > For these sort of files the services of other file systems are
    > overkill, too. You don't want write more than one at the same time mostly.
    > You don't need the ability to append to the file, once it is closed.
    > If you have files of several Gigabyte size you
    > probably want to have them contiguous. You cannot use BSD slices or
    > BSD partitions or plan9 partitions to store these, because there are too few
    > of them. The 128 partitions on a GPT are sufficient, because a 500GB
    > disk can be laid out to use one big chunk to host file systems and
    > several smaller chunks from 1GB to, say 20GB for multimedia files.
    > Of course, you need utilities to manage these.


    there are several styles devices with exactly these limitiations: cd, dvd
    and bd recorders.

    the standard data rate for bd disks is 5.4mb/s (en.wikipedia.org/wiki/BluÂ*ray_Disc).
    venti was easily fast enough to handle this, even on a usb drive.

    your proposed set of limitations makes no sense to me. venti
    can stream fast enough and the overhead for storing blocks and
    not extents is ~1%.

    further, foundation is an archival strategy, not primary storage.
    pretty impressive to be able to stream movies directly from the backup.

    i sure wouldn't give up the ability to store, e.g., an index to which i
    append an entry each time i add a movie in my moviefs for 1%
    drop in storage efficiency. since venti goes fast enough, any extra
    speed is wasted on streaming.

    > A file system very suitable for this situation was that of DEC's
    > RT11 (if it were not limited to 16bit block adresses). On this
    > system, every file was contiguous, there was a directory, which
    > looked mostly like a big partition table, because the directory had
    > only one level. A new file could be allocated either with a fixed
    > size in the 1st gap to fit, or to fill the biggest gap. The default
    > was to fill the maximum of the 2nd largest gap and half the largest
    > gap size.


    1968 called. it wants its fs technology back.

    if you must have extents, there are much better (if more complicated)
    fses that support extents. for example. xfs.

    i'm not convinced the extra complexity of managing arbitrarly-sized
    blocks is worth it.

    - erik



  14. Re: [9fans] Streaming on venti

    erik quanstrom schrieb:
    >> erik quanstrom schrieb:
    >>>> Of course, some support form the OS is needed, which excludes Plan9.
    >>>
    >>> what is the basis for this claim? references?
    >>>

    ....
    >
    > plan 9 doesn't support gpt.


    This was my claim. Nothing else. You did cut off some significant part ofmy original message.

    > however, nemo is correct. when gpt partitions are
    > created below the fdisk limit of 2TB, that space
    > is supposed to be shown in partition table. (at
    > www.uefi.org; see §5.2.2 of the uefi specification
    > 2.1 on the "protective" mbr.)


    Yes. A protective mbr is in the specification. Protective means: Not to be
    used for fiddling.

    > your claim was that plan 9 lacks something to support
    > streaming. what does support for gpt partition tables
    > have to do with this?


    Let me restate my claim:
    - Using venti for backing up a streaming application is not a good idea.
    - Contiguous storage areas may be better.
    - One potential method to provide access to contiguous disk space may be a rich partitioning
    system, e.g. GPT.
    - Plan9 does not support the last idea.
    --
    Dipl.-Math. Wilhelm Bernhard Kloke
    Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
    Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-373
    PGP: http://vestein.arb-phys.uni-dortmund...b/mypublic.key

  15. Re: [9fans] Streaming on venti

    > Yes. A protective mbr is in the specification. Protective means: Not to be
    > used for fiddling.


    the spec says that the protective mbr should include entries reserving
    the space used by gpt partitions. thus if you use fdisk to edit such
    a partition table, you will not harm gpt unless you delete the gpt
    entries.

    they are quite clear that this is for compatability.

    if you can cite chapter and verse showing that this is incorrect,
    i'm all ears.

    > - One potential method to provide access to contiguous disk space may be a rich partitioning
    > system, e.g. GPT.
    > - Plan9 does not support the last idea.


    the plan 9 kernel doesn't read partition tables. this is done by
    external programs. e.g. disk/fdisk and disk/prep and 9load. devsd allows
    partition tables to be written to /dev/sdXX/ctl. devsd partitions
    are dynamicly allocated. there is no limit to how many you can have

    without inventing a new format, disk/prep uses just a single
    sector. so it supports about 16 partitions. however, you can
    repartition each prep partition. so with 2 levels on a drive with
    standard-sized sectors, you can have 239 partitions.

    nonetheless, using partitions as a fs is a terrible idea.

    - erik



  16. Re: [9fans] Streaming on venti

    > - Using venti for backing up a streaming application is not a good idea.
    > - Contiguous storage areas may be better.


    So far I agree with you.

    > - One potential method to provide access to contiguous
    > disk space may be a rich partitioning system, e.g. GPT.


    I can't believe what a terrible idea this is. I honestly thought
    that PC architecture couldn't get any worse; congratulations.
    We were running out of 1-byte partition types so now we're
    going to use random 16-byte identifiers that no one can
    remember or even read?

    > - Plan9 does not support the last idea.


    No, Plan 9 transcends the idea.

    As Erik pointed out, Plan 9 couldn't care less what bizarro world
    your disks come from. To keep architecture-specific disk format goo
    from infecting the kernel, the disk device presents a very simple
    interface that can be used to implement any partitioning scheme
    you care to invent, even ones as disgusting as GPT.
    You write a simple user-level program that opens the raw disk
    device, reads the partition table, and then creates the partitions
    by writing commands like

    part linux 63 11425234

    to the disk's ctl file. A GPT implementation would be only a few
    hundred lines confined to a single user-space program, if anyone
    cared to write it.

    Russ


  17. Re: [9fans] Streaming on venti

    On Thu, Jun 5, 2008 at 7:42 AM, Enrico Weigelt wrote:
    > * Russ Cox wrote:
    >> > VAC eg. is good for archiving, but it's tree-based structure
    >> > is probably not optimal for streaming (on large files, a lot
    >> > of blocks IMHO have to be loaded before getting the first
    >> > payload block can be reached).

    >>
    >> A typical venti tree has a branching factor of 409 (8192/20).

    >
    > I gues, 8k is vac's index block size ?
    > So, maybe it could even be improved (for my case) by increasing
    > it to the 56k venti limit ?



    As Russ makes pretty clear, this stuff all involves numbers that can
    be reasoned about. Storage companies do such reasoning as their daily
    bread.

    So, rather than say stuff like "as fast as possible" (which is without
    meaning) why not attach some numbers to this kind of speculation?

    ron


  18. Re: [9fans] Streaming on venti

    On 2008-06-06, Russ Cox wrote:
    >> - Using venti for backing up a streaming application is not a good idea.
    >> - Contiguous storage areas may be better.

    >
    > So far I agree with you.
    >

    ....
    >
    > to the disk's ctl file. A GPT implementation would be only a few
    > hundred lines confined to a single user-space program, if anyone
    > cared to write it.


    I don't care, too. The only, but in my eyes and for my personal use
    significant, difference it makes, is that these partitions are
    uniformly accessible from different OSs. Of course, only from those,
    which see GPT partitions. I agree that the idea of evolving MBR to
    GPT was terrible. But it is there, as there are different OSs and
    different opinions, sometimes in the same computer system or human head.
    --
    Dipl.-Math. Wilhelm Bernhard Kloke
    Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
    Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-373
    PGP: http://vestein.arb-phys.uni-dortmund...b/mypublic.key

  19. Re: [9fans] Streaming on venti

    On Fri, Jun 6, 2008 at 8:53 AM, Russ Cox wrote:

    >> - One potential method to provide access to contiguous
    >> disk space may be a rich partitioning system, e.g. GPT.

    >
    > I can't believe what a terrible idea this is. I honestly thought
    > that PC architecture couldn't get any worse; congratulations.


    It's part of the EFI promise: everything they touch will turn to merde :-)

    ron


  20. Re: [9fans] Streaming on venti

    > It's part of the EFI promise: everything they touch will turn to merde :-)

    optimist!

    - erik



+ Reply to Thread