writing simultaneously to 2 or more network filesystems - NFS

This is a discussion on writing simultaneously to 2 or more network filesystems - NFS ; Hello, We have a computer (DAQ) that will generate 300GB/hour and we will process the data (images of 18 - 30MB) using a cluster. The data processing *must* occur at the same time as the "data acquisition" so we do ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: writing simultaneously to 2 or more network filesystems

  1. writing simultaneously to 2 or more network filesystems

    Hello,

    We have a computer (DAQ) that will generate 300GB/hour and we will
    process the data (images of 18 - 30MB) using a cluster.

    The data processing *must* occur at the same time as the "data
    acquisition" so we do not want to use the same network or file-server
    in order not to slow down either of the processes (data acquisition /
    data processing).

    My question: Is there any technology (either hardware or software
    based) that will allow the DAQ computer to simultaneously write the
    data to 2 or more different computers? Possibly using many network
    cards, special switches...

    At first i thought that maybe I could NFS mount two NFS resources to
    the same mount point on the DAQ computer with some write-only option
    but I could not find any docs on this scenario.

    Your comments are very welcome.

    Cheers,
    E.Panepucci
    Swiss Light Source


  2. Re: writing simultaneously to 2 or more network filesystems

    Begin <1150807154.903426.171800@h76g2000cwa.googlegroups. com>
    On 2006-06-20, epanepucci@gmail.com wrote:
    > The data processing *must* occur at the same time as the "data
    > acquisition" so we do not want to use the same network or file-server
    > in order not to slow down either of the processes (data acquisition /
    > data processing).


    You do need some form of transport between acquisition and processing.
    If that isn't acceptable, you need to eliminate any intermediates, which
    probably means putting both functions on the same hardware. Or perhaps
    I'm misunderstanding what you mean here?


    > My question: Is there any technology (either hardware or software
    > based) that will allow the DAQ computer to simultaneously write the
    > data to 2 or more different computers? Possibly using many network
    > cards, special switches...


    Of course. For example, you could write your own little program that
    takes the image and sends it to two (or more) client machines. There
    probably already exist solutions that do this on various levels and
    in various ways. What would suit your problem I can't say, as you
    are pretty sparse on the rest of the requirements, especially WRT
    reliability and faillure modes.

    From what you have stated, your problem looks more to be a data transfer
    problem, and you're asking in groups with more or less a data storage
    slant, not especially data transfer.


    > At first i thought that maybe I could NFS mount two NFS resources to
    > the same mount point on the DAQ computer with some write-only option
    > but I could not find any docs on this scenario.


    Note that nfs implies that the data will be stored somewhere, where it
    then can presumably be picked up again. If that is acceptable, and you
    do want to use nfs, you're thinking the wrong way around: Have the data
    processors mount some nfs exported on the DAQ instead, and pick their
    data up from there. The other way around nfs simply doesn't do.


    --
    j p d (at) d s b (dot) t u d e l f t (dot) n l .
    This message was originally posted on Usenet in plain text.
    Any other representation, additions, or changes do not have my
    consent and may be a violation of international copyright law.

  3. Re: writing simultaneously to 2 or more network filesystems

    epanepucci@gmail.com wrote:
    > Hello,
    >
    > We have a computer (DAQ) that will generate 300GB/hour and we will
    > process the data (images of 18 - 30MB) using a cluster.
    >
    > The data processing *must* occur at the same time as the "data
    > acquisition" so we do not want to use the same network or file-server
    > in order not to slow down either of the processes (data acquisition /
    > data processing).
    >
    > My question: Is there any technology (either hardware or software
    > based) that will allow the DAQ computer to simultaneously write the
    > data to 2 or more different computers?


    Sure - but what good would it do?

    In order to process the data, you've got to write at least one copy of
    it to the hardware which will do that processing, which hardware
    therefore must be capable of both accepting the in-coming data and
    processing it. Once you can do that, why do you need another computer?
    And if you can't do that, it doesn't seem you can accomplish what you
    want to.

    However, that's if you look at it as competing processes. If you
    instead look at it from the storage level, all you need is 1) storage
    capable of handling the combined bandwidth of acquisition and processing
    (which shouldn't present much of a challenge: a single disk comes very
    close to providing enough streaming bandwidth for acquisition, so two or
    more in a stripe group should provide sufficient streaming bandwidth for
    acquisition plus as much processing as you care to configure by
    extending the size of the stripe group) and 2) a cluster that uses
    shared direct access to that storage (such as is supported by
    shared-disk file systems like SANergy et al.) such that one or more
    members can do the acquisition and the rest can do the processing
    (perhaps at slightly lower disk-access priority such that acquisition is
    guaranteed), all using the same single copy of the data (or the same
    redundant data if you protect it with mirroring or parity).

    If you're hesitant to step up to shared-disk technology, you could do it
    with a NAS box with the same kind of striped-storage array: it
    shouldn't be too difficult to configure a NAS box that will handle
    several hundred MB/sec (a bit under 100 MB/sec for acquisition plus
    whatever you need for processing), and this nicely segregates the
    acquisition machine (which may be important if acquisition consumes a
    lot of CPU) from the processing cluster (again, you might want to give
    acquisition higher priority at the NAS box than processing to ensure you
    didn't lose any in-coming data).

    - bill

+ Reply to Thread