Buffers, disk cache, and validating data - SUN

This is a discussion on Buffers, disk cache, and validating data - SUN ; If I want to validate data copy operations using commands like cmp, dircmp, etc, how do I know I'm actually comparing data on disk, given that the data might be in a buffer or disk cache?...

+ Reply to Thread
Results 1 to 7 of 7

Thread: Buffers, disk cache, and validating data

  1. Buffers, disk cache, and validating data

    If I want to validate data copy operations using commands like cmp, dircmp,
    etc, how do I know I'm actually comparing data on disk, given that the data
    might be in a buffer or disk cache?



  2. Re: Buffers, disk cache, and validating data

    sinister wrote:
    > If I want to validate data copy operations using commands like cmp, dircmp,
    > etc, how do I know I'm actually comparing data on disk, given that the data
    > might be in a buffer or disk cache?
    >
    >


    Run sync first - just an idea, I don't know it is necessarily right.

    Although that might cause the OS to write the data to the "disk", I'm
    not sure how you can be certain the data is not buffered on the disk
    hardware, entirely independent of the operating system.

    Interesting question.
    --
    Dave K

    http://www.southminster-branch-line.org.uk/

    Please note my email address changes periodically to avoid spam.
    It is always of the form: month-year@domain. Hitting reply will work
    for a couple of months only. Later set it manually. The month is
    always written in 3 letters (e.g. Jan, not January etc)

  3. Re: Buffers, disk cache, and validating data

    sinister wrote:

    >If I want to validate data copy operations using commands like cmp, dircmp,
    >etc, how do I know I'm actually comparing data on disk, given that the data
    >might be in a buffer or disk cache?
    >
    >
    >
    >

    Well, you could follow your copy operation with a couple of "sync" commands.

    If you are in that much doubt, however, you might want to consider using
    an operating system like OpenVMS that actually commits data to disk
    before reporting the I/O as complete; it may be slower but there's no
    doubt about whether or not your data is on disk!

  4. Re: Buffers, disk cache, and validating data

    In article , Richard B. Gilbert wrote:
    > sinister wrote:
    >
    >>If I want to validate data copy operations using commands like cmp, dircmp,
    >>etc, how do I know I'm actually comparing data on disk, given that the data
    >>might be in a buffer or disk cache?
    >>

    > Well, you could follow your copy operation with a couple of "sync" commands.
    >
    > If you are in that much doubt, however, you might want to consider using
    > an operating system like OpenVMS that actually commits data to disk
    > before reporting the I/O as complete; it may be slower but there's no
    > doubt about whether or not your data is on disk!


    As high of an opinion of OpenVMS's default setting as I have, I should
    point out that it's only half the story, though.

    That approach only says it's on disk, but not if it's correctly written out.

    Was a recent thread on that particular point, and I eventually conceded
    good points raised that ultimately, one has to trust that if it was
    written out to disk, just essentially hope it was the exact same data.

    (Places where data can get corrupted: in memory prior to the write or
    cabling problems.)

    The problem is that if you had hardware issues causing silent
    corruption, then how do you reliably detect such (aside from admin
    monitoring of /var/adm/messages or fmdump), since data comparisons would
    involve the same busted hardware components.

    Hence, there isn't really much you can do in that particular situation
    (from the perspective of the writer), except to cross fingers and just
    really hope data was correctly written out and matches the source data.

    ZFS, possibly in Solaris 10 Update 2, alleviates that to some degree
    with its checksummed data, as does careful monitoring of fmdump in
    Solaris 10 or /var/adm/messages in earlier OS releases for ECC errors or
    signs of cabling issues.

    -Dan

  5. Re: Buffers, disk cache, and validating data

    Dan Foster wrote:

    >In article , Richard B. Gilbert wrote:
    >
    >
    >>sinister wrote:
    >>
    >>
    >>
    >>>If I want to validate data copy operations using commands like cmp, dircmp,
    >>>etc, how do I know I'm actually comparing data on disk, given that the data
    >>>might be in a buffer or disk cache?
    >>>
    >>>
    >>>

    >>Well, you could follow your copy operation with a couple of "sync" commands.
    >>
    >>If you are in that much doubt, however, you might want to consider using
    >>an operating system like OpenVMS that actually commits data to disk
    >>before reporting the I/O as complete; it may be slower but there's no
    >>doubt about whether or not your data is on disk!
    >>
    >>

    >
    >As high of an opinion of OpenVMS's default setting as I have, I should
    >point out that it's only half the story, though.
    >
    >That approach only says it's on disk, but not if it's correctly written out.
    >
    >
    >

    But the OP merely wanted to be certain that the data he was comparing
    was being read back from disk rather than from a buffer in memory.

  6. Re: Buffers, disk cache, and validating data

    In article <43a6c604@212.67.96.135>,
    Dave writes:
    > sinister wrote:
    >> If I want to validate data copy operations using commands like cmp, dircmp,
    >> etc, how do I know I'm actually comparing data on disk, given that the data
    >> might be in a buffer or disk cache?
    >>
    >>

    >
    > Run sync first - just an idea, I don't know it is necessarily right.
    >
    > Although that might cause the OS to write the data to the "disk", I'm
    > not sure how you can be certain the data is not buffered on the disk
    > hardware, entirely independent of the operating system.
    >
    > Interesting question.


    Here's something I've used sometimes if I wanted to know that data was
    being read from disk rather than from cached copies; "memtool"
    (playground.sun.com, I think) seems to confirm that it does purge it.
    (at least on Solaris 9 on UltraSPARC...)

    As to whether it works on any other platforms, it really depends on
    whether or not MADV_DONTNEED is available and works similarly.


    /*
    * freemap.c
    *
    * If nothing else is using a file that happens to be cached in memory,
    * this should cause it to be freed entirely rather than merely left on
    * the freelist to be reclaimed.
    *
    */

    #include
    #include
    #include
    #include
    #include
    #include
    #include

    void perror3(const char *s1, const char *s2, const char *s3)
    {
    static const char colon_space[]={ ':',' '};
    if (s1!=NULL && *s1!='\0') {
    write(2,s1,strlen(s1));
    write(2,colon_space,sizeof colon_space);
    }
    if (s2!=NULL && *s2!='\0') {
    write(2,s2,strlen(s2));
    write(2,colon_space,sizeof colon_space);
    }
    perror(s3);
    }

    int main(int argc, char **argv)
    {
    int fd, x;
    struct stat s;
    caddr_t rval;

    for (x=1;x if ((fd=open(argv[x],O_RDONLY)) == -1) {
    perror3(argv[0],argv[x],"can't open file");
    continue;
    }
    if (fstat(fd,&s) == -1) {
    perror3(argv[0],argv[x],"can't obtain file attributes");
    close(fd);
    continue;
    }
    if (!S_ISREG(s.st_mode)) {
    fprintf(stderr, "%s: %s: not a regular file\n", argv[0],argv[x]);
    close(fd);
    continue;
    }
    if (s.st_size>0) {
    if ((rval=mmap(NULL,s.st_size,PROT_READ,MAP_SHARED,fd ,(off_t)0))
    == MAP_FAILED) {
    perror3(argv[0],argv[x],"can't map file into virtual memory");
    close(fd);
    continue;
    }
    else {
    close(fd);
    madvise(rval,s.st_size,MADV_DONTNEED);
    munmap(rval,s.st_size);
    }
    }
    }
    return 0;
    }









    --
    mailto:rlhamil@smart.net http://www.smart.net/~rlhamil

    Lasik/PRK theme music:
    "In the Hall of the Mountain King", from "Peer Gynt"

  7. Re: Buffers, disk cache, and validating data

    sinister wrote:
    > If I want to validate data copy operations using commands like cmp, dircmp,
    > etc, how do I know I'm actually comparing data on disk, given that the data
    > might be in a buffer or disk cache?


    It's not documented that it does this, but my own experience leads me to
    believe that "lockfs -fa" will invalidate the cache. I'm not sure of this,
    but based on performance, it seems that lots of things have to be loaded
    after a "lockfs -fa" that would've been in cache otherwise.

    By the way, I'm really sure that "sync" will be sufficient. It writes
    everything to disk, but that does not ensure that the cached stuff in
    RAM is invalidated. And if it's not invalidated, then when you go to
    read it again, you will be reading it from cache rather than disk, so
    you aren't verifying that the data can really be read from disk.

    - Logan

+ Reply to Thread