linux performance puzzle - Unix

This is a discussion on linux performance puzzle - Unix ; Hi all, I have a program that reads 100000 lines from stdin, performs processing in two worker threads (there are two processors on my system), and outputs the result into a file. To make the test cleaner, I excluded all ...

+ Reply to Thread
Results 1 to 12 of 12

Thread: linux performance puzzle

  1. linux performance puzzle

    Hi all,

    I have a program that reads 100000 lines from stdin, performs
    processing in two worker threads (there are two processors on my
    system), and outputs the result into a file. To make the test
    cleaner, I excluded all file operations by reading from a pipe
    (created by another small program I wrote), and redirecting the output
    into /dev/null.

    Now the puzzle: while it generally takes about 11.5 sec to execute my
    test, the first run after rebuild takes _significantly_ more,
    sometimes 19 sec, sometimes 15, etc. All the subsequent runs
    consistently take around 11.5 sec.

    I noted that I actually need to rebuild -- just touching the
    executable does not have the same effect.

    I am not using any shared objects that need to be downloaded over the
    network.

    My system is:
    Red Hat Enterprise Linux WS release 4 (Nahant Update 2)

    Can anybody suggest any explanation of this?

    Thanks in advance,
    Arkadiy

  2. Re: linux performance puzzle

    In article <83fab499-81a3-4c52-8d47-a57f26286ae4@t39g2000prh.googlegroups.com>
    Arkadiy writes:
    >Hi all,
    >
    >I have a program that reads 100000 lines from stdin, performs
    >processing in two worker threads (there are two processors on my
    >system), and outputs the result into a file. To make the test
    >cleaner, I excluded all file operations by reading from a pipe
    >(created by another small program I wrote), and redirecting the output
    >into /dev/null.
    >
    >Now the puzzle: while it generally takes about 11.5 sec to execute my
    >test, the first run after rebuild takes _significantly_ more,
    >sometimes 19 sec, sometimes 15, etc. All the subsequent runs
    >consistently take around 11.5 sec.


    May or may not be the case for you, but I've run into this just as
    a matter of disk cacheing, either of the app/libraries or of the
    test data. The first run has to load everything. Subsequent runs
    may already have everything in memory.


    --
    Drew Lawson

    In Dr. Johnson's famous dictionary patriotism is defined as the
    last resort of the scoundrel. With all due respect to an enlightened

  3. Re: linux performance puzzle

    drew@furrfu.invalid (Drew Lawson) writes:
    >In article <83fab499-81a3-4c52-8d47-a57f26286ae4@t39g2000prh.googlegroups.com>
    > Arkadiy writes:
    >>Hi all,
    >>
    >>I have a program that reads 100000 lines from stdin, performs
    >>processing in two worker threads (there are two processors on my
    >>system), and outputs the result into a file. To make the test
    >>cleaner, I excluded all file operations by reading from a pipe
    >>(created by another small program I wrote), and redirecting the output
    >>into /dev/null.
    >>
    >>Now the puzzle: while it generally takes about 11.5 sec to execute my
    >>test, the first run after rebuild takes _significantly_ more,
    >>sometimes 19 sec, sometimes 15, etc. All the subsequent runs
    >>consistently take around 11.5 sec.

    >
    >May or may not be the case for you, but I've run into this just as
    >a matter of disk cacheing, either of the app/libraries or of the
    >test data. The first run has to load everything. Subsequent runs
    >may already have everything in memory.
    >


    yup. and his compile is trashing the disk/file cache, resulting
    in the cold-cache startup penalty after rebuild.

    scott

  4. Re: linux performance puzzle

    >I have a program that reads 100000 lines from stdin, performs
    >processing in two worker threads (there are two processors on my
    >system), and outputs the result into a file. To make the test
    >cleaner, I excluded all file operations by reading from a pipe
    >(created by another small program I wrote), and redirecting the output
    >into /dev/null.
    >
    >Now the puzzle: while it generally takes about 11.5 sec to execute my
    >test, the first run after rebuild takes _significantly_ more,
    >sometimes 19 sec, sometimes 15, etc. All the subsequent runs
    >consistently take around 11.5 sec.


    Do you get the same penalty after rebuilding _something else_, say
    a Linux kernel (don't install it, just build it)?
    Do you get the same penalty after rebooting?
    Do you get the same penalty after *NOT* running that program for
    a whole week, then running it?

    The disk cache is probably working well after the first run of
    the program, but the first run has to load everything.


  5. Re: linux performance puzzle

    In article ,
    scott@slp53.sl.home (Scott Lurndal) wrote:

    > drew@furrfu.invalid (Drew Lawson) writes:
    > >In article
    > ><83fab499-81a3-4c52-8d47-a57f26286ae4@t39g2000prh.googlegroups.com>
    > > Arkadiy writes:
    > >>Hi all,
    > >>
    > >>I have a program that reads 100000 lines from stdin, performs
    > >>processing in two worker threads (there are two processors on my
    > >>system), and outputs the result into a file. To make the test
    > >>cleaner, I excluded all file operations by reading from a pipe
    > >>(created by another small program I wrote), and redirecting the output
    > >>into /dev/null.
    > >>
    > >>Now the puzzle: while it generally takes about 11.5 sec to execute my
    > >>test, the first run after rebuild takes _significantly_ more,
    > >>sometimes 19 sec, sometimes 15, etc. All the subsequent runs
    > >>consistently take around 11.5 sec.

    > >
    > >May or may not be the case for you, but I've run into this just as
    > >a matter of disk cacheing, either of the app/libraries or of the
    > >test data. The first run has to load everything. Subsequent runs
    > >may already have everything in memory.
    > >

    >
    > yup. and his compile is trashing the disk/file cache, resulting
    > in the cold-cache startup penalty after rebuild.


    And the reason the "touch" command doesn't have the same effects it that
    the VM system works at the page level. It actually notices whether
    individual pages of a file have been modified since they were cached.
    The touch command simply updates the file's timestamp, but doesn't
    actually dirty any of the pages, so the VM system is smart enough to
    know that the cache is still valid.

    --
    Barry Margolin, barmar@alum.mit.edu
    Arlington, MA
    *** PLEASE post questions in newsgroups, not directly to me ***
    *** PLEASE don't copy me on replies, I'll read them in the group ***

  6. Re: linux performance puzzle

    Hello Drew, thanks for your response.

    > May or may not be the case for you, but I've run into this just as
    > a matter of disk cacheing, either of the app/libraries or of the
    > test data. The first run has to load everything. Subsequent runs
    > may already have everything in memory.


    Kind of hard to believe that loading app/libraries can take up to 7
    sec... and I don't use any disk IO in my test -- getting data from
    the pipe (the other program writes into the pipe directly from memory)
    and outputting into /dev/null.

    Or maybe I am missing something...

    Regards,
    Arkadiy

  7. Re: linux performance puzzle

    Hi Scott,

    > >May or may not be the case for you, but I've run into this just as
    > >a matter of disk cacheing, either of the app/libraries or of the
    > >test data. The first run has to load everything. Subsequent runs
    > >may already have everything in memory.

    >
    > yup. and his compile is trashing the disk/file cache, resulting
    > in the cold-cache startup penalty after rebuild.


    I am not using any disk IO in my test (reading from a pipe, writing to
    dev/null). Is disk/file cache still involved?

    Regards,
    Arkadiy

  8. Re: linux performance puzzle

    On Wed, 22 Oct 2008 06:00:19 -0700, Arkadiy wrote:

    > Hi Scott,
    >
    >> >May or may not be the case for you, but I've run into this just as a
    >> >matter of disk cacheing, either of the app/libraries or of the test
    >> >data. The first run has to load everything. Subsequent runs may
    >> >already have everything in memory.

    >>
    >> yup. and his compile is trashing the disk/file cache, resulting in the
    >> cold-cache startup penalty after rebuild.

    >
    > I am not using any disk IO in my test (reading from a pipe, writing to
    > dev/null). Is disk/file cache still involved?


    Well, you said your program reads 100000 lines from stdin.
    Do you type that fast ?

    AvK

  9. Re: linux performance puzzle

    In article
    Arkadiy writes:
    >Hello Drew, thanks for your response.
    >
    >> May or may not be the case for you, but I've run into this just as
    >> a matter of disk cacheing, either of the app/libraries or of the
    >> test data. The first run has to load everything. Subsequent runs
    >> may already have everything in memory.

    >
    >Kind of hard to believe that loading app/libraries can take up to 7
    >sec... and I don't use any disk IO in my test -- getting data from
    >the pipe (the other program writes into the pipe directly from memory)
    >and outputting into /dev/null.


    Depends on the machine, the stuff being loaded and other factors.
    At work, I work on a web server application. That grabs page
    templates, loads assorted libraries as needed and interacts with
    the back-end servers. When I first hit it after a restart, it is
    probably a good 4 seconds slower than making the same request later.

    I'm sure it is faster on the production boxes, but they don't let
    me play over there very often.


    --
    |Drew Lawson | Mrs. Tweedy! |
    | | The chickens are revolting! |

  10. Re: linux performance puzzle

    On Oct 22, 9:05 am, Moi wrote:

    > > I am not using any disk IO in my test (reading from a pipe, writing to
    > > dev/null). Is disk/file cache still involved?

    >
    > Well, you said your program reads 100000 lines from stdin.
    > Do you type that fast ?


    Stdin doesn't have to be a disk or keyboard does it? I wrote a little
    program that writes 100000 lines into stdout (from memory), and pipe
    its result into my test.

    Regards,
    Arkadiy

  11. Re: linux performance puzzle

    On Wed, 22 Oct 2008 07:36:56 -0700, Arkadiy wrote:


    > Stdin doesn't have to be a disk or keyboard does it? I wrote a little
    > program that writes 100000 lines into stdout (from memory), and pipe its
    > result into my test.
    >


    Sorry, I overlooked that part. My bad ...

    Anyway, for your two program to be loaded and executed, memory is needed.
    If your amount of memory is small, the compile will have claimed most or
    all of your available memory, including diskbuffers for the compiler
    itself, sourcefile, objects, executable, libraries, linker.

    If you would *compile* the program twice, the second compile will probably
    take less time, too.

    HTH,
    AvK

  12. Re: linux performance puzzle

    >> >May or may not be the case for you, but I've run into this just as
    >> >a matter of disk cacheing, either of the app/libraries or of the
    >> >test data. The first run has to load everything. Subsequent runs
    >> >may already have everything in memory.

    >>
    >> yup. and his compile is trashing the disk/file cache, resulting
    >> in the cold-cache startup penalty after rebuild.

    >
    >I am not using any disk IO in my test (reading from a pipe, writing to
    >dev/null). Is disk/file cache still involved?


    Are you running the *program* from disk? It has to be loaded
    into memory to run.


+ Reply to Thread