linux performance puzzle - Unix
This is a discussion on linux performance puzzle - Unix ; Hi all,
I have a program that reads 100000 lines from stdin, performs
processing in two worker threads (there are two processors on my
system), and outputs the result into a file. To make the test
cleaner, I excluded all ...
-
linux performance puzzle
Hi all,
I have a program that reads 100000 lines from stdin, performs
processing in two worker threads (there are two processors on my
system), and outputs the result into a file. To make the test
cleaner, I excluded all file operations by reading from a pipe
(created by another small program I wrote), and redirecting the output
into /dev/null.
Now the puzzle: while it generally takes about 11.5 sec to execute my
test, the first run after rebuild takes _significantly_ more,
sometimes 19 sec, sometimes 15, etc. All the subsequent runs
consistently take around 11.5 sec.
I noted that I actually need to rebuild -- just touching the
executable does not have the same effect.
I am not using any shared objects that need to be downloaded over the
network.
My system is:
Red Hat Enterprise Linux WS release 4 (Nahant Update 2)
Can anybody suggest any explanation of this?
Thanks in advance,
Arkadiy
-
Re: linux performance puzzle
In article <83fab499-81a3-4c52-8d47-a57f26286ae4@t39g2000prh.googlegroups.com>
Arkadiy writes:
>Hi all,
>
>I have a program that reads 100000 lines from stdin, performs
>processing in two worker threads (there are two processors on my
>system), and outputs the result into a file. To make the test
>cleaner, I excluded all file operations by reading from a pipe
>(created by another small program I wrote), and redirecting the output
>into /dev/null.
>
>Now the puzzle: while it generally takes about 11.5 sec to execute my
>test, the first run after rebuild takes _significantly_ more,
>sometimes 19 sec, sometimes 15, etc. All the subsequent runs
>consistently take around 11.5 sec.
May or may not be the case for you, but I've run into this just as
a matter of disk cacheing, either of the app/libraries or of the
test data. The first run has to load everything. Subsequent runs
may already have everything in memory.
--
Drew Lawson
In Dr. Johnson's famous dictionary patriotism is defined as the
last resort of the scoundrel. With all due respect to an enlightened
-
Re: linux performance puzzle
drew@furrfu.invalid (Drew Lawson) writes:
>In article <83fab499-81a3-4c52-8d47-a57f26286ae4@t39g2000prh.googlegroups.com>
> Arkadiy writes:
>>Hi all,
>>
>>I have a program that reads 100000 lines from stdin, performs
>>processing in two worker threads (there are two processors on my
>>system), and outputs the result into a file. To make the test
>>cleaner, I excluded all file operations by reading from a pipe
>>(created by another small program I wrote), and redirecting the output
>>into /dev/null.
>>
>>Now the puzzle: while it generally takes about 11.5 sec to execute my
>>test, the first run after rebuild takes _significantly_ more,
>>sometimes 19 sec, sometimes 15, etc. All the subsequent runs
>>consistently take around 11.5 sec.
>
>May or may not be the case for you, but I've run into this just as
>a matter of disk cacheing, either of the app/libraries or of the
>test data. The first run has to load everything. Subsequent runs
>may already have everything in memory.
>
yup. and his compile is trashing the disk/file cache, resulting
in the cold-cache startup penalty after rebuild.
scott
-
Re: linux performance puzzle
>I have a program that reads 100000 lines from stdin, performs
>processing in two worker threads (there are two processors on my
>system), and outputs the result into a file. To make the test
>cleaner, I excluded all file operations by reading from a pipe
>(created by another small program I wrote), and redirecting the output
>into /dev/null.
>
>Now the puzzle: while it generally takes about 11.5 sec to execute my
>test, the first run after rebuild takes _significantly_ more,
>sometimes 19 sec, sometimes 15, etc. All the subsequent runs
>consistently take around 11.5 sec.
Do you get the same penalty after rebuilding _something else_, say
a Linux kernel (don't install it, just build it)?
Do you get the same penalty after rebooting?
Do you get the same penalty after *NOT* running that program for
a whole week, then running it?
The disk cache is probably working well after the first run of
the program, but the first run has to load everything.
-
Re: linux performance puzzle
In article ,
scott@slp53.sl.home (Scott Lurndal) wrote:
> drew@furrfu.invalid (Drew Lawson) writes:
> >In article
> ><83fab499-81a3-4c52-8d47-a57f26286ae4@t39g2000prh.googlegroups.com>
> > Arkadiy writes:
> >>Hi all,
> >>
> >>I have a program that reads 100000 lines from stdin, performs
> >>processing in two worker threads (there are two processors on my
> >>system), and outputs the result into a file. To make the test
> >>cleaner, I excluded all file operations by reading from a pipe
> >>(created by another small program I wrote), and redirecting the output
> >>into /dev/null.
> >>
> >>Now the puzzle: while it generally takes about 11.5 sec to execute my
> >>test, the first run after rebuild takes _significantly_ more,
> >>sometimes 19 sec, sometimes 15, etc. All the subsequent runs
> >>consistently take around 11.5 sec.
> >
> >May or may not be the case for you, but I've run into this just as
> >a matter of disk cacheing, either of the app/libraries or of the
> >test data. The first run has to load everything. Subsequent runs
> >may already have everything in memory.
> >
>
> yup. and his compile is trashing the disk/file cache, resulting
> in the cold-cache startup penalty after rebuild.
And the reason the "touch" command doesn't have the same effects it that
the VM system works at the page level. It actually notices whether
individual pages of a file have been modified since they were cached.
The touch command simply updates the file's timestamp, but doesn't
actually dirty any of the pages, so the VM system is smart enough to
know that the cache is still valid.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
-
Re: linux performance puzzle
Hello Drew, thanks for your response.
> May or may not be the case for you, but I've run into this just as
> a matter of disk cacheing, either of the app/libraries or of the
> test data. The first run has to load everything. Subsequent runs
> may already have everything in memory.
Kind of hard to believe that loading app/libraries can take up to 7
sec... and I don't use any disk IO in my test -- getting data from
the pipe (the other program writes into the pipe directly from memory)
and outputting into /dev/null.
Or maybe I am missing something...
Regards,
Arkadiy
-
Re: linux performance puzzle
Hi Scott,
> >May or may not be the case for you, but I've run into this just as
> >a matter of disk cacheing, either of the app/libraries or of the
> >test data. The first run has to load everything. Subsequent runs
> >may already have everything in memory.
>
> yup. and his compile is trashing the disk/file cache, resulting
> in the cold-cache startup penalty after rebuild.
I am not using any disk IO in my test (reading from a pipe, writing to
dev/null). Is disk/file cache still involved?
Regards,
Arkadiy
-
Re: linux performance puzzle
On Wed, 22 Oct 2008 06:00:19 -0700, Arkadiy wrote:
> Hi Scott,
>
>> >May or may not be the case for you, but I've run into this just as a
>> >matter of disk cacheing, either of the app/libraries or of the test
>> >data. The first run has to load everything. Subsequent runs may
>> >already have everything in memory.
>>
>> yup. and his compile is trashing the disk/file cache, resulting in the
>> cold-cache startup penalty after rebuild.
>
> I am not using any disk IO in my test (reading from a pipe, writing to
> dev/null). Is disk/file cache still involved?
Well, you said your program reads 100000 lines from stdin.
Do you type that fast ?
AvK
-
Re: linux performance puzzle
In article
Arkadiy writes:
>Hello Drew, thanks for your response.
>
>> May or may not be the case for you, but I've run into this just as
>> a matter of disk cacheing, either of the app/libraries or of the
>> test data. The first run has to load everything. Subsequent runs
>> may already have everything in memory.
>
>Kind of hard to believe that loading app/libraries can take up to 7
>sec... and I don't use any disk IO in my test -- getting data from
>the pipe (the other program writes into the pipe directly from memory)
>and outputting into /dev/null.
Depends on the machine, the stuff being loaded and other factors.
At work, I work on a web server application. That grabs page
templates, loads assorted libraries as needed and interacts with
the back-end servers. When I first hit it after a restart, it is
probably a good 4 seconds slower than making the same request later.
I'm sure it is faster on the production boxes, but they don't let
me play over there very often.
--
|Drew Lawson | Mrs. Tweedy! |
| | The chickens are revolting! |
-
Re: linux performance puzzle
On Oct 22, 9:05 am, Moi wrote:
> > I am not using any disk IO in my test (reading from a pipe, writing to
> > dev/null). Is disk/file cache still involved?
>
> Well, you said your program reads 100000 lines from stdin.
> Do you type that fast ?
Stdin doesn't have to be a disk or keyboard does it? I wrote a little
program that writes 100000 lines into stdout (from memory), and pipe
its result into my test.
Regards,
Arkadiy
-
Re: linux performance puzzle
On Wed, 22 Oct 2008 07:36:56 -0700, Arkadiy wrote:
> Stdin doesn't have to be a disk or keyboard does it? I wrote a little
> program that writes 100000 lines into stdout (from memory), and pipe its
> result into my test.
>
Sorry, I overlooked that part. My bad ...
Anyway, for your two program to be loaded and executed, memory is needed.
If your amount of memory is small, the compile will have claimed most or
all of your available memory, including diskbuffers for the compiler
itself, sourcefile, objects, executable, libraries, linker.
If you would *compile* the program twice, the second compile will probably
take less time, too.
HTH,
AvK
-
Re: linux performance puzzle
>> >May or may not be the case for you, but I've run into this just as
>> >a matter of disk cacheing, either of the app/libraries or of the
>> >test data. The first run has to load everything. Subsequent runs
>> >may already have everything in memory.
>>
>> yup. and his compile is trashing the disk/file cache, resulting
>> in the cold-cache startup penalty after rebuild.
>
>I am not using any disk IO in my test (reading from a pipe, writing to
>dev/null). Is disk/file cache still involved?
Are you running the *program* from disk? It has to be loaded
into memory to run.