[9fans] syscall spikes - Plan9

This is a discussion on [9fans] syscall spikes - Plan9 ; Hi, I've been having syscall spikes. It's really weird. There will be >40,000 in 1 second, (according to stats(8)) and it appears at a fairly regular interval. I've tried running kprof, but I'm not sure that it's running a global ...

+ Reply to Thread
Results 1 to 11 of 11

Thread: [9fans] syscall spikes

  1. [9fans] syscall spikes

    Hi,

    I've been having syscall spikes. It's really weird. There will be
    >40,000 in 1 second, (according to stats(8)) and it appears at a

    fairly regular interval. I've tried running kprof, but I'm not sure
    that it's running a global profile. Namely, when I come out of it, I
    have tons of time spent in ``halt'', with very, very few (<100)
    syscalls. Any ideas what I can do to find the culprit? It's gapping
    sound playback, which is really annoying.

    --dho

  2. Re: [9fans] syscall spikes

    fossil

    On 4/17/07, Devon H. O'Dell wrote:
    > Hi,
    >
    > I've been having syscall spikes. It's really weird. There will be
    > >40,000 in 1 second, (according to stats(8)) and it appears at a

    > fairly regular interval. I've tried running kprof, but I'm not sure
    > that it's running a global profile. Namely, when I come out of it, I
    > have tons of time spent in ``halt'', with very, very few (<100)
    > syscalls. Any ideas what I can do to find the culprit? It's gapping
    > sound playback, which is really annoying.
    >
    > --dho
    >


  3. Re: [9fans] syscall spikes

    2007/4/17, Uriel :
    > fossil


    I attempted to tprof fossil and I don't get any useful output. I even did:

    for (i in `{ps a | grep fossil | awk '{ print $2 }'})
    echo profile > /proc/$i/ctl

    [wait a while for spike]

    for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
    echo $i
    tprof $i
    }

    All I get is a bunch of lines saying ``total: 0''.

    What'm I missing?

    --dho

    > On 4/17/07, Devon H. O'Dell wrote:
    > > Hi,
    > >
    > > I've been having syscall spikes. It's really weird. There will be
    > > >40,000 in 1 second, (according to stats(8)) and it appears at a

    > > fairly regular interval. I've tried running kprof, but I'm not sure
    > > that it's running a global profile. Namely, when I come out of it, I
    > > have tons of time spent in ``halt'', with very, very few (<100)
    > > syscalls. Any ideas what I can do to find the culprit? It's gapping
    > > sound playback, which is really annoying.
    > >
    > > --dho
    > >

    >


  4. Re: [9fans] syscall spikes

    2007/4/17, Devon H. O'Dell :
    > 2007/4/17, Uriel :
    > > fossil

    >
    > I attempted to tprof fossil and I don't get any useful output. I even did:
    >
    > for (i in `{ps a | grep fossil | awk '{ print $2 }'})
    > echo profile > /proc/$i/ctl
    >
    > [wait a while for spike]
    >
    > for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
    > echo $i
    > tprof $i
    > }
    >
    > All I get is a bunch of lines saying ``total: 0''.
    >
    > What'm I missing?


    I found the [undocumented] startsyscall argument to the process ctl
    file, but I cannot get it to work. If I echo startsyscall >
    /proc/N/ctl, I get ``echo: write error: bad process or channel control
    request''.

    What gives?

    --dho

    > --dho
    >
    > > On 4/17/07, Devon H. O'Dell wrote:
    > > > Hi,
    > > >
    > > > I've been having syscall spikes. It's really weird. There will be
    > > > >40,000 in 1 second, (according to stats(8)) and it appears at a
    > > > fairly regular interval. I've tried running kprof, but I'm not sure
    > > > that it's running a global profile. Namely, when I come out of it, I
    > > > have tons of time spent in ``halt'', with very, very few (<100)
    > > > syscalls. Any ideas what I can do to find the culprit? It's gapping
    > > > sound playback, which is really annoying.
    > > >
    > > > --dho
    > > >

    > >

    >


  5. Re: [9fans] syscall spikes

    it looks as though the process must be stopped first,
    which you possibly don't want to do to your primary fossil.

  6. Re: [9fans] syscall spikes

    are you running anything that faces the outside world that
    might be doing auth? if you don't have a /sys/log/auth, you
    might want to create it and watch it for any external attacks.

    i've found that to be a source of big syscall loads on some of
    our systems.

    - erik

    On Tue Apr 17 11:38:33 EDT 2007, devon.odell@gmail.com wrote:
    > 2007/4/17, Devon H. O'Dell :
    > > 2007/4/17, Uriel :
    > > > fossil

    > >
    > > I attempted to tprof fossil and I don't get any useful output. I even did:
    > >
    > > for (i in `{ps a | grep fossil | awk '{ print $2 }'})
    > > echo profile > /proc/$i/ctl
    > >
    > > [wait a while for spike]
    > >
    > > for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
    > > echo $i
    > > tprof $i
    > > }
    > >
    > > All I get is a bunch of lines saying ``total: 0''.
    > >
    > > What'm I missing?

    >
    > I found the [undocumented] startsyscall argument to the process ctl
    > file, but I cannot get it to work. If I echo startsyscall >
    > /proc/N/ctl, I get ``echo: write error: bad process or channel control
    > request''.
    >
    > What gives?
    >
    > --dho
    >
    > > --dho
    > >
    > > > On 4/17/07, Devon H. O'Dell wrote:
    > > > > Hi,
    > > > >
    > > > > I've been having syscall spikes. It's really weird. There will be
    > > > > >40,000 in 1 second, (according to stats(8)) and it appears at a


  7. Re: [9fans] syscall spikes

    2007/4/17, erik quanstrom :
    > are you running anything that faces the outside world that
    > might be doing auth? if you don't have a /sys/log/auth, you
    > might want to create it and watch it for any external attacks.
    >
    > i've found that to be a source of big syscall loads on some of
    > our systems.


    Nope, it's just a laptop on a network, and the spikes are very short
    (1s bursts), and happen possibly once per minute or so, but they're
    fairly evenly spaced, though not always.

    Charles: I dug around in the code and it appears that it actually
    stops the process and does nothing else. I can't see where any syscall
    trace information is actually generated from it. So it doesn't seem
    like a big problem that I can't get it to work, since it's not going
    to anyway (or what am I missing?).

    I'm considering creating a trace buffer of my own and doing some
    heuristics with the timer, where if there have been more than 10,000
    syscalls since the last tick, it starts logging them all. Because
    honestly, I can't find a better way to do it. Does anybody have any
    suggestions before I try this?

    > - erik


    --dho

    > On Tue Apr 17 11:38:33 EDT 2007, devon.odell@gmail.com wrote:
    > > 2007/4/17, Devon H. O'Dell :
    > > > 2007/4/17, Uriel :
    > > > > fossil
    > > >
    > > > I attempted to tprof fossil and I don't get any useful output. I even did:
    > > >
    > > > for (i in `{ps a | grep fossil | awk '{ print $2 }'})
    > > > echo profile > /proc/$i/ctl
    > > >
    > > > [wait a while for spike]
    > > >
    > > > for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
    > > > echo $i
    > > > tprof $i
    > > > }
    > > >
    > > > All I get is a bunch of lines saying ``total: 0''.
    > > >
    > > > What'm I missing?

    > >
    > > I found the [undocumented] startsyscall argument to the process ctl
    > > file, but I cannot get it to work. If I echo startsyscall >
    > > /proc/N/ctl, I get ``echo: write error: bad process or channel control
    > > request''.
    > >
    > > What gives?
    > >
    > > --dho
    > >
    > > > --dho
    > > >
    > > > > On 4/17/07, Devon H. O'Dell wrote:
    > > > > > Hi,
    > > > > >
    > > > > > I've been having syscall spikes. It's really weird. There will be
    > > > > > >40,000 in 1 second, (according to stats(8)) and it appears at a

    >


  8. Re: [9fans] syscall spikes

    Steve Simon suggested the problem was timesync. He seems to have been
    very correct in this suggestion. Though I wonder why it needs to do >
    60k syscalls :\.

    Additionally, using dd to play my cd instead of cat seems to work
    pretty well. dd -if a000 -of /dev/audio -bs 4192 seems to be enough.
    Additionally, using madplay like:

    madplay -o RAW:'02 - Birdhouse In Your Soul.mp3' | dd -of /dev/audio -bs 4192

    seems to work well.

    --dho

  9. Re: [9fans] syscall spikes

    2007/4/17, Devon H. O'Dell :
    > Steve Simon suggested the problem was timesync. He seems to have been
    > very correct in this suggestion. Though I wonder why it needs to do >
    > 60k syscalls :\.


    And my apologies for continuing additions to this thread and flooding
    the inboxes of you fine subscribers :\. It seems that somehow I ended
    up with 2 timesyncs running. This causes the spikes I was seeing -- a
    single timesync doesn't spike terribly badly.

    Should there be some check in timesync to prevent two from running?
    There should be no reason for this to ever occur.

    --dho

  10. Re: [9fans] syscall spikes

    > I've been having syscall spikes. It's really weird. There will be
    >>40,000 in 1 second, (according to stats(8)) and it appears at a

    > fairly regular interval. I've tried running kprof, but I'm not sure
    > that it's running a global profile. Namely, when I come out of it, I
    > have tons of time spent in ``halt'', with very, very few (<100)
    > syscalls. Any ideas what I can do to find the culprit? It's gapping
    > sound playback, which is really annoying.


    You figured out that this is caused by two timesyncs.

    However, it might not be caused by the timesyncs actually
    issuing that many system calls as much as fighting over
    what time means. So probably you're not having 40,000
    system calls but in fact the clock is being slowed/stopped
    in the kernel, which makes the devaudio sleep take too long.

    Russ


  11. Re: [9fans] syscall spikes

    2007/4/18, Russ Cox :
    > > I've been having syscall spikes. It's really weird. There will be
    > >>40,000 in 1 second, (according to stats(8)) and it appears at a

    > > fairly regular interval. I've tried running kprof, but I'm not sure
    > > that it's running a global profile. Namely, when I come out of it, I
    > > have tons of time spent in ``halt'', with very, very few (<100)
    > > syscalls. Any ideas what I can do to find the culprit? It's gapping
    > > sound playback, which is really annoying.

    >
    > You figured out that this is caused by two timesyncs.
    >
    > However, it might not be caused by the timesyncs actually
    > issuing that many system calls as much as fighting over
    > what time means. So probably you're not having 40,000
    > system calls but in fact the clock is being slowed/stopped
    > in the kernel, which makes the devaudio sleep take too long.


    Why would stats(8) say that there's anywhere between 40000 and (the
    highest I've seen) 80000 syscalls in that second? Is its notion of the
    second screwed up that much due to the contention?

    > Russ


    --dho

+ Reply to Thread