unixware 7.1.4 sendmail issues - SCO

This is a discussion on unixware 7.1.4 sendmail issues - SCO ; just discovered that no all mail local, uucp, and smtp is stuck in mqueue for no apparent reason - syslog shows entries such as below: any idea what my problem might be? Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: unixware 7.1.4 sendmail issues

  1. unixware 7.1.4 sendmail issues

    just discovered that no all mail local, uucp, and smtp is stuck in mqueue
    for no apparent reason - syslog shows entries such as below:
    any idea what my problem might be?



    Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on daemon
    Daemon0
    : load average: 36
    Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on daemon
    Daemon0
    : load average: 36
    Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue run --
    load av
    erage too high
    Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on daemon
    Daemon0
    : load average: 35



  2. Re: unixware 7.1.4 sendmail issues

    Ron Kirschner typed (on Thu, Aug 30, 2007 at 04:12:29PM -0400):
    | just discovered that no all mail local, uucp, and smtp is stuck in mqueue
    | for no apparent reason - syslog shows entries such as below:
    | any idea what my problem might be?
    |
    |
    |
    | Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on daemon Daemon0 : load average: 36
    | Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on daemon Daemon0 : load average: 36
    | Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue run -- load average too high
    | Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on daemon Daemon0: load average: 35

    Please consider the possibility that the problem is exactly what you are
    being told: the load average is too high.

    --
    JP
    ==> http://www.frappr.com/cusm <==

  3. Re: unixware 7.1.4 sendmail issues

    On Aug 30, 1:30 pm, Jean-Pierre Radley wrote:
    > Ron Kirschner typed (on Thu, Aug 30, 2007 at 04:12:29PM -0400):
    > | just discovered that no all mail local, uucp, and smtp is stuck in mqueue
    > | for no apparent reason - syslog shows entries such as below:
    > | any idea what my problem might be?
    > |
    > |
    > |
    > | Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on daemon Daemon0 : load average: 36
    > | Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on daemon Daemon0 : load average: 36
    > | Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue run -- load average too high
    > | Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on daemon Daemon0: load average: 35
    >
    > Please consider the possibility that the problem is exactly what you are
    > being told: the load average is too high.
    >
    > --
    > JP
    > ==>http://www.frappr.com/cusm<==


    By which my esteemed colleague means that it seems like a sendmail
    issue, not a Unixware issue. Sendmail's configuration file specifies
    load limits beyond which it will merely queue requests or reject them
    totally (QueueLA and RefuseLA, respectively). Not sure what
    Unixware's defaults are but except in a dedicated mail server they'd
    typically be well below the 35% range.

    If you're sure that the pending mail is legitimate (that is, your
    system isn't originating or relaying spam) you can force a temporary
    override with the appropriate sendmail -O option or permanently change
    the limits.

    --RLR


  4. Re: unixware 7.1.4 sendmail issues

    In article , Ron Kirschner wrote:
    >just discovered that no all mail local, uucp, and smtp is stuck in mqueue
    >for no apparent reason - syslog shows entries such as below:
    >any idea what my problem might be?
    >
    >
    >
    >Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on daemon
    >Daemon0
    >: load average: 36
    >Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on daemon
    >Daemon0
    >: load average: 36
    >Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue run --
    >load av
    >erage too high
    >Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on daemon
    >Daemon0
    >: load average: 35
    >
    >


    As others said there are settings in the sendmail.cf about what to
    do at certain load averages.

    Typical default [at least on my 8.14.1 version will only queue
    messages when the load average is 8.

    When the load average is 12 it refuses connections.

    At this time the only thing that will happen is the machine
    will try to process what is in the queue. If this is large
    you may find that you are actually running in swap space
    as the system tries to manage the mess.

    I had this happen when a client got hit by one of the bots, and I
    wound up getting spams for them at the rate of 15/second, and I
    could barely log into the machine. That's the only problem I've
    found with a 100Mbit/sec connection into a 40Gb/sec tier 1 link.
    Later I could not login and it required going to the console
    [about 5 miles away at the colo] and logging in there, as the ssh
    login just wouldn't make it.

    At that point my load average was close to 500!!!!!

    There were over 100,000 messages in the queue, and it was trying
    to process all of those.

    So check your queue.

    I moved everything from the queue to another place, and then
    slowly worked through the queue getting the message IDs of
    anything destined for that client, and then removing them.

    That took about 5 hours as I didn't dare lose anything for
    other clients on the machine.

    So from what you have described, you have a mqueue that
    is full.

    Kill all of sendmail. Move the mqueue to something like
    mqueue-hold, recreate mqueue, and make the permissions the same
    as the original and restart sendmail.

    Then start working your way through the files in mqueue-hold.

    I had so many that no utilities would work easily so I
    wildcarded with *[0-5]0, *[6-9]0, *[0-5]1, and so on.

    I'd grep for the destination and direct that to a file with the
    message IDs so I could remove the ones I needed.

    Then when it was all over I moved the files from mqueue-hold
    back to mqueue.

    Lots of luck. If the above scenario is true, then you have
    a fair amount of work ahead of you.

    Bill
    --
    Bill Vermillion - bv @ wjv . com

  5. Re: unixware 7.1.4 sendmail issues


    ----- Original Message -----
    From: "ThreeStar"
    Newsgroups: comp.unix.sco.misc
    To:
    Sent: Thursday, August 30, 2007 6:38 PM
    Subject: Re: unixware 7.1.4 sendmail issues


    > On Aug 30, 1:30 pm, Jean-Pierre Radley wrote:
    >> Ron Kirschner typed (on Thu, Aug 30, 2007 at 04:12:29PM -0400):
    >> | just discovered that no all mail local, uucp, and smtp is stuck in
    >> mqueue
    >> | for no apparent reason - syslog shows entries such as below:
    >> | any idea what my problem might be?
    >> |
    >> |
    >> |
    >> | Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on
    >> daemon Daemon0 : load average: 36
    >> | Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on
    >> daemon Daemon0 : load average: 36
    >> | Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue
    >> run -- load average too high
    >> | Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on
    >> daemon Daemon0: load average: 35
    >>
    >> Please consider the possibility that the problem is exactly what you are
    >> being told: the load average is too high.
    >>
    >> --
    >> JP
    >> ==>http://www.frappr.com/cusm<==

    >
    > By which my esteemed colleague means that it seems like a sendmail
    > issue, not a Unixware issue. Sendmail's configuration file specifies
    > load limits beyond which it will merely queue requests or reject them
    > totally (QueueLA and RefuseLA, respectively). Not sure what
    > Unixware's defaults are but except in a dedicated mail server they'd
    > typically be well below the 35% range.
    >
    > If you're sure that the pending mail is legitimate (that is, your
    > system isn't originating or relaying spam) you can force a temporary
    > override with the appropriate sendmail -O option or permanently change
    > the limits.


    A load average of 35 is not 35% of anything. What it is is astronomical.
    Something somewhere on the box is running away like crazy, or has been
    building up over time. Forcing the mail to go might resolve it if it just
    happens to be the mail server itself that is running away and making the
    load average so high, but we don't know anything of the sort at this point.
    It might be a cron job that's running every night and hanging every night,
    thus adding 1 to the load average every day. Like a tape backup asking
    /dev/null for a second tape. Or it might be a one time fluke event like a
    report that generated a zillion emails or faxes or print jobs... Or it might
    be a pc on the network with a virus slamming some service it found on the
    server like the web server or mail or facetwin/samba/visiofs. Or maybe the
    machine has a raid array in a degraded state running very slow causing
    normal ops to pile up with no obvious problem processes to account for it.
    Or maybe... anything.

    So the first thing is find out what are the 35 or 36 processes trying to run
    that the cpu can't find time to get to:

    Find cpu hogs:
    ps -eopcpu,pid,tty,args |sort
    The worst offenders will be at the bottom of the list and don't worry about
    the ones the scrolled off the top.

    Or if you wanna get fancy yet can't install "top" from skunkware:

    -----------------
    #!/bin/ksh
    # ptop - quick-n-dirty top process display
    # usage: ptop [n]
    # shows top n cpu hogs. Default is 10.
    # brian@aljex.com

    N=${1:-10}
    while true ; do
    clear
    uptime
    echo "Top $N CPU Hogs..."
    echo "%CPU PID TTY COMMAND"
    ps -eopcpu,pid,tty,args |sort -rn |head -$N
    sleep 1
    done
    -----------------

    Except the processes currently using the most cpu might not be offending
    anything. So just look at that but don't read too much into it yet. Do look
    at this though...

    Find any non-sleeping processes:
    ps -elf |awk '($2!="S"){print $0}'


    Any non sleepers, that are eating cpu, and that have been running a long
    time, are likely culprits.
    Non-sleepers that aren't eating cpu might just be normal stuff that's being
    held up by other stuff.
    Be careful what you kill and why.

    Brian K. White brian@aljex.com http://www.myspace.com/KEYofR
    +++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
    filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!


  6. Re: unixware 7.1.4 sendmail issues

    I found a bunch of processes started by cron running one of my programs that
    had an error. Once I knew that load average meant system load, not anything
    to do with sendmaill load which I originally assumed, I found and killed the
    hung jobs. Thanks for the help.

    "Brian K. White" wrote in message
    news:00a101c7ebb7$ab5c3d80$6d00000a@venti...
    >
    > ----- Original Message -----
    > From: "ThreeStar"
    > Newsgroups: comp.unix.sco.misc
    > To:
    > Sent: Thursday, August 30, 2007 6:38 PM
    > Subject: Re: unixware 7.1.4 sendmail issues
    >
    >
    >> On Aug 30, 1:30 pm, Jean-Pierre Radley wrote:
    >>> Ron Kirschner typed (on Thu, Aug 30, 2007 at 04:12:29PM -0400):
    >>> | just discovered that no all mail local, uucp, and smtp is stuck in
    >>> mqueue
    >>> | for no apparent reason - syslog shows entries such as below:
    >>> | any idea what my problem might be?
    >>> |
    >>> |
    >>> |
    >>> | Aug 30 16:07:02 casedrum sendmail[5839]: rejecting connections on
    >>> daemon Daemon0 : load average: 36
    >>> | Aug 30 16:07:17 casedrum sendmail[5839]: rejecting connections on
    >>> daemon Daemon0 : load average: 36
    >>> | Aug 30 16:07:37 casedrum sendmail[5839]: runqueue: Skipping queue
    >>> run -- load average too high
    >>> | Aug 30 16:07:37 casedrum sendmail[5839]: rejecting connections on
    >>> daemon Daemon0: load average: 35
    >>>
    >>> Please consider the possibility that the problem is exactly what you are
    >>> being told: the load average is too high.
    >>>
    >>> --
    >>> JP
    >>> ==>http://www.frappr.com/cusm<==

    >>
    >> By which my esteemed colleague means that it seems like a sendmail
    >> issue, not a Unixware issue. Sendmail's configuration file specifies
    >> load limits beyond which it will merely queue requests or reject them
    >> totally (QueueLA and RefuseLA, respectively). Not sure what
    >> Unixware's defaults are but except in a dedicated mail server they'd
    >> typically be well below the 35% range.
    >>
    >> If you're sure that the pending mail is legitimate (that is, your
    >> system isn't originating or relaying spam) you can force a temporary
    >> override with the appropriate sendmail -O option or permanently change
    >> the limits.

    >
    > A load average of 35 is not 35% of anything. What it is is astronomical.
    > Something somewhere on the box is running away like crazy, or has been
    > building up over time. Forcing the mail to go might resolve it if it just
    > happens to be the mail server itself that is running away and making the
    > load average so high, but we don't know anything of the sort at this
    > point.
    > It might be a cron job that's running every night and hanging every night,
    > thus adding 1 to the load average every day. Like a tape backup asking
    > /dev/null for a second tape. Or it might be a one time fluke event like a
    > report that generated a zillion emails or faxes or print jobs... Or it
    > might
    > be a pc on the network with a virus slamming some service it found on the
    > server like the web server or mail or facetwin/samba/visiofs. Or maybe the
    > machine has a raid array in a degraded state running very slow causing
    > normal ops to pile up with no obvious problem processes to account for it.
    > Or maybe... anything.
    >
    > So the first thing is find out what are the 35 or 36 processes trying to
    > run
    > that the cpu can't find time to get to:
    >
    > Find cpu hogs:
    > ps -eopcpu,pid,tty,args |sort
    > The worst offenders will be at the bottom of the list and don't worry
    > about
    > the ones the scrolled off the top.
    >
    > Or if you wanna get fancy yet can't install "top" from skunkware:
    >
    > -----------------
    > #!/bin/ksh
    > # ptop - quick-n-dirty top process display
    > # usage: ptop [n]
    > # shows top n cpu hogs. Default is 10.
    > # brian@aljex.com
    >
    > N=${1:-10}
    > while true ; do
    > clear
    > uptime
    > echo "Top $N CPU Hogs..."
    > echo "%CPU PID TTY COMMAND"
    > ps -eopcpu,pid,tty,args |sort -rn |head -$N
    > sleep 1
    > done
    > -----------------
    >
    > Except the processes currently using the most cpu might not be offending
    > anything. So just look at that but don't read too much into it yet. Do
    > look
    > at this though...
    >
    > Find any non-sleeping processes:
    > ps -elf |awk '($2!="S"){print $0}'
    >
    >
    > Any non sleepers, that are eating cpu, and that have been running a long
    > time, are likely culprits.
    > Non-sleepers that aren't eating cpu might just be normal stuff that's
    > being
    > held up by other stuff.
    > Be careful what you kill and why.
    >
    > Brian K. White brian@aljex.com http://www.myspace.com/KEYofR
    > +++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
    > filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!
    >




+ Reply to Thread