This is a discussion on ec1 statistics overflow error - SGI ; We have a problem here involving an "ec1 statistics overflow" error that causes our SGI O2s to crash. Here's the situation... After our department upgraded our switches (different manufacturer, same configuration at 10mbps, half-duplex) we saw "ec1 statistics overflow" messages ...
We have a problem here involving an "ec1 statistics overflow" error that
causes our SGI O2s to crash. Here's the situation...
After our department upgraded our switches (different manufacturer, same
configuration at 10mbps, half-duplex) we saw "ec1 statistics overflow"
messages in the system logs a number of times per day on our dual
Ethernet card O2s. In these O2s, ec1 (the PCI card Ethernet interface)
is configured for the outside world while ec0 (the internal Ethernet
interface) is attached to an NMR device (a huge magnet used in chemical
SGI (and Brueker, the company that makes the NMR device) suggested
increasing the number of ecf_max_rxds in /var/sysgen/master.d/if_ecf
from 40 to 100. This didn't help, as it still crashed, so we asked them
what we could increase it to safely. That number was 170 so we changed
it to that. We were still seeing the ec1 statistics overflow errors,
although not as much as before. We do experience a lot of traffic on the
part of the network that this SGI and the NMR device are on, but the
amount of traffic has not increased dramatically (there is no concrete
evidence to support this statement).
The large number of ec1 statistics overflow errors eventually caused our
O2 to crash with a CPU kernel fault. Increasing the ecf_max_rxds would
prevent the crashes, but something else was now happening. The network
interface (ec1) would all of a sudden drop, not immediately, but after a
day or so, so I wrote a script that runs "ifconfig ec1 up" every 10
minutes. SGI address the following questions of mine.
1. Will running ifconfig ec1 up every 10 minutes slow down traffic
considerably or is it negligible? SGI says it should be negligible. Is
ifconfig smart enough to know that if it is up, it will not toggle off
and then back on? (How does ifconfig work?) SGI hasn't gotten back to me
yet on this.
2. Will doing ifconfig ec1 up affect ec0 in any way? SGI says no.
We checked that the port configuration of the old and new switches is
indeed the same. We've tried connecting the O2 to a port on the old
switch and then in turn to a port on the new switch. Same problem.
If you have any further thoughts on this, please reply.