--===============1693779732==
Content-Type: multipart/alternative;
boundary="_aa565a82-9386-4222-8c25-c8c8ffb0bc58_"

--_aa565a82-9386-4222-8c25-c8c8ffb0bc58_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable


Okay.

So I think I've found a solution.

I grabbed the latest copy of the net-snmp source code from the SVN reposito=
ry.

I applied patches 1794532, 1792716 and 1805971 to the source code but no lu=
ck. Was not able to keep the agent up and running for more then 30 seconds=
..

I then applied patch 1712645 (from the unoffical patch section (entitled: m=
eaningful log message on duplicate IP address) and the agent is up and stab=
le.
All walks work and the agent has been up for over 13 minutes.

I hope this helps others out in the environment that are having the same is=
sues with the agent seg faulting upon startup.

Jayson
From: thefeistycadavar@hotmail.com
To: net-snmp-coders@lists.sourceforge.net
Subject: RE: More net-snmp 5.4.1 startup issues.
Date: Tue, 6 Nov 2007 11:05:02 -0500








Dave,

I applied the zones patch last night to my copy of the source code and stil=
l was having issues with it crashing. I downloaded the net-snmp 5.4.1 code=
this morning and applied the diff-zones patch(bug 1794532) and the agent =
was dying with the SIGSEGV error noted below.

I then went on and patched it with the code from bug 1792716 (diff.ipaddres=
s-patch-541) and now it no longer dies but consumes 92 - 93 % of the cpu an=
d will not answer requests (One of the issues that I noted yesterday).

Now for the bad news. I cannot remove the duplicate IP addresses as thats =
the way linux handles bonded ip addresses. They all show up but the two et=
hernet devices which are bonded show up as slaves while the bonded device s=
hows up as the master.

GDB OUTPUT BELOW FOR OUTPUT PRIOR TO PATCH AFTER SNMPD Crashed.

Program received signal SIGSEGV, Segmentation fault.
0x0089e390 in _int_malloc () from /lib/tls/libc.so.6
(gdb) where
#0 0x0089e390 in _int_malloc () from /lib/tls/libc.so.6
#1 0x0089fc76 in calloc () from /lib/tls/libc.so.6
#2 0x00bc57ee in ipSystemStatsTable_allocate_rowreq_ctx (data=3D0x843eab8,=
user_init_ctx=3D0x3)
at ip-mib/ipSystemStatsTable/ipSystemStatsTable_interface.c:426
#3 0x00bc78d7 in _add_new (systemstats_entry=3D0x843eab8, container=3D0x83=
a9230)
at ip-mib/ipSystemStatsTable/ipSystemStatsTable_data_access.c:263
#4 0x00d0d19b in _ba_for_each (container=3D0x3, f=3D0xbc788d <_add_new>, c=
ontext=3D0x83a9230) at container_binary_array.c:342
#5 0x00bc7c11 in ipSystemStatsTable_container_load (container=3D0x83a9230)
at ip-mib/ipSystemStatsTable/ipSystemStatsTable_data_access.c:377
#6 0x00bc6af8 in _cache_load (cache=3D0x83a91f0, vmagic=3D0x83a9230) at ip=
-mib/ipSystemStatsTable/ipSystemStatsTable_interface.c:1212
#7 0x004add54 in _cache_load (cache=3D0x83a91f0) at cache_handler.c:537
#8 0x00cf396f in run_alarms () at snmp_alarm.c:252
#9 0x0804c003 in main (argc=3D11, argv=3D0xbfec4c34) at snmpd.c:1210
#10 0x00850e23 in __libc_start_main () from /lib/tls/libc.so.6
#11 0x08049f41 in _start ()
(gdb) list
1210 run_alarms();
1211
1212 netsnmp_check_outstanding_agent_requests();
1213
1214 } /* endwhile */
1215
1216 snmp_log(LOG_INFO, "Received TERM or STOP signal... shutting d=
own...\n");
1217 return 0;
1218
1219 } /* end receive() */
(gdb)=20


GDB OUTPUT AFTER PATCH FROM bug 1792716 (Attached to process)
#0 0x0089e090 in _int_malloc () from /lib/tls/libc.so.6
#1 0x0089ff01 in malloc () from /lib/tls/libc.so.6
#2 0x0014a0ff in _sess_read (sessp=3D0x9977b58, fdset=3D0xbff33ec0) at snm=
p_api.c:5567
#3 0x0014ad0b in snmp_sess_read (sessp=3D0x9977b58, fdset=3D0x38) at snmp_=
api.c:5791
#4 0x0014ad59 in snmp_read (fdset=3D0xbff33ec0) at snmp_api.c:5408
#5 0x0804bffe in main (argc=3D10, argv=3D0xbff34064) at snmpd.c:1180
#6 0x00850e23 in __libc_start_main () from /lib/tls/libc.so.6
#7 0x08049f41 in _start ()
(gdb) list
1180 snmp_read(&readfds);
1181 }
1182 } else
1183 switch (count) {
1184 case 0:
1185 snmp_timeout();
1186 break;
1187 case -1:
1188 DEBUGMSGTL(("snmpd/select", " errno =3D %d\n", err=
no));
1189 if (errno =3D=3D EINTR) {

Jayson

> Date: Tue, 6 Nov 2007 09:58:07 +0000
> From: D.T.Shield@liverpool.ac.uk
> To: thefeistycadavar@hotmail.com
> Subject: Re: More net-snmp 5.4.1 startup issues.
> CC: net-snmp-coders@lists.sourceforge.net
>=20
> On 05/11/2007, Jayson Robinson wrote:
> > Actually I had to sanitize the data so that it could be exported. They=

're
> > very careful about IP addresses leaving the building.

>=20
> That's fair enough.
>=20
> > I can confirm that bond0 / eth0 and eth4 all share the same IP address
> > though.

>=20
> This does sound more and more as if duplicate IP addresses is the cause.
>=20
>=20
> > Now I just patched it with 2 various patches this morning:
> >
> > official patch: 1805971

>=20
> That's not relevant to this particular problem.
> (Though it is worth applying anyway)
>=20
> > and
> > diff.ipaddress-patch-541

>=20
> I presume you mean the patch from Bugs #1794532/1792716 ?
> [It's always worth referring to tracker numbers, rather than patch
> file names!]
>=20
> From the discussion in Bug #1794532, it sounds as if this patch
> does not fix the problem either. It definitely sounds as if the later
> file diff.zones-541.pat is more promising.
>=20
> Dave


Climb to the top of the charts! Play Star Shuffle: the word scramble chal=
lenge with star power. Play Now!

__________________________________________________ _______________
Boo!=A0Scare away worms, viruses and so much more! Try Windows Live OneCare=
!
http://onecare.live.com/standard/en-...id=3Dwl_hotma=
ilnews=

--_aa565a82-9386-4222-8c25-c8c8ffb0bc58_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable






Okay.

So I think I've found a solution.

I grabbed the latest =
copy of the net-snmp source code from the SVN repository.

I applied =
patches 1794532, 1792716 and 1805971 to the source code but no luck.  =
Was not able to keep the agent up and running for more then 30 seconds.
=

I then applied patch 1712645 (from the unoffical patch section (entitle=
d: meaningful log message on duplicate IP address) and the agent is up and =
stable.
All walks work and the agent has been up for over 13 minutes.
>
I hope this helps others out in the environment that are having the sa=
me issues with the agent seg faulting upon startup.

Jayson
quote>
From: thefeistycadavar@hotmail.com
To: net-snmp-coders@lists.s=
ourceforge.net
Subject: RE: More net-snmp 5.4.1 startup issues.
Date:=
Tue, 6 Nov 2007 11:05:02 -0500








Dave,

I applied the zones patch last night to my copy of the source =
code and still was having issues with it crashing.  I downloaded the n=
et-snmp 5.4.1 code this morning and applied  the diff-zones patch(bug =
1794532) and the agent was dying with the SIGSEGV error noted below.

>I then went on and patched it with the code from bug 1792716 (diff.ipaddre=
ss-patch-541) and now it no longer dies but consumes 92 - 93 % of the cpu a=
nd will not answer requests (One of the issues that I noted yesterday).
=

Now for the bad news.  I cannot remove the duplicate IP addresses =
as thats the way linux handles bonded ip addresses.  They all show up =
but the two ethernet devices which are bonded show up as slaves while the b=
onded device shows up as the master.

GDB OUTPUT BELOW FOR OUTPUT PRI=
OR TO PATCH AFTER SNMPD Crashed.

Program received signal SIGSEGV, Se=
gmentation fault.
0x0089e390 in _int_malloc () from /lib/tls/libc.so.6 r>(gdb) where
#0  0x0089e390 in _int_malloc () from /lib/tls/libc.s=
o.6
#1  0x0089fc76 in calloc () from /lib/tls/libc.so.6
#2 =
0x00bc57ee in ipSystemStatsTable_allocate_rowreq_ctx (data=3D0x843eab8, us=
er_init_ctx=3D0x3)
    at ip-mib/ipSystemStatsTable/ipSys=
temStatsTable_interface.c:426
#3  0x00bc78d7 in _add_new (systemsta=
ts_entry=3D0x843eab8, container=3D0x83a9230)
    at ip-mi=
b/ipSystemStatsTable/ipSystemStatsTable_data_access.c:263
#4  0x00d=
0d19b in _ba_for_each (container=3D0x3, f=3D0xbc788d <_add_new>, cont=
ext=3D0x83a9230) at container_binary_array.c:342
#5  0x00bc7c11 in =
ipSystemStatsTable_container_load (container=3D0x83a9230)
  &n=
bsp; at ip-mib/ipSystemStatsTable/ipSystemStatsTable_data_access.c:377
#=
6  0x00bc6af8 in _cache_load (cache=3D0x83a91f0, vmagic=3D0x83a9230) a=
t ip-mib/ipSystemStatsTable/ipSystemStatsTable_interface.c:1212
#7 =
0x004add54 in _cache_load (cache=3D0x83a91f0) at cache_handler.c:537
#8=
  0x00cf396f in run_alarms () at snmp_alarm.c:252
#9  0x0804c0=
03 in main (argc=3D11, argv=3D0xbfec4c34) at snmpd.c:1210
#10 0x00850e23=
in __libc_start_main () from /lib/tls/libc.so.6
#11 0x08049f41 in _star=
t ()
(gdb) list
1210        &=
nbsp;   run_alarms();
1211
1212    &nbs=
p;       netsnmp_check_outstanding_agent_requ=
ests();
1213
1214        } &n=
bsp;           &nbsp=
;             /=
* endwhile */
1215
1216        snm=
p_log(LOG_INFO, "Received TERM or STOP signal...  shutting down...\n")=
;
1217        return 0;
1218
12=
19    }        &nbsp=
;            &n=
bsp;         /* end receive() */
>(gdb)


GDB OUTPUT AFTER PATCH FROM bug 1792716  (Attached =
to process)
#0  0x0089e090 in _int_malloc () from /lib/tls/libc.so.=
6
#1  0x0089ff01 in malloc () from /lib/tls/libc.so.6
#2  0=
x0014a0ff in _sess_read (sessp=3D0x9977b58, fdset=3D0xbff33ec0) at snmp_api=
..c:5567
#3  0x0014ad0b in snmp_sess_read (sessp=3D0x9977b58, fdset=
=3D0x38) at snmp_api.c:5791
#4  0x0014ad59 in snmp_read (fdset=3D0x=
bff33ec0) at snmp_api.c:5408
#5  0x0804bffe in main (argc=3D10, arg=
v=3D0xbff34064) at snmpd.c:1180
#6  0x00850e23 in __libc_start_main=
() from /lib/tls/libc.so.6
#7  0x08049f41 in _start ()
(gdb) li=
st
1180          &nbsp=
;       snmp_read(&readfds);
1181&nbsp=
;            &n=
bsp;  }
1182         &=
nbsp;  } else
1183        &=
nbsp;       switch (count) {
1184 &nb=
sp;            =
  case 0:
1185        &nbsp=
;           snmp_timeout(=
);
1186          &nbsp=
;         break;
1187 &nbsp=
;            &n=
bsp; case -1:
1188         =
           DEBUGMSGTL(("s=
nmpd/select", "  errno =3D %d\n", errno));
1189   &n=
bsp;           &nbsp=
;    if (errno =3D=3D EINTR) {

Jayson

> Dat=
e: Tue, 6 Nov 2007 09:58:07 +0000
> From: D.T.Shield@liverpool.ac.uk<=
br>> To: thefeistycadavar@hotmail.com
> Subject: Re: More net-snmp=
5.4.1 startup issues.
> CC: net-snmp-coders@lists.sourceforge.net
>>
> On 05/11/2007, Jayson Robinson <thefeistycadavar@hotmail.=
com> wrote:
> > Actually I had to sanitize the data so that it =
could be exported. They're
> > very careful about IP addresses le=
aving the building.
>
> That's fair enough.
>
> &=
gt; I can confirm that bond0 / eth0 and eth4 all share the same IP address<=
br>> > though.
>
> This does sound more and more as if d=
uplicate IP addresses is the cause.
>
>
> > Now I ju=
st patched it with 2 various patches this morning:
> >
> &gt=
; official patch: 1805971
>
> That's not relevant to this part=
icular problem.
> (Though it is worth applying anyway)
>
&g=
t; > and
> > diff.ipaddress-patch-541
>
> I presum=
e you mean the patch from Bugs #1794532/1792716 ?
> [It's always wort=
h referring to tracker numbers, rather than patch
> file names!]
&=
gt;
> From the discussion in Bug #1794532, it sounds as if this patc=
h
> does not fix the problem either. It definitely sounds as if the =
later
> file diff.zones-541.pat is more promising.
>
> D=
ave


Climb to the top of the charts!  Play Star Shuffle:&nbs=
p; the word scramble challenge with star power. ..com/star_shuffle.aspx?icid=3Dstarshuffle_wlmailtextlin k_oct" target=3D"_bl=
ank">Play Now!