xlC 8.0 and AIX 5.3 - Aix

This is a discussion on xlC 8.0 and AIX 5.3 - Aix ; I am trying to port our enterprise C++ servers from xlC 5.x to xlC 8.0 and also from AIX 5.1 OS to AIX 5.3 OS We are experiencing interesting behavior. Some servers are coring in the same code all the ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: xlC 8.0 and AIX 5.3

  1. xlC 8.0 and AIX 5.3

    I am trying to port our enterprise C++ servers from xlC 5.x to xlC 8.0
    and also from AIX 5.1 OS to AIX 5.3 OS

    We are experiencing interesting behavior. Some servers are coring in the
    same code all the time and when we change the order of shared libraries
    that are linked to these servers, the problem disappears.

    Also, all of our shared libraries are built with a priority of 2000.
    It's mystery to us why the loader behaves differently based on in what
    order we linked the shared libraries ( we link about 30 of them)

    Any one experienced above wierd hehavior with AIX 5.3 and xlC_r 8.0 ?


    Thanks for your help
    SK

  2. Re: xlC 8.0 and AIX 5.3

    srinivas writes:

    > I am trying to port our enterprise C++ servers from xlC 5.x to xlC 8.0
    > and also from AIX 5.1 OS to AIX 5.3 OS
    >
    > We are experiencing interesting behavior. Some servers are coring in
    > the same code all the time and when we change the order of shared
    > libraries that are linked to these servers, the problem disappears.


    Your code has bugs, and changing the order of library loading makes
    the symptoms of these bugs disappear.

    It is not at all unusual for bugs to remain hidden for years, then
    start showing up on a particular machine, or with a particular
    revision of the OS or the compiler.

    You may get more helpful replies if you describe your crashes in
    more detail.

    You may also get some help from code-checking tools, such as Purify,
    Insure++, Klocwork or Coverity.

    Cheers,
    --
    In order to understand recursion you must first understand recursion.
    Remove /-nsp/ for email.

  3. Re: xlC 8.0 and AIX 5.3

    Paul Pluzhnikov wrote:
    > srinivas writes:
    >
    >> I am trying to port our enterprise C++ servers from xlC 5.x to xlC 8.0
    >> and also from AIX 5.1 OS to AIX 5.3 OS
    >>
    >> We are experiencing interesting behavior. Some servers are coring in
    >> the same code all the time and when we change the order of shared
    >> libraries that are linked to these servers, the problem disappears.

    >
    > Your code has bugs, and changing the order of library loading makes
    > the symptoms of these bugs disappear.
    >
    > It is not at all unusual for bugs to remain hidden for years, then
    > start showing up on a particular machine, or with a particular
    > revision of the OS or the compiler.
    >
    > You may get more helpful replies if you describe your crashes in
    > more detail.
    >
    > You may also get some help from code-checking tools, such as Purify,
    > Insure++, Klocwork or Coverity.
    >
    > Cheers,

    Thanks for the reply.

    The servers are built as Tuxedo C++ servers and they all start up fine.
    The crash occurs always at a fixed point in the code ( per my
    experience , memory scribble cause crashes at random place. So i am not
    sure if this is caused by a memory scribble). The line where it crashes
    is a very straight forward code. It's just RWTValSlist ( Roguewave
    container class) declaration happening within a function.


    The top part of the stack from debugger is here
    -------------------------------------------------
    Illegal instruction (illegal opcode) in . at 0x0 ($t1)
    warning: Unable to access address 0x0 from core
    (dbx) .() at 0x0
    BusinessSvc.rw_internal_slist()(0x2ff1f4f8), line 482 in "slist.h"
    BusinessSvc.RWTValSlist()(0x2ff1f4f8), line 151 in "tvslist.h"
    GetBrokerData(BusinessSvc&,const
    RWTPtrSlist >&,const
    RWCString&)(0x22aefaa8, 0x2ff1f950, 0x22a687


  4. Re: xlC 8.0 and AIX 5.3

    srinivas writes:

    > The servers are built as Tuxedo C++ servers and they all start up
    > fine. The crash occurs always at a fixed point in the code ( per my
    > experience , memory scribble cause crashes at random place. So i am
    > not sure if this is caused by a memory scribble).


    Only if you have "random input data", have threads, or have an OS
    that does some randomization. If you have no randomness in your
    program, then "memory scribble" will cause it to crash in the same
    place every time.

    > The line where it
    > crashes is a very straight forward code. It's just RWTValSlist (
    > Roguewave container class) declaration happening within a function.


    What is on line 482 of slist.h ?

    > The top part of the stack from debugger is here
    > -------------------------------------------------
    > Illegal instruction (illegal opcode) in . at 0x0 ($t1)
    > warning: Unable to access address 0x0 from core
    > (dbx) .() at 0x0
    > BusinessSvc.rw_internal_slist()(0x2ff1f4f8), line 482 in "slist.h"
    > BusinessSvc.RWTValSlist()(0x2ff1f4f8), line 151 in "tvslist.h"
    > GetBrokerData(BusinessSvc&,const
    > RWTPtrSlist >&,const
    > RWCString&)(0x22aefaa8, 0x2ff1f950, 0x22a687


    The rest of the stack probably points to some global object being
    constructed (you may think that the rest of the stack is irrelevant,
    but it probably *is* quite relevant).

    You are calling some function on line 482 (possibly virtual method),
    and the function pointer is still NULL, so you try to execute
    instructions on the first page of VM, and promptly crash.

    Why that happens is impossible to say given the data you've
    provided; but possibly because you have a dependency on the order
    of initialization of global data (the order is unspecified, and
    quite possibly changes when switching compiler versions and when
    changing the order of libraries on the link line).

    Cheers,
    --
    In order to understand recursion you must first understand recursion.
    Remove /-nsp/ for email.

+ Reply to Thread