dlopen() sees some symbols, but not others - Aix

This is a discussion on dlopen() sees some symbols, but not others - Aix ; I want to load a shared library and have it access symbols in my main executable. I've been able to get this to work by building the main executable with -brtl and -bexpall. But, even though I've told the linker ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: dlopen() sees some symbols, but not others

  1. dlopen() sees some symbols, but not others

    I want to load a shared library and have it access symbols in my main
    executable. I've been able to get this to work by building the main
    executable with -brtl and -bexpall. But, even though I've told the
    linker to 'export all' symbols, not all are automatically available.

    For example, I wanted a module with some utility functions to be
    available to the loaded library, but since the main app doesn't call
    any of those functions, I added a stub module to the main app with a
    call to one of these functions in order to force the utility module to
    be linked to the main app.

    That worked for the most part - my dlopen'd code was able to call
    functions in that module. But a function called FormatElement(), which
    was in that module was unresolvable by dlopen(). In fact, another
    function calaled LogFormattedElement() in the same source module was
    resolvable and calls FormatElement() internally. As long as my
    dlopen'd library called this second function, it worked.

    Finally, I added a call to FormatElement() to my stub module to
    explicitly 'link it in', and now it's resolvable by dlopen(). Since
    this code was obviously there in the first place, and I used -bexpall,
    shouldn't it just have been available?


  2. Re: dlopen() sees some symbols, but not others

    "Rob Y" writes:

    > For example, I wanted a module with some utility functions to be
    > available to the loaded library, but since the main app doesn't call
    > any of those functions, I added a stub module to the main app with a
    > call to one of these functions in order to force the utility module to
    > be linked to the main app.


    Here is your first mistake: you are transporting your knowledge of
    how UNIX linkers and object files work to AIX, which doesn't quite
    work that way.

    On UNIX, calling one routine from .o links in that whole .o

    On AIX, compiler (by default) emits one CSect for each individual
    function, and linker (by default) garbage-collects unused CSects.
    Thus referencing one function from .o gets you that one function (and
    whatever that function itself calls); but not the rest of the .o.

    > That worked for the most part - my dlopen'd code was able to call
    > functions in that module. But a function called FormatElement(), which
    > was in that module was unresolvable by dlopen().


    Was it linked into the exe, or was it garbage-collected?
    ("nm a.out" will answer that).

    > In fact, another
    > function calaled LogFormattedElement() in the same source module was
    > resolvable and calls FormatElement() internally. As long as my
    > dlopen'd library called this second function, it worked.


    Hmm, that's inconsistent with my mental model of how AIX linking
    works. Are you sure? Can you construct a small test case?

    > Finally, I added a call to FormatElement() to my stub module to
    > explicitly 'link it in', and now it's resolvable by dlopen().


    That's consistent with FormatElement() having been GC'd,
    until you added explicit reference for it ...

    > Since this code was obviously there in the first place,


    I don't think it was there in the first place ...

    > and I used -bexpall, shouldn't it just have been available?


    Yes, *if* it was there, it should have been available (provided
    its name doesn't begin with underscore -- '-bexpall' doesn't
    export those).

    Cheers,
    --
    In order to understand recursion you must first understand recursion.
    Remove /-nsp/ for email.

  3. Re: dlopen() sees some symbols, but not others


    Paul Pluzhnikov wrote:

    > > Finally, I added a call to FormatElement() to my stub module to
    > > explicitly 'link it in', and now it's resolvable by dlopen().

    >
    > That's consistent with FormatElement() having been GC'd,
    > until you added explicit reference for it ...
    >


    Interesting. Now that I dig further, I find a call to LogFormattedKey
    from inside the main app, so I guess that function was there all along.
    Odd, though, that since LogFormattedKey calls FormatElement, and the
    code must be there, that symbol isn't exported. I guess the GC
    algorithm's tricky.

    Do you know of a way to bypass this symbol garbage collection? If
    not, I can just change my dummy 'dllstub.c' module to make calls to all
    the functions I want to force to be available, and I guess that'll
    force them to be pulled in and exported via -bexpall.

    By the way, is garbage collection only done for modules pulled in by
    the linker from a library? If the .o's are explicitly fed to the
    linker, are all functions in the .o's kept even if they're not
    referenced by the app? The functions in my dllstub.c module are never
    called - they're just there to provide references to other functions I
    want included, so I'm guessing that the answer is yes.

    Thanks,
    Rob


  4. Re: dlopen() sees some symbols, but not others


    Paul Pluzhnikov wrote:

    > > In fact, another
    > > function calaled LogFormattedElement() in the same source module was
    > > resolvable and calls FormatElement() internally. As long as my
    > > dlopen'd library called this second function, it worked.

    >
    > Hmm, that's inconsistent with my mental model of how AIX linking
    > works. Are you sure? Can you construct a small test case?
    >


    More info. I took out the explicit call to FormatElement in my main
    app, and nm still reports it as there, but dlopen doesn't see it:

    FormatElement:F-1 - 732

    But with the explicit call in the main app, nm reports 2 entries for
    FormatElement, and dlopen does see it:

    FormatElement D 536986860 12
    FormatElement:F-1 - 732

    The 'nm' man page describes the D entry as a global data symbol, but I
    don't see any explanation for the :F-1 entry. Any idea what that
    means?

    Thanks,
    Rob


  5. Re: dlopen() sees some symbols, but not others

    Rob Y wrote:
    > I want to load a shared library and have it access symbols in my main
    > executable. I've been able to get this to work by building the main
    > executable with -brtl and -bexpall. But, even though I've told the
    > linker to 'export all' symbols, not all are automatically available.


    Right. Unless the symbols are referenced by code within the main
    app, or by any dependent modules named on the command line, the
    linker will garbage collect symbols it deems unnecessary.

    As Paul stated, granularity on AIX is by csect (control section),
    _not_ object file. The intent is to removed unneeded code. If you
    need something retained, then say so.

    > For example, I wanted a module with some utility functions to be
    > available to the loaded library, but since the main app doesn't call
    > any of those functions, I added a stub module to the main app with a
    > call to one of these functions in order to force the utility module to
    > be linked to the main app.


    Or simply create an export list naming the required symbols.

    > Finally, I added a call to FormatElement() to my stub module to
    > explicitly 'link it in', and now it's resolvable by dlopen(). Since
    > this code was obviously there in the first place, and I used -bexpall,
    > shouldn't it just have been available?


    Depends upon what the code looks like. A testcase would be helpful.

  6. Re: dlopen() sees some symbols, but not others

    Rob Y wrote:
    >
    > Do you know of a way to bypass this symbol garbage collection? If
    > not, I can just change my dummy 'dllstub.c' module to make calls to all
    > the functions I want to force to be available, and I guess that'll
    > force them to be pulled in and exported via -bexpall.


    Better to build an explicit export list (which can be used along
    with -bexpall).

    > By the way, is garbage collection only done for modules pulled in by
    > the linker from a library? If the .o's are explicitly fed to the
    > linker, are all functions in the .o's kept even if they're not
    > referenced by the app?


    When you use -brtl and .bexpall any named .o files should be fully,
    retained, as the symbols in all of the .o files should end up
    exported.

  7. Re: dlopen() sees some symbols, but not others


    Gary R. Hook wrote:

    > Better to build an explicit export list (which can be used along
    > with -bexpall).
    >


    You anticipated my next question. I was going to use an export list,
    but wasn't sure if I'd have to make it exhaustive or whether -bexpall
    would still work along with the list.

    Thanks,
    Rob


+ Reply to Thread