large file support && ! large file support - Unix

This is a discussion on large file support && ! large file support - Unix ; On 21 May 2007 22:43:27 GMT phil-news-nospam@ipal.net wrote: > No, I certainly do not have that legacy. But even if I did I would not > take the approach POSIX LFS did. I would want to make it work for ...

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 21 to 40 of 54

Thread: large file support && ! large file support

  1. Re: large file support && ! large file support

    On 21 May 2007 22:43:27 GMT phil-news-nospam@ipal.net wrote:
    > No, I certainly do not have that legacy. But even if I did I would not
    > take the approach POSIX LFS did. I would want to make it work for both
    > old and new programs, even if that meant having 2 or even 3 different
    > binary libraries (at least one being there for legacy ABI compatibility
    > for programs w/o source code available).


    That's nice on paper but really most software developers aren't that good
    and changing the default (to be LFS-ok) would IMHO have been a big problem.
    I can see even just requiring a re-link against a non-LFS library to be
    problematic.

    Anyway, I'll be very interested to hear of your solution.
    -frank

  2. Re: large file support && ! large file support

    On Mon, 21 May 2007 20:55:29 -0700 Frank Cusack wrote:
    | On 21 May 2007 22:43:27 GMT phil-news-nospam@ipal.net wrote:
    |> No, I certainly do not have that legacy. But even if I did I would not
    |> take the approach POSIX LFS did. I would want to make it work for both
    |> old and new programs, even if that meant having 2 or even 3 different
    |> binary libraries (at least one being there for legacy ABI compatibility
    |> for programs w/o source code available).
    |
    | That's nice on paper but really most software developers aren't that good
    | and changing the default (to be LFS-ok) would IMHO have been a big problem.
    | I can see even just requiring a re-link against a non-LFS library to be
    | problematic.
    |
    | Anyway, I'll be very interested to hear of your solution.

    My solution as I would have done POSIX LFS, or my solution for my libraries?

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-22-0723@ipal.net |
    |------------------------------------/-------------------------------------|

  3. Re: large file support && ! large file support

    On Mon, 21 May 2007 22:49:25 +0000, phil-news-nospam wrote:

    > On Mon, 21 May 2007 22:18:30 -0000 James Antill
    > | Doesn't _just_ having _LARGEFILE64_SOURCE=1 do the right thing here?
    > | All of the (informal) documentation I can find implies it does.
    >
    > But that is not the only way a calling program might do this.


    Right, I probably wasn't clear. My solution was to do:

    if off64_t is available in the environment, all library interfaces
    explicitly use off64_t. Otherwise library interfaces use off_t.

    This works for:

    .. 32bit programs where off_t == off64_t
    .. 32bit programs where off_t == 32bit, and off64_t is defined.
    .. 64bit programs with just off_t.

    ....and it only doesn't work for:

    .. 32bit programs where no LFS support is enabled.

    ....my feeling was that the later are broken anyway, so doing the right
    thing in all the other cases was enough.

    --
    James Antill -- james@and.org
    C String APIs use too much memory? ustr: length, ref count, size and
    read-only/fixed. Ave. 55% overhead over strdup(), for 0-20B strings
    http://www.and.org/ustr/

  4. Re: large file support && ! large file support

    On 22 May 2007 12:23:42 GMT phil-news-nospam@ipal.net wrote:
    > On Mon, 21 May 2007 20:55:29 -0700 Frank Cusack wrote:
    > | Anyway, I'll be very interested to hear of your solution.
    >
    > My solution as I would have done POSIX LFS, or my solution for my libraries?


    for your own libraries

  5. Re: large file support && ! large file support

    On Tue, 22 May 2007 15:16:21 -0000 James Antill wrote:
    | On Mon, 21 May 2007 22:49:25 +0000, phil-news-nospam wrote:
    |
    |> On Mon, 21 May 2007 22:18:30 -0000 James Antill
    |> | Doesn't _just_ having _LARGEFILE64_SOURCE=1 do the right thing here?
    |> | All of the (informal) documentation I can find implies it does.
    |>
    |> But that is not the only way a calling program might do this.
    |
    | Right, I probably wasn't clear. My solution was to do:
    |
    | if off64_t is available in the environment, all library interfaces
    | explicitly use off64_t. Otherwise library interfaces use off_t.

    How does one tell if that is so? LARGEFILE64_SOURCE?

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-22-2215@ipal.net |
    |------------------------------------/-------------------------------------|

  6. Re: large file support && ! large file support

    On Wed, 23 May 2007 03:16:30 +0000, phil-news-nospam wrote:

    > On Tue, 22 May 2007 15:16:21 -0000 James Antill
    > wrote: | On Mon, 21 May 2007 22:49:25 +0000, phil-news-nospam wrote: |
    > |> On Mon, 21 May 2007 22:18:30 -0000 James Antill
    > |> | Doesn't _just_ having _LARGEFILE64_SOURCE=1
    > do the right thing here? |> | All of the (informal) documentation I can
    > find implies it does. |>
    > |> But that is not the only way a calling program might do this. |
    > | Right, I probably wasn't clear. My solution was to do: |
    > | if off64_t is available in the environment, all library interfaces |
    > explicitly use off64_t. Otherwise library interfaces use off_t.
    >
    > How does one tell if that is so? LARGEFILE64_SOURCE?


    I basically[1] used:

    AC_CHECK_TYPE(off64_t, AC_DEFINE(HAVE_OFF64_T),
    AC_DEFINE_UNQUOTED(off64_t, off_t))

    ....you can't really use _LARGEFILE64_SOURCE because that relies on the
    user of your lib. defining it. If you don't use autoconf, you could try:

    getconf LFS64_CFLAGS

    ....and see what the output is, but I'm not sure how portable that is.


    [1] A little bit of sed post output to give the output a namespace.

    --
    James Antill -- james@and.org
    C String APIs use too much memory? ustr: length, ref count, size and
    read-only/fixed. Ave. 55% overhead over strdup(), for 0-20B strings
    http://www.and.org/ustr/

  7. Re: large file support && ! large file support

    On Wed, 23 May 2007 14:40:53 -0000 James Antill wrote:
    | On Wed, 23 May 2007 03:16:30 +0000, phil-news-nospam wrote:
    |
    |> On Tue, 22 May 2007 15:16:21 -0000 James Antill
    |> wrote: | On Mon, 21 May 2007 22:49:25 +0000, phil-news-nospam wrote: |
    |> |> On Mon, 21 May 2007 22:18:30 -0000 James Antill
    |> |> | Doesn't _just_ having _LARGEFILE64_SOURCE=1
    |> do the right thing here? |> | All of the (informal) documentation I can
    |> find implies it does. |>
    |> |> But that is not the only way a calling program might do this. |
    |> | Right, I probably wasn't clear. My solution was to do: |
    |> | if off64_t is available in the environment, all library interfaces |
    |> explicitly use off64_t. Otherwise library interfaces use off_t.
    |>
    |> How does one tell if that is so? LARGEFILE64_SOURCE?
    |
    | I basically[1] used:
    |
    | AC_CHECK_TYPE(off64_t, AC_DEFINE(HAVE_OFF64_T),
    | AC_DEFINE_UNQUOTED(off64_t, off_t))
    |
    | ...you can't really use _LARGEFILE64_SOURCE because that relies on the
    | user of your lib. defining it. If you don't use autoconf, you could try:

    if they don't define it, then how would they end up with off_t defined
    larger than "normal" and/or off64_t defined?


    | getconf LFS64_CFLAGS
    |
    | ...and see what the output is, but I'm not sure how portable that is.

    It comes up blank on some machines that do have large files. So it seems
    to not be reliable. Is it a program possibly compiled with other headers?

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-23-2004@ipal.net |
    |------------------------------------/-------------------------------------|

  8. Re: large file support && ! large file support

    On Thu, 24 May 2007 01:06:39 +0000, phil-news-nospam wrote:

    > On Wed, 23 May 2007 14:40:53 -0000 James Antill
    > |
    > | AC_CHECK_TYPE(off64_t, AC_DEFINE(HAVE_OFF64_T),
    > | AC_DEFINE_UNQUOTED(off64_t, off_t))
    > |
    > | ...you can't really use _LARGEFILE64_SOURCE because that relies on the
    > | user of your lib. defining it. If you don't use autoconf, you could
    > try:
    >
    > if they don't define it, then how would they end up with off_t defined
    > larger than "normal" and/or off64_t defined?


    Yes, when you've decided the environment has an off64_t build all your
    interfaces to use that ... then when a user of your library comes along,
    then before it gets to that piece of code someone will need to have
    defined _LARGEFILE64_SOURCE.

    What I mean is that this can only really be treated as a result, not
    something you should check IMO. The order goes roughly like:

    1. Compile of library must know if off64_t exists or not.
    2. Library header must know if off64_t exists or not.
    3. Someone must define _LARGEFILE64_SOURCE, if off64_t exists.
    4. Library should include the headers it needs.
    5. Library uses off_t or off64_t in it's own definitions.

    ....in theory you can swap 2 and 3, but then you might be giving out the
    wrong interface definitions if the user doesn't do the right thing
    (instead of doing a #warning or #error). The above order also means that
    the library headers can "do everything", which is much nicer to use,
    unless the user requests otherwise.

    --
    James Antill -- james@and.org
    C String APIs use too much memory? ustr: length, ref count, size and
    read-only/fixed. Ave. 55% overhead over strdup(), for 0-20B strings
    http://www.and.org/ustr/

  9. Re: large file support && ! large file support

    On Fri, 25 May 2007 00:00:30 -0000 James Antill wrote:
    | On Thu, 24 May 2007 01:06:39 +0000, phil-news-nospam wrote:
    |
    |> On Wed, 23 May 2007 14:40:53 -0000 James Antill
    |> |
    |> | AC_CHECK_TYPE(off64_t, AC_DEFINE(HAVE_OFF64_T),
    |> | AC_DEFINE_UNQUOTED(off64_t, off_t))
    |> |
    |> | ...you can't really use _LARGEFILE64_SOURCE because that relies on the
    |> | user of your lib. defining it. If you don't use autoconf, you could
    |> try:
    |>
    |> if they don't define it, then how would they end up with off_t defined
    |> larger than "normal" and/or off64_t defined?
    |
    | Yes, when you've decided the environment has an off64_t build all your
    | interfaces to use that ... then when a user of your library comes along,
    | then before it gets to that piece of code someone will need to have
    | defined _LARGEFILE64_SOURCE.

    They don't need to define _LARGEFILE64_SOURCE. They can choose to. Or
    they can choose not to. I want my library to work for both, and correctly.
    Correctly is defined as presenting the same files as the interfaces made
    available to them can do. But I also want _my_ library to achieve that by
    having no added function names, no added variable types, etc. Because the
    underlying functionality will have to be very slightly different for the
    two cases I see, there will be two different library files produced. They
    will have to link to the correct one.


    | What I mean is that this can only really be treated as a result, not
    | something you should check IMO. The order goes roughly like:
    |
    | 1. Compile of library must know if off64_t exists or not.
    | 2. Library header must know if off64_t exists or not.
    | 3. Someone must define _LARGEFILE64_SOURCE, if off64_t exists.
    | 4. Library should include the headers it needs.
    | 5. Library uses off_t or off64_t in it's own definitions.
    |
    | ...in theory you can swap 2 and 3, but then you might be giving out the
    | wrong interface definitions if the user doesn't do the right thing
    | (instead of doing a #warning or #error). The above order also means that
    | the library headers can "do everything", which is much nicer to use,
    | unless the user requests otherwise.

    If they don't define _LARGEFILE64_SOURCE, they are expecting a legacy
    interface with legacy semantics. While my library doesn't really have a
    legacy interface of its own, per se, it will still need to present the
    relevant matching semantics. For example, a failure to get stat info for
    a large file is one of those semantics.

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-25-0708@ipal.net |
    |------------------------------------/-------------------------------------|

  10. Re: large file support && ! large file support

    On 25 May 2007 12:14:39 GMT phil-news-nospam@ipal.net wrote:
    > They don't need to define _LARGEFILE64_SOURCE. They can choose to. Or
    > they can choose not to. I want my library to work for both, and correctly.
    > Correctly is defined as presenting the same files as the interfaces made
    > available to them can do. But I also want _my_ library to achieve that by
    > having no added function names, no added variable types, etc. Because the
    > underlying functionality will have to be very slightly different for the
    > two cases I see, there will be two different library files produced. They
    > will have to link to the correct one.


    What's the advantage of that over having *64 interfaces? Two libraries
    with the same interface but differently sized data types may also be
    confusing to debug (for users of the library).

    -frank

  11. Re: large file support && ! large file support

    On Fri, 25 May 2007 08:38:45 -0700, Frank Cusack wrote:

    > On 25 May 2007 12:14:39 GMT phil-news-nospam@ipal.net wrote:
    >> They don't need to define _LARGEFILE64_SOURCE. They can choose to. Or
    >> they can choose not to. I want my library to work for both, and
    >> correctly. Correctly is defined as presenting the same files as the
    >> interfaces made available to them can do. But I also want _my_ library
    >> to achieve that by having no added function names, no added variable
    >> types, etc. Because the underlying functionality will have to be very
    >> slightly different for the two cases I see, there will be two different
    >> library files produced. They will have to link to the correct one.

    >
    > What's the advantage of that over having *64 interfaces? Two libraries
    > with the same interface but differently sized data types may also be
    > confusing to debug (for users of the library).


    Not only that but it means that shared libraries are much less likely to
    use your library (because then _their_ users won't be able to use/not-use
    LFS), and it sounds like random users of the library are likely to screw
    up compiling the library (I can only hope that the library name they link
    against is the same on i386 and x86-64).

    --
    James Antill -- james@and.org
    C String APIs use too much memory? ustr: length, ref count, size and
    read-only/fixed. Ave. 55% overhead over strdup(), for 0-20B strings
    http://www.and.org/ustr/

  12. Re: large file support && ! large file support

    On Fri, 25 May 2007 08:38:45 -0700 Frank Cusack wrote:
    | On 25 May 2007 12:14:39 GMT phil-news-nospam@ipal.net wrote:
    |> They don't need to define _LARGEFILE64_SOURCE. They can choose to. Or
    |> they can choose not to. I want my library to work for both, and correctly.
    |> Correctly is defined as presenting the same files as the interfaces made
    |> available to them can do. But I also want _my_ library to achieve that by
    |> having no added function names, no added variable types, etc. Because the
    |> underlying functionality will have to be very slightly different for the
    |> two cases I see, there will be two different library files produced. They
    |> will have to link to the correct one.
    |
    | What's the advantage of that over having *64 interfaces? Two libraries
    | with the same interface but differently sized data types may also be
    | confusing to debug (for users of the library).

    No one _needs_ two libraries or two interfaces _unless_ they are mixing
    types of programs on the same system. Either of these two approachs is
    _supposed_ to be a transition. However, because programs get coded to
    use *64 interface, it really is not a true transition. This is a legacy
    that will be difficult to get rid of. This is why I do not want to create
    it in the first place ... because it will end up being there forever.

    I stand by MHO that the whole *64 interface idea (at the POSIX layer) was
    a terribly bad idea. At the kernel layer (ABI or even API), it does not
    matter so much, as few programs should be touching that layer (system
    utilities and the core libc or alternative stub library, and that's all).

    A system might be pure 32-bit (in terms of file offset/size referencing
    facility). Or a system might be pure 64-bit (regardless of whether the
    pointer size is 32-bit or 64-bit). Or a system could have a mix of 32-bit
    and 64-bit (on pointer size being 32-bit or 64-bit). The mix would be to
    support old programs that have not been, or cannot yet be, converted to
    64-bit for some reason (badly coded, lack of source, etc). It is only the
    mix systems that would need two libraries (or three or four if it is a
    64-bit pointer architecture that also supports 32-bit pointers).

    The thing is, a system "in transition" could eventually get itself out of
    transition by replacing all programs that need legacy interface sizing with
    programs that use the latest interface sizing ... and removing the no longer
    needed additional libraries. With *64 interface symbols, it will be next
    to impossible to exit from the transition.

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-26-1950@ipal.net |
    |------------------------------------/-------------------------------------|

  13. Re: large file support && ! large file support

    On 27 May 2007 01:05:31 GMT phil-news-nospam@ipal.net wrote:
    > On Fri, 25 May 2007 08:38:45 -0700 Frank Cusack wrote:
    > | On 25 May 2007 12:14:39 GMT phil-news-nospam@ipal.net wrote:
    > |> They don't need to define _LARGEFILE64_SOURCE. They can choose to. Or
    > |> they can choose not to. I want my library to work for both, and correctly.
    > |> Correctly is defined as presenting the same files as the interfaces made
    > |> available to them can do. But I also want _my_ library to achieve that by
    > |> having no added function names, no added variable types, etc. Because the
    > |> underlying functionality will have to be very slightly different for the
    > |> two cases I see, there will be two different library files produced. They
    > |> will have to link to the correct one.
    > |
    > | What's the advantage of that over having *64 interfaces? Two libraries
    > | with the same interface but differently sized data types may also be
    > | confusing to debug (for users of the library).
    >
    > No one _needs_ two libraries or two interfaces _unless_ they are mixing
    > types of programs on the same system. Either of these two approachs is
    > _supposed_ to be a transition. However, because programs get coded to
    > use *64 interface, it really is not a true transition. This is a legacy
    > that will be difficult to get rid of. This is why I do not want to create
    > it in the first place ... because it will end up being there forever.


    Programs are coded to use the LFS *64 interfaces mostly transparently,
    through macros. It's entirely possible to come along later and add
    *32 interfaces and drop the *64 names. Of course this will break all
    existing compiled programs expecting the unadorned names to have
    32-bit data types, but that is something you apparently find
    acceptable. It just means all existing programs have to be
    recompiled.

    > I stand by MHO that the whole *64 interface idea (at the POSIX layer) was

    [yadda]

    You're awful hung up on how bad the so-called LFS transition is. OK,
    maybe it sucks! But you are stuck with it.

    Since your library depends on LFS types, the users of your library are
    stuck with *64 interfaces, regardless of how your library handles it.
    Using a macro to select a 32- or 64-bit interface is pretty easy,
    especially since users of your library *already* have to do that.

    If you have identical interface names for the 2 different cases, how
    will users of your library determine which library to link against?
    No matter how you do it, it will be extra work on the user's part
    and IMHO more confusing to debug. It will also be unreliable; you
    won't be able to guarantee that the correct library is linked.

    -frank

  14. Re: large file support && ! large file support

    On Sun, 27 May 2007 14:27:24 -0700 Frank Cusack wrote:
    | On 27 May 2007 01:05:31 GMT phil-news-nospam@ipal.net wrote:
    |> On Fri, 25 May 2007 08:38:45 -0700 Frank Cusack wrote:
    |> | On 25 May 2007 12:14:39 GMT phil-news-nospam@ipal.net wrote:
    |> |> They don't need to define _LARGEFILE64_SOURCE. They can choose to. Or
    |> |> they can choose not to. I want my library to work for both, and correctly.
    |> |> Correctly is defined as presenting the same files as the interfaces made
    |> |> available to them can do. But I also want _my_ library to achieve that by
    |> |> having no added function names, no added variable types, etc. Because the
    |> |> underlying functionality will have to be very slightly different for the
    |> |> two cases I see, there will be two different library files produced. They
    |> |> will have to link to the correct one.
    |> |
    |> | What's the advantage of that over having *64 interfaces? Two libraries
    |> | with the same interface but differently sized data types may also be
    |> | confusing to debug (for users of the library).
    |>
    |> No one _needs_ two libraries or two interfaces _unless_ they are mixing
    |> types of programs on the same system. Either of these two approachs is
    |> _supposed_ to be a transition. However, because programs get coded to
    |> use *64 interface, it really is not a true transition. This is a legacy
    |> that will be difficult to get rid of. This is why I do not want to create
    |> it in the first place ... because it will end up being there forever.
    |
    | Programs are coded to use the LFS *64 interfaces mostly transparently,
    | through macros. It's entirely possible to come along later and add
    | *32 interfaces and drop the *64 names. Of course this will break all
    | existing compiled programs expecting the unadorned names to have
    | 32-bit data types, but that is something you apparently find
    | acceptable. It just means all existing programs have to be
    | recompiled.
    |
    |> I stand by MHO that the whole *64 interface idea (at the POSIX layer) was
    | [yadda]
    |
    | You're awful hung up on how bad the so-called LFS transition is. OK,
    | maybe it sucks! But you are stuck with it.

    Yes, I am stuck with it. Yes, I am hung up on how bad it is. But that
    doesn't mean I have to further it.


    | Since your library depends on LFS types, the users of your library are
    | stuck with *64 interfaces, regardless of how your library handles it.

    They are stuck with the POSIX names having *64 versions, if they select
    such.


    | Using a macro to select a 32- or 64-bit interface is pretty easy,
    | especially since users of your library *already* have to do that.

    _LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, and _FILE_OFFSET_BITS=64 ?


    | If you have identical interface names for the 2 different cases, how
    | will users of your library determine which library to link against?
    | No matter how you do it, it will be extra work on the user's part
    | and IMHO more confusing to debug. It will also be unreliable; you
    | won't be able to guarantee that the correct library is linked.

    Users could be making use of non-*64 names AND not define any of the macros
    on a 32-bit only system (has not begun the transition) and on a 64-bit only
    system (has completed the transition). What should my library do in these
    two cases? I see no reason to export to the calling program any *64 variant
    name. I won't be able to use *64 variant POSIX names. Each of these cases
    will mean a quite normal compile just as if the LFS strategy had never been
    created in the first place. The end result will be programs that expect to
    link to dynamic symbols with no *64 varient names, yet have two different
    ABIs. They will have to be linked to a library with the correct ABI, which
    on these two system cases would be expected to be the only library.

    Now take these two dynamically linked executable binaries over to a third
    system which is in transition.

    I don't have an issue with making my library link to *64 POSIX names.
    Well, I do in the sense that I think the whole approach is wrong, but I
    can deal with it. I just don't want to create new names to export to
    the calling program. I don't want to expand on the POSIX mistake.

    I think what you are trying to suggest isn't so much that I have to use
    my own *64 names at the ABI layer (to dynamically link with), but rather,
    that it is difficult to sort out what library name will be linked with
    for systems with both ABIs available. And that will end up being a lot
    of systems for a long time because of the lack of an exit strategy in
    the POSIX LFS design. So I take it that you are suggesting that what I
    need to do is have both filesystem size interfaces in a single library
    file, which then requires distinct names for the two different sizes.
    If I am correct in that assessment of your position, could you tell me
    your feeling about using *32 names for the 32-bit version of symbols,
    and leave the 64-bit versions as plain names?

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-27-1726@ipal.net |
    |------------------------------------/-------------------------------------|

  15. Re: large file support && ! large file support

    On 27 May 2007 23:15:02 GMT phil-news-nospam@ipal.net wrote:
    > On Sun, 27 May 2007 14:27:24 -0700 Frank Cusack wrote:
    > | Since your library depends on LFS types, the users of your library are
    > | stuck with *64 interfaces, regardless of how your library handles it.
    >
    > They are stuck with the POSIX names having *64 versions, if they select
    > such.


    Yes.

    > | Using a macro to select a 32- or 64-bit interface is pretty easy,
    > | especially since users of your library *already* have to do that.
    >
    > _LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, and _FILE_OFFSET_BITS=64 ?


    It depends on the platform. Generally you would use
    `getconf LFS_CFLAGS`. Does autoconf have a built-in check for the
    correct flags?

    My point was that users have to select the correct macros anyway, so
    your use of those macros to switch between plain and *64 interface
    names is no extra burden. But I was assuming users correctly choose
    the macros to set. Which is not at all a certainty. I've seen lots
    and lots of open source software which gets it wrong.

    > | If you have identical interface names for the 2 different cases, how
    > | will users of your library determine which library to link against?
    > | No matter how you do it, it will be extra work on the user's part
    > | and IMHO more confusing to debug. It will also be unreliable; you
    > | won't be able to guarantee that the correct library is linked.
    >
    > Users could be making use of non-*64 names AND not define any of the macros
    > on a 32-bit only system (has not begun the transition) and on a 64-bit only
    > system (has completed the transition). What should my library do in these
    > two cases? I see no reason to export to the calling program any *64 variant
    > name.


    Well, sure (by definition). For the 32-bit non-LFS case and the 64-bit
    case, you just use the plain interface and you get 32-bit or 64-bit
    data types and file size support.

    > I won't be able to use *64 variant POSIX names. Each of these cases
    > will mean a quite normal compile just as if the LFS strategy had never been
    > created in the first place. The end result will be programs that expect to
    > link to dynamic symbols with no *64 varient names, yet have two different
    > ABIs.


    These are 2 different ABIs because they are 2 different architectures.

    > They will have to be linked to a library with the correct ABI, which
    > on these two system cases would be expected to be the only library.


    Not on a multilib system. In which case the user is still prevented
    from linking against the wrong library (32-bit apps can't link against
    64-bit libraries, and vice versa).

    > Now take these two dynamically linked executable binaries over to a third
    > system which is in transition.


    And they will work, because on the transitional system, the 32-bit
    apps with 64-bit versions of the interface have *64 names.

    But if you don't have a different interface name for 64-bit data types
    on a 32-bit system, you can't compile on a transitional system and use
    the resultant binary on a non-transitional system. Or rather, the
    binary will execute but will be broken.

    > I don't have an issue with making my library link to *64 POSIX names.
    > Well, I do in the sense that I think the whole approach is wrong, but I
    > can deal with it. I just don't want to create new names to export to
    > the calling program. I don't want to expand on the POSIX mistake.
    >
    > I think what you are trying to suggest isn't so much that I have to use
    > my own *64 names at the ABI layer (to dynamically link with), but rather,
    > that it is difficult to sort out what library name will be linked with
    > for systems with both ABIs available.


    Yes. And impossible to tell if you are linked against the correct
    library; which is a bigger problem if you compile and one system
    and run on another (as you would typically do with pkg mgmt).

    > And that will end up being a lot
    > of systems for a long time because of the lack of an exit strategy in
    > the POSIX LFS design. So I take it that you are suggesting that what I
    > need to do is have both filesystem size interfaces in a single library
    > file, which then requires distinct names for the two different sizes.


    Yes.

    > If I am correct in that assessment of your position, could you tell me
    > your feeling about using *32 names for the 32-bit version of symbols,
    > and leave the 64-bit versions as plain names?


    Sounds great.

    -frank

  16. Re: large file support && ! large file support

    On Sun, 27 May 2007 17:14:31 -0700 Frank Cusack wrote:
    | On 27 May 2007 23:15:02 GMT phil-news-nospam@ipal.net wrote:
    |> On Sun, 27 May 2007 14:27:24 -0700 Frank Cusack wrote:
    |> | Since your library depends on LFS types, the users of your library are
    |> | stuck with *64 interfaces, regardless of how your library handles it.
    |>
    |> They are stuck with the POSIX names having *64 versions, if they select
    |> such.
    |
    | Yes.
    |
    |> | Using a macro to select a 32- or 64-bit interface is pretty easy,
    |> | especially since users of your library *already* have to do that.
    |>
    |> _LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, and _FILE_OFFSET_BITS=64 ?
    |
    | It depends on the platform. Generally you would use
    | `getconf LFS_CFLAGS`. Does autoconf have a built-in check for the
    | correct flags?

    I do not use autoconf and fully intend to avoid it.


    | My point was that users have to select the correct macros anyway, so
    | your use of those macros to switch between plain and *64 interface
    | names is no extra burden. But I was assuming users correctly choose
    | the macros to set. Which is not at all a certainty. I've seen lots
    | and lots of open source software which gets it wrong.

    Given there are 8 different ways to do this, and apparently only 3 of
    them make any sense, I can imagine the difficulties.


    |> | If you have identical interface names for the 2 different cases, how
    |> | will users of your library determine which library to link against?
    |> | No matter how you do it, it will be extra work on the user's part
    |> | and IMHO more confusing to debug. It will also be unreliable; you
    |> | won't be able to guarantee that the correct library is linked.
    |>
    |> Users could be making use of non-*64 names AND not define any of the macros
    |> on a 32-bit only system (has not begun the transition) and on a 64-bit only
    |> system (has completed the transition). What should my library do in these
    |> two cases? I see no reason to export to the calling program any *64 variant
    |> name.
    |
    | Well, sure (by definition). For the 32-bit non-LFS case and the 64-bit
    | case, you just use the plain interface and you get 32-bit or 64-bit
    | data types and file size support.

    But then there is the library issue. Either you need two libraries or
    you need to make one set of interface symbols use variant names at the
    ABI layer.


    |> I won't be able to use *64 variant POSIX names. Each of these cases
    |> will mean a quite normal compile just as if the LFS strategy had never been
    |> created in the first place. The end result will be programs that expect to
    |> link to dynamic symbols with no *64 varient names, yet have two different
    |> ABIs.
    |
    | These are 2 different ABIs because they are 2 different architectures.

    2 different ABIs, or 2 different sets of names in the same ABI?
    At least the implementation I see in Linux has 2 different sets
    of names and a single ABI (all in one libc.so).

    I want to avoid the 2 different sets of names for my library. I do not
    want to end up with binary executables that are trying to link to names
    that have "64" in them, especially not on machines that have only one
    supported way to access the filesystem (machines that are beyond the
    LFS transition ... e.g. "LFS pure").


    |> They will have to be linked to a library with the correct ABI, which
    |> on these two system cases would be expected to be the only library.
    |
    | Not on a multilib system. In which case the user is still prevented
    | from linking against the wrong library (32-bit apps can't link against
    | 64-bit libraries, and vice versa).

    Is this a confusion with pointer size differences. There are 4 different
    sub-architecture possibilities:

    32-bit pointer plus 32-bit file offset
    32-bit pointer plus 64-bit file offset
    64-bit pointer plus 32-bit file offset (doubtful any of these exist)
    64-bit pointer plus 64-bit file offset

    for the sake of clarity, and because the 64/32 cases probably do not even
    exist anywhere, I'll be just discussing 32-bit pointer based architectures
    unless I say otherwise.


    |> Now take these two dynamically linked executable binaries over to a third
    |> system which is in transition.
    |
    | And they will work, because on the transitional system, the 32-bit
    | apps with 64-bit versions of the interface have *64 names.

    What do you mean by "32-bit apps with 64-bit versions"?


    | But if you don't have a different interface name for 64-bit data types
    | on a 32-bit system, you can't compile on a transitional system and use
    | the resultant binary on a non-transitional system. Or rather, the
    | binary will execute but will be broken.

    Link to a different transitional library. But that is hard because no
    standard emerged on how to have this set up to make it easy.


    |> I don't have an issue with making my library link to *64 POSIX names.
    |> Well, I do in the sense that I think the whole approach is wrong, but I
    |> can deal with it. I just don't want to create new names to export to
    |> the calling program. I don't want to expand on the POSIX mistake.
    |>
    |> I think what you are trying to suggest isn't so much that I have to use
    |> my own *64 names at the ABI layer (to dynamically link with), but rather,
    |> that it is difficult to sort out what library name will be linked with
    |> for systems with both ABIs available.
    |
    | Yes. And impossible to tell if you are linked against the correct
    | library; which is a bigger problem if you compile and one system
    | and run on another (as you would typically do with pkg mgmt).

    Yes, that is a problem. And it is a problem because POSIX chose to not
    specify a way to identify the libraries (probably because the scope of
    what POSIX is about is the API). IMHO, it should never have been an
    API issue (aside from fseek and ftell being broken in POSIX, fixed by
    the change to fseeko and ftello). It should have been an architecture
    selection issue, with a better means to identify sub-architectures on
    mixed-sub-architecture systems (mixed during transition from one to
    another).


    |> And that will end up being a lot
    |> of systems for a long time because of the lack of an exit strategy in
    |> the POSIX LFS design. So I take it that you are suggesting that what I
    |> need to do is have both filesystem size interfaces in a single library
    |> file, which then requires distinct names for the two different sizes.
    |
    | Yes.
    |
    |> If I am correct in that assessment of your position, could you tell me
    |> your feeling about using *32 names for the 32-bit version of symbols,
    |> and leave the 64-bit versions as plain names?
    |
    | Sounds great.

    I might do this, then. Since no common facility exists to properly select
    the correct library, it would be difficult to ensure that from just one
    library. And the other alternative is to forego any 32-bit offset support
    altogether.

    How do they do this for mixing 32-bit pointer programs and 64-bit pointer
    programs on the same machine where it can support both sub-architectures?
    Are there different library files? Or do all the syscalls get *64 variant
    names, too?

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-27-2238@ipal.net |
    |------------------------------------/-------------------------------------|

  17. Re: large file support && ! large file support

    On 28 May 2007 04:03:08 GMT phil-news-nospam@ipal.net wrote:
    > On Sun, 27 May 2007 17:14:31 -0700 Frank Cusack wrote:
    >
    > | My point was that users have to select the correct macros anyway, so
    > | your use of those macros to switch between plain and *64 interface
    > | names is no extra burden. But I was assuming users correctly choose
    > | the macros to set. Which is not at all a certainty. I've seen lots
    > | and lots of open source software which gets it wrong.
    >
    > Given there are 8 different ways to do this, and apparently only 3 of
    > them make any sense, I can imagine the difficulties.


    Part of the reason it's difficult because the Linux folks haven't
    really learned of 'getconf LFS_CFLAGS'.

    > |> Users could be making use of non-*64 names AND not define any of
    > |> the macros on a 32-bit only system (has not begun the transition)
    > |> and on a 64-bit only system (has completed the transition). What
    > |> should my library do in these two cases? I see no reason to
    > |> export to the calling program any *64 variant name.
    > |
    > | Well, sure (by definition). For the 32-bit non-LFS case and the 64-bit
    > | case, you just use the plain interface and you get 32-bit or 64-bit
    > | data types and file size support.
    >
    > But then there is the library issue. Either you need two libraries or
    > you need to make one set of interface symbols use variant names at the
    > ABI layer.


    When you wrote "64-bit only system" I took that to mean a 64-bit
    architecture, not a 32-bit architecture using the 64-bit data types.

    So most of the rest of my response (and your followup) doesn't make
    sense and I'll just elide it.

    .....
    > I might do this, then. Since no common facility exists to properly select
    > the correct library, it would be difficult to ensure that from just one
    > library. And the other alternative is to forego any 32-bit offset support
    > altogether.


    Yeah, in your header file you could just have something like

    #if defined(_ILP32) && (_FILE_OFFSET_BITS != 64)
    #error "32-bit offset (non-LFS) not supported by libfoo"
    #endif

    > How do they do this for mixing 32-bit pointer programs and 64-bit pointer
    > programs on the same machine where it can support both sub-architectures?
    > Are there different library files? Or do all the syscalls get *64 variant
    > names, too?


    Are you asking how does a 64-bit CPU run 32- and 64-bit programs
    simultaneously?

    First, both the linker and the runtime loader disallow linking between
    different architectures (this is done via ELF magic number stuff). So
    there are 2 different versions of each library. But because the same
    library names are used (e.g. libc.so), this means that 32-bit and
    64-bit programs have different DT_RPATH settings (either embedded in
    the app or default chosen by the runtime linker).

    32-bit programs make syscalls to the 64-bit kernel via a different
    entry point than 64-bit programs make (which allows clearing the upper
    half of registers, etc). So I would guess that there are 2 different
    kernel functions for the 2 different LFS data sizes. Solaris on SPARC
    seems to do it that way:

    $ dis -F lseek /usr/lib/libc.so
    **** DISASSEMBLER ****


    disassembly for /usr/lib/libc.so

    section .text
    lseek()
    lseek: 82 10 20 13 mov 0x13, %g1
    lseek+0x4: 91 d0 20 08 ta %icc, %g0 + 8
    lseek+0x8: 0a bd 85 dc blu __cerror
    lseek+0xc: 01 00 00 00 nop
    lseek+0x10: 81 c3 e0 08 retl
    lseek+0x14: 01 00 00 00 nop
    $ dis -F llseek /usr/lib/libc.so
    **** DISASSEMBLER ****


    disassembly for /usr/lib/libc.so

    section .text
    llseek()
    llseek: 82 10 20 af mov 0xaf, %g1
    llseek+0x4: 91 d0 20 08 ta %icc, %g0 + 8
    llseek+0x8: 0a bd 85 ec blu __cerror64
    llseek+0xc: 01 00 00 00 nop
    llseek+0x10: 81 c3 e0 08 retl
    llseek+0x14: 01 00 00 00 nop
    $

    So for lseek, it's syscall #0x13 and for llseek it's syscall #0xaf
    (lseek maps to llseek in the Solaris LFS environment).

    The only problem with that is that the 64-bit libc calls syscall #0x13
    for BOTH lseek() and llseek(). Therefore it would seem to me that the
    lseek() syscall does in fact return 64-bit data.

    I suppose in this case, since a 32-bit non-LFS app cannot pass 64-bit
    data to lseek(), it is only the return value we care about validating.
    And if there is a return value >= 2^31, the different syscall entry
    point handler can throw the [valid] return from the syscall away as
    invalid. I don't know how to trace an app at that level to verify
    this one way or the other. That doesn't seem quite right because the
    lseek() would still have been done, and the handler would have to know
    whether to expect signed or unsigned return values, and that for signed
    values -1 is the only acceptable result. Unless of course 2^31-1 is
    the largest possible return value for a 32-bit syscall.

    Well, as you can see, I don't really know how the syscall part works.

    -frank

  18. Re: large file support && ! large file support

    On Mon, 28 May 2007 00:00:50 -0700 Frank Cusack wrote:
    | On 28 May 2007 04:03:08 GMT phil-news-nospam@ipal.net wrote:
    |> On Sun, 27 May 2007 17:14:31 -0700 Frank Cusack wrote:
    |>
    |> | My point was that users have to select the correct macros anyway, so
    |> | your use of those macros to switch between plain and *64 interface
    |> | names is no extra burden. But I was assuming users correctly choose
    |> | the macros to set. Which is not at all a certainty. I've seen lots
    |> | and lots of open source software which gets it wrong.
    |>
    |> Given there are 8 different ways to do this, and apparently only 3 of
    |> them make any sense, I can imagine the difficulties.
    |
    | Part of the reason it's difficult because the Linux folks haven't
    | really learned of 'getconf LFS_CFLAGS'.

    ================================================== ==============
    phil@varuna:/home/phil 916> getconf LFS_CFLAGS
    -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
    phil@varuna:/home/phil 917>
    ================================================== ==============

    It seems to do something. But what is that telling me? I thought that
    the macros that provide information from the system to the program had
    different names, e.g. _LFS_LARGEFILE and _LFS64_LARGEFILE.

    How can getconf know whether a program is written to use 64 bits with
    the traditional non-64 names, or is written to use the alternate names?
    It certainly cannot if not given any such program to examine.

    All that getconf can be expected to do is provide information about what
    the system implementation is capable of providing.


    |> |> Users could be making use of non-*64 names AND not define any of
    |> |> the macros on a 32-bit only system (has not begun the transition)
    |> |> and on a 64-bit only system (has completed the transition). What
    |> |> should my library do in these two cases? I see no reason to
    |> |> export to the calling program any *64 variant name.
    |> |
    |> | Well, sure (by definition). For the 32-bit non-LFS case and the 64-bit
    |> | case, you just use the plain interface and you get 32-bit or 64-bit
    |> | data types and file size support.
    |>
    |> But then there is the library issue. Either you need two libraries or
    |> you need to make one set of interface symbols use variant names at the
    |> ABI layer.
    |
    | When you wrote "64-bit only system" I took that to mean a 64-bit
    | architecture, not a 32-bit architecture using the 64-bit data types.

    Sorry. I'm talking about the opaque data types to deal with file offsets
    and file sizes, structs that contain them as members, and the functions
    that work with these types and structs.


    | So most of the rest of my response (and your followup) doesn't make
    | sense and I'll just elide it.

    Oh.


    | ....
    |> I might do this, then. Since no common facility exists to properly select
    |> the correct library, it would be difficult to ensure that from just one
    |> library. And the other alternative is to forego any 32-bit offset support
    |> altogether.
    |
    | Yeah, in your header file you could just have something like
    |
    | #if defined(_ILP32) && (_FILE_OFFSET_BITS != 64)
    | #error "32-bit offset (non-LFS) not supported by libfoo"
    | #endif

    But I will support it. If my library gets compiled on a system that has no
    LFS support, I want it to work just fine with the size that is available.


    |> How do they do this for mixing 32-bit pointer programs and 64-bit pointer
    |> programs on the same machine where it can support both sub-architectures?
    |> Are there different library files? Or do all the syscalls get *64 variant
    |> names, too?
    |
    | Are you asking how does a 64-bit CPU run 32- and 64-bit programs
    | simultaneously?
    |
    | First, both the linker and the runtime loader disallow linking between
    | different architectures (this is done via ELF magic number stuff). So
    | there are 2 different versions of each library. But because the same
    | library names are used (e.g. libc.so), this means that 32-bit and
    | 64-bit programs have different DT_RPATH settings (either embedded in
    | the app or default chosen by the runtime linker).
    |
    | 32-bit programs make syscalls to the 64-bit kernel via a different
    | entry point than 64-bit programs make (which allows clearing the upper
    | half of registers, etc). So I would guess that there are 2 different
    | kernel functions for the 2 different LFS data sizes. Solaris on SPARC
    | seems to do it that way:

    At the kernel ABI layer, there does not necessarily need to be different
    syscalls. With a translating library dynamically linked in, a call to a
    32-bit function can still call a 64-bit kernel interface in most cases.
    A few cases like open() would still need to inform the kernel so it can
    flag the descriptor to limit access to only those files that 32-bit
    offsets can work with.

    For example, a program expecting struct stat with 32-bit members calls
    stat() via being linked to a translating library. The stub code in that
    library for stat() will call the kernel interface with a 64-bit expectation
    (passes a pointer to a 64-bit struct stat in most implementations). If the
    kernel returns an error, the stub returns that error. Else it then tests
    the values to see if they properly fit in a 32-bit struct stat. If any do
    not, it returns an error. If they all do fit, it fills in the caller's
    32-bit struct stat and returns success.

    Thus a kernel ABI could have mostly only 64-bit file offset/size interfaces
    and still support programs expecting 32-bit offset/size interfaces by means
    of a different library that presents a 32-bit program ABI and talks to the
    kernel's 64-bit ABI. A similar method could be done for 32-bit pointer ABI
    interfaces as well. Either there would be separate libraries loaded for
    each set of functions (the dynamic link would have to know how to do this
    as an faking of a single libc), or there would be a combination of libraries
    to handle the variant mixes pointer bits and offset bits that might exist.



    |
    | $ dis -F lseek /usr/lib/libc.so
    | **** DISASSEMBLER ****
    |
    |
    | disassembly for /usr/lib/libc.so
    |
    | section .text
    | lseek()
    | lseek: 82 10 20 13 mov 0x13, %g1
    | lseek+0x4: 91 d0 20 08 ta %icc, %g0 + 8
    | lseek+0x8: 0a bd 85 dc blu __cerror
    | lseek+0xc: 01 00 00 00 nop
    | lseek+0x10: 81 c3 e0 08 retl
    | lseek+0x14: 01 00 00 00 nop
    | $ dis -F llseek /usr/lib/libc.so
    | **** DISASSEMBLER ****
    |
    |
    | disassembly for /usr/lib/libc.so
    |
    | section .text
    | llseek()
    | llseek: 82 10 20 af mov 0xaf, %g1
    | llseek+0x4: 91 d0 20 08 ta %icc, %g0 + 8
    | llseek+0x8: 0a bd 85 ec blu __cerror64
    | llseek+0xc: 01 00 00 00 nop
    | llseek+0x10: 81 c3 e0 08 retl
    | llseek+0x14: 01 00 00 00 nop
    | $
    |
    | So for lseek, it's syscall #0x13 and for llseek it's syscall #0xaf
    | (lseek maps to llseek in the Solaris LFS environment).
    |
    | The only problem with that is that the 64-bit libc calls syscall #0x13
    | for BOTH lseek() and llseek(). Therefore it would seem to me that the
    | lseek() syscall does in fact return 64-bit data.

    That's plausible.


    | I suppose in this case, since a 32-bit non-LFS app cannot pass 64-bit
    | data to lseek(), it is only the return value we care about validating.
    | And if there is a return value >= 2^31, the different syscall entry
    | point handler can throw the [valid] return from the syscall away as
    | invalid. I don't know how to trace an app at that level to verify
    | this one way or the other. That doesn't seem quite right because the
    | lseek() would still have been done, and the handler would have to know
    | whether to expect signed or unsigned return values, and that for signed
    | values -1 is the only acceptable result. Unless of course 2^31-1 is
    | the largest possible return value for a 32-bit syscall.
    |
    | Well, as you can see, I don't really know how the syscall part works.

    As long as we are interfacing to a standard API via a library, we should
    not have to know in order to achieve correct operation on a system that
    has the capability to perform as expected.

    I'll eventually dig my old 32-bit Sparc machines out and set them up.
    I have an old Solaris 7 to run on them (as well as OpenBSD and Splack).
    And I got some Ultra 10's being tossed out from work and could download
    a Solaris 10 for those (hopefully it still supports those machines).
    Then I'd have at least a couple diverse test points to see how well my
    code does, in addition to the x86 Linux systems I have now, with x86-64
    coming soon, plus all the emulation architectures in QEMU and Hercules.

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-28-0647@ipal.net |
    |------------------------------------/-------------------------------------|

  19. Re: large file support && ! large file support

    On 28 May 2007 12:46:03 GMT phil-news-nospam@ipal.net wrote:
    > On Mon, 28 May 2007 00:00:50 -0700 Frank Cusack wrote:
    > | Part of the reason it's difficult because the Linux folks haven't
    > | really learned of 'getconf LFS_CFLAGS'.
    >
    > ================================================== ==============
    > phil@varuna:/home/phil 916> getconf LFS_CFLAGS
    > -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
    > phil@varuna:/home/phil 917>
    > ================================================== ==============
    >
    > It seems to do something. But what is that telling me? I thought that
    > the macros that provide information from the system to the program had
    > different names, e.g. _LFS_LARGEFILE and _LFS64_LARGEFILE.


    Maybe you're thinking of sysconf()?

    > How can getconf know whether a program is written to use 64 bits with
    > the traditional non-64 names, or is written to use the alternate names?
    > It certainly cannot if not given any such program to examine.
    >
    > All that getconf can be expected to do is provide information about what
    > the system implementation is capable of providing.


    'getconf LFS_CFLAGS' tells you what to pass to the preprocessor (it
    should really be 'getconf LFS_CPPFLAGS') to enable the LFS
    environment. My point about Linux is that much software simply makes
    up what they think the flags should be, rather than calling getconf to
    find out. This makes it difficult for you to correctly write a
    portable library which DTRT, since the #ifdef'd flags you use to
    enable LFS (and which choose functionality in the user-included header
    file for your library), even though correct, may not be the ones a
    user of your library uses.

    > | ....
    > |> I might do this, then. Since no common facility exists to properly select
    > |> the correct library, it would be difficult to ensure that from just one
    > |> library. And the other alternative is to forego any 32-bit offset support
    > |> altogether.
    > |
    > | Yeah, in your header file you could just have something like
    > |
    > | #if defined(_ILP32) && (_FILE_OFFSET_BITS != 64)
    > | #error "32-bit offset (non-LFS) not supported by libfoo"
    > | #endif
    >
    > But I will support it. If my library gets compiled on a system that has no
    > LFS support, I want it to work just fine with the size that is available.


    You just said, "the other alternative is to forego 32-bit offset
    support". I thought you meant you might do that.

    -frank

  20. Re: large file support && ! large file support

    On Mon, 28 May 2007 11:03:12 -0700 Frank Cusack wrote:
    | On 28 May 2007 12:46:03 GMT phil-news-nospam@ipal.net wrote:
    |> On Mon, 28 May 2007 00:00:50 -0700 Frank Cusack wrote:
    |> | Part of the reason it's difficult because the Linux folks haven't
    |> | really learned of 'getconf LFS_CFLAGS'.
    |>
    |> ================================================== ==============
    |> phil@varuna:/home/phil 916> getconf LFS_CFLAGS
    |> -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
    |> phil@varuna:/home/phil 917>
    |> ================================================== ==============
    |>
    |> It seems to do something. But what is that telling me? I thought that
    |> the macros that provide information from the system to the program had
    |> different names, e.g. _LFS_LARGEFILE and _LFS64_LARGEFILE.
    |
    | Maybe you're thinking of sysconf()?

    No. I was referring to the macros _LFS_LARGEFILE and _LFS64_LARGEFILE as
    defined in the POSIX LFS extensions.

    http://www.unix.org/version2/whatsnew/lfs20mar.html



    |> How can getconf know whether a program is written to use 64 bits with
    |> the traditional non-64 names, or is written to use the alternate names?
    |> It certainly cannot if not given any such program to examine.
    |>
    |> All that getconf can be expected to do is provide information about what
    |> the system implementation is capable of providing.
    |
    | 'getconf LFS_CFLAGS' tells you what to pass to the preprocessor (it
    | should really be 'getconf LFS_CPPFLAGS') to enable the LFS
    | environment. My point about Linux is that much software simply makes
    | up what they think the flags should be, rather than calling getconf to
    | find out. This makes it difficult for you to correctly write a
    | portable library which DTRT, since the #ifdef'd flags you use to
    | enable LFS (and which choose functionality in the user-included header
    | file for your library), even though correct, may not be the ones a
    | user of your library uses.

    I don't think they make it up. I get it from the document I read:

    http://www.unix.org/version2/whatsnew/lfs20mar.html

    If someone else might be using different flags that are not listed in
    the above document, and not in the POSIX standard per se, then they are
    beyond my interest in supporting them (unless they can make a good
    argument on why I should adopt what they are doing).


    |> |> I might do this, then. Since no common facility exists to properly select
    |> |> the correct library, it would be difficult to ensure that from just one
    |> |> library. And the other alternative is to forego any 32-bit offset support
    |> |> altogether.
    |> |
    |> | Yeah, in your header file you could just have something like
    |> |
    |> | #if defined(_ILP32) && (_FILE_OFFSET_BITS != 64)
    |> | #error "32-bit offset (non-LFS) not supported by libfoo"
    |> | #endif
    |>
    |> But I will support it. If my library gets compiled on a system that has no
    |> LFS support, I want it to work just fine with the size that is available.
    |
    | You just said, "the other alternative is to forego 32-bit offset
    | support". I thought you meant you might do that.

    I thought about it in terms of the impact. I want to get to a point where
    32-bit file offset support is fully depricated. But that is going to be a
    very very long time, especially with embedded systems.

    The whole purpose of opaque datatypes was to allow correctly written code
    to simply be recompiled in another architecture or subarchitecture (where
    a change in file offset size is a subarchitecture) and just work. But too
    many people write bad code, and too many people made special hacks to some
    systems and became dependent on it, and the standards people fell behind
    and ended up being swayed to adopt bad choices in a hurry. That and the
    standard itself had some defects that should have been fixed earlier (e.g.
    the ftell/ftello and fseek/fseeko issue) like in version 1.

    We haven't achieved true opaque datatypes. Many people envision that it
    will be quite a while before we need 128 bit file offset types. Do you
    think we'll have things fixed by then? That should be plenty of time,
    right?

    I've periodically imagined putting together my own low level language much
    like C (very much like C, actually, but a few important differences) which
    would avoid many of the issues C has. That thought keeps bubbling up as
    it might also be a way to drive a new cleaner system interface as well.
    One thing my language would have is the ability to define data types as
    any size you wish (from among what is available) rather easily, but only
    by defining a type. You have to define your own type, then define variables
    to that type. You can't simply make foo be an integer of 32 bits; you must
    define a type of your own as a 32 bit integer then make foo be an instance
    of that new type. It will force at least the motion of going about creating
    opaque types (and hopefully some programmers would even do it sensibly).

    --
    |---------------------------------------/----------------------------------|
    | Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
    | first name lower case at ipal.net / spamtrap-2007-05-28-1527@ipal.net |
    |------------------------------------/-------------------------------------|

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast