struct stat - Unix

This is a discussion on struct stat - Unix ; Something which has bothered me when I write programs in assembly language is the stat syscall, which takes a pointer to a struct stat. How do I know the size of this structure so that I can allocate the right ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 28

Thread: struct stat

  1. struct stat

    Something which has bothered me when I write programs in assembly
    language is the stat syscall, which takes a pointer to a struct stat.

    How do I know the size of this structure so that I can allocate the
    right amount of space for it? I can look it up in the C library header
    where it is declared, but this is not only some detective work involving
    several headers, but the size of the struct might also change with a new
    version of the operating system.

    Is there a clean way to handle this? Am I missing something?


    Bjarni
    --

    INFORMATION WANTS TO BE FREE

  2. Re: struct stat

    On October 26, 2008 14:44, in comp.unix.programmer, Bjarni Juliusson
    (bjarni@update.uu.se) wrote:

    > Something which has bothered me when I write programs in assembly
    > language is the stat syscall, which takes a pointer to a struct stat.
    >
    > How do I know the size of this structure so that I can allocate the
    > right amount of space for it? I can look it up in the C library header
    > where it is declared, but this is not only some detective work involving
    > several headers, but the size of the struct might also change with a new
    > version of the operating system.
    >
    > Is there a clean way to handle this? Am I missing something?


    #include
    #include
    #include
    #include

    struct stat *MyStat;

    if ((MyStat = malloc(sizeof *MyStat)) != NULL)
    {
    if (stat("/etc/passwd", Mystat) == 0)
    {
    printf("/etc/passwd is linked %u times\n",
    (unsigned int)MyStat->st_nlink);
    }
    }





    --
    Lew Pitcher

    Master Codewright & JOAT-in-training | Registered Linux User #112576
    http://pitcher.digitalfreehold.ca/ | GPG public key available by request
    ---------- Slackware - Because I know what I'm doing. ------



  3. Re: struct stat

    In article ,
    Lew Pitcher wrote a bunch of the usual
    CLC hyper-correct nonsense, while, in true CLC style, totally missing
    the point, leading up to:
    ....
    >> Is there a clean way to handle this? Am I missing something?

    >
    >#include
    >#include
    >#include
    >#include


    Um, what part of "When I write programs in assembly language"
    seems to have escaped you?


  4. Re: struct stat

    In article ,
    Bjarni Juliusson wrote:

    > Something which has bothered me when I write programs in assembly
    > language is the stat syscall, which takes a pointer to a struct stat.


    Isn't this a problem for all functions that take or return a structure
    parameter? What's so special about stat()?

    And why are you programming something like this in assembly in the first
    place?

    --
    Barry Margolin, barmar@alum.mit.edu
    Arlington, MA
    *** PLEASE post questions in newsgroups, not directly to me ***
    *** PLEASE don't copy me on replies, I'll read them in the group ***

  5. Re: struct stat

    Bjarni Juliusson wrote:

    > Something which has bothered me when I write programs in assembly
    > language is the stat syscall, which takes a pointer to a struct stat.
    >
    > How do I know the size of this structure so that I can allocate the
    > right amount of space for it? I can look it up in the C library header
    > where it is declared, but this is not only some detective work involving
    > several headers, but the size of the struct might also change with a new
    > version of the operating system.


    You have to allocate the memory on the stack or other memory, alloting the
    same size for the struct as C would use. In practice when writing a
    graphical programming language that had a compiler that took a list of
    lists and generated assembly, I found it easier to work with C for
    syscalls. I just linked in some code that would work with the syscalls to
    my assembly code.

    You get little benefit from doing syscalls directly in assembly, because the
    systems supporting x86/Intel processors aren't consistent. Some use int
    0x80, while others use sysenter, some support both. Also, systems like
    Linux don't use the SysV syscall ABI. Linux passes the first arguments to
    syscalls in registers, unlike systems like NetBSD or OpenBSD. This means
    you often have to spill some registers elsewhere, if you're working with
    Linux syscalls, and happen to cache some critical pointer or value in a
    register that the kernel interprets as a syscall argument.

    In practice if you're against using C for members of struct stat, i.e.
    access functions, you can use .equ.

    ..equ dev_t_offset 0
    ..equ inode_offset 4
    ..equ mode_offset 8

    /*Let's say that %eax has the start of struct stat and are using
    AT&T-compatible syntax. */
    movl $mode_offset(%eax),%ecx

    #define macros are another alternative.

    It gets a bit more complex when you realize that some systems also have
    64-bit types in struct stat. glibc/Linux supports 2 types of struct stat
    depending on compilation flags (32-bit types and 64-bit) on 32-bit
    architectures. Some of the BSD family broke compatibility years ago for
    64-bit values in struct stat.


    > Is there a clean way to handle this? Am I missing something?


    Write or preferably generate functions that access struct stat, and call
    those to get the values. You can automatically discover such things with a
    script and sizeof with offsetof.


    George

  6. Re: struct stat

    GPS wrote:
    > In practice if you're against using C for members of struct stat, i.e.
    > access functions, you can use .equ.
    >
    > .equ dev_t_offset 0
    > .equ inode_offset 4
    > .equ mode_offset 8
    >
    > /*Let's say that %eax has the start of struct stat and are using
    > AT&T-compatible syntax. */
    > movl $mode_offset(%eax),%ecx


    Oops, that should just be: movl mode_offset(%eax),%ecx

    $ is for a literal value, which wouldn't make sense with the indirection.

    I program in a variety of languages at times, so $ has a different meaning
    in say Tcl than it does in gas/GNU as...


    George

  7. Re: struct stat

    Barry Margolin wrote:
    > In article ,
    > Bjarni Juliusson wrote:
    >
    >> Something which has bothered me when I write programs in assembly
    >> language is the stat syscall, which takes a pointer to a struct stat.

    >
    > Isn't this a problem for all functions that take or return a structure
    > parameter? What's so special about stat()?


    Nothing special about stat, it's just the one that was bothering me.

    > And why are you programming something like this in assembly in the first
    > place?


    But you don't know what it is that I'm programming...

    If GPS is right, there is no way to know the size of the structure in
    assembly language, and the only way to do it is to use C for the system
    calls.

    Also note that I didn't say I would necessarily perform the system calls
    directly in assembly; but even if I call the libc wrappers, I still need
    to know the size of any structures in advance.

    Another question: Does this mean that if the kernel is upgraded,
    changing the size of struct stat or some other structure, all C programs
    on the system need to be relinked with a new version of libc, or at
    least a new build of the same libc?


    Bjarni
    --

    INFORMATION WANTS TO BE FREE

  8. Re: struct stat

    Bjarni Juliusson wrote:
    > [...]
    > If GPS is right, there is no way to know the size of the structure in
    > assembly language, and the only way to do it is to use C for the system
    > calls.
    >
    > Also note that I didn't say I would necessarily perform the system calls
    > directly in assembly; but even if I call the libc wrappers, I still need
    > to know the size of any structures in advance.


    ... and the offsets, sizes, and meanings of its elements, too.
    If you really wanted to keep the C parts separate from the assembly
    parts (to avoid needing to thumb through the definitions of the
    various structs and the typedefs that contribute to them), you'd
    probably want to write a C wrapper that exported the size as a global
    variable and provided getters and/or setters for the fields.

    > Another question: Does this mean that if the kernel is upgraded,
    > changing the size of struct stat or some other structure, all C programs
    > on the system need to be relinked with a new version of libc, or at
    > least a new build of the same libc?


    Different Unix versions have different ideas about the importance
    of version-to-version binary compatibility. But redefining a "public"
    interface like `struct stat' seems like a drastic step that no Unix
    would undertake except in direst need. Note that such a redefinition
    would require not only that all programs link with updated libraries,
    but that many programs be recompiled from source.

    You may well find differing `struct stat' sizes and layouts on
    different systems, but it seems highly unlikely that the `struct stat'
    on any particular system would ever change.

    --
    Eric.Sosman@sun.com

  9. Re: struct stat

    Bjarni Juliusson writes:

    > Barry Margolin wrote:
    >> In article ,
    >> Bjarni Juliusson wrote:
    >>
    >>> Something which has bothered me when I write programs in assembly
    >>> language is the stat syscall, which takes a pointer to a struct
    >>> stat.

    >>
    >> Isn't this a problem for all functions that take or return a
    >> structure parameter? What's so special about stat()?

    >
    > Nothing special about stat, it's just the one that was bothering me.
    >
    >> And why are you programming something like this in assembly in the
    >> first place?

    >
    > But you don't know what it is that I'm programming...
    >
    > If GPS is right, there is no way to know the size of the structure in
    > assembly language, and the only way to do it is to use C for the
    > system calls.


    Well, there's not any standard *automatic* way to know. You can count
    up the bytes used by all the fields of the structure, taking into
    account padding and alignment required by your system's ABI. You could
    also write a tiny C program that prints out sizeof(struct stat) (and
    maybe some offsetof() values) and then hardcode those values into your
    assembler program (or a header file). Or you can link your program with
    a C module containing a fuction that does these things and call it at
    run time. Et cetera.

    You can think of it this way: the size of the structure is documented by
    the kernel, but the documentation is in the form of a C header file.
    It's up to you to use this "documentation" to make your program conform;
    it happens that in C this is very easy, but requires some more manual
    labor in other languages.

    > Also note that I didn't say I would necessarily perform the system
    > calls directly in assembly; but even if I call the libc wrappers, I
    > still need to know the size of any structures in advance.
    >
    > Another question: Does this mean that if the kernel is upgraded,
    > changing the size of struct stat or some other structure, all C
    > programs on the system need to be relinked with a new version of libc,
    > or at least a new build of the same libc?


    That's right. For this reason, new versions of the kernel will normally
    *not* change the size or layouts of these structures, at least not as
    such.

    What's usually done when they want to introduce new fields into these
    structures is to make a new system call. For instance, suppose stat(2)
    had been system call number 42, and used a certain layout for struct
    stat. Now the kernel authors want to add a st_foobar field. They will
    leave system call 42 untouched, still using the old layout, and make a
    new system call, number 99 say, which uses the new layout including
    st_foobar. will be altered to reflect the new layout, and
    the stat() function in libc will be changed to use system call 99, so
    newly compiled programs will get the new layout and can use st_foobar.
    Programs which are still linked against the old libc will continue to
    use system call 42, and will work the same as before, though they won't
    have access to the fancy new st_foobar feature.

    You can see evidence of this process in Linux's sources, for example.

    For dynamically linked programs, where libc might be changed without the
    program being recompiled, there are shared library versioning schemes to
    ensure that everyone stays on the same page.

    Obviously there is a certain amount of overhead and redundancy
    introduced when all this happens, so the OS vendors try not to do it
    without due consideration and a good reason.

  10. Re: struct stat

    >Something which has bothered me when I write programs in assembly
    >language is the stat syscall, which takes a pointer to a struct stat.
    >
    >How do I know the size of this structure so that I can allocate the
    >right amount of space for it? I can look it up in the C library header
    >where it is declared, but this is not only some detective work involving
    >several headers, but the size of the struct might also change with a new
    >version of the operating system.


    My suggestion is to use an assembly-language header file which
    defines such things as sizes of structures, offsets of fields from
    the beginning of structures, and other stuff you need from C header
    files. (struct stat isn't the only problem you'll run into. Among
    others, there is struct tm. You may need some preprocessor symbols,
    too.)

    Now write a C program that will output this header file with the
    right values, letting the C compiler figure out sizes and offsets.
    If you use make, write a rule to use the program to construct the
    header file.

    Operating systems tend to try to preserve binary compatability.
    However, if they do break it, you can just use your program to
    construct a new header file.


  11. Re: struct stat

    Bjarni Juliusson wrote:
    [SNIP]
    > Another question: Does this mean that if the kernel is upgraded,
    > changing the size of struct stat or some other structure, all C programs
    > on the system need to be relinked with a new version of libc, or at
    > least a new build of the same libc?

    Yes.

    64-bit inodes are an example of this.

    The XFS file system (IRIX and Linux) can have 64-bit inodes switched on,
    but doing so can break programs (the Intel C compiler, for example).

    Trivially avoidable by either using all 64-bit file access, or calling
    stat64() directly, but apparently the poor helots at Intel can't quite
    grasp the idea!

    Cheers,
    Gary B-)

    --
    __________________________________________________ ____________________________
    Armful of chairs: Something some people would not know
    whether you were up them with or not
    - Barry Humphries

  12. Re: struct stat

    On Oct 27, 1:04*pm, Bjarni Juliusson wrote:

    > Another question: Does this mean that if the kernel is upgraded,
    > changing the size of struct stat or some other structure, all C programs
    > on the system need to be relinked with a new version of libc, or at
    > least a new build of the same libc?


    No. C programs that use the old version of libc will not know that the
    'stat' system call has changed and will instead call the system call
    previously known as 'stat' (and know probably known as 'old_stat').

    DS

  13. Re: struct stat

    Eric Sosman writes:

    > Different Unix versions have different ideas about the importance
    >of version-to-version binary compatibility. But redefining a "public"
    >interface like `struct stat' seems like a drastic step that no Unix
    >would undertake except in direst need. Note that such a redefinition
    >would require not only that all programs link with updated libraries,
    >but that many programs be recompiled from source.


    This was done with the transition from SVR3 to SVR4. The
    original stat kernel interface was kept for compatibility with existing
    a.outs, and a new internal 'xstat' system call was created which had
    a version argument in addition to the path and struct stat arguments.

    Newly compiled programs calling stat would get the new structure format
    and would continue to call the libc stat wrapper. libc stat wrapper
    would invoke the xstat system call with the correct version.

    This was done to transition several fields from 16-bit types to 32-bit types.

    >
    > You may well find differing `struct stat' sizes and layouts on
    >different systems, but it seems highly unlikely that the `struct stat'
    >on any particular system would ever change.


    see above.

    scott

  14. Re: struct stat

    Nate Eldredge wrote:
    > Well, there's not any standard *automatic* way to know. You can count
    > up the bytes used by all the fields of the structure, taking into
    > account padding and alignment required by your system's ABI. You could
    > also write a tiny C program that prints out sizeof(struct stat) (and
    > maybe some offsetof() values) and then hardcode those values into your
    > assembler program (or a header file). Or you can link your program with
    > a C module containing a fuction that does these things and call it at
    > run time. Et cetera.
    >
    > You can think of it this way: the size of the structure is documented by
    > the kernel, but the documentation is in the form of a C header file.
    > It's up to you to use this "documentation" to make your program conform;
    > it happens that in C this is very easy, but requires some more manual
    > labor in other languages.


    Yes. I'll probably do it the way Gordon proposed in his post.

    > What's usually done when they want to introduce new fields into these
    > structures is to make a new system call. For instance, suppose stat(2)
    > had been system call number 42, and used a certain layout for struct
    > stat. Now the kernel authors want to add a st_foobar field. They will
    > leave system call 42 untouched, still using the old layout, and make a
    > new system call, number 99 say, which uses the new layout including
    > st_foobar. will be altered to reflect the new layout, and
    > the stat() function in libc will be changed to use system call 99, so
    > newly compiled programs will get the new layout and can use st_foobar.
    > Programs which are still linked against the old libc will continue to
    > use system call 42, and will work the same as before, though they won't
    > have access to the fancy new st_foobar feature.
    >
    > You can see evidence of this process in Linux's sources, for example.


    In fact, it's mentioned in the man page. Thanks!

    And thanks to everyone else who posted in this thread! I appreciate your
    help. My questions have been answered.


    Bjarni
    --

    INFORMATION WANTS TO BE FREE

  15. Re: struct stat

    On Oct 27, 9:20 pm, Eric Sosman wrote:
    > Bjarni Juliusson wrote:
    > > [...]
    > > If GPS is right, there is no way to know the size of the
    > > structure in assembly language, and the only way to do it is
    > > to use C for the system calls.


    > > Also note that I didn't say I would necessarily perform the
    > > system calls directly in assembly; but even if I call the
    > > libc wrappers, I still need to know the size of any
    > > structures in advance.


    > ... and the offsets, sizes, and meanings of its elements, too.
    > If you really wanted to keep the C parts separate from the
    > assembly parts (to avoid needing to thumb through the
    > definitions of the various structs and the typedefs that
    > contribute to them), you'd probably want to write a C wrapper
    > that exported the size as a global variable and provided
    > getters and/or setters for the fields.


    You could also write a small C++ program which generates
    assembler declarations, using sizeof and offsetof. Something
    like:

    std::cout << "stat_size = "
    << sizeof( struct stat ) << std::endl ;
    std::cout << "stat_dev_offset = "
    << offsetof( struct stat, st_dev ) << std::endl ;
    // ...

    (Using the GNU assembler syntax.)

    > > Another question: Does this mean that if the kernel is
    > > upgraded, changing the size of struct stat or some other
    > > structure, all C programs on the system need to be relinked
    > > with a new version of libc, or at least a new build of the
    > > same libc?


    > Different Unix versions have different ideas about the
    > importance of version-to-version binary compatibility. But
    > redefining a "public" interface like `struct stat' seems like
    > a drastic step that no Unix would undertake except in direst
    > need. Note that such a redefinition would require not only
    > that all programs link with updated libraries, but that many
    > programs be recompiled from source.


    I was under the impression that binary compatibility of
    executables was maintained by the use of dynamicly linked
    objects. If a new version breaks binary compatibility of
    executables, it changes the magic in the executable file (or
    some other information, so that it can detect the version of a
    binary, and adjust the .so it dynamically loads accordingly.

    Of course, he would have to recompile and regenerate all of his
    program if he wanted to recompile anything. But he could still
    manage the problem by putting the generation of his assembler in
    his makefile, along with some sort of test on the results.

    > You may well find differing `struct stat' sizes and layouts on
    > different systems, but it seems highly unlikely that the
    > `struct stat' on any particular system would ever change.


    I doubt that it will change often; it's not like the function is
    a new innovation, still somewhat experimental. But in general,
    I wouldn't count on a struct at the system API level never
    changing.

    --
    James Kanze (GABI Software) email:james.kanze@gmail.com
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

  16. Re: struct stat

    On Oct 28, 4:38 pm, sc...@slp53.sl.home (Scott Lurndal) wrote:
    > Eric Sosman writes:


    [re modification of struct stat...]
    > This was done to transition several fields from 16-bit types
    > to 32-bit types.


    Which makes me think that we'll see such things on most systems,
    sometime in the near future. When time_t moves to 64 bits.

    --
    James Kanze (GABI Software) email:james.kanze@gmail.com
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

  17. Re: struct stat

    Eric Sosman wrote:
    > Bjarni Juliusson wrote:
    >> Another question: Does this mean that if the kernel is upgraded,
    >> changing the size of struct stat or some other structure, all C programs
    >> on the system need to be relinked with a new version of libc, or at
    >> least a new build of the same libc?

    >
    > Different Unix versions have different ideas about the importance
    > of version-to-version binary compatibility. But redefining a "public"
    > interface like `struct stat' seems like a drastic step that no Unix
    > would undertake except in direst need. Note that such a redefinition
    > would require not only that all programs link with updated libraries,
    > but that many programs be recompiled from source.


    Not necessarily. As someone pointed out elsewhere in the thread; systems
    can often be upgraded by renaming the kernel function that implements the
    old syscall to something else, and making the new stat function use a
    different offset. The libc is also upgraded to match and use the new
    struct stat. NetBSD calls this "syscall versioning."

    http://www.netbsd.org/docs/internals...all_versioning

    One of the downsides is if you are transferring a struct stat through shared
    memory, and both programs don't agree on the size of the struct, it may
    cause a failure.

    > You may well find differing `struct stat' sizes and layouts on
    > different systems, but it seems highly unlikely that the `struct stat'
    > on any particular system would ever change.


    It can even change at compile-time, due to LFS support. See this:
    http://www.unix.org/version2/whatsnew/lfs20mar.html

    That's where stat64 and its other ugly friends come from. Unfortunately the
    LFS support makes some things impossible in GNU/Linux, like
    using -D_FILE_OFFSET_BITS=64 with the fts.h functions.

    George

  18. Re: struct stat

    GPS wrote:
    > Eric Sosman wrote:
    >> Bjarni Juliusson wrote:
    >>> Another question: Does this mean that if the kernel is upgraded,
    >>> changing the size of struct stat or some other structure, all C programs
    >>> on the system need to be relinked with a new version of libc, or at
    >>> least a new build of the same libc?

    >> Different Unix versions have different ideas about the importance
    >> of version-to-version binary compatibility. But redefining a "public"
    >> interface like `struct stat' seems like a drastic step that no Unix
    >> would undertake except in direst need. Note that such a redefinition
    >> would require not only that all programs link with updated libraries,
    >> but that many programs be recompiled from source.

    >
    > Not necessarily. As someone pointed out elsewhere in the thread; systems
    > can often be upgraded by renaming the kernel function that implements the
    > old syscall to something else, and making the new stat function use a
    > different offset. The libc is also upgraded to match and use the new
    > struct stat. NetBSD calls this "syscall versioning."


    "It depends on what the meaning of the words 'is' is." Or in
    this case, what the meaning of `struct stat' is. In schemes like
    the one you describe, "the" `struct stat' hasn't change, but the
    name `struct stat' has been reassigned to a new struct. In fact,
    for backwards compatibility you'll find that the original struct
    and the system call(s) that use it still exist, even if the names
    have changed.

    In a way, it's like domain-name squatting ...

    >> [...] it seems highly unlikely that the `struct stat'
    >> on any particular system would ever change.


    I should have been clearer about what I meant by this. It is
    in fact fairly common for the compile-time name `struct stat' to
    refer to different data structures over time, and for parallel sets
    of system services to grow up around the new structures as they
    appear. The point I was trying to make (not too well, I guess) is
    that even when a new `struct stat' appears, the old `struct stat'
    will almost certainly not vanish, nor will the system services that
    went with it. Bjarni was worried about a kernel upgrade changing
    `struct stat' in such a way that an already-compiled program would
    suddenly cease to function; it was this failure mode I was trying
    to characterize as unlikely.

    --
    Eric.Sosman@sun.com

  19. Re: struct stat

    > [re modification of struct stat...]
    >> This was done to transition several fields from 16-bit types
    >> to 32-bit types.

    >
    >Which makes me think that we'll see such things on most systems,
    >sometime in the near future. When time_t moves to 64 bits.


    Some systems, for example FreeBSD, already have 64 bits reserved
    for the time fields, although the designation for the second
    half seems to be for fractional-second use.

    I'd prefer that timestamps go directly to 256 bits (with at least
    64 bits for fractions of a second). That, however, breaks POSIX
    requirements for timestamps in units of a second.


  20. Re: struct stat

    James Kanze writes:
    >On Oct 27, 9:20 pm, Eric Sosman wrote:
    >> Bjarni Juliusson wrote:
    >> > [...]
    >> > If GPS is right, there is no way to know the size of the
    >> > structure in assembly language, and the only way to do it is
    >> > to use C for the system calls.

    >
    >> > Also note that I didn't say I would necessarily perform the
    >> > system calls directly in assembly; but even if I call the
    >> > libc wrappers, I still need to know the size of any
    >> > structures in advance.

    >
    >> ... and the offsets, sizes, and meanings of its elements, too.
    >> If you really wanted to keep the C parts separate from the
    >> assembly parts (to avoid needing to thumb through the
    >> definitions of the various structs and the typedefs that
    >> contribute to them), you'd probably want to write a C wrapper
    >> that exported the size as a global variable and provided
    >> getters and/or setters for the fields.

    >
    >You could also write a small C++ program which generates
    >assembler declarations, using sizeof and offsetof. Something
    >like:


    Probably easier, at least with the gnu toolchain, is to get
    the 'pahole' application from the dwarves package and post
    process its output into assembler macros which define the
    structure. (pahole will print offset and size values for each
    field based upon the dwarf info (-g) compiled into the app).

    In the DEC VMS days, all data structures were defined in abstract
    (MDL) and processed by the 'mdl' application into assembler
    definitions, C definitions, Bliss-32 definitions and Pascal
    definitions automatically.

    scott

+ Reply to Thread
Page 1 of 2 1 2 LastLast