Reading /proc/<pid>/maps file - Linux

This is a discussion on Reading /proc/<pid>/maps file - Linux ; Hi, I want to dump the virtual memory map of my process, and it seems like I can get this from the /proc/ /maps file. The trouble is, when I try to loop over this file until EOF, I never ...

+ Reply to Thread
Results 1 to 19 of 19

Thread: Reading /proc/<pid>/maps file

  1. Reading /proc/<pid>/maps file

    Hi,

    I want to dump the virtual memory map of my process, and it seems like
    I can get this from the /proc//maps file. The trouble is, when I
    try to loop over this file until EOF, I never get EOF and if I try to
    calculate the length of the file, its turning out to be zero!

    Here is a small snippet that demonstrates the problem with calculating
    the filelength.

    #include
    #include
    #include
    #include
    #include
    #include
    #include

    int main(int argc, char* argv[])
    {
    char filename[1024];
    sprintf(filename, "/proc/%s/maps", argv[1]);
    printf("filename is %s\n", filename);

    FILE* fp = fopen(filename, "r");
    assert(fp);
    fseek(fp, 0, SEEK_END);
    long pos = ftell(fp);
    assert(pos);
    char* buf = malloc(pos);
    fseek(fp, 0, SEEK_SET);
    fread(buf, pos, 1, fp);
    puts(buf);
    fclose(fp);
    return 0;
    }

    This one fails the assertion for pos.

    Any ideas on what I'm doing wrong? Am I missing something here?

    Thanks in advance,
    pramod

  2. Re: Reading /proc/<pid>/maps file

    Pramod Subramanyan wrote in news:0d56f4fa-bf8b-4d0c-
    b661-f73b4b38ac38@r37g2000prr.googlegroups.com:

    > I want to dump the virtual memory map of my process, and it seems like
    > I can get this from the /proc//maps file. The trouble is, when I
    > try to loop over this file until EOF, I never get EOF and if I try to
    > calculate the length of the file, its turning out to be zero!


    It's not a real file and hence the fseek(...SEEK_END) cannot calculate the
    end of the file. (Likewise, "ls -l /proc/xx/maps" always shows a size of
    0.)

    However, you can read it and get EOF at the end. For example, you should
    be able to successfully execute "rd trivial rd.c:

    #include

    int main(int ac, char **av)
    {
    int c;
    while ((c = getchar()) != EOF)
    putchar(c);
    return 0;
    }


    GH

  3. Re: Reading /proc/<pid>/maps file

    > I want to dump the virtual memory map of my process, and it seems like
    > I can get this from the /proc//maps file. The trouble is, when I
    > try to loop over this file until EOF, I never get EOF and if I try to
    > calculate the length of the file, its turning out to be zero!


    $ ls -l /proc/self/maps
    -r--r--r-- 1 user user 0 2008-10-16 09:33 /proc/self/maps
    #----------------------^

    Most of the "information-generating" files in /proc do not support *seek(),
    and support only some of the fields of *stat().

    Instead, read() such a file until the return count is non-positive.
    Do not use fread(), or other stdio functions.

    --

  4. Re: Reading /proc/<pid>/maps file

    Pramod Subramanyan wrote:

    > The trouble is, when I
    > try to loop over this file until EOF, I never get EOF


    I'm just guess here, but it looks like you are looping over the process's
    own maps (i.e. virtual memory). If that is the case, could the reading
    process be allocating more virtual memory? Kind of like a cat chasing it's
    own tail.

    --
    Chris

  5. Re: Reading /proc/<pid>/maps file

    Chris wrote in
    news:NbKJk.2030$%%2.812@edtnps82:

    >> The trouble is, when I
    >> try to loop over this file until EOF, I never get EOF

    >
    > I'm just guess here, but it looks like you are looping over the
    > process's own maps (i.e. virtual memory). If that is the case, could
    > the reading process be allocating more virtual memory? Kind of like a
    > cat chasing it's own tail.


    I don't see how that could happen. First, the map only contains an entry
    for each virtual memory "region", not for each page so it would have to be
    growing by adding new regions (not sure how you'd do that -- presumably by
    creating additional separate mmap regions). Second, even if it were
    continuously adding regions, this would at worst simply be extending the
    apparent file size. That's not the same as "never get EOF". You'd get EOF
    eventually (when you ran out of VM space).

    Anyway, that's no different than reading a file that another process is
    appending to. It might go on for a long time, but not forever.

    GH

  6. Re: Reading /proc/<pid>/maps file

    John Reiser wrote in
    news:48f76f26$0$3714$39cecf19@news.twtelecom.net:

    >> I want to dump the virtual memory map of my process, and it seems
    >> like I can get this from the /proc//maps file. The trouble is,
    >> when I try to loop over this file until EOF, I never get EOF and if I
    >> try to calculate the length of the file, its turning out to be zero!


    > Instead, read() such a file until the return count is non-positive.
    > Do not use fread(), or other stdio functions.


    Why should one not use stdio?

    GH

  7. Re: Reading /proc/<pid>/maps file

    With respect to /proc//maps, or any other file in /proc
    whose contents are dynamically generated:
    > while ((c = getchar()) != EOF)
    > putchar(c);


    Run that code under strace, then look at the last read():
    read(0, "", 1024) = 0
    Concluding that getchar() should return EOF in this case is brash.
    For some kinds of files, getchar() should keep trying!
    Generated files in /proc do not supply all fields of struct stat,
    and saying that the file type is S_IFREG (regular file) is
    a stretch: seeking is not supported, etc. Thus, when getchar()
    concludes that (0==read(fd, buf, size)) implies EOF, it is almost
    a coincidence. It is safer to avoid stdio here.
    Use open() and read() directly, and apply specific knowledge
    of /proc to conclude that 0==read(,,) implies EOF.

    --

  8. Re: Reading /proc/<pid>/maps file

    John Reiser writes:

    > With respect to /proc//maps, or any other file in /proc
    > whose contents are dynamically generated:
    >> while ((c = getchar()) != EOF)
    >> putchar(c);

    >
    > Run that code under strace, then look at the last read():
    > read(0, "", 1024) = 0
    > Concluding that getchar() should return EOF in this case is brash.
    > For some kinds of files, getchar() should keep trying!
    > Generated files in /proc do not supply all fields of struct stat,
    > and saying that the file type is S_IFREG (regular file) is
    > a stretch: seeking is not supported, etc. Thus, when getchar()
    > concludes that (0==read(fd, buf, size)) implies EOF, it is almost
    > a coincidence.


    I don't think that's correct. /proc/pid/maps is a "regular file" in the
    sense of stat(); it's not a pipe, device, or any other such thing. In
    my experience, getc()/getchar() on a regular file returns EOF
    *precisely* when read() returns 0. Indeed, read() returning 0 is the
    only way that the operating system indicates end-of-file, and the /proc
    files are implemented with this in mind. read() returns the bytes,
    then returns 0, which is exactly what would happen for an ordinary disk
    file, and getc() does the right thing.

    It's possible I'm mistaken; I tried to find the relevant code in glibc
    but got lost in a maze of twisty little jump tables. But on FreeBSD,
    for instance, that's what happens. If Linux is different, could you
    give a test case, or point out the source?

    And for non-regular files, usually it's the reverse of your statement
    that's true. getc() is generally not "smart enough" to keep reading
    after read() returns 0, even when that's what you want to do. It's
    those cases where you must use read() manually.

    The OP's problem was caused by trying to seek in the file, which
    certainly doesn't work. Arguably it was a mistake to make the /proc
    files appear as regular files when you can't seek on them, but I think
    we're stuck with that now.

  9. Re: Reading /proc/<pid>/maps file

    On Oct 16, 10:09*am, Gil Hamilton wrote:
    > John Reiser wrote innews:48f76f26$0$3714$39cecf19@news.twtelecom.net :


    > > Do not use fread(), or other stdio functions.


    > Why should one not use stdio?


    Because this file has special properties that the stdio code may not
    understand how to handle. The 'stdio' code is buffered, and reading a
    file through a buffer while the contents of that file are changing
    gives unpredictable results.

    DS

  10. Re: Reading /proc/<pid>/maps file

    David Schwartz writes:

    > On Oct 16, 10:09*am, Gil Hamilton wrote:
    >> John Reiser wrote innews:48f76f26$0$3714$39cecf19@news.twtelecom.net :

    >
    >> > Do not use fread(), or other stdio functions.

    >
    >> Why should one not use stdio?

    >
    > Because this file has special properties that the stdio code may not
    > understand how to handle. The 'stdio' code is buffered, and reading a
    > file through a buffer while the contents of that file are changing
    > gives unpredictable results.


    Wait a minute. What does the buffering have to do with it?

    The question is, what happens at the level of read()?

    If you read() part of a /proc file, and then call read() again a little
    later to get some more, is the second one guaranteed to be consistent
    with the first one, or not?

    If the former is the case, then buffering can't hurt you. If the latter
    is the case, then not only can you not use stdio, you also can't use
    read() directly, unless you are careful to read() the whole thing at one
    go, which is hard to do when you don't know how big it is.

    It appears to me that the first case holds, that multiple read()s on a
    /proc file are consistent even if time elapses in between. It looks
    like the kernel generates the appropriate text in a buffer when the file
    is opened (or first read?), and further read()s just read out of that
    buffer. I assume if someone else opens it in the meantime, they'll get
    their own buffer with updated text. So the file isn't actually changing
    as you read it, in a sense.

    To test this, I used /proc/stat, which contains some cpu usage
    statistics.

    nate@archdiocese:/proc$ cat /proc/stat
    cpu 36129403 41649 2410347 292406206 6269588 44754 204578 0
    cpu0 36129403 41649 2410347 292406206 6269588 44754 204578 0
    intr 3462800761 3375578369 53483 0 7 288328 21988 2 286 0 2 2 86850069 5907 0 2229 89
    ctxt 1383672407
    btime 1216639718
    processes 1407293
    procs_running 2
    procs_blocked 0

    This is a single cpu system, so the cpu and cpu0 lines should always be
    identical. The first four numbers on those lines count cpu time, in
    clock ticks, that the system spent in various modes, so they should
    change very quickly. Now I do

    nate@archdiocese:/proc$ ( dd bs=50 count=1 ; sleep 10; cat ) cpu 36129651 41649 2410565 292421296 6269732 44751+0 records in
    1+0 records out
    50 bytes (50 B) copied, 0.000412338 s, 121 kB/s
    4 204578 0
    cpu0 36129651 41649 2410565 292421296 6269732 44754 204578 0
    intr 3462957794 3375735402 53483 0 7 288328 21988 2 286 0 2 2 86850069 5907 0 2229 89
    ctxt 1383800523
    btime 1216639718
    processes 1407371
    procs_running 1
    procs_blocked 0

    We see that the two lines still agree, even though they come from read()
    calls 10 seconds apart, from different processes no less.

    So it seems to me that using stdio on /proc files should be safe,
    provided that you don't try to seek.


  11. Re: Reading /proc/<pid>/maps file

    Thanks for pointing out that SEEK_END won't work. It looks like
    lseek(fd, 0, SEEK_SET) works. So what I'm doing now is trying to read
    the whole file in one slurp, and if that doesn't work, double the
    buffer size, seek to the beginning, and read again until read returns
    a count that is less than the buffer size. My first solution was to
    use fscanf and feof. I'm still not sure why that wasn't working. Maybe
    there was some other bug in my code.

    Thanks everybody,
    Pramod

  12. Re: Reading /proc/<pid>/maps file

    On Oct 16, 7:29*pm, Nate Eldredge wrote:

    > David Schwartz writes:


    > > Because this file has special properties that the stdio code may not
    > > understand how to handle. The 'stdio' code is buffered, and reading a
    > > file through a buffer while the contents of that file are changing
    > > gives unpredictable results.


    > Wait a minute. *What does the buffering have to do with it?


    > The question is, what happens at the level of read()?


    Really? How do you even know that "fread" call "read"? All you know is
    that "fread" works for normal files "somehow".

    > If you read() part of a /proc file, and then call read() again a little
    > later to get some more, is the second one guaranteed to be consistent
    > with the first one, or not?


    That guarantee would be very helpful if you were calling "read".

    > If the former is the case, then buffering can't hurt you. *If the latter
    > is the case, then not only can you not use stdio, you also can't use
    > read() directly, unless you are careful to read() the whole thing at one
    > go, which is hard to do when you don't know how big it is.


    You are making all kinds of assumptions about what stdio buffering
    looks like. Those assumptions may be true, or they may not. All you
    are assured is that "fread" does lots of complicated things that are
    efficient and safe for unchanging normal files.

    > It appears to me that the first case holds, that multiple read()s on a
    > /proc file are consistent even if time elapses in between. *It looks
    > like the kernel generates the appropriate text in a buffer when the file
    > is opened (or first read?), and further read()s just read out of that
    > buffer. *I assume if someone else opens it in the meantime, they'll get
    > their own buffer with updated text. *So the file isn't actually changing
    > as you read it, in a sense.


    Correct, if you don't close/reopen it or otherwise try to be too
    clever. All of these things are things you can assure if you make the
    system calls and are things you cannot assure if you use stdio.

    > So it seems to me that using stdio on /proc files should be safe,
    > provided that you don't try to seek.


    How do you know stdio doesn't internally put a seek call before every
    read? How do you know it doesn't open for append and call pread? Yes,
    if you use your internal knowledge about how stdio works, but no in
    general. The 'stdio' code is for normal files and terminals, but it's
    simply wrong for 'magic' files.

    DS

  13. Re: Reading /proc/<pid>/maps file

    On 2008-10-17, David Schwartz wrote:

    > Really? How do you even know that "fread" call "read"? All you know is
    > that "fread" works for normal files "somehow".


    The existance of fileno(3) gives a strong hint. The source for libc
    should prove conclusive.

    Bye.
    Jasen

  14. Re: Reading /proc/<pid>/maps file

    On Oct 18, 3:18*am, Jasen Betts wrote:

    > On 2008-10-17, David Schwartz wrote:


    > > Really? How do you even know that "fread" call "read"? All you know is
    > > that "fread" works for normal files "somehow".


    > The existance of fileno(3) gives a strong hint.


    It's possible that this function opens a file, rather than returning
    one that's already opened.

    >*The source for libc
    > should prove conclusive.


    Right, until on the next version of libc, the source is different. If
    you want to make your source dependent on how a particular version of
    libc is implemented, it's important to add conditional compilation
    directives to abort the compilation if you're compiling on an untested
    version of libc.

    You may want to check the source of libc every time it updates to make
    sure your code is still safe, but I sure don't.

    DS



  15. Re: Reading /proc/<pid>/maps file

    David Schwartz writes:
    > On Oct 18, 3:18*am, Jasen Betts wrote:
    >> On 2008-10-17, David Schwartz wrote:

    >
    >> > Really? How do you even know that "fread" call "read"? All you know is
    >> > that "fread" works for normal files "somehow".

    >
    >> The existance of fileno(3) gives a strong hint.

    >
    > It's possible that this function opens a file, rather than returning
    > one that's already opened.


    That would contradict both its documented behaviour:

    You can get the underlying file descriptor for an existing
    stream with the `fileno' function.

    [...]

    -- Function: int fileno (FILE *STREAM)
    This function returns the file descriptor associated with the
    stream STREAM.
    (glibc documentation)

    and its required behaviour:

    The fileno() function shall return the integer file descriptor
    associated with the stream pointed to by stream
    (SUS)

    It is technically correct that, just because a file descriptor is
    supposed to be associated with any particular stream, the
    stdio-implementation itself need not necessarily use any particular
    set of I/O-primitives provided in some operating environment, eg stdio
    could be implemented as set of system calls operating on special
    purpose kernel objects which are purposely incompatible with the other
    I/O-interfaces provided by this kernel (and this is probably not much
    different from the situation in 'certain operating environments' ...)
    but for UNIX(*) and anything like it, this is simple: stdio is a
    library sitting on top of the kernel system call API and is
    implemented by using the same facilities any other lump of userspace
    code could use, too.

    So, for as long as the Ex-egcs-team doesn't start to maintain the
    Linux C-library, assuming that things behave in a 'traditionally
    considered to be reasonable way' in order to maintain compatibility
    with existing code is IMO ok. Should the (nowadays) gcc-team ever
    start to maintain a 'maximally incompatible ISO-compliant C-library
    "because WE CAN!"' a sensible suggestion would be to file this as

    Wegen-ISO-C++-Gehtnix

    and ignore it ('stupid other operating system gratuitiously
    incompatible with UNIX(*) whose users can certainly care for their own
    problems, one hopes')

    >>*The source for libc should prove conclusive.

    >
    > Right, until on the next version of libc, the source is different. If
    > you want to make your source dependent on how a particular version of
    > libc is implemented,


    This particular version of a C library documents this as:

    13 Low-Level Input/Output
    *************************

    This chapter describes functions for performing low-level
    input/output operations on file descriptors. These functions
    include the primitives for the higher-level I/O functions
    described in *Note I/O on Streams::, as well as functions for
    performing low-level control operations for which there are no
    equivalents on streams.

    Of course, again assuming that the 'fork gcc'-people could take over
    this, yesterday's documented features are today's laughing stock, but
    in the meantime, the documented behaviour is good enough.

  16. Re: Reading /proc/<pid>/maps file

    Rainer Weikusat writes:

    > It is technically correct that, just because a file descriptor is
    > supposed to be associated with any particular stream, the
    > stdio-implementation itself need not necessarily use any particular
    > set of I/O-primitives provided in some operating environment, eg stdio
    > could be implemented as set of system calls operating on special
    > purpose kernel objects which are purposely incompatible with the other
    > I/O-interfaces provided by this kernel (and this is probably not much
    > different from the situation in 'certain operating environments' ...)
    > but for UNIX(*) and anything like it, this is simple: stdio is a
    > library sitting on top of the kernel system call API and is
    > implemented by using the same facilities any other lump of userspace
    > code could use, too.


    Any modern Unix-type kernel has many system calls outside the ones
    required by SUS. A libc designed specifically for, say, Linux is free
    to use Linux-specific system calls to implement any functionality.
    For example, there is no guarantee that fread() ever invokes a read()
    system call, although it seems to (I haven't read the spec carefully)
    be required to have an open file descriptor that *could* be used with
    read().

    --
    Måns Rullgård
    mans@mansr.com

  17. Re: Reading /proc/<pid>/maps file

    On 2008-10-20, David Schwartz wrote:
    > On Oct 18, 3:18Â*am, Jasen Betts wrote:
    >
    >> On 2008-10-17, David Schwartz wrote:

    >
    >> > Really? How do you even know that "fread" call "read"? All you know is
    >> > that "fread" works for normal files "somehow".

    >
    >> The existance of fileno(3) gives a strong hint.

    >
    > It's possible that this function opens a file, rather than returning
    > one that's already opened.


    The function fileno() examines the argument stream and
    returns it's integer descriptor.

    I think they would have used different wording if there was a chance
    of any side-effects.

    then there is the fact that fileno(stdin) returns the same number as
    the macro STDIN_FILENO and also the behavior of fdopen() ...

    Bye.
    Jasen

  18. Re: Reading /proc/<pid>/maps file

    Måns Rullgård writes:
    > Rainer Weikusat writes:
    >> It is technically correct that, just because a file descriptor is
    >> supposed to be associated with any particular stream, the
    >> stdio-implementation itself need not necessarily use any particular
    >> set of I/O-primitives provided in some operating environment, eg stdio
    >> could be implemented as set of system calls operating on special
    >> purpose kernel objects which are purposely incompatible with the other
    >> I/O-interfaces provided by this kernel (and this is probably not much
    >> different from the situation in 'certain operating environments' ...)
    >> but for UNIX(*) and anything like it, this is simple: stdio is a
    >> library sitting on top of the kernel system call API and is
    >> implemented by using the same facilities any other lump of userspace
    >> code could use, too.

    >
    > Any modern Unix-type kernel has many system calls outside the ones
    > required by SUS.


    Eh ... yes. So what?

    > A libc designed specifically for, say, Linux is free
    > to use Linux-specific system calls to implement any functionality.


    Please name the thing and the people responsible for it, so that I can
    be certain to never inadvertently use it. Generally, 'the Linux
    C-library', ie the C-library used by default on a Linux-distribution
    is the GNU C library. The GNU C-library has

    a) a certain documented behaviour.
    b) an open source implementation.

    > For example, there is no guarantee that fread() ever invokes a read()
    > system call,


    I already wrote in my first sentence that the UNIX(*)-specification
    does not demand that stdio uses any particular other facilities of a
    system. But this is of little interest to me (or anyone with an actual
    problem), because one does not program the UNIX(*)-specification (it
    is actually of really little interest to me, because stdio is
    historical cruft and its use should IMO be avoided).

  19. Re: Reading /proc/<pid>/maps file

    Jasen Betts writes:
    > On 2008-10-20, David Schwartz wrote:
    >> On Oct 18, 3:18*am, Jasen Betts wrote:
    >>> On 2008-10-17, David Schwartz wrote:

    >>
    >>> > Really? How do you even know that "fread" call "read"? All you know is
    >>> > that "fread" works for normal files "somehow".

    >>
    >>> The existance of fileno(3) gives a strong hint.

    >>
    >> It's possible that this function opens a file, rather than returning
    >> one that's already opened.

    >
    > The function fileno() examines the argument stream and
    > returns it's integer descriptor.
    >
    > I think they would have used different wording if there was a chance
    > of any side-effects.


    Each function that operates on a stream is said to have zero
    or more ``underlying functions''. This means that the stream
    function shares certain traits with the underlying functions,
    but does not require that there be any relation between the
    implementations of the stream function and its underlying
    functions.
    (SUS 2.5.1 'Interaction of File Descriptors and Standard I/O
    Streams')

    For the GNU C-library, the existence of such a 'relation between the
    implementation of the stdio-routines' and the I/O-interfaces provided
    by the kernel is documented to exist. Its existence can also
    empirically be verified by looking at the source code. This does not
    preclude hypothetical other implementations, where this would be
    different, including hypothetical other libraries also called 'GNU
    C-libary'.



+ Reply to Thread