mmap() function call... - Unix

This is a discussion on mmap() function call... - Unix ; Hi Everyone, I searched for mmap() and i found the following in wikipedia, 'Anonymous mappings are mappings of physical RAM to virtual memory. This is similar to malloc, and is used in some malloc implementations for certain allocations' I understood ...

+ Reply to Thread
Results 1 to 15 of 15

Thread: mmap() function call...

  1. mmap() function call...

    Hi Everyone,

    I searched for mmap() and i found the following in wikipedia,


    'Anonymous mappings are mappings of physical RAM to virtual memory.
    This is similar to malloc, and is used in some malloc implementations
    for certain allocations'


    I understood that the memory contents are mapped to files using
    mmap()


    However, from the following link,


    http://ou800doc.caldera.com/en/man/html.2/mmap.2.html


    I understand that mmap() just avoids read/write concept of a file
    and
    makes sure that files are accessed as raw memory.


    Can anyone help me as to what is correct about mmap()?


    Thanks in advance!!!


  2. Re: mmap() function call...

    On Apr 23, 1:42 am, sam_...@yahoo.co.in wrote:
    > Hi Everyone,
    >
    > I searched for mmap() and i found the following in wikipedia,
    >
    > 'Anonymous mappings are mappings of physical RAM to virtual memory.
    > This is similar to malloc, and is used in some malloc implementations
    > for certain allocations'
    >
    > I understood that the memory contents are mapped to files using
    > mmap()
    >
    > However, from the following link,
    >
    > http://ou800doc.caldera.com/en/man/html.2/mmap.2.html
    >
    > I understand that mmap() just avoids read/write concept of a file
    > and
    > makes sure that files are accessed as raw memory.
    >
    > Can anyone help me as to what is correct about mmap()?


    They are both correct. What conflict do you see between them?

    You could use 'read' and 'write' on a file instead of 'malloc' and
    memory access if you wanted to.

    DS


  3. Re: mmap() function call...


    >
    > They are both correct. What conflict do you see between them?
    >
    > You could use 'read' and 'write' on a file instead of 'malloc' and
    > memory access if you wanted to.
    >


    Well, i see that the link says that a file can be mapped to a page in
    memory but from wikepedia it seems to be the other way, mapping from
    memory to virtual memory (disk)... mapping in both directions confuses
    me... :-(

    Second, if i have a file of 100 Mb and RAM of 64 MB, will mmap()
    work? I think the entire content of the file can't be mapped on to a
    page in RAM. Hence, is it correct to assume that when there is a
    fault, OS will automatically take care of page fault and bring in the
    new page with the remaining 36MB of data? and is it correct to
    understand that the application need not worry about all these facts,
    and still can assume that the complete 100 Mb of data is available in
    the location pointed by the pointer returned by mmap(), irrespective
    of whether page fault has happened or not?



  4. Re: mmap() function call...

    On Apr 23, 3:23 am, sam_...@yahoo.co.in wrote:

    > Well, i see that the link says that a file can be mapped to a page in
    > memory but from wikepedia it seems to be the other way, mapping from
    > memory to virtual memory (disk)... mapping in both directions confuses
    > me... :-(


    You can map a file into memory. That just means that pages from the
    file are loaded into memory and that memory is mapped into your file.
    An anonymous mapping acts like a mapping of a file, except there's no
    actual file. So it's just a way to have some memory around.

    > Second, if i have a file of 100 Mb and RAM of 64 MB, will mmap()
    > work? I think the entire content of the file can't be mapped on to a
    > page in RAM. Hence, is it correct to assume that when there is a
    > fault, OS will automatically take care of page fault and bring in the
    > new page with the remaining 36MB of data? and is it correct to
    > understand that the application need not worry about all these facts,
    > and still can assume that the complete 100 Mb of data is available in
    > the location pointed by the pointer returned by mmap(), irrespective
    > of whether page fault has happened or not?


    Correct. The mapping is actually into the process' virtual address
    space. Physical memory will be used to back the mapping as needed.

    DS


  5. Re: mmap() function call...

    >
    > You can map a file into memory. That just means that pages from the
    > file are loaded into memory and that memory is mapped into your file.
    > An anonymous mapping acts like a mapping of a file, except there's no
    > actual file. So it's just a way to have some memory around.
    >


    Thanks for the clarification, i have tried now to map a file to the
    memory using mmap and it works fine.
    However, i'm not sure about mapping memory to a anonymous file... :-(

    Can you give a small example to illustrate how it can be done? and
    if i'm correct the purpose would be to have dynamic memory when the
    RAM doesn't have enough memory to be allocated to a process.

    Second, if i mmap() and don't munmap(), will that cause a memory
    leak?


  6. Re: mmap() function call...

    Let's start from scratch... This may be a bit inaccurate, but anyway:

    A process has a virtual address space (virtual memory).

    Unless the process is swapped out, some or all of that virtual memory
    is mapped to physical memory.

    The virtual memory is also mapped to the swap file so that the process'
    memory can be swapped/paged out if the system needs more memory. So if
    you update some memory, the change will be carried through to the swap
    file as well. If you read some allocated virtual memory which is not
    mapped to physical memory, the corresponding data is read from the swap
    file and into a new physical memory page, and the virtual memory page is
    mapped do that.

    mmap() with a file maps another file than the swap file to some virtual
    memory. Depending on the mmap() flags, if you then update that memory,
    the file is updated. If you read from that virtual memory, a chunk of
    the file is if necessary read into physical memory and mapped to the
    appropriate virtual memory page.

    > if i have a file of 100 Mb and RAM of 64 MB, will mmap() work?


    If your _virtual_ address space is larger than 100M (+ whatever the
    process is already using), that's no problem. If a process is using
    200M virtual memory but you only have 64M phyiscal memory, then only
    some of the virtual memory pages are mapped to physical memory at any
    given time. The rest has been saved to the swap file or your mmapped
    file. If the process tries to access an address which is not in memory
    but is on file, the OS reads that address' page into a free physical
    memory page from the swap/mmap file - likel after throwing some other
    page out from physical memory to a file.

    --
    Hallvard

  7. Re: mmap() function call...


    >
    > mmap() with a file maps another file than the swap file to some virtual
    > memory. Depending on the mmap() flags, if you then update that memory,
    > the file is updated. If you read from that virtual memory, a chunk of
    > the file is if necessary read into physical memory and mapped to the
    > appropriate virtual memory page.
    >


    So you mean to say mmap() makes sure that instead of the default
    swap file, a new file (described by tge file descriptor passed to
    mmap) is used for swaping.
    But i have a question here, assume i have a program sample.c which
    invokes mmap() with fd1 and the exe is sample.exe.

    Now when the sample.exe is executed, a process is created a default
    swap file would be associtated. However, during the execution of the
    process, mmap() with fd1 is invoked and according to your statement,
    this new file would become the swap file, but this is just for the
    data in the file, but what about the original swap file the process
    sample.exe.

    Sorry if i'm not making sense, i would appreciate if you could
    explain in detail for this case.

    sample.c

    int main()
    {
    int fd1;
    fd1 = open(...);
    mmap(...,fd1,...);
    }


  8. Re: mmap() function call...

    sam_cit@yahoo.co.in wrote:
    >> They are both correct. What conflict do you see between them?
    >>
    >> You could use 'read' and 'write' on a file instead of 'malloc' and
    >> memory access if you wanted to.
    >>

    >
    > Well, i see that the link says that a file can be mapped to a page in
    > memory but from wikepedia it seems to be the other way, mapping from
    > memory to virtual memory (disk)... mapping in both directions confuses
    > me... :-(


    mapping can be 1:1 hence you can't tell what's mapping to what.

    >
    > Second, if i have a file of 100 Mb and RAM of 64 MB, will mmap()
    > work?


    Yes.

    > ... I think the entire content of the file can't be mapped on to a
    > page in RAM. Hence, is it correct to assume that when there is a
    > fault, OS will automatically take care of page fault and bring in the
    > new page with the remaining 36MB of data?


    Not quite but yes kinda.

    calling mmap usually does not read the file, it's the subsequent page
    faults that read the file. If your algorithm to read/write the file is
    sequential, then it will basically do as you said, however if access is
    random, it goes all over the place.

    > ... and is it correct to
    > understand that the application need not worry about all these facts,
    > and still can assume that the complete 100 Mb of data is available in
    > the location pointed by the pointer returned by mmap(), irrespective
    > of whether page fault has happened or not?
    >


    Most of the time, yes. Some of the time (seldom), it's important to
    perform some kind of read-ahead to avoid disk thrashing. For example,
    if you're comparing two files, it may be important to touch many pages
    of one area of memory (file) before reading the other.

  9. Re: mmap() function call...

    >
    > Most of the time, yes. Some of the time (seldom), it's important to
    > perform some kind of read-ahead to avoid disk thrashing. For example,
    > if you're comparing two files, it may be important to touch many pages
    > of one area of memory (file) before reading the other.
    >


    Ok from all of the above posts, i can infer that mmap() can be used
    for my purpose, however, i have one more doubt, does mmap() allocate
    memory in heap of the process or in any other region outside the
    process in which mmap() is called? and is it mandatory to invoke
    munmap() after every call to mmap()?




  10. Re: mmap() function call...

    >
    > Most of the time, yes. Some of the time (seldom), it's important to
    > perform some kind of read-ahead to avoid disk thrashing. For example,
    > if you're comparing two files, it may be important to touch many pages
    > of one area of memory (file) before reading the other.


    Does mmap() allocate any memory? If so, is it in the heap segment of
    the process which invokes the mmap() or is it in the global memory?
    Is there any link that explains the way mmap() works?


  11. Re: mmap() function call...

    sam_cit@yahoo.co.in writes:
    >> Most of the time, yes. Some of the time (seldom), it's important to
    >> perform some kind of read-ahead to avoid disk thrashing. For example,
    >> if you're comparing two files, it may be important to touch many pages
    >> of one area of memory (file) before reading the other.

    >
    > Does mmap() allocate any memory?


    'mmap' is one of the system-supplied primitives that can be used by
    memory allocators (eg malloc) to request more virtual memory from the
    kernel (the other is brk). It basically extends the set of valid
    memory locations of the process calling it. The first time a process
    accesses a location that has been 'declared to be valid' this way, a
    page fault happens, which causes the kernel page fault handler to run
    (somewhat simplified). This page fault handler then allocates a
    physical page of memory, fills it with the required content (data from
    a 'mapped file' or all-zeroes for anonymous mappings) and assigns it
    to the process whose access faulted. After that, the faulting
    instruction is restarted and completes as if the newly allocated page
    had always been there.

    > If so, is it in the heap segment of the process which invokes the
    > mmap() or is it in the global memory?


    'The heap segment' doesn't technically exist. 'heap' is usually memory managed
    by the C library memory allocate (malloc), which acquires 'valid
    virtual memory' from the kernel 'somehow'.

  12. Re: mmap() function call...

    On Apr 23, 4:58 am, sam_...@yahoo.co.in wrote:

    > Thanks for the clarification, i have tried now to map a file to the
    > memory using mmap and it works fine.
    > However, i'm not sure about mapping memory to a anonymous file... :-(


    > Can you give a small example to illustrate how it can be done? and
    > if i'm correct the purpose would be to have dynamic memory when the
    > RAM doesn't have enough memory to be allocated to a process.


    char *ptr;
    ptr=mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|
    MAP_ANONYMOUS, -1, 0);
    strcpy(ptr, "Hello, world!");
    printf("%s\n", ptr);
    munmap(ptr, 16384);

    Your explanation of the purpose doesn't seem to make very much sense.
    If there isn't enough virtual memory, 'mmap' will fail. If there is,
    there's no reason functions like 'malloc' or 'sbrk' should fail.

    > Second, if i mmap() and don't munmap(), will that cause a memory
    > leak?


    Yes, until the process terminates.

    DS


  13. Re: mmap() function call...

    On Apr 23, 3:18 pm, David Schwartz wrote:

    > Your explanation of the purpose doesn't seem to make very much sense.
    > If there isn't enough virtual memory, 'mmap' will fail. If there is,
    > there's no reason functions like 'malloc' or 'sbrk' should fail.


    I understand what you're saying now. Yes, if you 'mmap' a file
    *shared* then the file will normally be used to back your 'mmap'
    rather than swap. So it can be a way to get the effect of allocating
    more memory without sufficient RAM+swap being available.

    In addition, you can use 'mmap' backed by real files to deal with
    insufficient address space. You can 'mmap' in only the files (or file
    ranges) you need at the time, switching them in and out as required.

    DS


  14. Re: mmap() function call...

    sam_cit@yahoo.co.in wrote:
    > Now when the sample.exe is executed, a process is created a default
    > swap file would be associtated.


    Basically, except that there is on swap file (or set of swap files)
    shared by the whole system. They are not associated with any one
    single process.

    > However, during the execution of the
    > process, mmap() with fd1 is invoked and according to your statement,
    > this new file would become the swap file,


    It doesn't replace the swap file. Instead, every page in the process's
    virtual address space has an associated file to be used as backing store
    when the physical RAM is not available to hold that page. Pages allocated
    with malloc() use the system swap space as backing store. Pages put into
    the address space with mmap() use the specified file as the backing store.

    In other words, the kernel maintains a table. This table keeps track of
    the backing store for every (virtual) page. If the kernel decides it
    wants to reclaim the physical page (of RAM), it simply consults the
    table to know what to do with the virtual page's data. If it is mmap()ed,
    then it can write the data to the file named in the mmap(). If it is
    an anonymous page, it can use swap space.

    - Logan

  15. Re: mmap() function call...

    sam_cit@yahoo.co.in wrote:
    >> Most of the time, yes. Some of the time (seldom), it's important to
    >> perform some kind of read-ahead to avoid disk thrashing. For example,
    >> if you're comparing two files, it may be important to touch many pages
    >> of one area of memory (file) before reading the other.

    >
    > Does mmap() allocate any memory?


    It does add to the set of virtual addresses that your process can
    access.

    > If so, is it in the heap segment of
    > the process which invokes the mmap() or is it in the global memory?


    It is in neither. The heap is simply a data structure used by a
    library to break up pieces of memory. Heap management happens
    entirely in user code with no involvement from the kernel. However,
    if you call malloc() and there is insufficient space available in
    the heap, it will ask the kernel for more memory.

    Globals are complicated. If they are read-only, they could in fact
    be loaded by mmap()! After all, globals are just variables in your
    address space, and they are coming from a file (the executable), and
    they need to appear in memory. So mmap() is perfect for that.

    In general, though, you should stop thinking in terms of heap and
    globals. They are valid concepts, but they exist at a higher layer
    than mmap() and other mechanisms provided by the kernel.

    > Is there any link that explains the way mmap() works?


    It is easiest to understand if you understand the whole memory
    architecture. It helps to understand virtual memory versus
    physical memory and how the mapping between those is done
    (including having a basic idea of the concepts used in the
    hardware). And then it helps to understand the difference
    between library calls and system calls. Finally, you want to
    understand the interface between kernel and user code so that
    you know what must take place at each layer.

    The point is that all the information is probably not available
    in one place. You may need to get some of it from several
    different places and piece it together in your mind.

    Anyway, I found some BSD-related documentation helpful when I
    last tried to understand this. I can't remember what it was
    exactly, but I think this may have been the paper I found helpful:

    http://www.usenix.org/publications/l.../silvers_html/

    - Logan

+ Reply to Thread