Re: [9fans] plan 9 overcommits memory? - Plan9

This is a discussion on Re: [9fans] plan 9 overcommits memory? - Plan9 ; > if we allow overcomitted memory, *any* access of brk'd memory might page > fault. this seems like a real step backwards in error recovery as most programs > assume that malloc either returns n bytes of valid memory or ...

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 21 to 40 of 52

Thread: Re: [9fans] plan 9 overcommits memory?

  1. Re: [9fans] plan 9 overcommits memory?

    > if we allow overcomitted memory, *any* access of brk'd memory might page
    > fault. this seems like a real step backwards in error recovery as most programs
    > assume that malloc either returns n bytes of valid memory or fails. since
    > this assumption is false, either we need to make it true or fix most programs.


    in the Inferno environment, i concluded that exceptions were the only way of handling
    that (you could use notes in Plan 9), and that deals with both explicit
    and implicit allocations. it's more obvious in Inferno because implicit allocations
    are common, because the run time system allocates memory dynamically, and not
    just for stack frames.

    the exception handlers are placed, optionally, fairly high up in the application processes, with further supervising sets
    towards the root of the system (eg, to encapsulate individual applications within the window system).
    an unhandled exception within an application process causes that process and perhaps others in its group
    to die, and the exception is propagated to a process that's the nominated process group leader.

    note that the process that incurs the exception is just the one that ran out of memory, not the one
    that `really' caused the problem. there needs to be some extra mechanism to ensure that important
    functions survive in any case.

    i looked at quota systems, but they are far too pessimistic for memory systems as they are for disc systems
    (for most embedded devices, which is where you typically care most about this).
    typically you end up with either over-committing (which is where we started), or poor utilisation
    (which also isn't great for small embedded systems).

    that left some form of allocation accounting, but we found that most programmers
    for one reason or another found quite hard the systems analysis that's needed to make
    allocation accounting work (although the degree of pessimism is typically much less
    than that of quotas, which are too coarse-grained).

    i used a variant that reserved a given amount of memory for use by distinguished processes
    critical to system operation or recovery. (perhaps this protected memory structure should have nested,
    but it seemed better to see if that would really be useful.)

    systems analysis at this level is much easier, though still neglected.


  2. Re: [9fans] plan 9 overcommits memory?

    One option for Erik: try changing the segment allocator so that it
    faults in all segment pages on creation. Would this do what you want?
    I will try this if I get time later today. Assuming it is as simple as
    my simple-minded description makes it sound.

    If it would, maybe a simple
    echo faultall > /proc/pid/ctl
    would be useful

    would be interesting: iterate over all segments, and make sure each
    has a real page for all pages in all segments.

    I can see the need for not overcommitting, and also for actually
    creating and filling out the pages on malloc or other allocation.
    Indeed, lack of OS thrashing due to paging is one feature cited by
    proponents of this:
    http://www.cs.sandia.gov/~smkelly/SA...ntDualCore.pdf

    ron

  3. Re: [9fans] plan 9 overcommits memory?

    I was thinking that this was probably what we wanted to do for HPC....
    also, having the option of turning off zero-filling pages....

    -eric


    On 9/3/07, ron minnich wrote:
    > One option for Erik: try changing the segment allocator so that it
    > faults in all segment pages on creation. Would this do what you want?
    > I will try this if I get time later today. Assuming it is as simple as
    > my simple-minded description makes it sound.
    >
    > If it would, maybe a simple
    > echo faultall > /proc/pid/ctl
    > would be useful
    >
    > would be interesting: iterate over all segments, and make sure each
    > has a real page for all pages in all segments.
    >
    > I can see the need for not overcommitting, and also for actually
    > creating and filling out the pages on malloc or other allocation.
    > Indeed, lack of OS thrashing due to paging is one feature cited by
    > proponents of this:
    > http://www.cs.sandia.gov/~smkelly/SA...ntDualCore.pdf
    >
    > ron
    >


  4. Re: [9fans] plan 9 overcommits memory?

    > One option for Erik: try changing the segment allocator so that it
    > faults in all segment pages on creation. Would this do what you want?
    > I will try this if I get time later today. Assuming it is as simple as
    > my simple-minded description makes it sound.


    grudgingly, i admit it would -- assuming that malloc then returns 0 and
    doesn't send the proc a note.

    you've got me motivated to think about this some more.

    - erik

  5. Re: [9fans] plan 9 overcommits memory?

    On Mon Sep 3 16:48:59 EDT 2007, geoff@plan9.bell-labs.com wrote:
    > If your machines are regularly running out of VM, something is wrong
    > in your environment. I would argue that we'd be better off fixing
    > upas/fs to be less greedy with memory than contorting the system to
    > try to avoid overcommitting memory.


    well, yes. the problem is 400MB mailboxes. but i'll let you tell
    folk with mailboxes that large, that that's too large. ;-)

    it'd be nice to be able to use more than 3.75-pcispace GB of memory.

    but i don't see this as a "fix upasfs" problem. i see this as a general
    problem that upas/fs's huge memory usage highlights. this can happen
    to any process. suppose i start a program that allocates 8k but between
    the malloc and the memset, another program uses the last available
    page in memory, then my original program faults.

    > If one did change the system to
    > enforce a limit of 16MB for the aggregate of all system stacks, what
    > would happen when a process needed to grow its stack and the 16MB were
    > full? Checking malloc returns cannot suffice.


    no, it wouldn't. obviously one doesn't malloc the stack -- at least not today.
    but this is no worse than the current situation for stacks.
    and an improvement for the heap.

    if one made the limit settable at runtime, one could verify reasonable
    stack usage while testing.

    here i think ron's idea of pre-faulting makes even more sense for
    the stack than the heap, as stack allocation is implicit.

    - erik

  6. Re: [9fans] plan 9 overcommits memory?

    > to any process. suppose i start a program that allocates 8k but between
    > the malloc and the memset, another program uses the last available
    > page in memory, then my original program faults.


    yes, and you'll always have to deal with that in some form or another.
    i've started a program, it allocates some memory, is guaranteed to have it (unlike the current system),
    but later, some other program allocates enough memory that mine can't get
    any more, memory that mine needs to finish (perhaps during an output phase).
    my original program fails, even though the system guarantees physical memory
    for all virtual memory allocations.


  7. Re: [9fans] plan 9 overcommits memory?

    >> to any process. suppose i start a program that allocates 8k but between
    >> the malloc and the memset, another program uses the last available
    >> page in memory, then my original program faults.

    >
    > yes, and you'll always have to deal with that in some form or another.
    > i've started a program, it allocates some memory, is guaranteed to have it (unlike the current system),
    > but later, some other program allocates enough memory that mine can't get
    > any more, memory that mine needs to finish (perhaps during an output phase).
    > my original program fails, even though the system guarantees physical memory
    > for all virtual memory allocations.


    that would be perfect.

    perhaps i've been unclear. i don't have any problem dealing with failed
    alloc. malloc has always been able to return 0.

    dealing with a page fault due to overcommit is a different story.

    - erik


  8. Re: [9fans] plan 9 overcommits memory?

    > perhaps i've been unclear. i don't have any problem dealing with failed
    > alloc. malloc has always been able to return 0.
    >
    > dealing with a page fault due to overcommit is a different story.


    that's a slightly different aspect. the note should not be "page fault" but
    "out of memory" (or some such thing). that's much better than a nil return.
    most errors on shared resoruces are better expressed as exceptions (notes),
    because that's what they are: they are a failure of the underlying physical or virtual machine
    to handle an exceptional case. the code shouldn't have to deal with it explicitly everywhere,
    except in C to detect and propagate the exception to code that knows what's going on.

    exceptions have acquired a bad name in some circles because of the way that some
    people tried to use them for situations, usually in interfaces, that are hardly exceptional (eg, Ada and Java).


  9. Re: [9fans] plan 9 overcommits memory?

    > No, and it would be hard to do it because you'd need ways to compact
    > fragmented memory after a lot of mallocs and frees. And then, you'd
    > need a way to fix the pointers after compacting.


    Is it all localised, or is the code scattered across multiple kernel
    modules? Many years ago I put a lot of effort into a scheme for
    automatic compacting of memory that worked pretty efficiently on an
    8088 without being particularly architecture dependent. When I tried
    to implement it on NetBSD, it turned out that I did not understand
    NetBSD particularly well and that was the last of my attempts.

    The code has long been lost, but I remember the strategy very well, so
    I could try again.

    ++L


  10. Re: [9fans] plan 9 overcommits memory?

    > On the pc, Plan 9 currently limits user-mode stacks to 16MB.
    > On a CPU server with 200 processes (fairly typical), that's
    > 3.2GB of VM one would have to commit just for stacks. With
    > 2,000 processes, that would rise to 32GB just for stacks.


    There's probably no simple answer which is correct for all goal
    sets.

    For an embedded widget, you might want to create a small number
    of processes and be utterly sure none of them would run out of
    RAM (which might be small). If you think your stuff fits in
    small stacks you'd probably like to know as early as possible
    if it doesn't, so the kernel "helpfully" giving you 16-meg
    stacks might not be so helpful.

    For a web server you probably want some number of parallel
    requests to run to completion and excess requests to be queued
    and/or retried by the (remote) browser. Overcommitting seems
    likely to be harmful here, since each process which dies when
    it tries to grow a stack page won't complete, and may return
    a nonsense answer to the client. It seems like you could thrash,
    with most processes running for a while before getting killed.

    Overcommitted 16-meg stacks are probably fine for lightly-loaded
    CPU servers running random mixes of programs... but I suspect
    other policies would also be fine for this case.

    Personally I am not a fan of programs dying randomly because of
    the behavior of other programs. So I guess my vote would be for
    a small committed stack (a handful of pages) with an easy way for
    processes with special needs to request a (committed) larger size.

    But I'd probably prefer an OHCI USB driver first :-)

    Dave Eckhardt

  11. Re: [9fans] plan 9 overcommits memory?

    wrote in message
    news:01b719eaabe004a9073ccb4b3425e1d0@plan9.bell-labs.com...
    > ... The big exception is stack growth. ...


    That has indeed been a longstanding problem, and if the OSes
    want to grow up they need to solve that problem. It is obvious
    how to solve it if speed isn't an issue; just test upon each function
    invocation. I bet there are efficient hacks for that..

  12. Re: [9fans] plan 9 overcommits memory?

    wrote...
    > there's no conceivable reason anyone would want swap, and operating
    > systems with working swap suck


    Actually there have been many successful OSes with swapping/demand paging.

    A way to make it work is for process initiation to include resource
    allocation, especially memory limits. Or, you could implement the
    "working set" concept according to which a process swaps only
    against itself., and the maximum working set RAM is guaranteed or
    the attempt to execute the process reports failure at the outset.

  13. Re: [9fans] plan 9 overcommits memory?

    > that's a slightly different aspect. the note should not be "page fault" but
    > "out of memory" (or some such thing). that's much better than a nil return.
    > most errors on shared resoruces are better expressed as exceptions (notes),
    > because that's what they are: they are a failure of the underlying physical or virtual machine
    > to handle an exceptional case. the code shouldn't have to deal with it explicitly everywhere,
    > except in C to detect and propagate the exception to code that knows what's going on.


    if one wishes to be remotely standards-compliant, sending a note on allocation
    failure is not an option. k&r 2nd ed. p. 252.

    - erik

  14. Re: [9fans] plan 9 overcommits memory?

    > if one wishes to be remotely standards-compliant, sending a note on allocation
    > failure is not an option. k&r 2nd ed. p. 252.


    i was discussing something about it in practice, and not in a 1970's environment,
    where the approach didn't really work well even then. the `recovery' that resulted was almost
    invariably equivalent to sysfatal.


  15. Re: [9fans] plan 9 overcommits memory?

    Two cases so far: running out of stack and allocating overcommited memory.

    You can easily catch stack growth failure in the OS. The proc gets a
    note. The proc has the
    option of the equivalent of 'echo growstack xyz > /proc/me/ctl'.

    For overcommits, 'echo faultall > /proc/me/ctl'.

    can we catch everything this way?

    ron

  16. Re: [9fans] plan 9 overcommits memory?

    i can't remember whether anyone pointed this out as well (probably):
    you'll need to ensure that each fork reserves as many physical pages as are currently
    shared in the data space, for the life of the shared data,
    so that every subsequent copy-on-write is guaranteed to succeed.
    this will prevent some large processes from forking to exec much smaller images.


  17. Re: [9fans] plan 9 overcommits memory?

    > > if one wishes to be remotely standards-compliant, sending a note on allocation
    > > failure is not an option. k&r 2nd ed. p. 252.

    >
    > i was discussing something about it in practice, and not in a 1970's environment,
    > where the approach didn't really work well even then. the `recovery' that resulted was almost
    > invariably equivalent to sysfatal.


    sysfatal is a reasonable recovery strategy for many programs. for many others,
    there may be something useful to do, like allocate a smaller or fewer buffers.

    - erik

  18. Re: [9fans] plan 9 overcommits memory?

    On 9/4/07, erik quanstrom wrote:
    > > Also, [swap is] broken, broken, broken on Plan 9

    >
    > but could you describe what antisocial behavior it exhibits and how one
    > could reproduce this behavior?


    My cpu/auth/file server is a poor little headless P100 with 24MB RAM
    (there's 32 in there but apparently one of the sticks is faulty). I
    have a 192MB swap partition set up, man -P hoses the box (gs was the
    biggest memory user IIRC). Actually, hoses is a bit misleading...
    I hear the box reading the disk for a time, then my drawterm locks
    up, then it carries on with the disk activity, changes sound slightly
    (guess it's into swap), and finally goes silent. drawterm is still
    locked up, it's like something swap related is deadlocked or somesuch.
    Now, I'm sure, in the past that if I ^T^Tr drawterm at this point,
    some time later the box recovers in a flourish of disk activity and I
    can reconnect. But apparently this is not guarenteed as last night
    when I tried it to get accurate timings it really was hosed, and still
    dead when I woke up just now. I dragged a monitor over to it but my ps
    hung, so I guess fossil or something else important bit it.
    Unfortunately I forgot about ^T^Tp until just now.

    So yeah, I've probably got a decent test bed for swapping.
    -sqweek

  19. Re: [9fans] plan 9 overcommits memory?

    Charles Forsyth wrote:
    > you'll need to ensure that each fork reserves as many physical pages as are currently
    > shared in the data space, for the life of the shared data,
    > so that every subsequent copy-on-write is guaranteed to succeed.
    > this will prevent some large processes from forking to exec much smaller images.


    That's why many OSes have a "spawn" primitive that combines fork-and-exec.

  20. Re: [9fans] plan 9 overcommits memory?

    > That's why many OSes have a "spawn" primitive that combines fork-and-exec.

    the problem with spawn is it requires a mechanism to replace that little
    block of code between the fork and exec. that code is hardly ever the
    same so spawn keeps growing arguments.

    - erik


+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast