[9fans] odd out-of-memory behavior - Plan9

This is a discussion on [9fans] odd out-of-memory behavior - Plan9 ; i'm seeing some out-of-memory behavior i don't quite understand. there is no swap configured. the machine is a cpuserver. the symptom is this message is repeated on the console maybe 20x. fs in this case is upas/fs. (the standard one.) ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: [9fans] odd out-of-memory behavior

  1. [9fans] odd out-of-memory behavior

    i'm seeing some out-of-memory behavior i don't quite understand.
    there is no swap configured. the machine is a cpuserver.
    the symptom is this message is repeated on the console
    maybe 20x. fs in this case is upas/fs. (the standard one.)

    311954: fs killed: out of memory
    out of physical memory; no swap configured
    311954: fs killed: out of memory
    out of physical memory; no swap configured
    311954: fs killed: out of memory

    there's a sleep of 5 seconds after killbig
    is called. so, though it's hard to imagine,
    it must be taking 100s to clean up this process.

    i'm not sure i have any ideas on how this could
    happen.

    - erik



  2. Re: [9fans] odd out-of-memory behavior

    > 311954: fs killed: out of memory
    > out of physical memory; no swap configured
    > 311954: fs killed: out of memory
    > out of physical memory; no swap configured
    > 311954: fs killed: out of memory
    >
    > there's a sleep of 5 seconds after killbig
    > is called. so, though it's hard to imagine,
    > it must be taking 100s to clean up this process.


    /sys/src/9/port/proc.c:/^killbig marks the process
    to be killed, but if it can't acquire the lock on that
    process's segments, the memory is not actually
    freed immediately:

    kp->procctl = Proc_exitbig;
    for(i = 0; i < NSEG; i++) {
    s = kp->seg[i];
    if(s != 0 && canqlock(&s->lk)) {
    mfreeseg(s, s->base, (s->top - s->base)/BY2PG);
    qunlock(&s->lk);
    }
    }

    Perhaps another upas/fs proc sharing the same
    segment is holding the segment lock and
    blocking on something else.

    If you can make it happen again, you could try
    to run

    acid -k -l kernel 1 /386/9pccpu # or your kernel image
    stacks()

    though of course without any memory it's going to
    be hard to start acid. You might be able to pull it off
    if you cpu somewhere else, bind /mnt/term/proc /proc,
    and then start acid there before you run the machine
    out of memory. As long as the exportfs serving /mnt/term
    doesn't need any new memory pages, it should be able
    to serve /proc well enough to the remote acid.

    Russ



  3. Re: [9fans] odd out-of-memory behavior

    unfortunately or fortunately this is a rare problem.
    hopefully the caching upas will mature faster than
    our mailboxes grow.

    thanks for re-pointing out the acid tricks. i shall
    lay a trap. but in the interest of covering the careful
    thought bit ...

    > /sys/src/9/port/proc.c:/^killbig marks the process
    > to be killed, but if it can't acquire the lock on that
    > process's segments, the memory is not actually
    > freed immediately:
    >
    > kp->procctl = Proc_exitbig;
    > for(i = 0; i < NSEG; i++) {
    > s = kp->seg[i];
    > if(s != 0 && canqlock(&s->lk)) {
    > mfreeseg(s, s->base, (s->top - s->base)/BY2PG);
    > qunlock(&s->lk);
    > }
    > }
    >
    > Perhaps another upas/fs proc sharing the same
    > segment is holding the segment lock and
    > blocking on something else.


    how would that happen? upas/fs -p doesn't fork.
    (it's being run from imap4d.)

    is there some other reason that segments would
    be shared?

    i originally thought someone else might be sitting
    on the shared segments, but i couldn't explain how
    that might be happening. i also thought the purpose
    of this loop was to hunt down relatives sharing memory
    with killbig's vic:

    for(p = procalloc.arena; p < ep; p++) {
    if(p->state == Dead || p->kp)
    continue;
    if(p != kp && p->seg[BSEG] && p->seg[BSEG] == kp->seg[BSEG])
    p->procctl = Proc_exitbig;
    }

    so much for the "careful" thought. what am i missing?

    - erik



  4. Re: [9fans] odd out-of-memory behavior

    > how would that happen? upas/fs -p doesn't fork.
    > (it's being run from imap4d.)


    maybe more than one process isn't involved.
    that would make your job easier. ☺

    > i originally thought someone else might be sitting
    > on the shared segments, but i couldn't explain how
    > that might be happening. i also thought the purpose
    > of this loop was to hunt down relatives sharing memory
    > with killbig's vic:
    >
    > for(p = procalloc.arena; p < ep; p++) {
    > if(p->state == Dead || p->kp)
    > continue;
    > if(p != kp && p->seg[BSEG] && p->seg[BSEG] == kp->seg[BSEG])
    > p->procctl = Proc_exitbig;
    > }
    >
    > so much for the "careful" thought. what am i missing?


    that loop identifies and marks them, but it doesn't kill them.
    they won't die until the next time they attempt to cross
    the kernel-user boundary.

    i also wonder if perhaps there is some way that you
    can manage to end up sleeping for the pager in fixfault
    while holding the lock of the big segment. i don't see
    one immediately, but that doesn't mean it's not there.

    if you can get acid running, 100 seconds should be plenty
    of time to get stack traces that would solve this.

    russ



  5. Re: [9fans] odd out-of-memory behavior

    > If you can make it happen again, you could try
    > to run
    >
    > acid -k -l kernel 1 /386/9pccpu # or your kernel image
    > stacks()
    >


    it's not immediately obvious what i am doing wrong:

    akin# acid -k -l kernel 1 /386/9pccpu
    /386/9pccpu:386 plan 9 boot image
    /sys/lib/acid/port
    /sys/lib/acid/386
    /sys/lib/acid/kernel
    acid: include("acid")
    acid: include("procacid")
    acid: stacks()
    ================================================== =======
    0xf0312008 1: init dennis pc 0x00008984 Await (Wakeme) ut 2 st 2 qpc 0x00000000
    :5: (error) no stack frame: can't translate address 0xf001bf30

    - erik


  6. Re: [9fans] odd out-of-memory behavior

    > it's not immediately obvious what i am doing wrong:
    >
    > akin# acid -k -l kernel 1 /386/9pccpu
    > /386/9pccpu:386 plan 9 boot image
    > /sys/lib/acid/port
    > /sys/lib/acid/386
    > /sys/lib/acid/kernel
    > acid: include("acid")
    > acid: include("procacid")
    > acid: stacks()
    > ================================================== =======
    > 0xf0312008 1: init dennis pc 0x00008984 Await (Wakeme) ut 2 st 2 qpc 0x00000000
    > :5: (error) no stack frame: can't translate address 0xf001bf30


    i forgot to say you should

    mappc()

    first. the pc kernel maps some extra data memory
    below the text segment, which isn't accounted for
    in the default acid map.

    russ



  7. Re: [9fans] odd out-of-memory behavior

    Apologies for the double-post. Now that I look,
    the function to do the mapping is now called kinit.

    Russ



+ Reply to Thread