Re: x86: 4kstacks default - Kernel

This is a discussion on Re: x86: 4kstacks default - Kernel ; On Sun, Apr 20, 2008 at 09:09:37AM -0500, Eric Sandeen wrote: > Mark Lord wrote: > > Willy Tarreau wrote: > >> What would really help would be to have 8k stacks with the lower page > >> causing a ...

+ Reply to Thread
Page 3 of 8 FirstFirst 1 2 3 4 5 ... LastLast
Results 41 to 60 of 154

Thread: Re: x86: 4kstacks default

  1. Re: x86: 4kstacks default

    On Sun, Apr 20, 2008 at 09:09:37AM -0500, Eric Sandeen wrote:
    > Mark Lord wrote:
    > > Willy Tarreau wrote:
    > >> What would really help would be to have 8k stacks with the lower page
    > >> causing a fault and print a stack trace upon first access. That way,
    > >> the safe setting would still report us useful information without
    > >> putting users into trouble.

    > > ..
    > >
    > > That's the best suggestion from this thread, by far!
    > > Can you produce a patch for 2.6.26 for this?
    > > Or perhaps someone else here, with the right code familiarity, could?
    > >
    > > Some sort of CONFIG option would likely be wanted to
    > > either enable/disable this feature, of course.

    >
    > Changing the default warning threshold is easy, it's just a #define.


    I thought it was checked only at a few places (eg: during irqs). If so,
    maybe it can miss some call chains ?

    > Although setting it too low would spam syslogs on some setups.


    we should set it slightly below the 4k limit if we want users to switch
    to 4k.

    > When I was trying to cram stuff into 4k in the past, I had a patch which
    > added a sysctl to dynamically change the warning threshold, and
    > optionally BUG() when I hit it for crash analysis. It was good for
    > debugging, at least. If something along those lines is desired, I could
    > resurrect it.


    While it's good for debugging, having users tweak the limit to eliminate
    the warning is the opposite of what we're looking for. We just want to
    have them report the warning without their service being disrupted.

    Willy

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: x86: 4kstacks default

    Willy Tarreau wrote:
    > On Sun, Apr 20, 2008 at 09:27:32AM -0400, Mark Lord wrote:
    >> Willy Tarreau wrote:
    >>> What would really help would be to have 8k stacks with the lower page
    >>> causing a fault and print a stack trace upon first access. That way,
    >>> the safe setting would still report us useful information without
    >>> putting users into trouble.

    >> ..
    >>
    >> That's the best suggestion from this thread, by far!


    Only if you believe that 4K stack pages are a worthy goal.
    As far as I can figure out they are not. They might have been
    a worthy goal on crappy 2.4 VMs, but these times are long gone.

    The "saving memory on embedded" argument also does not
    quite convince me, it is unclear if that is really
    a significant amount of memory on these systems and if that
    couldn't be addressed better (e.g. in running generally
    less kernel threads). I don't have numbers on this,
    but then the people who made this argument didn't have any
    either

    If anybody has concrete statistics on this
    (including other kernel memory users in realistic situations)
    please feel free to post them.


    >> Can you produce a patch for 2.6.26 for this?

    >
    > Unfortunately, I can't. I wouldn't know where to start from.


    The problem with his suggestion is that the lower 4K of the stack page
    are accessed in normal operation too because it contains the thread_struct.
    That could be changed, but it would be a relatively large change
    because you would need to audit/change a lot of code who assumes
    thread_struct and stack are continuous

    If that was changed implementing Willy's suggestion would not be that
    difficult using cpa() at the cost of some general slowdown in
    increased TLB misses and much higher thread creation/tear down cost etc,
    Using the alternative vmalloc way has also other issues.

    But still the fundamental problem is that it would likely only
    hit the interesting cases in real production setups and I don't
    think the production users would be very happy to slow down
    their kernels and handle strange backtraces just to act as guinea pigs
    for something dubious

    -Andi

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: x86: 4kstacks default

    On Sun, Apr 20, 2008 at 09:05:40AM -0500, Eric Sandeen wrote:
    > Adrian Bunk wrote:
    >
    > > But the more users will get 4k stacks the more testing we have, and the
    > > better both existing and new bugs get shaken out.
    > >
    > > And if there were only 4k stacks in the vanilla kernel, and therefore
    > > all people on i386 testing -rc kernels would get it, that would give a
    > > better chance of finding stack regressions before they get into a
    > > stable kernel.

    >
    > Heck, maybe you should make it 2k by default in all -rc kernels; that
    > way when people run -final with the 4k it'll be 100% bulletproof, right?
    > 'cause all those piggy drivers that blow a 2k stack will finally have
    > to get fixed?


    I'm arguing for aiming at having all 32bit architectures with 4k page
    size using the same stack size. Not for having -rc kernels differ from
    release kernels.

    > Or leave it at 2k and find a way to share pages for
    > stacks, think how much memory you could save and how many java threads
    > you could run!


    The only architecture that already defaults to 4k stacks is m68knommu,
    and I doubt they do it for many java threads...

    >...
    > -Eric
    >...


    cu
    Adrian

    --

    "Is there not promise of rain?" Ling Tan asked suddenly out
    of the darkness. There had been need of rain for many days.
    "Only a promise," Lao Er said.
    Pearl S. Buck - Dragon Seed

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: x86: 4kstacks default

    Willy Tarreau wrote:
    > On Sun, Apr 20, 2008 at 09:09:37AM -0500, Eric Sandeen wrote:
    >> Mark Lord wrote:
    >>> Willy Tarreau wrote:
    >>>> What would really help would be to have 8k stacks with the lower page
    >>>> causing a fault and print a stack trace upon first access. That way,
    >>>> the safe setting would still report us useful information without
    >>>> putting users into trouble.
    >>> ..
    >>>
    >>> That's the best suggestion from this thread, by far!
    >>> Can you produce a patch for 2.6.26 for this?
    >>> Or perhaps someone else here, with the right code familiarity, could?
    >>>
    >>> Some sort of CONFIG option would likely be wanted to
    >>> either enable/disable this feature, of course.

    >> Changing the default warning threshold is easy, it's just a #define.

    >
    > I thought it was checked only at a few places (eg: during irqs). If so,
    > maybe it can miss some call chains ?


    Ah, ok I skimmed your first suggestion too quickly. 100% coverage
    reports on the initial access to the 2nd 4k that way would be nice.
    Well, it would be nice if we all really wanted 4k stacks some day...

    -Eric
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: x86: 4kstacks default

    Adrian Bunk wrote:
    > On Sun, Apr 20, 2008 at 09:05:40AM -0500, Eric Sandeen wrote:
    >> Adrian Bunk wrote:
    >>
    >>> But the more users will get 4k stacks the more testing we have, and the
    >>> better both existing and new bugs get shaken out.
    >>>
    >>> And if there were only 4k stacks in the vanilla kernel, and therefore
    >>> all people on i386 testing -rc kernels would get it, that would give a
    >>> better chance of finding stack regressions before they get into a
    >>> stable kernel.

    >> Heck, maybe you should make it 2k by default in all -rc kernels; that
    >> way when people run -final with the 4k it'll be 100% bulletproof, right?
    >> 'cause all those piggy drivers that blow a 2k stack will finally have
    >> to get fixed?

    >
    > I'm arguing for aiming at having all 32bit architectures with 4k page
    > size using the same stack size. Not for having -rc kernels differ from
    > release kernels.


    Oh, I know. I'm just saying that 4k seems chosen out of convenience for
    memory management, without any real correlation to what you might
    actually need to run a thread. They do happen to be roughly equivalent
    for many cases, but not all. Setting a default which is not safe for
    several common use cases does not seem wise...

    I guess what I'm saying is, I don't agree that any callchain which needs
    more than 4k of stack indicates brokenness that must be fixed, as
    various posts in this thread seem to suggest.

    Sure, 1k char buffers on the stack and massive structs and unlimited
    recursion we can agree on as things to fix, but complex/deep/stacked
    callchains which don't fit in 4k are much more of a grey area.

    -Eric
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: x86: 4kstacks default

    On Sun, 20 Apr 2008 09:05:40 -0500
    Eric Sandeen wrote:

    >
    > 4K just happens to be the page size; other than that it's really just
    > some random/magic number picked, and now dictated that if you (and
    > everyting around you) doesn't fit, you're broken.


    it wasn't randomly picked; it was based on 2.4 kernels
    (where we had 8kb, but that was roughly 2.5Kb or so for the task struct,
    which was on stack back then, then 4Kb for user context and 2Kb for IRQ context)

    >
    > That bugs me.
    >


    yes. Adrian is waay off in the weeds on this one. Nobody but him is suggesting to remove
    8Kb stacks. I think everyone else agrees that having both options is valuable; and there
    are better ways to find+fix stack bloat than removing this config option.


    --
    If you want to reach me at my work email, use arjan@linux.intel.com
    For development, discussion and tips for power savings,
    visit http://www.lesswatts.org
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: x86: 4kstacks default

    On Sunday 20 April 2008 08:27:14 Andi Kleen wrote:
    > Adrian Bunk writes:
    > > 6k is known to work, and there aren't many problems known with 4k.
    > >
    > > And from a QA point of view the only way of getting 4k thoroughly tested

    >
    > But you have to first ask why do you want 4k tested? Does it serve
    > any useful purpose in itself? I don't think so. Or you're saying
    > it's important to support 50k kernel threads on 32bit kernels?
    >
    > -Andi


    Andi, you're the only one I've seen seriously pounding the "50k threads"
    thing - I don't think anyone is really fooled by the straw-man, so I'd
    suggest you drop it.

    The real issue is that you think (and are correct in thinking) that people are
    idiots. Yes, there will be breakages if the default is changed to 4k stacks -
    but if people are running new kernels on boxes that'll hit stack use problems
    (that *AREN'T* related to ndiswrapper) and haven't made sure that they've
    configured the kernel properly, then they deserve the outcome. It isn't the
    job of the Linux Kernel to protect the incompetent - nor is it the job of
    linux kernel developers to do such.

    If people are doing a "zcat /proc/kconfig.gz > .config && make oldconfig" (or
    similar) the problem shouldn't even appear, really. They'll get whatever
    setting was in their old config for the stack size. And until the problems
    with deep-stack setups - like nfs+xfs+raid - get resolved I'd think that the
    option to configure the stack size would remain.

    Since the second-most-common reason for stack overages is ndiswrapper... Well,
    with there being so much more hardware now supported directly by the linux
    kernel... I'm stunned every time someone tells me "I can't run Linux on my
    laptop, there is hardware that isn't supported without me having to get
    ndiswrapper". The last time someone said that to me I pointed to the fact
    that their hardware is supported by the latest kernel and even offered to
    build&install it for them.

    DRH

    --
    Dialup is like pissing through a pipette. Slow and excruciatingly painful.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: x86: 4kstacks default

    On Sun, Apr 20, 2008 at 08:41:27AM -0700, Arjan van de Ven wrote:
    >...
    > yes. Adrian is waay off in the weeds on this one. Nobody but him is suggesting to remove
    > 8Kb stacks. I think everyone else agrees that having both options is valuable; and there
    > are better ways to find+fix stack bloat than removing this config option.


    I'm not arguing for removing the option immediately, but long-term we
    shouldn't need it.

    This comes from my experience of removing obsolete drivers for hardware
    for which also a more recent driver exists:
    As long as there is some workaround (e.g. using an older driver or
    8k stacks) the workaround will be used instead of the getting proper
    bug reports and fixes.

    As far as I know all problems that are known with 4k stacks are some
    nested things with XFS in the trace.

    If this class of issues would get fixed one day, why would it be
    valuable to also offer 8k stacks long-term? Especially weigthed
    against the fact that with only 4k stacks we will have more people
    running into stack problems in -rc kernels if any new ones pop up,
    resulting in getting more such problems fixed during -rc.

    cu
    Adrian

    --

    "Is there not promise of rain?" Ling Tan asked suddenly out
    of the darkness. There had been need of rain for many days.
    "Only a promise," Lao Er said.
    Pearl S. Buck - Dragon Seed

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: x86: 4kstacks default

    Jörn Engel wrote:
    > On Sun, 20 April 2008 16:19:29 +0200, Andi Kleen wrote:
    >> Only if you believe that 4K stack pages are a worthy goal.
    >> As far as I can figure out they are not. They might have been
    >> a worthy goal on crappy 2.4 VMs, but these times are long gone.
    >>
    >> The "saving memory on embedded" argument also does not
    >> quite convince me, it is unclear if that is really
    >> a significant amount of memory on these systems and if that
    >> couldn't be addressed better (e.g. in running generally
    >> less kernel threads). I don't have numbers on this,
    >> but then the people who made this argument didn't have any
    >> either

    >
    > It is not uncommon for embedded systems to be designed around 16MiB.


    But these are SoC systems. Do they really run x86?
    (note we're talking about an x86 default option here)

    Also I suspect in a true 16MB system you have to strip down
    everything kernel side so much that you're pretty much outside
    the "validated by testers" realm that Adrian cares about.

    > When dealing in those dimensions, savings of 100k are substantial. In
    > some causes they may be the difference between 16MiB or 32MiB, which
    > translates to manufacturing costs. In others it simply means that the
    > system can cache


    If you need the stack you don't have any less cache foot print.
    If you don't need it you don't have any either.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: x86: 4kstacks default

    On Sun, Apr 20, 2008 at 10:42:28AM +0300, Adrian Bunk wrote:

    > We are going from 6k to 4k.


    6k?

    > Why should users have to poke with such deeply internal things?
    > That doesn't sound right.


    they shouldn't, so a 4k default is a problem for them

    > Excessive stack usage in the kernel is considered to be a bug.


    define excessive

    > We should identify and fix all remaining problems (if any).


    let's see your patches then
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: x86: 4kstacks default

    Daniel Hazelton wrote:

    > Andi, you're the only one I've seen seriously pounding the "50k threads"
    > thing. I don't think anyone is really fooled by the straw-man, so I'd
    > suggest you drop it.


    Ok, perhaps we can settle this properly. Like historicans. We study the
    original sources.

    The primary resource is the original commit adding the 4k stack code.
    You cannot find this in latest git because it predates 2.6.12, but it is
    available in one of the historic trees imported from BitKeeper like
    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

    Here's the log:
    >>

    commit 95f238eac82907c4ccbc301cd5788e67db0715ce
    Author: Andrew Morton
    Date: Sun Apr 11 23:18:43 2004 -0700

    [PATCH] ia32: 4Kb stacks (and irqstacks) patch

    From: Arjan van de Ven

    Below is a patch to enable 4Kb stacks for x86. The goal of this is to

    1) Reduce footprint per thread so that systems can run many more threads
    (for the java people)

    2) Reduce the pressure on the VM for order > 0 allocations. We see
    real life
    workloads (granted with 2.4 but the fundamental fragmentation
    issue isn't
    solved in 2.6 and isn't solvable in theory) where this can be a
    problem.
    In addition order > 0 allocations can make the VM "stutter" and
    give more
    latency due to having to do much much more work trying to defragment

    ....
    <<

    This gives us two reasons as you can see, one of them many threads
    and another mostly only relevant to 2.4

    Now I was also assuming that nobody took (1) really serious and
    attacked (2) in earlier thread; in particular in

    http://article.gmane.org/gmane.linux.kernel/665584

    >>

    Actually the real reason the 4K stacks were introduced IIRC was that
    the VM is not very good at allocation of order > 0 pages and that only
    using order 0 and not order 1 in normal operation prevented some stalls.

    This rationale also goes back to 2.4 (especially some of the early 2.4
    VMs were not very good) and the 2.6 VM is generally better and on
    x86-64 I don't see much evidence that these stalls are a big problem
    (but then x86-64 also has more lowmem).
    <<

    This was corrected by Ingo who was one of the primary authors of the patch:

    http://thread.gmane.org/gmane.linux.kernel/665420:

    >>

    no, the primary motivation Arjan and me started working on 4K stacks and
    implemented it was what Denys mentioned: i had a testcase that ran
    50,000 threads before it ran out of memory - i wanted it to run 100,000
    threads. The improved order-0 behavior was just icing on the cake.

    Ingo
    <<

    and then from Arjan:

    http://thread.gmane.org/gmane.linux.kernel/665420

    >>

    > no, the primary motivation Arjan and me started working on 4K stacks
    > and implemented it was what Denys mentioned: i had a testcase that


    well that and the fact that RH had customers who had major issues at
    fewer threads
    with 8Kb versus fragmentation.
    <<

    So both the primary authors of the patch state that 50k threads
    was the main reason. I didn't believe it at first either, but after
    these forceful corrections I do now.

    You're totally wrong when you call it a straw man.

    -Andi

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: x86: 4kstacks default

    Jörn Engel wrote:
    > On Sun, 20 April 2008 19:19:26 +0200, Andi Kleen wrote:
    >> But these are SoC systems. Do they really run x86?
    >> (note we're talking about an x86 default option here)
    >>
    >> Also I suspect in a true 16MB system you have to strip down
    >> everything kernel side so much that you're pretty much outside
    >> the "validated by testers" realm that Adrian cares about.

    >
    > Maybe. I merely showed that embedded people (not me) have good reasons
    > to care about small stacks.


    Sure but I don't think they're x86 embedded people. Right now there
    are very little x86 SOCs if any (iirc there is only some obscure rise
    core) and future SOCs will likely have more RAM.

    Anyways I don't have a problem to give these people any special options
    they need to do whatever they want. I just object to changing the
    default options on important architectures to force people in completely
    different setups to do part of their testing.


    Whether they care enough to actually spend
    > work on it - doubtful.
    >
    >>> When dealing in those dimensions, savings of 100k are substantial. In
    >>> some causes they may be the difference between 16MiB or 32MiB, which
    >>> translates to manufacturing costs. In others it simply means that the
    >>> system can cache

    >> If you need the stack you don't have any less cache foot print.
    >> If you don't need it you don't have any either.

    >
    > This part I don't understand.


    I was just objecting to your claim that small stack implies smaller
    cache foot print. Smaller stacks rarely give you smaller cache foot
    print in my kernel coding experience:

    First some stack is always safety and in practice unused. It won't
    be in cache.

    Then typically standard kernel stack pigs are just too large
    buffers on the stack which are not fully used. These also
    don't have much cache foot print.

    Or if you have a complicated call stack the typical fix
    is to move parts of it into another thread. But that doesn't
    give you less cache footprint because the cache foot print
    is just in someone else's stack. In fact you'll likely
    have slightly more cache foot print from that due to the
    context of the other thread.

    In theory if you e.g. convert a recursive algorithm
    to iterative you might save some cache foot print, but I don't
    think that really happens in kernel code.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: x86: 4kstacks default

    On Sun, 20 Apr 2008 19:26:10 +0200
    Andi Kleen wrote:

    > Daniel Hazelton wrote:
    >
    > > Andi, you're the only one I've seen seriously pounding the "50k
    > > threads" thing. I don't think anyone is really fooled by the
    > > straw-man, so I'd suggest you drop it.

    >
    > Ok, perhaps we can settle this properly. Like historicans. We study
    > the original sources.
    >
    > The primary resource is the original commit adding the 4k stack code.
    > You cannot find this in latest git because it predates 2.6.12, but it
    > is available in one of the historic trees imported from BitKeeper like
    > git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
    >
    > Here's the log:
    > >>

    > commit 95f238eac82907c4ccbc301cd5788e67db0715ce
    > Author: Andrew Morton
    > Date: Sun Apr 11 23:18:43 2004 -0700
    >
    > [PATCH] ia32: 4Kb stacks (and irqstacks) patch
    >
    > From: Arjan van de Ven
    >
    > Below is a patch to enable 4Kb stacks for x86. The goal of this
    > is to
    >
    > 1) Reduce footprint per thread so that systems can run many more
    > threads (for the java people)
    >
    > 2) Reduce the pressure on the VM for order > 0 allocations. We see
    > real life
    > workloads (granted with 2.4 but the fundamental fragmentation
    > issue isn't
    > solved in 2.6 and isn't solvable in theory) where this can be a
    > problem.
    > In addition order > 0 allocations can make the VM "stutter" and
    > give more
    > latency due to having to do much much more work trying to
    > defragment
    >
    > ...
    > <<
    >
    > This gives us two reasons as you can see, one of them many threads
    > and another mostly only relevant to 2.4
    >
    > Now I was also assuming that nobody took (1) really serious and


    I'm sorry but I really hope nobody shares your assumption here.
    These are real customer workloads; java based "many things going on" at a time
    showed several thousands of threads fin the system (a dozen or two per request, multiplied
    by the number of outstanding connections) for *real customers*.
    That you don't take that serious, fair, you can take serious whatever you want.


    > attacked (2) in earlier thread; in particular in


    yes you did attack. But lets please use more friendly conversation here than words like
    "attack". This is not a war, and we really shouldn't be hostile in this forum, neither
    in words nor in intention.

    >
    > http://article.gmane.org/gmane.linux.kernel/665584
    >
    > >>

    > Actually the real reason the 4K stacks were introduced IIRC was that
    > the VM is not very good at allocation of order > 0 pages and that only
    > using order 0 and not order 1 in normal operation prevented some
    > stalls.
    >
    > This rationale also goes back to 2.4 (especially some of the early 2.4
    > VMs were not very good) and the 2.6 VM is generally better and on
    > x86-64 I don't see much evidence that these stalls are a big problem
    > (but then x86-64 also has more lowmem).
    > <<


    What you didn't atta^Waddress was the observation that fragmentation is fundamentally unsolvable.
    Yes 2.4 sucked a lot more than 2.6 does. But even 2.6 will (and does) have fragmentation issues.
    We don't have effective physical address based reclaim yet for higher order allocs.

    >
    > http://thread.gmane.org/gmane.linux.kernel/665420:
    >
    > >>

    > no, the primary motivation Arjan and me started working on 4K stacks
    > and implemented it was what Denys mentioned: i had a testcase that ran
    > 50,000 threads before it ran out of memory - i wanted it to run
    > 100,000 threads. The improved order-0 behavior was just icing on the
    > cake.
    >
    > Ingo
    > <<
    >
    > and then from Arjan:
    >
    > http://thread.gmane.org/gmane.linux.kernel/665420
    >
    > >>

    > > no, the primary motivation Arjan and me started working on 4K stacks
    > > and implemented it was what Denys mentioned: i had a testcase that

    >
    > well that and the fact that RH had customers who had major issues at
    > fewer threads
    > with 8Kb versus fragmentation.
    > <<
    >
    > So both the primary authors of the patch state that 50k threads
    > was the main reason. I didn't believe it at first either, but after
    > these forceful corrections I do now.


    I'm sorry but I fail to entirely understand where your "So" or the rest of your
    conclusion comes from in terms of "both the authors". Which part of "fewer threads" and
    "8kb versus fragmentation" did you misunderstand to get to your conclusion?

    --
    If you want to reach me at my work email, use arjan@linux.intel.com
    For development, discussion and tips for power savings,
    visit http://www.lesswatts.org
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: x86: 4kstacks default

    On Sun, 20 Apr 2008 20:19:30 +0200
    Andi Kleen wrote:

    > In theory if you e.g. convert a recursive algorithm
    > to iterative you might save some cache foot print, but I don't
    > think that really happens in kernel code.
    >


    this is what Al did for the symlink recursion thing, and Jens did for the block layer...
    so yes this conversion does happen for real.

    --
    If you want to reach me at my work email, use arjan@linux.intel.com
    For development, discussion and tips for power savings,
    visit http://www.lesswatts.org
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: x86: 4kstacks default



    > These are real customer workloads; java based "many things going on" at a time
    > showed several thousands of threads fin the system (a dozen or two per request, multiplied
    > by the number of outstanding connections) for *real customers*.


    Several thousands or 50k? Several thousands sounds large, but not entirely unreasonable,
    but it is far from 50k.

    > That you don't take that serious, fair, you can take serious whatever you want.


    No I don't take 50k threads on 32bit serious. And I hope you do not
    either.

    Why I don't take it serious: on 32bit 50k threads will lead
    to lowmem exhaustion if the threads are actually doing something
    (like keeping select pages around or similar and having some thread
    local data). You'll easily be at 16-32K/thread and that is already
    far beyond the lowmem available on any 3:1 split 32bit kernel, likely
    even beyond 2:2. Even with 3:1 it could be tight.

    So you can say about customer workloads what you want, but you'll
    have a hard time convincing me they really run 50k threads
    doing something on 32bit.

    Now if we take the real realistic overhead of a thread into
    account 4k or more less don't really matter all that much
    and the decreased safety from the 4k stack starts to look
    like a very bad bargain.

    >> attacked (2) in earlier thread; in particular in

    >
    > yes you did attack.
    > But lets please use more friendly conversation here than words like
    > "attack". This is not a war, and we really shouldn't be hostile in this forum, neither
    > in words nor in intention.


    Ok what word would you prefer?

    There is no war involved right, just a technical argument. I previously
    always assumed that "attacking" was a standard term in discussions, but
    if you don't like I can switch to another one.

    Regarding war like terminology: I used to think that people who commonly
    talk about "nuking code" went a little too far, but at some point
    I adapted to them I think. Perhaps it comes from that.


    > What you didn't atta^Waddress


    Fine, I will call it address from now.

    > was the observation that fragmentation is fundamentally unsolvable.


    Where was that observation?

    > Yes 2.4 sucked a lot more than 2.6 does. But even 2.6 will (and does) have fragmentation issues.
    > We don't have effective physical address based reclaim yet for higher order allocs.


    I don't see any evidence that there are serious order 1 fragmentation
    issues on 2.6. If you have any please post it.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: x86: 4kstacks default

    Arjan van de Ven wrote:

    > this is what Al did for the symlink recursion thing,


    AFAIK most symlink lookups are still recursive.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: x86: 4kstacks default

    On Sun, 20 April 2008 20:19:30 +0200, Andi Kleen wrote:
    > >
    > >>> When dealing in those dimensions, savings of 100k are substantial. In
    > >>> some causes they may be the difference between 16MiB or 32MiB, which
    > >>> translates to manufacturing costs. In others it simply means that the
    > >>> system can cache
    > >> If you need the stack you don't have any less cache foot print.
    > >> If you don't need it you don't have any either.

    > >
    > > This part I don't understand.

    >
    > I was just objecting to your claim that small stack implies smaller
    > cache foot print.


    The cache I referred to is called DRAM, not L1.

    Jörn

    --
    Don't worry about people stealing your ideas. If your ideas are any good,
    you'll have to ram them down people's throats.
    -- Howard Aiken quoted by Ken Iverson quoted by Jim Horning quoted by
    Raph Levien, 1979
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: x86: 4kstacks default

    On Sun, 20 April 2008 20:19:30 +0200, Andi Kleen wrote:
    >
    > Sure but I don't think they're x86 embedded people. Right now there
    > are very little x86 SOCs if any (iirc there is only some obscure rise
    > core) and future SOCs will likely have more RAM.
    >
    > Anyways I don't have a problem to give these people any special options
    > they need to do whatever they want. I just object to changing the
    > default options on important architectures to force people in completely
    > different setups to do part of their testing.


    Ah, ok. The question whether 4k stacks should become the default I
    prefer not touching with an 80' pole.

    Jörn

    --
    Why do musicians compose symphonies and poets write poems?
    They do it because life wouldn't have any meaning for them if they didn't.
    That's why I draw cartoons. It's my life.
    -- Charles Shultz
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: x86: 4kstacks default

    On Sunday 20 April 2008 16:01:46 Andi Kleen wrote:
    > > These are real customer workloads; java based "many things going on" at a
    > > time showed several thousands of threads fin the system (a dozen or two
    > > per request, multiplied by the number of outstanding connections) for
    > > *real customers*.

    >
    > Several thousands or 50k? Several thousands sounds large, but not entirely
    > unreasonable, but it is far from 50k.
    >


    At 12 threads per request it'd only take about 4200 outstanding requests. That
    is high, but I can see it happening. At 24 threads per request the number of
    outstanding requests it takes to reach that is cut in half, to about 2100.
    That number is more realistic. Since all outstanding requests aren't going to
    be at the extremes, let us assume that it's a mid-point between the two for
    the number of outstanding requests - say somewhere around 3150 outstanding
    requests.

    While that is a rather high number, if a company - a decently sized one - is
    using a piece of Java code internally for some reason they could easily have
    that level of requests coming in from the users. For a website with a decent
    load that routes a common request to the machine running the code it'd be
    even easier to hit that limit. So yes, 50K threads *IS* actually pretty easy
    to reach and could be a common workload.

    > > That you don't take that serious, fair, you can take serious whatever you
    > > want.

    >
    > No I don't take 50k threads on 32bit serious. And I hope you do not
    > either.


    Just makes you sound foolish. Run the numbers yourself and you'll see that it
    is easy for a machine running highly threaded code to easily hit 50K threads.


    > > Yes 2.4 sucked a lot more than 2.6 does. But even 2.6 will (and does)
    > > have fragmentation issues. We don't have effective physical address based
    > > reclaim yet for higher order allocs.

    >
    > I don't see any evidence that there are serious order 1 fragmentation
    > issues on 2.6. If you have any please post it.


    Due to me screwing up the configuration of Apache (2) and MySQL I have seen a
    machine I own hit problems with memory fragmentation - and it's running a 2.6
    series kernel (a distro 2.6.17)

    Because I was able to see that it was a problem I caused I didn't even *THINK*
    about posting information about it to LKML. I didn't keep the logs of that
    around - it happened more than three months ago and I clean the logs out
    every three months or so.

    DRH

    --
    Dialup is like pissing through a pipette. Slow and excruciatingly painful.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: x86: 4kstacks default

    Daniel Hazelton wrote:

    > At 12 threads per request it'd only take about 4200 outstanding requests. That
    > is high, but I can see it happening.


    If it happens it just won't work on 32bit.

    > Just makes you sound foolish. Run the numbers yourself and you'll see that it
    > is easy for a machine running highly threaded code to easily hit 50K threads.


    I ran the numbers and the numbers showed that you need > 1.5GB of lowmem
    with a somewhat realistic scenario (32K per thread) at 50k threads. And
    subtracting 4k from that 32k number won't make any significant
    difference (still 1.3GB)

    If you claim that works on a 32bit system with typically 300-600MB
    lowmem available (which is also shared by other subsystem) I know who
    sounds foolish.

    -Andi
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 3 of 8 FirstFirst 1 2 3 4 5 ... LastLast