Re: Direction of Stack Growth - Embedded

This is a discussion on Re: Direction of Stack Growth - Embedded ; Nick Maclaren wrote: > In article , > Jerry Avins writes: ... > |> Ayup! The Forths I use have three stacks. > > Up, down and sideways? I always knew that Forth was a bizarre > language :-) Return ...

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 21 to 38 of 38

Thread: Re: Direction of Stack Growth

  1. Re: Direction of Stack Growth

    Nick Maclaren wrote:
    > In article <_uGdnTqXWaEPHYLanZ2dnUVZ_hWdnZ2d@rcn.net>,
    > Jerry Avins writes:


    ...

    > |> Ayup! The Forths I use have three stacks.
    >
    > Up, down and sideways? I always knew that Forth was a bizarre
    > language :-)


    Return (and auxiliary), parameter (for passing data) and floating point.
    Some implementations put floats on the parameter stack.

    Many Forth multitaskers are cooperative (round robin) and provide
    separate stacks for each task. This is done by giving each task its own
    stack pointers, and making those effective as appropriate. There are
    still three (or two) stacks in use at any one time. One system I used
    had a separate interrupt stack.

    Jerry
    --
    Engineering is the art of making what you want from things you can get.
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

  2. Re: Direction of Stack Growth

    Op 24 Oct 2007 17:53:45 GMT schreef Nick Maclaren:

    > In article <_uGdnTqXWaEPHYLanZ2dnUVZ_hWdnZ2d@rcn.net>,
    > Jerry Avins writes:
    >|>
    >|> > As some people know, I have been banging on about double stacks for
    >|> > years. I first encountered them in Algol 68, was suspicious, but
    >|> > discovered that they need negligible extra code, stack space and
    >|> > register use, and vastly improve locality. Now, in these days of
    >|> > 90% of performance being due to cachability, why does nobody pick
    >|> > them up again?
    >|> >
    >|> > It's insane.
    >|>
    >|> Ayup! The Forths I use have three stacks.
    >
    > Up, down and sideways? I always knew that Forth was a bizarre
    > language :-)
    >

    The Forth I had for the RCA Cosmac 1802 had one stack growing up and the
    other growing down ;-)

    --
    Coos

    CHForth, 16 bit DOS applications
    http://home.hccnet.nl/j.j.haak/forth.html

  3. Re: Direction of Stack Growth

    Coos Haak wrote:
    > Op 24 Oct 2007 17:53:45 GMT schreef Nick Maclaren:
    >
    >> In article <_uGdnTqXWaEPHYLanZ2dnUVZ_hWdnZ2d@rcn.net>,
    >> Jerry Avins writes:
    >> |>
    >> |> > As some people know, I have been banging on about double stacks for
    >> |> > years. I first encountered them in Algol 68, was suspicious, but
    >> |> > discovered that they need negligible extra code, stack space and
    >> |> > register use, and vastly improve locality. Now, in these days of
    >> |> > 90% of performance being due to cachability, why does nobody pick
    >> |> > them up again?
    >> |> >
    >> |> > It's insane.
    >> |>
    >> |> Ayup! The Forths I use have three stacks.
    >>
    >> Up, down and sideways? I always knew that Forth was a bizarre
    >> language :-)
    >>

    > The Forth I had for the RCA Cosmac 1802 had one stack growing up and the
    > other growing down ;-)


    And register 3 was always the interrupt stack, whatever had been
    assigned as the stack pointer for normal operation. There were standard
    register assignments 1, DMA (and PC?) 2, data pointer; 3, stack pointer,
    4, CALL routine program counter; 5, RETURN routine program counter; 6,
    link register, the rest unspoken for. These were merely convention,
    Except for DMA and interrupts, any register could serve any purpose.

    The Math ROM followed the standard convention. Since the 1802 has no
    relative addressing capability, addressing an element of the stack
    requires pointing a register to it. The genius who wrote the routines
    incremented and decremented the stack pointer as needed instead of
    (laboriously) copying it to a free register and using that.

    I wrote a controller that needed the math package, and I used a
    non-standard register as my stack pointer. One fellow from the software
    side berated me for not adhering to the convention. I explained that I
    was using both the math ROM and interrupts, that the ROM code diddled
    the stack pointer with the expectation of restoring it later, and that
    if an interrupt happened while the pointer was out of place .... At that
    point, my accuser said, "Don't tell me what it does, I wrote that
    package. So *that's* why the system sometimes crashes." I guess you
    could say that my fancy controller, as implemented, also had a separate
    interrupt stack.

    Jerry
    --
    Engineering is the art of making what you want from things you can get.
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

  4. Re: Direction of Stack Growth

    Coos Haak wrote:
    > Op 24 Oct 2007 17:53:45 GMT schreef Nick Maclaren:
    >
    >> In article <_uGdnTqXWaEPHYLanZ2dnUVZ_hWdnZ2d@rcn.net>,
    >> Jerry Avins writes:
    >> |>
    >> |> > As some people know, I have been banging on about double stacks for
    >> |> > years. I first encountered them in Algol 68, was suspicious, but
    >> |> > discovered that they need negligible extra code, stack space and
    >> |> > register use, and vastly improve locality. Now, in these days of
    >> |> > 90% of performance being due to cachability, why does nobody pick
    >> |> > them up again?
    >> |> >
    >> |> > It's insane.
    >> |>
    >> |> Ayup! The Forths I use have three stacks.
    >>
    >> Up, down and sideways? I always knew that Forth was a bizarre
    >> language :-)
    >>

    > The Forth I had for the RCA Cosmac 1802 had one stack growing up and the
    > other growing down ;-)


    And register 3 was always the interrupt stack, whatever had been
    assigned as the stack pointer for normal operation. There were standard
    register assignments IIRC, 1, DMA (and PC?); 2, data pointer; 3, stack
    pointer; 4, CALL routine PC; 5, RETURN routine PC; 6, link register, the
    rest unspoken for. These were merely convention, Except for DMA and
    during interrupts, any register could serve any purpose.

    The math ROM followed the standard convention. Since the 1802 has no
    relative addressing capability, addressing an element of the stack
    requires pointing a register to it. The genius who wrote the routines
    incremented and decremented the stack pointer as needed instead of
    (laboriously) copying it to a free register and diddling that.

    I wrote a controller that needed the math package, and I used a
    non-standard register as my stack pointer. One fellow from the software
    side berated me for not adhering to the convention. I explained that I
    was using both the math ROM and interrupts, that the ROM code diddled
    the stack pointer with the expectation of restoring it later, and that
    if an interrupt happened while the pointer was out of place .... At that
    point, my accuser said, "Don't tell me what it does, I wrote that
    package. So *that's* why the system sometimes crashes." I guess you
    could say that my fancy controller, as implemented, also had a separate
    interrupt stack.

    Jerry
    --
    Engineering is the art of making what you want from things you can get.
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

  5. Re: Direction of Stack Growth

    mojaveg@mojaveg.lsan.mdsg-pacwest.com (Everett M. Greene) writes:
    > While on the subject of return addresses, isn't the ARM a throwback
    > to the old BALR days? It's granted that you can save a push operation
    > for leaf functions, but otherwise you spend the same or more operations
    > saving the return address than just having the hardware do it
    > automagically.



    a separate issue showed up in os/360 days in the transition from real
    storage to virtual memory. part of the problem was there was a very
    ingrained pointer-passing convention that permeated the
    infrastructure. in the transition to virtual memory on 370s ... the
    idea was to give each process possibly a full, maximum 16mbyte virtual
    address space. the downside was that because of the ingrained
    pointer-passing convention ... contributed to the kernel image being
    included in every (process) virtual address space ... which took up
    8mbytes (of the 16mbytes).

    then there were a lot of operating system "subsystems" that lived
    outside the kernel ... but provided services to applications and where
    the pointer-passing paradigm was also ingrained. these got moved into
    their own virtual address space. now the pointer-passing paradigm got a
    little more difficult. to address this ... there was something called a
    "common segment" that started out as one megabyte ... appearing in ever
    virtual address space. an application would find some space in the
    common segment, stuff in arguments, and make a kernel call with a
    pointer to the stuff in the common segment ... which would passthru the
    kernel and switch to the subsystem address space. the subsystem then
    could pass back information ... also in the common segment. the problem
    here was that for larger installations, with numerous "subsystems", the
    common segment would grow to five or possibly even six megabytes.

    so you know have kernel image taking 8mbytes of every 16mbyte
    application virtual address space ... and the common segment taking
    another 5-6 mbytes of every application 16mbyte virtual address space
    .... leaving possibly as little as 2mbytes for the application.

    something called "dual-address" space mode was introduced with 3033
    .... it allowed for semi-privileged applications to execute with two
    virtual address space pointers ... one was the private virtual address
    space ... and one typically belonged to the calling application virtual
    address space. in this scenario ... each instance of the subsystem
    execution actually required register with both the address of the passed
    parameter pointer ... and a (control) register with the corresponding
    calling applicaton virtual address space. however, this still required
    passing thru the kernel ... executing software instructions to do the
    necessary permission checking and switching around address pointers
    (along with instructions that accessed storage in other virtual address
    spaces).

    later came "access registers" and program call/return. access registers
    generalized dual-address space mode to multiple virtual address spaces.
    there was kernel hardware table referenced by program call/return that
    supported applications being able to effectively make "BALR" type calls
    to subsystem services (in different virtual address spaces) and have all
    the various control registers be fiddled with (w/o having the overhead
    of executing instructions in the kernel) ... and then perform the return
    .... again w/o requiring kernel software exeuction. program call/return
    started out being a hierarchical infrastrucutre ... but later mesh-type
    operation was added.

    current description of address types, virtual address types, access
    register specified virtual address, etc
    http://publibz.boulder.ibm.com/cgi-b...04121320&CASE=

    more address space discussion
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    more discussion discussion able to handle up to 16 address spaces,
    the primary address space and up to 15 AR-specified address spaces
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    access-register introduction
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    linkage-stack introduction
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    linkage-stack operations
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    program call instruction
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    program return instruction
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    program transfer and program transfer with instance instructions
    http://publibz.boulder.ibm.com/cgi-b...20040504121320

    misc. past posts mentioning dual-address space, access registers,
    program call/return, etc
    http://www.garlic.com/~lynn/98.html#36 What is MVS/ESA?
    http://www.garlic.com/~lynn/2000c.html#84 Is a VAX a mainframe?
    http://www.garlic.com/~lynn/2000d.html#28 RS/6000 vs. System/390 architecture?
    http://www.garlic.com/~lynn/2000e.html#58 Why not an IBM zSeries workstation?
    http://www.garlic.com/~lynn/2001d.html#28 Very CISC Instuctions (Was: why the machine word size ...)
    http://www.garlic.com/~lynn/2001h.html#73 Most complex instructions
    http://www.garlic.com/~lynn/2001i.html#13 GETMAIN R/RU (was: An IEABRC Adventure)
    http://www.garlic.com/~lynn/2001k.html#16 Minimalist design (was Re: Parity - why even or odd)
    http://www.garlic.com/~lynn/2002d.html#51 Hardest Mistake in Comp Arch to Fix
    http://www.garlic.com/~lynn/2002g.html#17 Black magic in POWER5
    http://www.garlic.com/~lynn/2002g.html#18 Black magic in POWER5
    http://www.garlic.com/~lynn/2002l.html#57 Handling variable page sizes?
    http://www.garlic.com/~lynn/2002n.html#58 IBM S/370-168, 195, and 3033
    http://www.garlic.com/~lynn/2002n.html#74 Everything you wanted to know about z900 from IBM
    http://www.garlic.com/~lynn/2002q.html#1 Linux paging
    http://www.garlic.com/~lynn/2003c.html#13 Unused address bits
    http://www.garlic.com/~lynn/2003d.html#53 Reviving Multics
    http://www.garlic.com/~lynn/2003d.html#69 unix
    http://www.garlic.com/~lynn/2003e.html#0 Resolved: There Are No Programs With >32 Bits of Text
    http://www.garlic.com/~lynn/2003e.html#12 Resolved: There Are No Programs With >32 Bits of Text
    http://www.garlic.com/~lynn/2003g.html#13 Page Table - per OS/Process
    http://www.garlic.com/~lynn/2003m.html#29 SR 15,15
    http://www.garlic.com/~lynn/2004c.html#6 If the x86 ISA could be redone
    http://www.garlic.com/~lynn/2004e.html#41 Infiniband - practicalities for small clusters
    http://www.garlic.com/~lynn/2004f.html#27 [Meta] Marketplace argument
    http://www.garlic.com/~lynn/2004f.html#53 Infiniband - practicalities for small clusters
    http://www.garlic.com/~lynn/2004n.html#26 PCIe as a chip-to-chip interconnect
    http://www.garlic.com/~lynn/2004n.html#54 CKD Disks?
    http://www.garlic.com/~lynn/2004o.html#18 Integer types for 128-bit addressing
    http://www.garlic.com/~lynn/2004o.html#57 Integer types for 128-bit addressing
    http://www.garlic.com/~lynn/2005.html#3 [Lit.] Buffer overruns
    http://www.garlic.com/~lynn/2005b.html#53 The mid-seventies SHARE survey
    http://www.garlic.com/~lynn/2005c.html#63 intel's Vanderpool and virtualization in general
    http://www.garlic.com/~lynn/2005d.html#62 Misuse of word "microcode"
    http://www.garlic.com/~lynn/2005f.html#7 new Enterprise Architecture online user group
    http://www.garlic.com/~lynn/2005f.html#41 Moving assembler programs above the line
    http://www.garlic.com/~lynn/2005f.html#57 Moving assembler programs above the line
    http://www.garlic.com/~lynn/2005p.html#18 address space
    http://www.garlic.com/~lynn/2005p.html#19 address space
    http://www.garlic.com/~lynn/2005q.html#41 Instruction Set Enhancement Idea
    http://www.garlic.com/~lynn/2005q.html#48 Intel strikes back with a parallel x86 design
    http://www.garlic.com/~lynn/2006.html#39 What happens if CR's are directly changed?
    http://www.garlic.com/~lynn/2006b.html#25 Multiple address spaces
    http://www.garlic.com/~lynn/2006b.html#28 Multiple address spaces
    http://www.garlic.com/~lynn/2006i.html#33 virtual memory
    http://www.garlic.com/~lynn/2006p.html#10 What part of z/OS is the OS?
    http://www.garlic.com/~lynn/2006r.html#26 A Day For Surprises (Astounding Itanium Tricks)
    http://www.garlic.com/~lynn/2006r.html#32 MIPS architecture question - Supervisor mode & who is using it?
    http://www.garlic.com/~lynn/2006s.html#42 Ranking of non-IBM mainframe builders?
    http://www.garlic.com/~lynn/2006t.html#23 threads versus task
    http://www.garlic.com/~lynn/2006x.html#23 Multiple mappings
    http://www.garlic.com/~lynn/2006y.html#39 Multiple mappings
    http://www.garlic.com/~lynn/2007g.html#39 Wylbur and Paging
    http://www.garlic.com/~lynn/2007g.html#59 IBM to the PCM market(the sky is falling!!!the sky is falling!!)
    http://www.garlic.com/~lynn/2007k.html#14 Some IBM 3033 information
    http://www.garlic.com/~lynn/2007k.html#27 user level TCP implementation
    http://www.garlic.com/~lynn/2007k.html#28 IBM 360 Model 20 Questions
    http://www.garlic.com/~lynn/2007l.html#71 IBM 360 Model 20 Questions
    http://www.garlic.com/~lynn/2007o.html#10 IBM 8000 series
    http://www.garlic.com/~lynn/2007p.html#21 Newsweek article--baby boomers and computers



  6. Re: Direction of Stack Growth

    Nick Maclaren wrote:
    (snip)

    > Not just that. During my spells at IBM, I tried very hard to persuade
    > the Fortran people that using a stack would IMPROVE performance, based
    > on my experiences on the System/370 with many languages.


    > They were wedded to static allocation, because it enabled many of the
    > address constants to be calculated by the compiler, which also eased
    > pressure on the registers. This was Fortran 77, of course, and didn't
    > support reentrancy. That argument is a good one, but I was arguing
    > that the better locality properties of a stack gave MORE of a gain.


    I haven't looked at what VS Fortran does lately, but IBM did eventually
    figure this out. Later processors have a wide cache, with separate
    I and D cache. The cost of accessing memory already in the other
    cache is fairly high, so it is necessary to keep instructions and
    data apart. The old static save areas and local variables were
    right next to the code. There is even an old trick of putting the
    static save area just before the code so that R13 can be a save
    area address and code base register!

    I don't know how many other processors have the wide cache and
    separate I and D cache problem, though.

    -- glen


  7. Re: Direction of Stack Growth

    On Oct 24, 12:54 pm, moja...@mojaveg.lsan.mdsg-pacwest.com (Everett M.
    Greene) wrote:
    > While on the subject of return addresses, isn't the ARM a throwback
    > to the old BALR days? It's granted that you can save a push operation
    > for leaf functions, but otherwise you spend the same or more operations
    > saving the return address than just having the hardware do it
    > automagically.



    Having the subroutine call leave the return address in a register is
    almost universal for RISCs.

    In any event, even if you're only avoiding the push and pop for calls
    to leaf functions, that's approximately half the total (dynamic)
    number of calls.


  8. Re: Direction of Stack Growth

    Op Wed, 24 Oct 2007 15:35:12 -0400 schreef Jerry Avins:

    > Coos Haak wrote:


    >> The Forth I had for the RCA Cosmac 1802 had one stack growing up and the
    >> other growing down ;-)

    >
    > And register 3 was always the interrupt stack, whatever had been
    > assigned as the stack pointer for normal operation. There were standard
    > register assignments IIRC, 1, DMA (and PC?); 2, data pointer; 3, stack
    > pointer; 4, CALL routine PC; 5, RETURN routine PC; 6, link register, the
    > rest unspoken for. These were merely convention, Except for DMA and
    > during interrupts, any register could serve any purpose.


    R0 was the startup PC, but was immediately (on my ELF II) assigned for DMA.
    R1 was the Interrupt PC (hardware assigned) and with standard CALL and
    RETURN, R3 was the PC. And yes, R4 for CALL, R5 for RETURN and R6 as
    linkregister, holding the address after the previous call for easily take
    up inline parameters.

    > The math ROM followed the standard convention. Since the 1802 has no
    > relative addressing capability, addressing an element of the stack
    > requires pointing a register to it. The genius who wrote the routines
    > incremented and decremented the stack pointer as needed instead of
    > (laboriously) copying it to a free register and diddling that.
    >

    The problem was that there were postincrement instructions, but not the
    other way round, only a postdecremented instruction ;-(

    --
    Coos

    CHForth, 16 bit DOS applications
    http://home.hccnet.nl/j.j.haak/forth.html

  9. Re: Direction of Stack Growth

    > Having the subroutine call leave the return address in a register is
    > almost universal for RISCs.


    Some of these processors also don't use predefined push/pop type of
    instructions but support autoinc/autodec both with pre and post
    modification of the pointer (i.e. matching what the C compiler requests)
    and thus the growth direction of the stack is defined by the compiler.

    -Michael

  10. Re: Direction of Stack Growth

    Michael Schnell wrote:
    >> Having the subroutine call leave the return address in a register is
    >> almost universal for RISCs.

    >
    > Some of these processors also don't use predefined push/pop type of
    > instructions but support autoinc/autodec both with pre and post
    > modification of the pointer (i.e. matching what the C compiler requests)
    > and thus the growth direction of the stack is defined by the compiler.


    Ala PDP11?

    Jerry
    --
    Engineering is the art of making what you want from things you can get.
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

  11. Re: Direction of Stack Growth

    On Oct 24, 6:14 pm, "robertwess...@yahoo.com"
    wrote:
    > Having the subroutine call leave the return address in a register is
    > almost universal for RISCs.
    >
    > In any event, even if you're only avoiding the push and pop for calls
    > to leaf functions, that's approximately half the total (dynamic)
    > number of calls.


    The BALR approach permits several link registers, so that register
    saving
    and restoring can be avoided for several levels of statically-nested
    subroutines (each level uses a different link register; argument and
    return value registers are also used in a complementary way).

    PowerPC has a single link register, so it has to be saved on a nested
    call. It does have two possible return link registers. S/360 has 15
    registers which can be used for linking and returning as well as for
    parameter and return value passing.

    The BALR mechanism also permits very efficient coroutine linkage,
    where
    one side's call-target register is the other side's link-return
    register.

    Michel.


  12. Re: Direction of Stack Growth

    nmm1@cus.cam.ac.uk (Nick Maclaren) writes:
    > mojaveg@mojaveg.lsan.mdsg-pacwest.com (Everett M. Greene) writes:
    > |> nmm1@cus.cam.ac.uk (Nick Maclaren) writes:
    > |> > Niklas Holsti writes:
    > |> > |>
    > |> > |> > As some people know, I have been banging on about double stacks for
    > |> > |> > years. I first encountered them in Algol 68, was suspicious, but
    > |> > |> > discovered that they need negligible extra code, stack space and
    > |> > |> > register use, and vastly improve locality. Now, in these days of
    > |> > |> > 90% of performance being due to cachability, why does nobody pick
    > |> > |> > them up again?
    > |> > |>
    > |> > |> If I've understood the "double stack" method correctly, it is like
    > |> > |> the "secondary stack" method that is used in some current Ada
    > |> > |> compilers, including the GNU Ada compiler GNAT, to manage
    > |> > |> dynamically sized stack-allocated objects.
    > |> >
    > |> > Yes, it is. I had forgotten about those. Thanks for the correction.
    > |> > The "nobody" should be "nobody except the Ada community"!
    > |>
    > |> If you're talking about separate stacks for return addresses and data,
    > |> there are C compilers extant using that method under the hood.
    >
    > No, we aren't.


    Are you going to give at least a hint as to what you are talking about
    then?

    > The method isn't much use for C89 or C++, but is for C99.


  13. Re: Direction of Stack Growth


    In article <20071025.7A02CA8.BFC2@mojaveg.lsan.mdsg-pacwest.com>,
    mojaveg@mojaveg.lsan.mdsg-pacwest.com (Everett M. Greene) writes:
    |>
    |> > |> If you're talking about separate stacks for return addresses and data,
    |> > |> there are C compilers extant using that method under the hood.
    |> >
    |> > No, we aren't.
    |>
    |> Are you going to give at least a hint as to what you are talking about
    |> then?

    Read the earlier postings in the thread. All will be explained.


    Regards,
    Nick Maclaren.


  14. Re: Direction of Stack Growth

    Everett M. Greene wrote:
    > nmm1@cus.cam.ac.uk (Nick Maclaren) writes:
    >
    >>mojaveg@mojaveg.lsan.mdsg-pacwest.com (Everett M. Greene) writes:
    >>|> nmm1@cus.cam.ac.uk (Nick Maclaren) writes:
    >>|> > Niklas Holsti writes:
    >>|> > |>
    >>|> > |> > As some people know, I have been banging on about double stacks for
    >>|> > |> > years. I first encountered them in Algol 68, was suspicious, but
    >>|> > |> > discovered that they need negligible extra code, stack space and
    >>|> > |> > register use, and vastly improve locality. Now, in these days of
    >>|> > |> > 90% of performance being due to cachability, why does nobody pick
    >>|> > |> > them up again?
    >>|> > |>
    >>|> > |> If I've understood the "double stack" method correctly, it is like
    >>|> > |> the "secondary stack" method that is used in some current Ada
    >>|> > |> compilers, including the GNU Ada compiler GNAT, to manage
    >>|> > |> dynamically sized stack-allocated objects.
    >>|> >
    >>|> > Yes, it is. I had forgotten about those. Thanks for the correction.
    >>|> > The "nobody" should be "nobody except the Ada community"!
    >>|>
    >>|> If you're talking about separate stacks for return addresses and data,
    >>|> there are C compilers extant using that method under the hood.
    >>
    >>No, we aren't.

    >
    >
    > Are you going to give at least a hint as to what you are talking about
    > then?


    Briefly, as I understand it, you have two stacks per process. One
    stack holds the data objects that have a static (compile-time
    known) size, and therefore have static frame offsets. This stack is
    usually the "native" processor stack and also holds return
    addresses. The second stack holds all stack-allocated data objects
    that have a dynamic (run-time computed) size, and is defined and
    managed entirely by the software run-time system. Objects on the
    second (dynamic-sized) stack as accessed through pointers located
    in the first (static-sized) stack.

    The second stack can be thought of as a special kind of heap for
    objects that have FIFO allocation order.

    For a description of one such implementation of Algol 68, see
    http://birrell.org/andrew/papers/A68...nt-Sigplan.pdf

    The GNAT User Guide,

    http://gcc.gnu.org/onlinedocs/gnat_u...-gnatbind.html

    says: "The secondary stack is used to deal with functions that
    return a variable sized result, for example a function returning an
    unconstrained String. ... For most targets, the secondary stack is
    growing on demand and is allocated as a chain of blocks in the heap".


    --
    Niklas Holsti
    Tidorum Ltd
    niklas holsti tidorum fi
    . @ .

  15. Re: Direction of Stack Growth


    In article <4721a047$0$3492$39db0f71@news.song.fi>,
    Niklas Holsti writes:
    |>
    |> Briefly, as I understand it, ...

    You are correct. That was explained by several people earlier in the
    thread, too.

    |> The second stack can be thought of as a special kind of heap for
    |> objects that have FIFO allocation order.

    It can be, but that is a serious mistake. Unfortunately, the mistake
    is not obvious and so most people don't realise it.

    |> For a description of one such implementation of Algol 68, see
    |> http://birrell.org/andrew/papers/A68...nt-Sigplan.pdf

    Ah! ALGOL68C - that is the one I used most.

    |> The GNAT User Guide,
    |>
    |> http://gcc.gnu.org/onlinedocs/gnat_u...-gnatbind.html
    |>
    |> says: "The secondary stack is used to deal with functions that
    |> return a variable sized result, for example a function returning an
    |> unconstrained String. ... For most targets, the secondary stack is
    |> growing on demand and is allocated as a chain of blocks in the heap".

    And that is where they have made their mistake. Where at all possible,
    it should be allocated as a separate segment. There are two reasons
    for that, and one where they have made a related mistake:

    1) If it uses the heap, then freeing a frame needs a heap call to
    do so (or garbage collection). If it is a proper, independent, stack,
    no action is needed to free a frame.

    2) If it uses the heap, then certain common constructions (and a
    LOT of ones that use derived types) cause the most ghastly fragmentation.
    That wastes space, destroys locality, and needs fairly fancy garbage
    collection to avoid problems. Those simply do not arise if a proper,
    independent, stack is used.

    3) When returning an unconstrained object, it is generally better
    to use the heap, and to copy to the secondary stack in the calling
    procedure, if that is appropriate. That, again, was discovered in
    Algol 68 implementations. But why? Consider code like:

    VECTOR a;
    a := Makevector();

    Makevector starts by allocating quite a lot of dynamically-sized
    objects, and then calls Operate(), which creates the VECTOR that it
    returns. Oops. That vector is a long way down the secondary stack,
    but the caller of Makevector wants it much higher up. You then have
    a potentially overlapping copy.

    Of course, there are several other solutions to this problem than
    using the heap, and some are considerably better, but all are more
    complicated.


    Regards,
    Nick Maclaren.

  16. Re: Direction of Stack Growth

    nmm1@cus.cam.ac.uk (Nick Maclaren) writes:
    > mojaveg@mojaveg.lsan.mdsg-pacwest.com (Everett M. Greene) writes:


    > |> > |> If you're talking about separate stacks for return addresses and data,
    > |> > |> there are C compilers extant using that method under the hood.
    > |> >
    > |> > No, we aren't.
    > |>
    > |> Are you going to give at least a hint as to what you are talking about
    > |> then?
    >
    > Read the earlier postings in the thread. All will be explained.


    It wasn't in the postings received here.

  17. Re: Direction of Stack Growth

    Nick Maclaren wrote:

    > In article <4721a047$0$3492$39db0f71@news.song.fi>,
    > Niklas Holsti writes:


    > |> says: "The secondary stack is used to deal with functions that
    > |> return a variable sized result, for example a function returning an
    > |> unconstrained String. ... For most targets, the secondary stack is
    > |> growing on demand and is allocated as a chain of blocks in the heap".
    >
    > And that is where they have made their mistake. Where at all possible,
    > it should be allocated as a separate segment. There are two reasons


    Nick, I would rather understand it differently - that GNAT stack
    is not completely contiguous but contains linked chain of big
    chunks, later suballocated in stack fashion. This involves
    some work, but have some flexibility of heap. Your comment
    assumes that every stack object is independently allocated.

    Archi

  18. Re: Direction of Stack Growth


    In article ,
    Archi writes:
    |> > In article <4721a047$0$3492$39db0f71@news.song.fi>,
    |> > Niklas Holsti writes:
    |>
    |> > |> says: "The secondary stack is used to deal with functions that
    |> > |> return a variable sized result, for example a function returning an
    |> > |> unconstrained String. ... For most targets, the secondary stack is
    |> > |> growing on demand and is allocated as a chain of blocks in the heap".
    |> >
    |> > And that is where they have made their mistake. Where at all possible,
    |> > it should be allocated as a separate segment. There are two reasons
    |>
    |> Nick, I would rather understand it differently - that GNAT stack
    |> is not completely contiguous but contains linked chain of big
    |> chunks, later suballocated in stack fashion. This involves
    |> some work, but have some flexibility of heap. Your comment
    |> assumes that every stack object is independently allocated.

    Nope. I stand by what I said. Of course, I quite agree that no
    such decisions are black and white, but only shades of grey. There
    are doubtless good reasons for their decision - indeed, I am pretty
    sure that I know what they are, because I have implemented stacks
    like that myself.

    But, where we are at present (and where we could have been foreseen
    to be going even a decade ago), the correct solution is a separate
    segment. Note that it wasn't the correct decision a decade ago,
    because memory was too tight, but it was clear that (a) we were
    going to 64 bits and (b) cache locality was going to become THE
    memory performance issue.


    Regards,
    Nick Maclaren.

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2