Integer multiply performance of UltraSPARC chips - SUN

This is a discussion on Integer multiply performance of UltraSPARC chips - SUN ; There are a few very negative comments about the performance of UltraSPARC processors on the pages of the GNU multi precision library (GMP), which is used for multiplying large integers. Here, where there are some benchmarks: http://www.swox.com/gmp/gmpbench.html it says "UltraSPARC ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: Integer multiply performance of UltraSPARC chips

  1. Integer multiply performance of UltraSPARC chips

    There are a few very negative comments about the performance of
    UltraSPARC processors on the pages of the GNU multi precision library
    (GMP), which is used for multiplying large integers.

    Here, where there are some benchmarks:
    http://www.swox.com/gmp/gmpbench.html

    it says "UltraSPARC 3's terrible scores are a result of its uniquely
    poor integer multiply support (unsuitable architectural support +
    simplistic integer multiply implementation)."

    Then here, where it talks about the performance of 32 vs 64 bit
    processors for computing with very large integers (>> 64 bits):

    http://www.swox.com/gmp/32vs64.html
    it says:

    "Now, UltraSPARC is a particularly poor example for showing the
    superiority of 64-bit processors for this problem domain, since this
    processor has a uniquely poor instruction set for bignum operations."

    Certainly the benchmarks with gmp are very poor with UltraSPARC
    processors, but I wonder how much of this is due to a mis-understanding
    on the part of the GMP developers. I would have expected the use of 64
    bit instructions to considerably improve performance in this task, but
    on my Ultra 80 the gains are only a few percent.

    Looking at the source distibution
    http://ftp.sunet.se/pub/gnu/gmp/gmp-4.1.4.tar.gz

    there is a README (in the directory gmp-4.1.4/mpn/sparc64) with again
    some very negative comments about the chips.

    I'd be interested from anyone who knows more about the chips to comment.
    If the assembler routines are broken badly, perhaps they might advise
    the GMP developers of this.

    That library is used as part of some expensive commercial software -
    Mathematica being one example.

    --
    Dave K

    http://www.southminster-branch-line.org.uk/

    Please note my email address changes periodically to avoid spam.
    It is always of the form: month-year@domain. Hitting reply will work
    for a couple of months only. Later set it manually. The month is
    always written in 3 letters (e.g. Jan, not January etc)

  2. Re: Integer multiply performance of UltraSPARC chips

    Dave writes:

    >"Now, UltraSPARC is a particularly poor example for showing the
    >superiority of 64-bit processors for this problem domain, since this
    >processor has a uniquely poor instruction set for bignum operations."


    I think they're complaining about the lack of a 64x64->128 bit multiply.

    Casper
    --
    Expressed in this posting are my opinions. They are in no way related
    to opinions held by my employer, Sun Microsystems.
    Statements on Sun products included here are not gospel and may
    be fiction rather than truth.

  3. Re: Integer multiply performance of UltraSPARC chips

    Casper H.S. Dik wrote:
    > Dave writes:
    >
    >
    >>"Now, UltraSPARC is a particularly poor example for showing the
    >>superiority of 64-bit processors for this problem domain, since this
    >>processor has a uniquely poor instruction set for bignum operations."

    >
    >
    > I think they're complaining about the lack of a 64x64->128 bit multiply.
    >
    > Casper


    If so, it is far from the only gripe of the gmp developers, with at one
    point them using floating point instructions rather than integers ones!
    Also they talk about "Integer conditional move instructions". I should
    have not put the word multiply in the subject line really, but I will
    not change it now.

    Here are some comments from one of the README files, with a note that
    the UltraSPARC 3 is slower (I assume compared to UltraSPARC 2).

    -- From gmp-4.1.4/mpn/sparc64/README --
    The 64-bit integer multiply instruction mulx takes from 5 cycles to 35
    cycles, depending on the position of the most significant bit of the
    first source operand. When used for 32x32->64 multiplication, it needs
    20 cycles. Furthermore, it stalls the processor while executing. We
    stay away from that instruction, and instead use floating-point operations.

    Integer conditional move instructions cannot dual-issue with other
    integer instructions. No conditional move can issue 1-5 cycles after a
    load. (Or something such bizarre.) Useless.

    The UltraSPARC-3 pipeline seems similar, but is somewhat more rigid.
    Branches execute slower, and there may be other new stalls. Integer
    multiply doesn't halt the CPU and also has a much lower latency. But
    it's still not pipelined, and thus useless for our needs.

    --
    Dave K

    http://www.southminster-branch-line.org.uk/

    Please note my email address changes periodically to avoid spam.
    It is always of the form: month-year@domain. Hitting reply will work
    for a couple of months only. Later set it manually. The month is
    always written in 3 letters (e.g. Jan, not January etc)

+ Reply to Thread