Using altivec - Powerpc

This is a discussion on Using altivec - Powerpc ; Hi, Does code have to explicitly use libraries to make use of the altivec unit in the PowerPC970 or are there compilers that can send some instructions to this unit to improve throughput (eg. 16 single byte additions in 1 ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: Using altivec

  1. Using altivec

    Hi,
    Does code have to explicitly use libraries to make use of the altivec unit
    in the PowerPC970 or are there compilers that can send some instructions to
    this unit to improve throughput (eg. 16 single byte additions in 1
    dispatch)?

    Thanks for any help,
    --
    Raymond.



  2. Re: Using altivec

    "Raymond Martin" writes:

    > Hi,
    > Does code have to explicitly use libraries to make use of the altivec unit
    > in the PowerPC970 or are there compilers that can send some instructions to
    > this unit to improve throughput (eg. 16 single byte additions in 1
    > dispatch)?


    Hi,

    gcc has a -maltivec -mabi=altivec switches, but when I tested on a
    simple program they made no difference in the resulting executable
    (tested with diff(1)). This may mean either that altivec is always
    used or never used... I don't know!

    --
    Stefano | Department of Psychology, University of Bologna
    Ghirlanda | Interdisciplinary cultural research, Stockholm University
    http://www.intercult.su.se/~stefano

  3. Re: Using altivec

    Thanks for replying, I don't have access to a 970 but I'm writing a report
    on the processor and need to try and find detail on the compiler
    translation. I think a quick compilation with -S and a check of the
    assembly should determine if altivec instructions are being used though if
    you're interested.

    --
    Raymond.



  4. Re: Using altivec

    "Raymond Martin" writes:

    > Thanks for replying, I don't have access to a 970 but I'm writing a
    > report on the processor and need to try and find detail on the
    > compiler translation.


    I overlooked that you are looking for info about a specific
    processor. I seem to have a 7455, not 970.

    > I think a quick compilation with -S and a check of the assembly
    > should determine if altivec instructions are being used though if
    > you're interested.


    I don't know enough. I inlclude below the .s file so you can have a
    look yourself, if you are interested. Note that I don't know whether
    altivec should be used with the code, perhaps not! The test was a very
    quick test prompted by your post. The code is split in stdnorm.h and
    test-stdnorm.c:

    /* stdnorm.h */
    #include
    #define INVSQRT2 (1./1.4142135623730951455)
    double stdnorm(void) {
    double x = 0.;
    int i=24;
    while( i-- ) x += drand48();
    return (x-12.)*INVSQRT2;
    }

    /* test-stdnorm.c */
    #include
    #include
    #include "stdnorm.h"
    int main(void) {
    long i=50000;
    srand48( time(0) );
    while( i-- ) {
    printf( "%f\n", stdnorm() );
    }
    return 0;
    }

    Assembler code:

    .file "test-stdnorm.c"
    .section .rodata
    .align 3
    ..LC0:
    .long 1072079006
    .long 1719614412
    .align 3
    ..LC1:
    .long 1076363264
    .long 0
    .section ".text"
    .align 2
    .globl stdnorm
    .type stdnorm,@function
    stdnorm:
    stwu 1,-48(1)
    mflr 0
    stw 31,44(1)
    stw 0,52(1)
    mr 31,1
    li 9,0
    li 10,0
    stw 9,8(31)
    stw 10,12(31)
    li 0,24
    stw 0,16(31)
    ..L2:
    lwz 9,16(31)
    addi 0,9,-1
    mr 9,0
    stw 9,16(31)
    li 0,-1
    cmpw 0,9,0
    bne 0,.L4
    b .L3
    ..L4:
    bl drand48
    fmr 0,1
    lfd 13,8(31)
    fadd 0,13,0
    stfd 0,8(31)
    b .L2
    ..L3:
    lfd 13,8(31)
    lis 9,.LC1@ha
    la 9,.LC1@l(9)
    lfd 0,0(9)
    fsub 0,13,0
    lis 9,.LC0@ha
    lfd 13,.LC0@l(9)
    fmul 0,0,13
    fmr 1,0
    lwz 11,0(1)
    lwz 0,4(11)
    mtlr 0
    lwz 31,-4(11)
    mr 1,11
    blr
    ..Lfe1:
    .size stdnorm,.Lfe1-stdnorm
    .section .rodata
    .align 2
    ..LC2:
    .string "%f\n"
    .section ".text"
    .align 2
    .globl main
    .type main,@function
    main:
    stwu 1,-32(1)
    mflr 0
    stw 31,28(1)
    stw 0,36(1)
    mr 31,1
    li 0,0
    ori 0,0,50000
    stw 0,8(31)
    li 3,0
    bl time
    mr 0,3
    mr 3,0
    bl srand48
    ..L6:
    lwz 9,8(31)
    addi 0,9,-1
    mr 9,0
    stw 9,8(31)
    li 0,-1
    cmpw 0,9,0
    bne 0,.L8
    b .L7
    ..L8:
    bl stdnorm
    fmr 0,1
    lis 9,.LC2@ha
    la 3,.LC2@l(9)
    fmr 1,0
    creqv 6,6,6
    bl printf
    b .L6
    ..L7:
    li 0,0
    mr 3,0
    lwz 11,0(1)
    lwz 0,4(11)
    mtlr 0
    lwz 31,-4(11)
    mr 1,11
    blr
    ..Lfe2:
    .size main,.Lfe2-main
    .ident "GCC: (GNU) 3.2.3"


    --
    Stefano | Department of Psychology, University of Bologna
    Ghirlanda | Interdisciplinary cultural research, Stockholm University
    http://www.intercult.su.se/~stefano

  5. Re: Using altivec

    Thanks for posting the code, there's no altivec instructions being used.
    I've done more reading and I think the only way to use altivec instructions
    is by using the libraries that call them or direct assembly code. I haven't
    found any evidence of an optimising compiler that makes use of altivec "on
    the fly", which is probably a good thing in many cases since the unit
    doesn't throw exceptions.

    Also altivec doesn't support double precision so some of the instructions
    won't be suitable for that unit but I'm not sure whether or not some of the
    integer operations inside the loops could be offloaded to it to ease the
    bottleneck on the load-store/integer unit queues (the 2 integer units and 2
    load-store units share 2 queues going into the 4 units so
    integer/load-stores reduce the grouping of instructions). This is the first
    PPC assembly I've looked at ever so I'm not the best person to advise on
    optimising atm

    --
    Raymond.



  6. Re: Using altivec

    "Raymond Martin" writes:

    > This is the first PPC assembly I've looked at ever so I'm not the
    > best person to advise on optimising atm


    I have never worked with assembly, so I'm the last person to consult!
    Good luck with your work, and do post on this group if you find
    relevant information.

    --
    Stefano | Department of Psychology, University of Bologna
    Ghirlanda | Interdisciplinary cultural research, Stockholm University
    http://www.intercult.su.se/~stefano

+ Reply to Thread