Please do not use previously mentioned routine, it missed 1 corner
case where 32=3Dnum_bits_word(d)

Revised routine that passes (cd test; make bntest). =20
All I had to do is add one more instruction to the routine.

Please test on your ppc32 machines.

Once we are all happy, it's a matter of adding the core dump at the beginni=
ng. =20
Thus you have a fast, easy to understand, predictable bn_div_words, as
opposed to that monster in 0.9.8.

#
# Handcrafted version of bn_div_words
#
# r3 =3D h
# r4 =3D l
# r5 =3D d

cmplwi 0,r5,0 # compare r5 and 0
bc BO_IF_NOT,CR0_EQ,.Lppcasm_div1 # proceed if d!=3D0
li r3,-1 # d=3D0 return -1
bclr BO_ALWAYS,CR0_LT
..Lppcasm_div1:
cmplwi 0,r3,0 # compare r3 and 0
bc BO_IF_NOT,CR0_EQ,.Lppcasm_div2 # proceed if h !=3D 0
divwu r3,r4,r5 # ret_q =3D l/d
bclr BO_ALWAYS,CR0_LT # return result in r3
..Lppcasm_div2:
divwu r9,r3,r5 # i_q =3D h/d
mullw r10,r9,r5 # i_r =3D h - (i_q*d)
subf r10,r10,r3
mr r3,r9 # req_q =3D i_q
..Lppcasm_set_ctr:
li r12,32 # ctr =3D bitsizeof(d)
mtctr r12
..Lppcasm_div_loop:
addc r4,r4,r4 # l =3D l << 1 -> i_carry
adde r11,r10,r10 # i_h =3D (i_r << 1) | i_carry
divwu r9,r11,r5 # i_q =3D i_h/d
addze r9,r9 # very important! - DKWH
mullw r10,r9,r5 # i_r =3D i_h - (i_q*d)
subf r10,r10,r11
add r3,r3,r3 # ret_q =3D ret_q << 1 | i_q
add r3,r3,r9
bc BO_dCTR_NZERO,CR0_EQ,.Lppcasm_div_loop
..Lppc_div_end:
bclr BO_ALWAYS,CR0_LT # return result in r3
.long 0x00000000


Regards,
David


On 7/5/05, Peter Waltenberg wrote:
> =20
> Thanks for finding and fixing this. Particularly for finding and fixing =

it
> before 0.9.8 hit the streets.=20
> =20
> Peter=20
> =20
> Peter Waltenberg
> Architect
> IBM Crypto for C Team
> IBM/Tivoli Gold Coast Office
> =20
> =20
> =20
> =20
> Andy Polyakov =20
> Sent by: owner-openssl-dev@openssl.org=20
>=20
> 06/07/2005 07:49 AM=20
> =20
> Please respond to
> openssl-dev=20
> =20
> =20
> To openssl-dev@openssl.org=20
> =20
> cc linuxppc-embedded@ozlabs.org=20
> =20
> Subject Re: PPC bn_div_words routine rewrite=20
> =20
> =20
> =20
> =20
> =20
> > Okay, having actually did what Andy suggested, i.e. the one liner fix
> > in the assembly code, bn_div_words returns the correct results.

> =20
> Note that the final version, one committed to all relevant OpenSSL=20
> branches since couple of days ago and one which actually made to just=20
> released 0.9.8, is a bit different from originally suggested one-line=20
> fix, see for example
> http://cvs.openssl.org/chngview?cn=3D14199.
> =20
> > At this point, my conclusion is, up to openssl-0.9.8-beta6, the ppc32
> > bn_div_words routine generated from crypto/bn/ppc.pl is still busted.

> =20
> Yes. Though it should be noted that 0.9.8 was inadvertently avoiding the=

=20
> bug condition. Recall that original problem report was for 0.9.7.
> =20
> > Why do you signal an overflow condition when it appears functions that
> > call bn_div_words do not check for overflow conditions?

> =20
> That's question to IBM. By the time they submitted the code, I've=20
> explicitly asked what would be appropriate way to generate *fatal*=20
> condition at that point, i.e. one which would result in a core dump, and=

=20
> it came out as division by 0 instruction. By that time I had no access=

=20
> to any PPC machine and had to just go with it. Now it actually came as=

=20
> surprise that division by 0 does not raise an exception, but silently=20
> returns implementation-specific value... A.
> __________________________________________________ ____________________
> OpenSSL Project http://www.openssl.org
> Development Mailing List openssl-dev@openssl.org
> Automated List Manager majordomo@openssl.org
> =20
>

__________________________________________________ ____________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-dev@openssl.org
Automated List Manager majordomo@openssl.org