This is a discussion on Re: Which CPUTYPE for a dualcore Xeon on AMD64 - FreeBSD ; In , Martin Cracauer typed: > Martin Turgeon wrote on Sun, Jun 24, 2007 at 07:32:22PM -0400: > > Hi, > > > > I recently installed AMD64 6.2 Release on 2 PowerEdge servers, both with > > dual core ...
In <20070626195031.GA29545@cons.org>, Martin Cracauer
> Martin Turgeon wrote on Sun, Jun 24, 2007 at 07:32:22PM -0400:
> > Hi,
> > I recently installed AMD64 6.2 Release on 2 PowerEdge servers, both with
> > dual core Xeon (3070 and 5110).
> I extensively benchmarked different compiler options on Xeon 5160 (3.0
> GHz Core2) with gcc-4.1.2 and gcc-4.2.
Using what benchmark? That makes a *lot* of difference.
> The result was within a percent of all highly tuned CPU-specific
> options like -march=k8 -msse3 -mfpmath=sse -ffast-math, and I went
> through most iterations. This means that locking your code to one
> x86_64 implementation and locking out either AMD or Intel is not worth
> the trouble.
I don't think you've reached the correct conclusions. In particular,
note that doing -mtune instead of -march won't lock you to a specific
CPU, but will instead choose instructions/sequences optimized for your
CPU. So it's a minor win with no downside.
With the x86_64bit architecture, you have three choices: unset (x86 +
MMX/SSE/SSE2), nocona (intel, with SSE3) and athlon64 (amd, with
3dNOW!). So changing your Xeon to nocona will just enablie SSE3. The
SSE3 extensions are mostly things for doing "horizontal" computations
inside the SSE register file. So unless your benchmark was doing lots
of work on arrays of floats, it's unlikely you actually tested the
SSE3 extensions, in which case all you did was test -mtune. Without
testing the extra instructions, we don't know whether using them is
worth the trouble or not, and you didn't say what your test was.
3dNOW! is an alternative, instead of an extension, to SSE/SSE2 (and
maybe SSE3). People who hack such things tell me it's much spiffier
than the SSE instructions, so possibly enabling it would cause those
instructions to be used instead of the SSE instructions the compiler
currently uses. But you didn't test this case, so we don't know how
much difference it would make, and hence whether or not it's worth
locking your code to AMD to get it.
Independent Network/Unix/Perforce consultant, email for more information.
firstname.lastname@example.org mailing list
To unsubscribe, send any mail to "email@example.com"