One of the lesser known "secret sauces" of Oracle Solaris Studio isperhaps one of its easiest-to-use and highest performance components:Performance Library (what we commonly call Perflib). Sun PerformanceLibrary is a set of optimized, high-speed mathematicalsubroutines for solving linear algebra and other numerically intensiveproblems. Sun Performance Library is based on a collection of publicdomain applications available from Netlib .Sun has enhanced these public domain applications and bundled them asthe Sun Performance Library.Sun ensures that the performance of each routine is optimal for theunderlying hardware and that the routines are parallelized to takeadvantage of multiple cores.

If words like BLAS, LAPACK, FFTPACK, SuperLU, ScalaPACK, SparseBLAS andSPSOLVE get you excited or at least curious, read on. For the rest ofyou, there are only a couple of headliners, I'd like you to remember:
  • Sun Performance Library comes optimized for every Sun HWplatform. This means there are optimized versions for for V8, sparcvis,sparcvis2, andsparcfmaf architectures on the SPARC side and there are alsooptimized versions for x86/x64 architectures, for AMD/Opteron,AMD/Barcelona andIntel/Xeon.
  • Sun Performance Library works on Solaris SPARC, Solaris x86/x64,OEL, RedHat and SuSE
  • These highly optimized versions are hand-tuned for the bestperformance. That means linking into these routines will automagicallygive you scalability across multiple cores and the best possibleperformance on each HW brand you could be running.
  • Scalability across multiple cores is automatically guaranteed bythe parallelized routines, which means code can automatically scale upon newer machines without having to parallelize code by hand (a verytedious task, in most cases).
Of course, these advantages apply only to numeric codes that can takeadvantage of these popular routines.
For the die-hards who want to know more, here is a classification ofthe kind of Linear Algebra and Numerical solvers that are part ofPerflib:
  • Elementary vector and matrixoperations - Vector and matrix products; plane rotations; 1, 2-,and infinity-norms; rank-1, 2, k, and 2k updates
  • Linear systems -Solve full-rank systems, compute error bounds, solve Sylvesterequations, refine a computed solution, equilibrate a coefficient matrix
  • Least squares -Full-rank, generalized linear regression, rank-deficient, linearequality constrained
  • Eigenproblems -Eigenvalues, generalized eigenvalues, eigenvectors, generalizedeigenvectors, Schur vectors, generalized Schur vectors
  • Matrix factorizations ordecompositions - SVD, generalized SVD, QL and LQ, QR and RQ,Cholesky, LU, Schur
  • Support operations -Condition number, in-place or out-of-place transpose, inverse,determinant, inertia
  • Sparse matrices -Solve symmetric, structurally symmetric, and unsymmetric coefficientmatrices using direct methods and a choice of fill-reducing orderingalgorithms, and user-specified orderings
  • Convolution and correlation in one and twodimensions
  • Fast Fourier transforms, Fourier synthesis,cosine and quarter-wave cosine transforms, cosine and quarter-wave sinetransforms
  • Complex vector FFTs and FFTs in two and threedimensions
  • Interval BLAS routines
  • Sorting operations
Seea complete list of routines here.

Taking full advantage of the increased accuracy, performance andparallelism of these routines often requires code change. However, inmany cases, such code change can result in morereadable code as well (here is a good example of that). Theperformance improvements are often dramatic, and well worth the timetaken to change code to take advantage of these routines.

Want to know more? There are several places you can look:

Read More about [Oracle Solaris Studio Secret Sauce: Ferociously tuned and Parallel Scientific Libraries...