Efficient Parallelization of a Three-Dimensional Navier-Stokes Solver on MIMD Multiprocessors
Basic Results in Automatic Transformations of Shared Memory Parallel Programs into Sequential Programs
A Parallel Adaptive Gauss-Jordan Algorithm
Improving Memory Traffic by Assembly-Level Exploitation of Reuses for Vector Registers
Efficient Address Generation for Affine Subscripts in Data-Parallel Programs