This page briefly describes the projects that constitute CSCAPES and points to the individual project-websites for more information. Some of these projects predate CSCAPES and involve many colleagues in addition to current members of CSCAPES.

Partitioning, Load Balancing, and Ordering
Automatic Differentiation
Coloring in Scientific Computing
Matching in Scientific Computing
Performance Improvement via Runtime Reordering
Multiscale Algorithms for Combinatorial Optimization

Partitioning, Load Balancing, and Ordering

In parallel computation, data and tasks need to be partitioned among processors such that workload is balanced and communication cost is low. In dynamic or adaptive applications, these goals must be met as computation and communication change over time, and the problem is then referred to as dynamic load balancing (or repartitioning). The load balancing problem is pervasive in parallel scientific computing, and is critical for achieving high performance in many SciDAC applications including structural mechanics, chemical engineering, groundwater flow, biological systems, electronic circuit simulations, and molecular dynamics.

CSCAPES members are developing new static as well as dynamic load balancing algorithms and extending the capabilities of Zoltan to support petascale applications. Zoltan is a software toolkit for load balancing and parallel data management developed mainly at Sandia National Laboratories. It contains implementations of several different kinds of algorithms for load balancing. Roughly, the algorithms can be divided into two groups: geometric- and connectivity-based methods. The geometric methods use geometric coordinates to partition space into regions, while the connectivity methods use a graph or a hypergraph to represent data and their dependencies. Geometric partitioning algorithms are fast, but graph/hypergraph algorithms typically reduce the communication volume more. CSCAPES members have shown that hypergraph models are more expressive than graph models for load balancing, and can model communication costs more accurately. A hypergraph-based dynamic repartitioning method with superior performance compared to earlier methods has recently been added to Zoltan; the work earned the Best Algorithms Paper Award at IPDPS07. Zoltan also supports third-party libraries (ParMetis and PT- Scotch) for partitioning and ordering. Zoltan recently became available as a package in the Trilinos framework. The package Isorropia provides a tight integration between Zoltan and the rest of Trilinos, and supports matrix-based partitioning, ordering, and coloring.

Representative presentations:
Partitioning, Load Balancing, and Ordering for Petascale Applications
Dynamic Load Balancing (Repartitioning) and Matrix Partitioning
Tutorial: The Zoltan Toolkit
All three presented at the CSCAPES Workshop in Santa Fe, NM, in June 2008.

Project website: Load Balancing (Zoltan)

Automatic Differentiation (AD)

AD is a technology for transforming subprograms that compute some mathematical function into subprograms that compute the derivatives of that function. The resulting derivatives are used for uncertainty quantification, optimization algorithms, nonlinear solvers for discretized differential equations, and solution of inverse problems using nonlinear least squares. AD techniques combine rules for differentiating the functions intrinsic to a given programming language with strategies for applying the chain rule while respecting the control flow of the original program.

The computation of a function and its derivatives via the chain rule can be represented by a directed acyclic graph (DAG); vertices of the DAG represent variables or intermediate expressions, and the edge weights correspond to partial derivatives. The associativity of the chain rule of differential calculus leads to exponentially many possible "modes," sequences for combining intermediate operands to compute partial derivatives. This choice of modes can be used to reduce the number of operations and storage needed for computing the partial derivatives. Choosing a mode can be interpreted as selecting an order in which vertices and edges are "eliminated" (corresponding to partial evaluation of the associated variables) on the DAG. Finding an optimal mode, one that minimizes the number of operations, for evaluating a Jacobian matrix is intractable (NP-complete). In practice, heuristics are used to identify accumulation strategies with low operations counts and/or storage requirements. CSCAPES researchers are pursuing new combinatorial heuristics to further reduce the cost of gradient, Jacobian, and Hessian computations. Coloring algorithms are also being used to reduce the computational effort associated with evaluating sparse Jacobians and Hessians. Checkpointing in irregular computation is another source of combinatorial problems in AD. CSCAPES is developing efficient algorithms for these problems, and is extending the capabilities of the Argonne-housed AD tools ADIC and ADIFOR within the OpenAD framework.

Representative presentation:
Combinatorial Aspects of Automatic Differentiation
A tutorial presented at the CSCAPES Workshop in Santa Fe, NM, in June 2008.

Project websites: Computational Differentiation at Argonne and OpenAD

Graph Coloring in Scientific Computing

Graph coloring models of various kinds have proved to be powerful abstractions for minimizing the execution time, memory, and storage space needed in computing sparse derivative matrices using automatic differentiation. Depending on the structure of the derivative matrix of interest---whether it is Jacobian (nonsymmentric) or Hessian (symmetric)---and the specifics of the computational techniques employed, as many as five variants of coloring problems exist. Graph coloring models are also useful in discovering concurrency in parallel scientific computation; examples include iterative methods for sparse linear systems, adaptive mesh refinement, preconditioning, and sparse tiling.

CSCAPES researchers are developing novel sequential as well as parallel algorithms for a variety of coloring problems. ColPack, a software package consisting of implementations of sequential algorithms for coloring and related problems in sparse derivative computation, has been released recently. Implementations of the parallel algorithms are being deployed via Zoltan.

Representative presentation:
Graph Coloring in Parallel Processing and Scientific Computing
Presented at the CSCAPES Workshop in Santa Fe, NM, in June 2008.

Project websites: Coloring. See also Zoltan for parallel coloring.

Matching in Scientific Computing

Various kinds of matchings in graphs are needed in several scientific or high-performance computing contexts. Examples include sparse linear systems (numerical preprocessing and block triangular decomposition) and graph partitioning (coarsening phase of multilevel methods). Traditional matching algorithms compute optimal solutions in superlinear time, but current trends are toward algorithms that find approximate solutions faster. CSCAPES is developing parallel matching algorithms based on approximation techniques.

Representative presentation:
A Parallel Half-Approximation Algorithm for the Weighted Matching Problem
Presented at a CSCAPES Seminar in Sept 2008.

Project website: Matching

Performance Improvement via Runtime Reordering

Modern microprocessors are highly sensitive to spatial and temporal locality of data. In a mesh smoothing application, for example, reordering the vertices and elements of a mesh can lead to a significant improvement in single processor performance. CSCAPES is developing several data and iteration reordering algorithms based on hypergraph models.

Representative presentation:
Parallelizing Gauss-Seidel Using Full Sparse Tiling
Presented at a CSCAPES Seminar in May 2007.

Project website : Performance Improvement

Multiscale Algorithms for Combinatorial Optimization

Originally invented for solving elliptic partial differential equations, the Multiscale method is currently receiving increasing attention as a framework for solving combinatorial optimization problems. In a generic sense, a multiscale algorithm attacks a problem by creating a hierarchy of successively "coarser" problems (each of which represents the original problem but has successively fewer degrees of freedom), solving the problem at the coarsest level (using a method that would have been too expensive or infeasible on the original problem), and projecting the solution back to the original problem with possible refinements along the way. One class of the graph partitioning algorithms in Zoltan falls under the Multiscale framework. Investigating the prospects of this framework for the design of algorithms for other graph problems is an activity within CSCAPES.

Project website: Mutiscale Algorithms