CSC

Parallel graph coloring algorithms for distributed GPU environments
FROSch Preconditioners for Land Ice Simulations of Greenland and Antarctica
Mixed Precision in Trilinos
Mixed-Precision Schemes for Linear Algebra Kernels on GPUs
A Block-Based Triangle Counting Algorithm on Heterogeneous Environments
A survey of numerical methods utilizing mixed precision arithmetic
EXAGRAPH: Graph and combinatorial methods for enabling exascale applications
FROSch Preconditioners for Land Ice Simulations of Greenland and Antarctica
Kokkos Kernels: Performance Portable Sparse/Dense Linear Algebra and Graph Kernels
Performance-portable graph coarsening for efficient multilevel graph analysis
Sphynx: A parallel multi-GPU graph partitioner for distributed-memory systems
A survey of numerical methods utilizing mixed precision arithmetic
An algebraic sparsified nested dissection algorithm using low-rank approximations
Distributed Memory Graph Coloring Algorithms for Multiple GPUs
Performance portable supernode-based sparse triangular solver for manycore architectures
Preparing sparse solvers for exascale computing
Scalable asynchronous domain decomposition solvers
Scalable, multi-constraint, complex-objective graph partitioning
SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems
A Parallel Graph Algorithm for Detecting Mesh Singularities in Distributed Memory Ice Sheet Simulations
A robust hierarchical solver for ill-conditioned systems with applications to ice sheet modeling
Geometric Mapping of Tasks to Processors on Parallel Computers with Mesh or Torus Networks
Linear algebra-based triangle counting via fine-grained tasking on heterogeneous environments:(Update on static graph challenge)
Scalable generation of graphs for benchmarking HPC community-detection algorithms
Scalable triangle counting on distributed-memory systems
A distributed-memory hierarchical solver for general sparse linear systems
Asynchronous one-level and two-level domain decomposition solvers
Fast triangle counting using cilk
FROSch: a fast and robust overlapping Schwarz domain decomposition preconditioner based on Xpetra in Trilinos
Geometric partitioning and ordering strategies for task mapping on parallel computers
Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures
Tacho: memory-scalable task parallel sparse Cholesky factorization
Basker: Parallel sparse LU factorization utilizing hierarchical parallelism and data layouts
Distributed graph layout for scalable small-world network analysis
Fast linear algebra-based triangle counting with kokkoskernels
Partitioning trillion-edge graphs in minutes
Performance-portable sparse matrix-matrix multiplication for many-core architectures
A survey of direct methods for sparse linear systems
Basker: a threaded sparse lu factorization utilizing hierarchical parallelism and data layouts
Complex network partitioning using label propagation
Parallel graph coloring for manycore architectures