1 | Siva Rajamanickam

LAPIS: A Performance Portable, High Productivity Compiler Framework

Brian Kelley, Sivasankaran Rajamanickam

Beyond Exascale: Dataflow Domain Translation on a Cerebras Cluster

Tomas Oppelstrup, Nicholas Giamblanco, Delyan Z Kalchev, Ilya Sharapov, Mark Taylor, Dirk Van Essendelft, Sivasankaran Rajamanickam, Michael James

Distributed Sparse Tensor Computations in MLIR

Miheer Vaidya, Shreya Singh, Devanshu Mantri, Michael Shannon Eydenberg, Brian Michael Kelley, Sivasankaran Rajamanickam, Atanas Rountev, P Sadayappan

Cello: Co-Designing Schedule and Hybrid Implicit/Explicit Buffer for Complex Tensor Reuse

Raveesh Garg, Michael Pellauer, Sivasankaran Rajamanickam, Tushar Krishna

Imperfect Recognition: A Study of OCR Limitations in the Context of Scientific Documents

Chinmay Sahasrabudhe, Yang Ho, Nick Winovich, Sivasankaran Rajamanickam

TenSQL: An SQL Database Built on GraphBLAS

Jon Roose, Miheer Vaidya, Ponnuswamy Sadayappan, Sivasankaran Rajamanickam

A Comparison of Spectral and Spatial Graph Convolutional Neural Network Kernels Using GraphSAGE-Sparse

Michael Eydenberg, Mark Plagge, Sivasankaran Rajamanickam

An Experimental Study of Two-level Schwarz Domain-Decomposition Preconditioners on GPUs

Ichitaro Yamazaki, Alexander Heinlein, Sivasankaran Rajamanickam

High-Performance GMRES Multi-Precision Benchmark: Design, Performance, and Challenges

Ichitaro Yamazaki, Christian Glusa, Jennifer Loe, Piotr Luszczek, Sivasankaran Rajamanickam, Jack Dongarra

Parallel, Portable Algorithms for Distance-2 Maximal Independent Set and Graph Coarsening

Brian Kelley, Sivasankaran Rajamanickam

Understanding the design-space of sparse/dense multiphase GNN dataflows on spatial accelerators

Raveesh Garg, Eric Qin, Francisco Munoz-Martinez, Robert Guirado, Akshay Jain, Sergi Abadal, Jose Abellan, Manuel Acacio, Eduard Alarcon, Sivasankaran Rajamanickam, Tushar Krishna

Concentric Spherical Neural Network for 3D Representation Learning

James Fox, Bo Zhao, Beatriz Gonzalez Del Rio, Sivasankaran Rajamanickam, Rampi Ramprasad, Le Song

Experimental evaluation of multiprecision strategies for GMRES on GPUs

Jennifer a Loe, Christian a Glusa, Ichitaro Yamazaki, Erik G Boman, Sivasankaran Rajamanickam

Extending Sparse Tensor Accelerators to Support Multiple Compression Formats

Eric Qin, Geonhwa Jeong, William Won, Sheng-Chun Kao, Hyoukjun Kwon, Sudarshan Srinivasan, Dipankar Das, Gordon E Moon, Sivasankaran Rajamanickam, Tushar Krishna

Performance-portable graph coarsening for efficient multilevel graph analysis

Michael S Gilbert, Seher Acer, Erik G Boman, Kamesh Madduri, Sivasankaran Rajamanickam

Union: A unified HW-SW Co-Design ecosystem in MLIR for evaluating tensor operations on spatial accelerators

Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi, Angshuman Parashar, Po-an Tsai, Sivasankaran Rajamanickam, Roberto Gioiosa, Tushar Krishna

A Performance-Portable Nonhydrostatic Atmospheric Dycore for the Energy Exascale Earth System Model Running at Cloud-Resolving Resolutions.

Luca Bertagna, Oksana Guba, Mark a Taylor, James G Foucar, Jeff Larkin, Andrew M Bradley, Sivasankaran Rajamanickam, Andrew G Salinger

ADELUS: A Performance-Portable Dense LU Solver for Distributed-Memory Hardware-Accelerated Systems.

Vinh Q Dang, Joseph D Kotulski, Sivasankaran Rajamanickam

Distributed Memory Graph Coloring Algorithms for Multiple GPUs

Ian Bogle, Erik G Boman, Karen Devine, Sivasankaran Rajamanickam, George M Slota

Performance portable supernode-based sparse triangular solver for manycore architectures

Ichitaro Yamazaki, Sivasankaran Rajamanickam, Nathan Ellingwood

SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems

Seher Acer, Erik G Boman, Sivasankaran Rajamanickam

A Parallel Graph Algorithm for Detecting Mesh Singularities in Distributed Memory Ice Sheet Simulations

Ian Bogle, Karen Devine, Mauro Perego, Sivasankaran Rajamanickam, George M Slota

A Portable SIMD Primitive Using Kokkos for Heterogeneous Architectures

Damodar Sahasrabudhe, Eric T Phipps, Sivasankaran Rajamanickam, Martin Berzins

Linear algebra-based triangle counting via fine-grained tasking on heterogeneous environments:(Update on static graph challenge)

Abdurrahman Yaşar, Sivasankaran Rajamanickam, Jonathan Berry, Michael Wolf, Jeffrey S Young, Ümit v Çatalyürek

Scalable generation of graphs for benchmarking HPC community-detection algorithms

George M Slota, Jonathan W Berry, Simon D Hammond, Stephen L Olivier, Cynthia a Phillips, Sivasankaran Rajamanickam

Scalable inference for sparse deep neural networks using Kokkos kernels

J Austin Ellis, Sivasankaran Rajamanickam

Scalable triangle counting on distributed-memory systems

Seher Acer, Abdurrahman Yaşar, Sivasankaran Rajamanickam, Michael Wolf, Ümit v Catalyürek

Asynchronous one-level and two-level domain decomposition solvers

Christian Glusa, Erik G Boman, Edmond Chow, Sivasankaran Rajamanickam, Paritosh Ramanan

Experimental design of work chunking for graph algorithms on high bandwidth memory architectures

George M Slota, Siva Rajamanickam

Fast triangle counting using cilk

Abdurrahman Yaşar, Sivasankaran Rajamanickam, Michael Wolf, Jonathan Berry, Ümit v Çatalyürek

FROSch: a fast and robust overlapping Schwarz domain decomposition preconditioner based on Xpetra in Trilinos

Alexander Heinlein, Axel Klawonn, Sivasankaran Rajamanickam, Oliver Rheinbach

Tacho: memory-scalable task parallel sparse Cholesky factorization

Kyungjoo Kim, H Carter Edwards, Sivasankaran Rajamanickam

Designing vector-friendly compact BLAS and LAPACK kernels

Kyungjoo Kim, Timothy B Costa, Mehmet Deveci, Andrew M Bradley, Simon D Hammond, Murat E Guney, Sarah Knepper, Shane Story, Sivasankaran Rajamanickam

Fast linear algebra-based triangle counting with kokkoskernels

Michael M Wolf, Mehmet Deveci, Jonathan W Berry, Simon D Hammond, Sivasankaran Rajamanickam

Order or shuffle: Empirically evaluating vertex order impact on parallel graph computations

George M Slota, Sivasankaran Rajamanickam, Kamesh Madduri

Partitioning trillion-edge graphs in minutes

George M Slota, Sivasankaran Rajamanickam, Karen Devine, Kamesh Madduri

Performance-portable sparse matrix-matrix multiplication for many-core architectures

Mehmet Deveci, Christian Trott, Sivasankaran Rajamanickam

A case study of complex graph analysis in distributed memory: Implementation and optimization

George M Slota, Sivasankaran Rajamanickam, Kamesh Madduri

A comparison of high-level programming choices for incomplete sparse factorization across different architectures

Joshua Dennis Booth, Kyungjoo Kim, Sivasankaran Rajamanickam

Basker: a threaded sparse lu factorization utilizing hierarchical parallelism and data layouts

Joshua Dennis Booth, Sivasankaran Rajamanickam, Heidi Thornquist

Parallel graph coloring for manycore architectures

Mehmet Deveci, Erik G Boman, Karen D Devine, Sivasankaran Rajamanickam

High-performance graph analytics on manycore processors

George M Slota, Sivasankaran Rajamanickam, Kamesh Madduri

Building blocks for graph based network analysis

Vladimir Ufimtsev, Sanjukta Bhowmick, Sivasankaran Rajamanickam

A hybrid approach for parallel transistor-level full-chip circuit simulation

Heidi K Thornquist, Sivasankaran Rajamanickam

BFS and coloring-based parallel algorithms for strongly connected components and related problems

George M Slota, Sivasankaran Rajamanickam, Kamesh Madduri

Domain decomposition preconditioners for communication-avoiding Krylov methods on a hybrid CPU/GPU cluster

Ichitaro Yamazaki, Sivasankaran Rajamanickam, Erik G Boman, Mark Hoemmen, Michael a Heroux, Stanimire Tomov

Exploiting geometric partitioning in task mapping for parallel computers

Mehmet Deveci, Sivasankaran Rajamanickam, Vitus J Leung, Kevin Pedretti, Stephen L Olivier, David P Bunde, Umit v Catalyürek, Karen Devine

PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks

George M Slota, Kamesh Madduri, Sivasankaran Rajamanickam

Towards extreme-scale simulations with next-generation Trilinos: a low Mach fluid application case study

Paul Lin, Matthew Bettencourt, Stefan Domino, Travis Fisher, Mark Hoemmen, Jonathan Hu, Eric Phipps, Andrey Prokopenko, Sivasankaran Rajamanickam, Christopher Siefert, Others

Scalable matrix computations on large scale-free graphs using 2D graph partitioning

Erik G Boman, Karen D Devine, Sivasankaran Rajamanickam