John Owens
University of California, Davis
H-index: 63
North America-United States
Top articles of John Owens
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Helping Faculty Teach Software Performance Engineering | John D Owens Bruce Hoppe | 2024/5/27 | |
Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms | arXiv preprint arXiv:2404.12674 | Zhongyi Lin Ning Sun Pallab Bhattacharya Xizhou Feng Louis Feng | 2024/4/19 |
The EDGE Language: Extended General Einsums for Graph Algorithms | arXiv preprint arXiv:2404.11591 | Toluwanimi O Odemuyiwa Joel S Emer John D Owens | 2024/4/17 |
Dynamic Mesh Processing on the GPU | Ahmed H Mahmoud Serban D Porumbescu John D Owens | 2024/1/29 | |
A programming model for GPU load balancing | Muhammad Osama Serban D Porumbescu John D Owens | 2023/2/25 | |
The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks | arXiv preprint arXiv:2310.00496 | Cameron Shinn Collin McCarthy Saurav Muralidharan Muhammad Osama John D Owens | 2023/9/30 |
Stream-k: Work-centric parallel decomposition for dense matrix-matrix multiplication on the gpu | Muhammad Osama Duane Merrill Cris Cecka Michael Garland John D Owens | 2023/2/25 | |
Boba: A parallel lightweight graph reordering algorithm with heavyweight implications | arXiv preprint arXiv:2306.10410 | Matthew Drescher Muhammad A Awad Serban D Porumbescu John D Owens | 2023/6/17 |
Analyzing and implementing GPU hash tables | Muhammad A Awad Saman Ashkiani Serban D Porumbescu Martín Farach-Colton John D Owens | 2023 | |
Maximum Clique Enumeration on the GPU | Afton Geil Serban D Porumbescu John D Owens | 2023/5/15 | |
Accelerating sparse data orchestration via dynamic reflexive tiling | Toluwanimi O Odemuyiwa Hadi Asghari-Moghaddam Michael Pellauer Kartik Hegde Po-An Tsai | 2023/3/25 | |
Harmonic CUDA: Asynchronous Programming on GPUs | Jonathan D Wapman Sean Treichler Serban D Porumbescu John D Owens | 2023/2/25 | |
Atos: A task-parallel GPU scheduler for graph analytics | Yuxin Chen Benjamin Brock Serban Porumbescu Aydin Buluc Katherine Yelick | 2022/8/29 | |
Supporting Unified Shader Specialization by Co-opting C++ Features | Proceedings of the ACM on Computer Graphics and Interactive Techniques | Kerry A Seitz Theresa Foley Serban D Porumbescu John D Owens | 2022/7/27 |
Introduction to GraphBLAS | Jeremy Kepner Peter Aaltonen David Bader Aydin Buluc Franz Franchetti | 2022/7/20 | |
Building a performance model for deep learning recommendation model training on gpus | Zhongyi Lin Louis Feng Ehsan K Ardestani Jaewon Lee John Lundell | 2022/12/18 | |
Essentials of parallel graph analytics | Muhammad Osama Serban D Porumbescu John D Owens | 2022/5/30 | |
Scalable irregular parallelism with GPUs: getting CPUs out of the way | Yuxin Chen Benjamin Brock Serban Porumbescu Aydin Buluç Katherine Yelick | 2022/11/13 | |
GraphBLAST: A High-Performance Linear Algebra-based Graph Framework on the GPU | ACM Transactions on Mathematical Software (TOMS) | Carl Yang Aydın Buluç John D Owens | 2022/2/16 |
A GPU Multiversion B-Tree | Muhammad A Awad Serban D Porumbescu John D Owens | 2022/10/8 |