Kunle Olukotun
Stanford University
H-index: 80
North America-United States
Top articles of Kunle Olukotun
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Implementing and Optimizing the Scaled Dot-Product Attention on Streaming Dataflow | arXiv preprint arXiv:2404.16629 | Gina Sohn Nathan Zhang Kunle Olukotun | 2024/4/25 |
Revet: A language and compiler for dataflow threads | Alexander C Rucker Shiv Sundram Coleman Smith Matthew Vilim Raghu Prabhakar | 2024/3/2 | |
Baco: A fast and portable Bayesian compiler optimization framework | Erik Orm Hellsten Artur Souza Johannes Lenfers Rubens Lacouture Olivia Hsu | 2023/3/25 | |
The sparse abstract machine | Olivia Hsu Maxwell Strange Ritvik Sharma Jaeyeon Won Kunle Olukotun | 2023/3/25 | |
Homunculus: Auto-generating efficient data-plane ml pipelines for datacenter networks | Tushar Swamy Annus Zulfiqar Luigi Nardi Muhammad Shahbaz Kunle Olukotun | 2023/3/25 | |
Sigma: Compiling einstein summations to locality-aware dataflow | Tian Zhao Alexander Rucker Kunle Olukotun | 2023/1/27 | |
Mosaic: An Interoperable Compiler for Tensor Algebra | Proceedings of the ACM on Programming Languages | Manya Bansal Olivia Hsu Kunle Olukotun Fredrik Kjolstad | 2023/6/6 |
Accelerating SLIDE: Exploiting Sparsity on Accelerator Architectures | Sho Ko Alexander Rucker Yaqi Zhang Paul Mure Kunle Olukotun | 2022/5/30 | |
Taurus: a data plane architecture for per-packet ML | Tushar Swamy Alexander Rucker Muhammad Shahbaz Ishan Gaur Kunle Olukotun | 2022/2/28 | |
Efficient Memory Partitioning in Software Defined Hardware | arXiv preprint arXiv:2202.01261 | Matthew Feldman Tian Zhao Kunle Olukotun | 2022/2/2 |
Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow Architecture | arXiv preprint arXiv:2211.03251 | Olivia Hsu Alexander Rucker Tian Zhao Kunle Olukotun Fredrik Kjolstad | 2022/11/7 |
Compilation of sparse array programming models | Proceedings of the ACM on Programming Languages | Rawn Henry Olivia Hsu Rohan Yadav Stephen Chou Kunle Olukotun | 2021/10/15 |
Chopping off the tail: Bounded non-determinism for real-time accelerators | IEEE Computer Architecture Letters | Alexander Rucker Muhammad Shahbaz Kunle Olukotun | 2021/8/4 |
Sara: Scaling a reconfigurable dataflow accelerator | Yaqi Zhang Nathan Zhang Tian Zhao Matt Vilim Muhammad Shahbaz | 2021/6/14 | |
Aurochs: An architecture for dataflow threads | Matthew Vilim Alexander Rucker Kunle Olukotun | 2021/6/14 | |
High performance lattice regression on FPGAs via a high level hardware description language | Nathan Zhang Matthew Feldman Kunle Olukotun | 2021/12/6 | |
Bayesian optimization with a prior for the optimum | Artur Souza Luigi Nardi Leonardo B Oliveira Kunle Olukotun Marius Lindauer | 2021 | |
Capstan: A vector RDA for sparsity | Alexander Rucker Matthew Vilim Tian Zhao Yaqi Zhang Raghu Prabhakar | 2021/10/18 | |
Gorgon: Accelerating machine learning from relational data | Matthew Vilim Alexander Rucker Yaqi Zhang Sophia Liu Kunle Olukotun | 2020/5/30 | |
Plasticine—A Universal Data Analytics Accelerator | Leland Stanford Junior University Stanford United States | Kunle Olukotun | 2020/3/1 |