Sheng-Chun Kao
Georgia Institute of Technology
H-index: 8
North America-United States
Top articles of Sheng-Chun Kao
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
NonGEMM Bench: Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads | arXiv preprint arXiv:2404.11788 | Rachid Karami Hemanth Kota Sheng-Chun Kao Hyoukjun Kwon | 2024/4/17 |
Progressive Gradient Flow for Robust N: M Sparsity Training in Transformers | arXiv preprint arXiv:2402.04744 | Abhimanyu Rajeshkumar Bambhaniya Amir Yazdanbakhsh Suvinay Subramanian Sheng-Chun Kao Shivani Agrawal | 2024/2/7 |
JaxPruner: A concise library for sparsity research | Joo Hyung Lee Wonpyo Park Nicole Elyse Mitchell Jonathan Pilault Johan Samir Obando Ceron | 2024/1/8 | |
DNNFuser: Generative pre-trained transformer as a generalized mapper for layer fusion in dnn accelerators | arXiv preprint arXiv:2201.11218 | Sheng-Chun Kao Xiaoyu Huang Tushar Krishna | 2022/1/26 |
A Formalism of DNN Accelerator Flexibility | Proceedings of the ACM on Measurement and Analysis of Computing Systems | Sheng-Chun Kao Hyoukjun Kwon Michael Pellauer Angshuman Parashar Tushar Krishna | 2022/6/6 |
Magma: An optimization framework for mapping multiple dnns on multiple accelerator cores | Sheng-Chun Kao Tushar Krishna | 2022/4/2 | |
Demystifying map space exploration for NPUs | Sheng-Chun Kao Angshuman Parashar Po-An Tsai Tushar Krishna | 2022/11/6 | |
Digamma: Domain-aware genetic algorithm for hw-mapping co-optimization for dnn accelerators | Sheng-Chun Kao Michael Pellauer Angshuman Parashar Tushar Krishna | 2022/3/14 | |
Training recipe for n: M structured sparsity with decaying pruning mask | arXiv preprint arXiv:2209.07617 | Sheng-Chun Kao Amir Yazdanbakhsh Suvinay Subramanian Shivani Agrawal Utku Evci | 2022/9/15 |
FLAT: An Optimized Dataflow for Mitigating Attention Performance Bottlenecks | Sheng-Chun Kao Suvinay Subramanian Gaurav Agrawal Amir Yazdanbakhsh Tushar Krishna | 2023/1/27 | |
Extending sparse tensor accelerators to support multiple compression formats | Eric Qin Geonhwa Jeong William Won Sheng-Chun Kao Hyoukjun Kwon | 2021/5/17 | |
E3: A hw/sw co-design neuroevolution platform for autonomous learning in edge device | Sheng-Chun Kao Tushar Krishna | 2021/3/28 | |
Confuciux: Autonomous hardware resource assignment for dnn accelerators using reinforcement learning | Sheng-Chun Kao Geonhwa Jeong Tushar Krishna | 2020 | |
Generative Design of Hardware-aware DNNs | arXiv preprint arXiv:2006.03968 | Sheng-Chun Kao Arun Ramamurthy Tushar Krishna | 2020/6/6 |
Conditional Neural Architecture Search | arXiv preprint arXiv:2006.03969 | Sheng-Chun Kao Arun Ramamurthy Reed Williams Tushar Krishna | 2020/6/6 |
MINT: Microarchitecture for Efficient and Interchangeable CompressioN Formats on Tensor Algebra. | Eric Qin Geonhwa Jeong William Won Sheng-Chun Kao Hyoukjun Kwon | 2020/5/1 | |
Bloom filter and implementation method thereof | 2020/3/31 | ||
Gamma: Automating the hw mapping of dnn models on accelerators via genetic algorithm | Sheng-Chun Kao Tushar Krishna | 2020/11/2 |