Depei Qian
Beihang University
H-index: 30
Asia-China
Top articles of Depei Qian
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU | Xiaoyan Liu Xuegui Zheng Hailong Yang Zhongzhi Luan Depei Qian | 2024/3/2 | |
Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding | arXiv preprint arXiv:2402.15678 | Siqi Wang Hailong Yang Xuezhu Wang Tongxuan Liu Pengbo Wang | 2024/2/24 |
GNNSched: 面向 GPU 的图神经网络推理任务调度框架. | Computer Engineering & Science/Jisuanji Gongcheng yu Kexue | 孙庆骁, 刘轶, 杨海龙, 王一晴, 贾婕, 栾钟治, 钱德沛 | 2024/1/1 |
INSPIRIT: Optimizing Heterogeneous Task Scheduling through Adaptive Priority in Task-based Runtime Systems | arXiv preprint arXiv:2404.03226 | Yiqing Wang Xiaoyan Liu Hailong Yang Xinyu Yang Pengbo Wang | 2024/4/4 |
Building a domain-specific compiler for emerging processors with a reusable approach | Science China Information Sciences | Mingzhen Li Yi Liu Bangduo Chen Hailong Yang Zhongzhi Luan | 2024/1 |
Towards optimized tensor code generation for deep learning on sunway many-core processor | Frontiers of Computer Science | Mingzhen Li Changxi Liu Jianjin Liao Xuegui Zheng Hailong Yang | 2024/4 |
AtRec: Accelerating Recommendation Model Training on CPUs | IEEE Transactions on Parallel and Distributed Systems | Siqi Wang Tianyu Feng Hailong Yang Xin You Bangduo Chen | 2024/3/25 |
TrivialSpy: Identifying Software Triviality via Fine-grained and Dataflow-based Value Profiling | Xin You Hailong Yang Kelun Lei Zhongzhi Luan Depei Qian | 2023/11/12 | |
Accelerating Big Data Application by Eliminating Redundancy on Hadoop Cluster | Kelun Lei Shaokang Du Xin You Zhibo Xuan Haoran Kong | 2023/12/17 | |
HAOTuner: A Hardware Adaptive Operator Auto-Tuner for Dynamic Shape Tensor Compilers | IEEE Transactions on Computers | Pengyu Mu Yi Liu Rui Wang Guoxiang Liu Zhonghao Sun | 2023/6/23 |
Adapting combined tiling to stencil optimizations on sunway processor | CCF Transactions on High Performance Computing | Biao Sun Mingzhen Li Hailong Yang Jun Xu Zhongzhi Luan | 2023/9 |
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs | Mingzhen Li Wencong Xiao Hailong Yang Biao Sun Hanyu Zhao | 2023/11/12 | |
Efficient Deep Molecular Dynamic Model Training on Heterogeneous System | Shaokang Du Xin You Hailong Yang Jing Shang Zhiwen Xiao | 2023/12/17 | |
BiRFIA: Selective Binary Rewriting for Function Interception on ARM | Kelun Lei Xin You Hailong Yang Zhongzhi Luan Depei Qian | 2023/6/21 | |
CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU–GPU system | The Journal of Supercomputing | Qi Zhang Yi Liu Tao Liu Depei Qian | 2023/9 |
gGMED: Towards GPU Accelerated Geometric Modeling Evaluation and Derivative Processes | Zhibo Xuan Hailong Yang Pengbo Wang Xin Sun Jiwei Hao | 2023/10/20 | |
Large-Scale Parallelization and Optimization of Lattice QCD on Tianhe New Generation Supercomputer | Junlin Chen Chaojing Liu Zhongzhi Luana Ming Gong Qingfeng Li | 2023/12/17 | |
Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU | Jianjin Liao Mingzhen Li Hailong Yang Qingxiao Sun Biao Sun | 2023/5/15 | |
Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs | Mingzhen Li Hailong Yang Shanjun Zhang Fengwei Yu Ruihao Gong | 2023/8/7 | |
Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs | IEEE Transactions on Parallel and Distributed Systems | Qingxiao Sun Yi Liu Hailong Yang Zhonghui Jiang Zhongzhi Luan | 2023/10/18 |