I am an LLM System Researcher at ByteDance Seed. I earned my Ph.D. from Tsinghua University, advised by Jidong Zhai. My passion lies in building efficient and reliable machine learning systems, with research interests covering large-scale LLM training, compilation optimization, and long-term operations.
Projects
- InfiniTensor is a high-performance inference engine tailored for GPUs and AI accelerators. Its design focuses on effective deployment and swift academic validation.
- Vapro is a performance profiler to detect and diagnose performance variance (i.e., performance degradation and jitters) for parallel applications.
Publications
[OSDI’26] Safeguarding LLM Training at Scale: Online SDC Detection and Insights from 35 Million GPU Hours
Kinman Lei*, Liyan Zheng*, Xiang Li, Hongmin Chen, Yun Zhang, Gaohong Liu, Zuquan Song, Zixuan Ma, Zhiyu Xue, Minghui Yu, Shuguang Wang, Wencong Xiao, Haibin Lin, Yuyang Jin, Jidong Zhai, Bo Liu, Xin Liu[PPoPP’26] Difflow: A Data-Characteristic-Aware Serving System for Diffusion Models
Chengzhang Wu*, Liyan Zheng*, Haojie Wang, Kezhao Huang, Zixuan Ma, Dong Dong, Jidong Zhai[ATC’25] mTuner: Accelerating Parameter-Efficient Fine-Tuning on Multi-GPU Servers with Elastic Tensor
Kezhao Huang, Siqi Zhu, Mingshu Zhai, Liyan Zheng, Kinman Lei, Jiaao He, Yuyang Jin, Jidong Zhai[CGO’25] IntelliGen: Instruction-Level Auto-tuning for Tensor Program with Monotonic Memory Optimization
Zixuan Ma, Haojie Wang, Jingze Xing, Shuhong Huang, Liyan Zheng, Chen Zhang, Huanqi Cao, Kezhao Huang, Mingshu Zhai, Shizhi Tang, Penghan Wang, Jidong Zhai[EuroSys’24] WiseGraph: Optimizing GNN with Joint Workload Partition of Graph and Operations
Kezhao Huang, Jidong Zhai, Liyan Zheng, Haojie Wang, Yuyang Jin, Qihao Zhang, Runqing Zhang, Zhen Zheng, Youngmin Yi, Xipeng Shen[OSDI’23] EinNet: Optimizing Tensor Programs with Derivation-Based Transformations
Liyan Zheng, Haojie Wang, Jidong Zhai, Muyan Hu, Zixuan Ma, Tuowei Wang, Shuhong Huang, Xupeng Miao, Shizhi Tang, Kezhao Huang, Zhihao Jia
[Paper] [Slides] [Poster] [Code][PPoPP’22] Vapro: performance variance detection and diagnosis for production-run parallel applications
Liyan Zheng, Jidong Zhai, Xiongchao Tang, Haojie Wang, Teng Yu, Yuyang Jin, Shuaiwen Leon Song, Wenguang Chen
[Paper] [Slides] [Code][TPDS’22] Detecting Performance Variance for Parallel Applications Without Source Code (Best Paper Award Runner-Up)
Jidong Zhai, Liyan Zheng, Jinghan Sun, Feng Zhang, Xiongchao Tang, Xuehai Qian, Bingsheng He, Wei Xue, Wenguang Chen, Weimin Zheng
[Paper][TPDS’22] Leveraging Code Snippets to Detect Variations in the Performance of HPC Systems
Jidong Zhai, Liyan Zheng, Jinghan Sun, Feng Zhang, Xiongchao Tang, Xuehai Qian, Bingsheng He, Wei Xue, Wenguang Chen, Weimin Zheng
[Paper][PPoPP’22] BaGuaLu: targeting brain scale pretrained models with over 37 million cores
Zixuan Ma, Jiaao He, Jiezhong Qiu, Huanqi Cao, Yuanwei Wang, Zhenbo Sun, Liyan Zheng, Haojie Wang, Shizhi Tang, Tianyu Zheng, Junyang Lin, Guanyu Feng, Zeqiang Huang, Jie Gao, Aohan Zeng, Jianwei Zhang, Runxin Zhong, Tianhui Shi, Sha Liu, Weimin Zheng, Jie Tang, Hongxia Yang, Xin Liu, Jidong Zhai, Wenguang Chen:[PLDI’22] FreeTensor: a free-form DSL with holistic optimizations for irregular tensor programs
Shizhi Tang, Jidong Zhai, Haojie Wang, Lin Jiang, Liyan Zheng, Zhenhao Yuan, Chen Zhang[OSDI’21] PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, Liyan Zheng, Yuanzhi Li, Kaiyuan Rong, Yuanyong Chen, Zhihao Jia
Talks
EinNet: Optimizing Tensor Programs with Derivation-Based Transformations
- ChinaSys, TURC, Wuhan, July 2023
- OSDI, Boston, July 2023
Vapro: performance variance detection and diagnosis for production-run parallel applications
- Meta, Online, August 2023
- PPoPP, Online, April 2022
