Zhuohao Li

I am a second-year Ph.D. student at UCLA EECS . I help to build SGLang project, a fast serving framework for large language models with Ying Sheng and Lianmin Zheng. My research interests lie in efficient and trustworthy large language models (LLMs) and its domain-specific applications.

Previously, I obtained my Bachelor's degree from Shanghai Jiao Tong University. During my undergraduate, I spent a great junior year with Prof. Jingwen Leng at SJTU EPCC Lab. I was a visiting researcher at UT Austin with Prof. Calvin Lin. I also spent a wonderful undergrad exchange time at HKUST with Prof. Wei Wang.

During my career, I am fortunate to work and intern at NVIDIA, AWS AI Lab, Alibaba Cloud, and Shanghai AI Laboratory. My research is supported by Samueli Fellowship during my Ph.D..

profile photo
zhuohaol [at] ucla [dot] edu Office: Engineering VI 63-205

I am always interested in potential collaboration in efficient and trustworthy ML. I am also happy to mentor highly-motivated undergrads/masters in EECS. If you want to discuss ideas and research opportunies, please feel free to shoot me an email. If you are a UCLA student and would like an office hour, please book an appointment with me.

News

[9/2024] 🎉 I am in PyTorch Dev Conference 2024, see you in San Francisco!
[6/2024] 🎉 I join AWS AI Lab as an Applied Scientist Intern, see you in New York!
[4/2024] I am visiting Duke in April, see you in Duram!
[3/2024] I am in NDSS 2024 this week, see you in San Diego!
[3/2024] I will give a talk at UCI, see you in Irvine!
[2/2024] I am in the vLLM meetup, see you in Bay Area!
[12/2023] I am in ICCV 2023 and visiting École polytechnique telecom this winter, see you in Paris!
[5/2023] 🎉 My thesis was elected as the Merit Honor at SJTU!
[3/2023] 🎉 I join Shanghai AI Lab as an Research Scientist Intern this summer!
[12/2023] I will exchange at HKUST CSE this fall, see you in Hong Kong!
[6/2022] 🎉 I join NVIDIA as a SWE Intern, see you in Bay Area!
[4/2022] I am visiting UT Austin this summer, see you in Austin!

Research Interests

I am focusing on building confidential while efficient algorithms and systems for various scaled machine learning tasks. I am currently working on efficient large language models and trustworthy machine learning. I would like to acceralete AGI in all ways. My dream is build interactive, human-friendly, fault-tolerant, and low-cost end-to-end frameworks from models to infrastructure to help enpower AGIs.

In my past years of research experience, I have also explored these research directions:

  • ML Computing Paradigms: How to enable efficient management of emerging new workloads in heterogeneous datacenters and constructing systems for new computing paradigms on the cloud.
  • SW-HW Co-design: The way how hardware, software, compiler, and data interact with each other to get optimized in specified tasks. ML tasks get either specifically or generally optimzied on emerging hardware.
  • LLM Agents: How to apply, retreive, fine-tune general models with other domain specific usages, such as vulnerability extraction, security, and privacy.
  • Publications
    Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement
    Yongji Wu, Wenjie Qu, Tianyang Tao, Zhuang Wang, Wei Bai, Zhuohao Li, Yuan Tian, Jiaheng Zhang, Matthew Lentz,
    (under review), 2025
    code / project page / arXiv / Poster
    FaaSwap: SLO-Aware, GPU-Efficient Serverless Inference via Model Swapping
    Minchen Yu, Ao Wang, Dong Chen, Haoxuan Yu, Xiaonan Luo, Zhuohao Li, Wei Wang, Ruichuan Chen, Dapeng Nie, Haoran Yang
    (under review), 2024
    code / project page / arXiv / Poster
    RealNet: Combining Optimized Object Detection with Information Fusion Depth Estimation on IoT Devices
    Zhuohao Li, Leqi Ding, Yuanhang Zhang, etc.
    arXiv preprint, 2022
    code / project page / arXiv / Poster
    Experience
    AWS logo
    Amazon Web Services, New York, NY, USA
    Applied Scientist Intern at AWS AI Lab
    Jun 2024 ~ Sep 2024
    NVIDIA logo
    NVIDIA, Santa Clara, CA, USA
    CUDA Software Engineering Intern
    Jun 2022 ~ Dec 2022
    Alicloud logo
    Alibaba Cloud, China
    Software Engineering Intern
    Jun 2021 ~ Dec 2021
    Highlighted Honor
    Samueli School of Engineering Fellowship, UCLA, 2024
    Shanghai Outstanding Graduates, SJTU, 2023
    Cockrell School of Engineering Fellowship, UT Austin, 2023
    SenseTime Scholarship , SenseTime, 2022
    Stanford Irving T Ho Fellowship , Irving T Ho Foundation, 2022
    Dean List, SJTU, 2020 - 2022
    Google Cup Google, 2020
    Misc.

    I love snowboarding, tennis, and racing fomula. If you want to have a tennis mate, I am very welcomed (I struggle to find a tennis mate these days). You can find me on 小红书 and instagram.
    I was a participant of Mathematics Olympiad in high school.