|
Barbarians at the Gate: How AI is Upending Systems Research
Audrey Cheng*, Shu Liu*, Melissa Pan*, Zhifei Li, Bowen Wang, Alex Krentsel, Tian Xia, Mert Cemri, Jongseok Park, Shuo Yang, Jeff Chen, Lakshya Agrawal, Aditya Desai, Jiarong Xing, Koushik Sen, Matei Zaharia, Ion Stoica
arXiv
[paper]
[code]
[website]
|
|
PrefillOnly: An Inference Engine for Prefill-only Workloads in LLM Applications
Kuntai Du, Bowen Wang, Chen Zhang, Yiming Cheng, Qing Lan, Hejian Sang, Yihua Cheng, Jiayi Yao, Xiaoxuan Liu, Yifan Qiao, Ion Stoica, Junchen Jiang
SOSP 2025
[paper]
[code]
|
|
APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding
Mingdao Liu*, Aohan Zeng*, Bowen Wang, Peng Zhang, Jie Tang, Yuxiao Dong
arXiv
[paper]
[code]
|
|
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Aohan Zeng*, Mingdao Liu*, Rui Lu*, Bowen Wang, Xiao Liu, Yuxiao Dong, Jie Tang
ACL 2024 Findings
[paper]
[code]
[website]
|
|
vLLM Expert Parallelism Load Balancer (EPLB)
[repo]
[pr]
Implemented a load balancer that rearranges the experts dynamically based on observed expert usage to solve the load imbalance problem of sparse Mixture-of-Experts (MoE) inference.
Supports redundant experts to further amortize the load of popular experts by distributing the load of heavy-hitters to more compute resources.
Achieves up to 30% throughput improvement and 25% latency reduction in sparse MoE inference. Core component of large scale expert parallelism inference.
|
 |
Sky Computing Lab, University of California, Berkeley
2024.12 - Present
Visiting Student Researcher
Advisor: Prof. Ion Stoica
|
 |
Undergraduate Visiting Research (UGVR), Stanford University
2024.07 - 2024.08
Research Intern
Advisor: Prof. Tsachy Weissman
|
 |
Knowledge Engineering Group (KEG), Tsinghua University
2023.07 - 2024.06
Research Intern
Advisor: Prof. Jie Tang &
Prof. Yuxiao Dong
|
 |
Z.ai, Beijing, China
2023.07 - 2024.06
Research Intern, Member of the GLM Training Team
|
|
NOP-Processor: Out-of-Order LoongArch Core
Competition, NSCSCC 2023 (LoongArch Track)
[video]
[code]
Special Prize (National Top 1), 7th "Loongsun Cup" CPU Design Competition.
NOP-Processor is a high-performance out-of-order processor core for the LoongArch architecture, used as a strong baseline for the National Student System Capability Challenge (NSCSCC 2023).
The project implements a modern pipeline with speculative execution, branch prediction, and an on-chip memory hierarchy, targeting both high frequency and competitive performance on the LoongArch benchmark suite.
|
|
Dino Fit Adventure: Chrome Dino with Full-Body Control
Coursework, Digital Logic Design, Tsinghua University
[video]
[code]
Dino Fit Adventure lets you play Chrome Dino in the real world using your body movements, built as a digital design course project on FPGA.
The system reads acceleration data from a wearable sensor to detect jumping and other actions, decodes the data in hardware, and renders a smooth VGA game pipeline, handling clock-domain crossing, buffering, and timing closure in SystemVerilog.
|
|
Simple RDBMS from Scratch
Coursework, Introduction to Database Management Systems, Tsinghua University
[code]
This project implements a simple relational database management system (RDBMS) from scratch, supporting core features such as CRUD operations, indexing, constraints, aggregation, and join queries.
It focuses on building the end-to-end query pipeline, including a storage layer, execution engine, and basic optimization, to expose the full lifecycle of SQL query processing in a compact system.
|
|
Wordle in Rust with WebAssembly
Coursework, Programming Training, Tsinghua University
[website]
[code]
This project is an implementation of the Wordle word game in Rust, compiled to WebAssembly so that it can run efficiently in the browser.
It explores ergonomics of Rust for game logic, safe state management, and the toolchain for building, optimizing, and deploying Rust+WASM applications to the web.
|
Miscellaneous
|
I have a passion for learning languages β both programming and natural.
Some languages I'm using / learning:
| π¨π³ δΈζ |
δ½ ε₯½οΌ |
| πΊπΈ English |
Hello! |
| π―π΅ ζ₯ζ¬θͺ |
γγγ«γ‘γ―οΌ |
| π°π· νκ΅μ΄ |
μλ
νμΈμ! |
| πͺπΈ EspaΓ±ol |
Β‘Hola! |
| π Python |
print('Hello, world!') |
| π¦ Rust |
println!("Hello, world!"); |
|
|