Biography

I am a second-year Ph.D. student in the Department of Computer Science at the University of Maryland, College Park, advised by Prof. Ang Li. Previously, I was a research intern advised by Dr. Tianlong Chen at Massachusetts Institute of Technology (CSAIL@MIT). I used to be a research assistant at the Research Institute of Intelligent Complex Systems at Fudan University, supervised by Prof.Siqi Sun. Before that, I was a research intern at JD Explore Academy, supervised by Dr. Liang Ding and Prof. Dacheng Tao. My research interests primarily lie in the area of deep learning, model compression, natural language processing (NLP), and AI + X (e.g., health, finance).

News

[08/2025]: Router-Tuning is accepted at EMNLP 2025.
[05/2025]: 🏆 Awarded the Qualcomm Innovation Fellowship (QIF) North America for the proposal “Less Attention, Much Faster: Toward a Future of Efficiency-Optimized Transformer Architectures.”
[09/2024]: Two papers accepted: (1) Efficient Attention at NeurIPS 2024, and (2) Reformat Alignment at EMNLP 2024.
[10/2023]: Merging Experts into One is accepted by EMNLP 2023.
[05/2023]: PAD-Net is accepted by ACL 2023.
[04/2023]: NeuralSlice is accepted by ICML 2023.
[10/2022]: SparseAdapter is accepted by EMNLP 2022.
[08/2022]: SD-Conv is accpeted by WACV 2023.
[07/2022]: 🏆 Ranked 1st (Chinese<=>English, German<=>English, Czech<=>English, English=>Russian), 2nd (Russian=>English, Japanese=>English), and 3rd (English=>Japanese) in General Translation Task in WMT 2022.
[01/2022]: One paper is accepted by AAAI-22 KDF.

Research Experience


Tencent AI Lab, Bellevue, WA: 06/2023 - 08/2024; Efficient ML


IICS, Fudan University: 07/2022 - 03/2023; AI for Protein, Computational Biology


NLP Group, JD Explore Academy: 02/2022 - 10/2022; Machine Learning, Efficient Methods for NLP

Selected Publications

Shwai He*, Guoheng Sun*, Zheyu Shen, Ang Li, “What Matters in Transformers? Not All Attention is Needed”, arXiv. [Paper] [Code]
Shwai He, Tao Ge, Guoheng Sun, Bowei Tian, Xiaoyang Wang, Dong Yu, “What Matters in Transformers? Not All Attention is Needed”, Proceedings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025) [Paper] [Code]
Shwai He*, Daize Dong*, Liang Ding, Ang Li, “Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques”, Transactions on Machine Learning Research (TMLR). [Paper] [Code]
Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, Dacheng Tao, “Merging Experts into One: Improving Computational Efficiency of Mixture of Experts”, Proceedings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Oral). [Paper] [Code]
Shwai He, Liang Ding, Daize Dong, Boan Liu, Fuqiang Yu, Dacheng Tao, “PAD-Net: An Efficient Framework for Dynamic Networks”, Proceedings of The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023). [Paper] [Code]
Shwai He, Liang Ding, Daize Dong, Miao Zhang, Dacheng Tao, “SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters”, Findings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [Paper] [Code]
Shwai He, Chenbo Jiang, Daize Dong, Liang Ding, “SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution”, IEEE/CVF Winter Conference on Applications of Computer Vision, 2023 (WACV 2023). [Paper]
Shwai He, Shi Gu, “Multi-modal Attention Network for Stock Movements Prediction”, The AAAI-22 Workshop on Knowledge Discovery from Unstructured Data in Financial Service (KDF 2022). [Paper]
Chenbo Jiang, Jie Yang, Shwai He, Yu-Kun Lai and Lin Gao. “NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes.”, Proceedings of the 40th International Conference on Machine Learning, 2023 (ICML 2023). [Paper] [Code]
Changtong Zan, Keqin Peng, Liang Ding, Baopu Qiu, Boan Liu, Shwai He, Qingyu Lu, Zheng Zhang, Chuang Liu, Weifeng Liu, Yibing Zhan and Dacheng Tao, “Vega-MT: The JD Explore Academy Translation System for WMT”, The Conference on Machine Translation, 2022 (WMT 2022). [Paper]

Shwai He

News

Research Experience

Selected Publications