Biography

I am a first-year Ph.D. student in the Department of Computer Science at the University of Maryland, College Park. Previously, I was a research intern advised by Dr. Tianlong Chen at Massachusetts Institute of Technology (CSAIL@MIT). I used to be a research assistant at the Research Institute of Intelligent Complex Systems at Fudan University, supervised by Prof.Siqi Sun. Before that, I was a research intern at JD Explore Academy, supervised by Dr. Liang Ding and Prof. Dacheng Tao.

My research interests primarily lie in the area of deep learning, model compression, natural language processing (NLP), and AI + X (e.g., health, finance). I start with data, models, objectives, optimization, and better adaptation to various downstream tasks to investigate how to efficiently, sufficiently, and trustworthily transfer knowledge from large-scale data to the parameters of the pre-training model.

News

[10/2023]: One paper (Merging Experts into One) is accepted by EMNLP 2023.
[05/2023]: One paper (PAD-Net) is accepted by ACL 2023.
[04/2023]: One paper (NeuralSlice) is accepted by ICML 2023.
[10/2022]: One paper (SparseAdapter) is accepted by EMNLP 2022.
[08/2022]: One paper (SD-Conv) is accpeted by WACV 2023.
[07/2022]: 🏆 Ranked 1st (Chinese<=>English, German<=>English, Czech<=>English, English=>Russian), 2nd (Russian=>English, Japanese=>English), and 3rd (English=>Japanese) in General Translation Task in WMT 2022.
[01/2022]: One paper is accepted by AAAI-22 KDF.

Research Experience

CSAIL, Massachusetts Institute of Technology
11/2023 - 04/2024
Research Intern, advised by Dr. Tianlong Chen
Efficient ML
IICS, Fudan University
07/2022 - 03/2023
Research Assistant, advised by Prof. Siqi Sun
AI for Protein, Computational Biology
NLP Group, JD Explore Academy
02/2022 - 10/2022
Research Intern, advised by Dr. Liang Ding and Prof. Dacheng Tao
Machine Learning, Efficient Methods for NLP

Selected Publications

  1. Shwai He, Tianlong Chen, “RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation”, arXiv.
  2. Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, Dacheng Tao, “Merging Experts into One: Improving Computational Efficiency of Mixture of Experts”, Proceedings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Oral).
  3. Shwai He, Liang Ding, Daize Dong, Boan Liu, Fuqiang Yu, Dacheng Tao, “PAD-Net: An Efficient Framework for Dynamic Networks”, Proceedings of The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023).
  4. Shwai He, Liang Ding, Daize Dong, Miao Zhang, Dacheng Tao, “SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters”, Findings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022).
  5. Shwai He, Chenbo Jiang, Daize Dong, Liang Ding, “SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution”, IEEE/CVF Winter Conference on Applications of Computer Vision, 2023 (WACV 2023).
  6. Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, Dacheng Tao, “MerA: Merging Pretrained Adapters For Few-Shot Learning”, arXiv.
  7. Shwai He, Shi Gu, “Multi-modal Attention Network for Stock Movements Prediction”, The AAAI-22 Workshop on Knowledge Discovery from Unstructured Data in Financial Service (KDF 2022).
  8. Chenbo Jiang, Jie Yang, Shwai He, Yu-Kun Lai and Lin Gao. “NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes.”, Proceedings of the 40th International Conference on Machine Learning, 2023 (ICML 2023).
  9. Changtong Zan, Keqin Peng, Liang Ding, Baopu Qiu, Boan Liu, Shwai He, Qingyu Lu, Zheng Zhang, Chuang Liu, Weifeng Liu, Yibing Zhan and Dacheng Tao, “Vega-MT: The JD Explore Academy Translation System for WMT”, The Conference on Machine Translation, 2022 (WMT 2022).