My current research interests span distributed machine learning systems and algorithms, and intelligent elderly/child care technologies. My research features both performance modeling/algorithm design and design/implementation of various (networked) systems based on optimization theory and machine learning methods.
Distributed Machine Learning Algorithms
We work on deep understanding and efficient algorithm design for distributed machine learning.
Sheng Wang, Liheng Chen, Pengan Chen, Jingwei Dong, Boyang Xue, Jiyue Jiang, Lingpeng Kong, Chuan Wu. "MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards," in The Thirteenth International Conference on Learning Representations (ICLR), Singapore, April 24-28, 2025.
Junwei Su, Chuan Wu. "On the Topology Awareness and Generalization Performance of Graph Neural Networks," in The 18th European Conference on Computer Vision (ECCV), Milan, Italy, Sep 29-Oct 4, 2024.
Sheng Wang, Liheng Chen, Jiyue Jiang, Boyang Xue, Lingpeng Kong, Chuan Wu. "LoRA Meets Dropout under a Unified Framework," in The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), Bangkok, Thailand, August 11 - 16, 2024.
Sheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Ling- peng Kong, Chuan Wu. "PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA," in The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), Bangkok, Thailand, August 11 - 16, 2024.
Junwei Su, Difan Zou, Chuan Wu. "On the Limitation and Experience Replay for GNNs in Continual Learning," in the Third Conference on Lifelong Learning Agents (CoLLAs), Pisa, Italy, July 29 - August 01, 2024.
Junwei Su, Difan Zou, Chuan Wu. "PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks," in The Twelfth International Conference on Learning Representations (ICLR), Vienna, Austria, May 7-11, 2024.
Junwei Su, Difan Zou, Zijun Zhang, Chuan Wu. "Towards Robust Graph Incremental Learning on Evolving Graphs," in the Fortieth International Conference on Machine Learning (ICML), Honolulu, Hawaii, USA, July 23-29, 2023.
Yangrui Chen, Jiaxuan Youb, Jun He, Yuan Lin, Yanghua Peng, Chuan Wu, Yibo Zhu. "SP-GNN: Learning Structure and Position Information from Graphs Learning Systems," in Elsevier Neural Networks Journal, vol. 161, pp. 505-514, April 2023.
Yangrui Chen, Cong Xie, Meng Ma, Juncheng Gu, Yanghua Peng, Haibin Lin, Chuan Wu, Yibo Zhu. "SAPipe: Staleness-Aware Pipeline for Data Parallel DNN Training," in the Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), New Orleans, USA, November 29- December 1, 2022.
Hanpeng Hu, Dan Wang, Chuan Wu. "Distributed Machine Learning through Heterogeneous Edge Systems," in AAAI, New York, USA, February 7-12, 2020.
Distributed Machine Learning Systems
We extensively study training and inference expedition for large-scale distributed machine learning from various perspectives, and build efficient distributed machine learning systems and AI cloud schedulers.
◼ DNN Systems
Borui Wan, Mingji Han, Yiyao Sheng, Yanghua Peng, Haibin Lin, Mofan Zhang, Zhichao Lai, Menghan Yu, Junda Zhang, Zuquan Song, Xin Liu, Chuan Wu. "ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model Development," in the 22th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Philadelphia, PA, USA, April 28-30, 2025.
Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, Chuan Wu. "HybridFlow: A Flexible and Efficient RLHF Framework," to appear in EuroSys 2025, Rotterdam, The Netherlands, March 30 - April 3, 2025. [project code available]
Yuchen Zhong, Guangming Sheng, Juncheng Liu, Jinhui Yuan, Chuan Wu. "SWIFT: Expedited Failure Recovery for Large-scale DNN Training," in IEEE Transactions on Parallel and Distributed Systems, vol. 35, pp. 1644-1656, September 2024. [project code available]
Xiaoyang Zhao, Zhe Zhang, Chuan Wu. "AdapCC: Making Collective Communication in Distributed Machine Learning Adaptive," in the 44th International Conference on Distributed Computing Systems (ICDCS), Jersey City, New Jersey, USA, July 23 - 26, 2024. [project code
available]
Juntao Zhao, Borui Wan, Yanghua Peng, Haibin Lin, Yibo Zhu, Chuan Wu. "QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices," in the 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS), San Francisco, USA, May 27-31, 2024. [project code
available]
Ye Tian, Zhen Jia, Ziyue Luo, Yida Wang, Chuan Wu. "DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines," in the Seventh Conference on Machine Learning and Systems (MLSys), Santa Clara, USA, May 13 - 16, 2024.
Chenyu Jiang, Ye Tian, Zhen Jia, Chuan Wu, Yida Wang, Shuai Zheng. "Lancet: Accelerating Mixture-of-Experts Training by Overlapping Weight Gradient Computation and All-to-All Communication," in the Seventh Conference on Machine Learning and Systems (MLSys), Santa Clara, USA, May 13 - 16, 2024.
Chenyu Jiang, Zhen Jia, Shuai Zheng, Yida Wang, Chuan Wu. "DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines," in EuroSys, Athens, Greece, April 22-25, 2024.
Hanpeng Hu, Junwei Su, Juntao Zhao, Yanghua Peng, Yibo Zhu, Haibin Lin, Chuan Wu. "CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs," in EuroSys, Athens, Greece, April 22-25, 2024.
Shiwei Zhang, Lansong Diao, Chuan Wu, Zongyan Cao, Siyu Wang, Wei Lin. "HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis," in EuroSys, Athens, Greece, April 22-25, 2024.
Shiwei Zhang, Xiaodong Yi, Lansong Diao, Chuan Wu, Siyu Wang, Wei Lin. "Expediting Distributed DNN Training with Device Topology-Aware Graph Deployment," in IEEE Transactions on Parallel and Distributed Systems, vol. 34, No. 4, pp. 1281-1293, April 2023. [project code
available]
Xiaodong Yi, Shiwei Zhang, Lansong Diao, Chuan Wu, Zhen Zheng, Shiqing Fan, Siyu Wang, Jun Yang, Wei Lin. "Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion," in IEEE Transactions on Parallel and Distributed Systems, vol. 33, No. 12, pp. 4694 - 4706, December 2022. [project code
available]
Shiwei Zhang, Lansong Diao, Chuan Wu, Siyu Wang, Wei Lin. "Accelerating Large-Scale Distributed Neural Network Training with SPMD Parallelism," in the 13th ACM Symposium on Cloud Computing (SOCC'22), San Francisco, CA, Nobember 7-11, 2022. [project code
available]
Hanpeng Hu, Chenyu Jiang, Yuchen Zhong, Yanghua Peng, Chuan Wu, Yibo Zhu, Haibin Lin. "dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training," in the Fifth Conference on Machine Learning and Systems (MLSys), August 29 - September 1, 2022. [project code
available]
Ziyue Luo, Xiaodong Yi, Guoping Long, Shiqing Fan, Chuan Wu, Wei Lin. "Efficient Pipeline Planning for Expedited Distributed DNN Training," in IEEE INFOCOM, online, May 2-5, 2022.
Zhe Zhang, Chuan Wu, Zongpeng Li. "Near-Optimal Topology-adaptive Parameter Synchronization in Distributed DNN Training," in IEEE INFOCOM, May 10-13, 2021.
Shiqing Fan, Yi Rong, Chen Meng, Zongyan Cao, Siyu Wang, Zheng Zhen, Chuan Wu, Guoping Long, Jun Yang, Lixue Xia, Lansong Diao, Xiaoyong Liu, Wei Lin. "DAPPLE: A Pipelined Data Parallel Approach for Large Models Training," in the 26th ACM SIGPLAN Annual Symposium Principles and Practice of Parallel Programming (PPoPP'21), Seoul, South Korea, February 27-March 3, 2021.
Xiaodong Yi, Shiwei Zhang, Ziyue Luo, Guoping Long, Lansong Diao, Chuan Wu, Zhen Zheng, Jun Yang, Wei Lin. "Optimizing Distributed Training Deployment in Heterogeneous GPU Clusters," in ACM CoNEXT, Barcelona, Spain, December 1-4, 2020. [project code
available]
Xiaodong Yi, Ziyue Luo, Chen Meng, Mengdi Wang, Guoping Long, Chuan Wu, Jun Yang, Wei Lin. "Fast Training of Deep Learning Models over Multiple GPUs," in ACM/IFIP Middleware, Delft, The Netherlands, December 7-11, 2020. [project code
available]
Yangrui Chen, Yanghua Peng, Yixin Bao, Chuan Wu, Yibo Zhu, Chuanxiong Guo. "Elastic Parameter Server Load Distribution in Deep Learning Clusters," in ACM SOCC, Renton, WA, USA, October 19-21, 2020.
Yanghua Peng, Yibo Zhu, Yangrui Chen, Yixin Bao, Bairen Yi, Chang Lan, Chuan Wu, Chuanxiong Guo. "A Generic Communication Scheduler for Distributed DNN Training Acceleration," in ACM SOSP, Huntsville, Ontario, Canada, October 27-30, 2019. [project code
available]
◼ GNN Systems
Guangming Sheng, Junwei Su, Chao Huang, Chuan Wu. "MSPipe: Efficient Temporal GNN Training via Staleness-aware Pipeline," in ACM KDD, Barcelona, Spain, August 25-29, 2024. [project code
available]
Bingqian Du, Jun Liu, Ziyue Luo, Chuan Wu, Qiankun Zhang, Hai Jin. "Expediting Distributed GNN Training with Feature-only Partition and Optimized Communication Planning," in IEEE INFOCOM, Vancouver, Canada, May 20-23, 2024.
Borui Wan, Juntao Zhao, Chuan Wu. "Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training," in the Sixth Conference on Machine Learning and Systems (MLSys), Miami, USA, May 4-8, 2023. [project code
available]
Zhe Zhang, Ziyue Luo, Chuan Wu. "Two-level Graph Caching for Expediting Distributed GNN Training," in IEEE INFOCOM, New York, USA, May 17-20, 2023.
Tianfeng Liu*, Yangrui Chen*, Dan Li, Chuan Wu, Yibo Zhu, Jun He, Yanghua Peng, Hongzheng Chen, Hongzhi Chen, Chuanxiong Guo. "BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing," in the 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Boston, MA, USA, April 17-19, 2023. (*: co-first authors)
Ziyue Luo, Yixin Bao, Chuan Wu. "Optimizing Task Placement and Online Scheduling for Distributed GNN Training Acceleration," in IEEE INFOCOM, online, May 2-5, 2022.
◼ AI Cluster Schedulers
Mengfan Liu, Wei Wang, Chuan Wu. "Optimizing Distributed Deployment of Mixture-of-Experts Model Inference in Serverless Computing," in IEEE INFOCOM, London, United Kingdom, May 19-22, 2025.
Xiaoyang Zhao, Siran Yang, Jiamang Wang, Lansong Diao, Lin Qu, Chuan Wu. "FaPES: Enabling Efficient Elastic Scaling for Serverless Machine Learning Platforms," in the 15th ACM Symposium on Cloud Computing (SOCC), Redmond, Washington, USA, November 20-22, 2024.
Xiaoyang Zhao, Chuan Wu, Xia Zhu. "Dynamic Flow Scheduling for DNN Training Workloads in Data Centers," to appear in IEEE Transactions on Network and Service Management. [project code
available]
Xiaoyang Zhao, Chuan Wu. "Large-scale Machine Learning Cluster Scheduling via Multi-agent Graph Reinforcement Learning," in IEEE Transactions on Network and Service Management, vol. 19, No. 4, pp. 4962-4974, December 2022.
Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, Chen Meng, Wei Lin. "DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters," in IEEE Transactions on Parallel and Distributed Systems, vol. 32, No. 8, pp. 1947-1960, August 2021. [project code
available]
Yixin Bao, Yanghua Peng, Yangrui Chen, Chuan Wu. "Preemptive All-reduce Scheduling for Expediting Distributed DNN Training," in IEEE INFOCOM, Toronto, Canada, July 6-9, 2020.
Yixin Bao, Yanghua Peng, Chuan Wu. "Deep Learning-based Job Placement in Distributed Machine Learning Clusters," in IEEE INFOCOM, Paris, France, April 29-May 2, 2019.
Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, Chuanxiong Guo. "Optimus: An Efficient Dynamic Resource Scheduler for Deep Learning Clusters," in EuroSys 2018, Porto, Portugal, April 23-26, 2018. [project code
available]
Yixin Bao, Yanghua Peng, Chuan Wu, Zongpeng Li. "Online Job Scheduling in Distributed Machine Learning Clusters," in IEEE INFOCOM, Honolulu, HI, USA, April 15-19, 2018.
Smart Elderly/Health Care Technologies
We are actively working on a number of machine learning/AI technologies for smart elderly care systems, including elderly walking support, safe-living activity monitoring, and psychotherapy.
Chongyu Zhao, Lingyu Guo, Rongwei Wen, Yanrui Wang, Chuan Wu. "Depth-Temporal Attention with Dual Modality Data for Walking Intention Prediction in Close-Proximity Front-Following," in the 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, USA, May 19-23, 2025.
Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu. "A Cognitive Stimulation Therapy Dialogue System with Multi-Source Knowledge Fusion for Elders with Cognitive Impairment," in the 61st Annual Meeting of the Association for Computational Linguistics (ACL), Toronto, Canada, July 9-14, 2023.
Chongyu Zhao, Wenzhi Guo, Rongwei Wen, Zheng Wang, Chuan Wu. "Deep Learning-driven Front-Following within Close Proximity: a Hands-Free Control Model on a Smart Walker," to appear in the 2022 IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, USA, May 23-27, 2022.
Le Fang, Yu Wu, Chuan Wu, Yizhou Yu. "A Non-Intrusive Elderly Home Monitoring System," in IEEE Internet of Things Journal, vol. 8, No. 4, pp. 2603-2614, February 2021.
Xiaoyang Zhao, Zhi Zhu, MingShan Liu, Chongyu Zhao, Yafei Zhao, Jia Pan, Zheng Wang, Chuan Wu. "A Smart Robotic Walker with Intelligent Close-proximity Interaction Capabilities for Elderly Mobility Safety," in Frontiers in Neurorobotics, October 2020.
Resource Scheduling, Scaling, Pricing, and Data Migration in Cloud Data Centers or Geo-distributed Clouds
We have studied cloud resource scheduling, scaling and pricing in various cloud systems. We design online and machine learning algorithms to achieve long-term optimal operation of the systems.
Bingqian Du, Chuan Wu, Zhiyi Huang. "Learning Resource Allocation and Pricing for Cloud Profit Maximization," in the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, USA, January 27-February 1, 2019.
Xiaoxi Zhang, Chuan Wu, Zhiyi Huang, Zongpeng Li. "Occupation-Oblivious Pricing of Cloud Jobs via Online Learning," in the Proceedings of IEEE INFOCOM, Honolulu, HI, USA, April 15-19, 2018.
Ruiting Zhou, Zongpeng Li, Chuan Wu, Zhiyi Huang. "An Efficient Cloud Market Mechanism for Computing Jobs with Soft Deadlines," in IEEE/ACM Transactions on Networking, vol. 25, No. 2, pp. 793-805, April 2017.
Weijie Shi, Chuan Wu, Zongpeng Li. "A Shapley-value Mechanism for Bandwidth On Demand between Datacenters," to appear in IEEE Transactions on Cloud Computing.
Hongxing Li, Chuan Wu, Zongpeng Li, Francis C.M. Lau. "Virtual Machine Trading in a Federation of Clouds: Individual Profit and Social Welfare Maximization," in IEEE/ACM Transactions on Networking, 2016.
Yu Wu, Zhizhong Zhang, Chuan Wu, Chuanxiong Guo, Zongpeng Li, Francis C.M. Lau. "Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters", in IEEE Transactions on Cloud Computing, 2017.
Weijie Shi, Chuan Wu, Zongpeng Li. "An Online Mechanism for Dynamic Virtual Cluster Provisioning in Geo-Distributed Clouds," in IEEE INFOCOM 2016.
Shengkai Shi, Chuan Wu, Zongpeng Li. "Cost-Minimizing Online VM Purchasing for Application Service Providers with Arbitrary Demands," in IEEE Cloud 2015.
Xiaoxi Zhang, Zhiyi Huang, Chuan Wu, Zongpeng Li, Francis C.M. Lau. "Online Auctions in IaaS Clouds: Welfare and Profit Maximization with Server Costs," in ACM SIGMETRICS 2015.
Xiaoxi Zhang, Chuan Wu, Zongpeng Li, Francis C.M. Lau. "A Truthful (1 - epsilon)-Optimal Mechanism for On-demand Cloud Resource Provisioning," in IEEE INFOCOM 2015.
Weijie Shi, Linquan Zhang, Chuan Wu, Zongpeng Li, Francis C.M. Lau. "An Online Auction Framework for Dynamic Resource Provisioning in Cloud Computing," in ACM SIGMETRICS 2014.
Linquan Zhang, Zongpeng Li, Chuan Wu. "Dynamic Resource Provisioning in Cloud Computing: A Randomized Auction Approach," in IEEE INFOCOM 2014.
Weijie Shi, Chuan Wu, Zongpeng Li. "RSMOA: A Revenue and Social Welfare Maximizing Online Auction for Dynamic Cloud Resource Provi- sioning," in ACM/IEEE IWQoS 2014.
Xuanjia Qiu, Chuan Wu, Hongxing Li, Zongpeng Li, Francis C.M. Lau. "Federated Private Clouds via Broker's Marketplace: A Stackelberg- Game Perspective," in IEEE Cloud 2014.
Jian Zhao, Hongxing Li, Chuan Wu, Zongpeng Li, Zhizhong Zhang, Francis C.M. Lau. "Dynamic Pricing and Profit Maximization for the Cloud with Geo-distributed Data Centers," in IEEE INFOCOM 2014.
Linquan Zhang, Zongpeng Li, Chuan Wu, Minghua Chen. "Online Algorithms for Uploading Deferrable Big Data to The Cloud," in IEEE INFOCOM 2014.
Linquan Zhang, Chuan Wu, Zongpeng Li, Chuanxiong Guo, Minghua Chen, Francis C.M. Lau. "Moving Big Data to The Cloud: An Online Cost-Minimizing Approach," in IEEE JSAC Special Issue on Networking Challenges in Cloud Computing Systems and Applications, 2013.
Xuanjia Qiu, Wai Leong Yeow, Chuan Wu, Francis C.M. Lau. "Cost-Minimizing Preemptive Scheduling of MapReduce Workloads on Hybrid Clouds," in ACM/IEEE IWQoS 2013.