CarbonDIS: Carbon-Aware DNN Inference Scheduling on Heterogeneous GPUs
Document Type
Conference Proceeding
Date of Original Version
1-1-2025
Abstract
Deploying deep neural network (DNN) inference applications in data centers and clouds to empower various services is on the increase. The significant power consumption of these applications contributes to the carbon emissions of underlying infrastructures. Therefore, minimizing the carbon emissions of these applications leads to a reduced carbon footprint of data centers and clouds. This paper introduces CarbonDIS, a carbon-aware inference scheduler designed to maintain the latency of DNN inference applications while minimizing their carbon emissions by leveraging heterogeneous GPUs. CarbonDIS considers an inference architecture where high-end and low-end GPUs are used in tandem to serve inference jobs. By leveraging the varying computing capability and power consumption of heterogeneous GPUs, CarbonDIS can effectively balance the performance and carbon emissions. Evaluation using three types of GPUs, 10 DNN models, and real-world carbon intensity traces show that CarbonDIS can find a trade-off between carbon footprint and latency of the jobs and reduce the carbon emissions by 16% compared with a performance-centric approach.
Publication Title, e.g., Journal
Proceedings 2025 IEEE 25th International Symposium on Cluster Cloud and Internet Computing Ccgrid 2025
Citation/Publisher Attribution
Nabavinejad, Seyed Morteza, Weiwei Jia, Shubbhi Taneja, and Tian Guo. "CarbonDIS: Carbon-Aware DNN Inference Scheduling on Heterogeneous GPUs." Proceedings 2025 IEEE 25th International Symposium on Cluster Cloud and Internet Computing Ccgrid 2025 (2025). doi: 10.1109/CCGRID64434.2025.00032.