Duba: Cost-Efficient Serverless Cloud-Edge Collaborative ML Serving with Dual-Batching

Jian-Xiong Liao; Jing Peng; Zhi  Zhou; Fei Xu

doi:10.1007/s11390-025-5623-5

Jian-Xiong Liao, Jing Peng, Zhi Zhou, Fei Xu. Duba: Cost-Efficient Serverless Cloud-Edge Collaborative ML Serving with Dual-BatchingJ. Journal of Computer Science and Technology. DOI: 10.1007/s11390-025-5623-5

Citation:

Duba: Cost-Efficient Serverless Cloud-Edge Collaborative ML Serving with Dual-Batching

Abstract

Abstract

The integration of edge and serverless cloud computing, which combines the low-latency advantages of edge processing with the cost efficiency and scalability of serverless cloud architectures, provides an ideal foundation for serving machine learning (ML) applications. While batching has demonstrated significant improvements in resource utilization through parallel execution, current approaches that independently optimize batching for edge or serverless cloud environments overlook their synergistic potential, leading to suboptimal end-to-end performance. To bridge this gap, we present Duba, a serverless cloud-edge collaborative system designed for cost-efficient ML serving. At its core, Duba introduces a novel dual-batching mechanism that harmonizes batching strategies across edge and serverless cloud environments. To realize this design, Duba combines lightweight configuration optimization with an adaptive scheduling policy, delivering substantial improvements in both cost efficiency and performance. Experimental results demonstrate that Duba consistently outperforms existing systems, reducing the serving costs by up to 74.1% and improving SLO compliance by over 6.9%.

FullText(HTML)

References (0)

Relative Articles

Supplements (1)

Cited By

Duba: Cost-Efficient Serverless Cloud-Edge Collaborative ML Serving with Dual-Batching

Abstract

Catalog

Export File

Citation

Format

Content