Apollo: Rapidly Picking the Optimal Cloud Configurations for Big Data Analytics Using a Data-Driven Approach

Yue-Wen Wu; Yuan-Jia Xu; Heng Wu; Lin-Gang Su; Wen-Bo Zhang; Hua Zhong

doi:10.1007/s11390-021-0232-4

Yue-Wen Wu, Yuan-Jia Xu, Heng Wu, Lin-Gang Su, Wen-Bo Zhang, Hua Zhong. Apollo: Rapidly Picking the Optimal Cloud Configurations for Big Data Analytics Using a Data-Driven ApproachJ. Journal of Computer Science and Technology, 2021, 36(5): 1184-1199. DOI: 10.1007/s11390-021-0232-4

Citation:

Apollo: Rapidly Picking the Optimal Cloud Configurations for Big Data Analytics Using a Data-Driven Approach

Abstract

Abstract

Big data analytics applications are increasingly deployed on cloud computing infrastructures, and it is still a big challenge to pick the optimal cloud configurations in a cost-effective way. In this paper, we address this problem with a high accuracy and a low overhead. We propose Apollo, a data-driven approach that can rapidly pick the optimal cloud configurations by reusing data from similar workloads. We first classify 12 typical workloads in BigDataBench by characterizing pairwise correlations in our offline benchmarks. When a new workload comes, we run it with several small datasets to rank its key characteristics and get its similar workloads. Based on the rank, we then limit the search space of cloud configurations through a classification mechanism. At last, we leverage a hierarchical regression model to measure which cluster is more suitable and use a local search strategy to pick the optimal cloud configurations in a few extra tests. Our evaluation on 12 typical workloads in HiBench shows that compared with state-of-the-art approaches, Apollo can improve up to 30% search accuracy, while reducing as much as 50% overhead for picking the optimal cloud configurations.

FullText(HTML)

References (30)

Relative Articles

Supplements (2)

Cited By

Apollo: Rapidly Picking the Optimal Cloud Configurations for Big Data Analytics Using a Data-Driven Approach

Abstract

Catalog

Export File

Citation

Format

Content