TY - GEN
T1 - Lifting the Fog of Uncertainties
T2 - 14th ACM Symposium on Cloud Computing, SoCC 2023
AU - Zhang, Yuqiu
AU - Zhang, Tongkun
AU - Zhang, Gengrui
AU - Jacobsen, Hans Arno
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/10/30
Y1 - 2023/10/30
N2 - The advances in virtualization technologies have sparked a growing transition from virtual machine (VM)-based to container-based infrastructure for cloud computing. From the resource orchestration perspective, containers’ lightweight and highly configurable nature not only enables opportunities for more optimized strategies, but also poses greater challenges due to additional uncertainties and a larger configuration parameter search space. Towards this end, we propose Drone, a resource orchestration framework that adaptively configures resource parameters to improve application performance and reduce operational cost in the presence of cloud uncertainties. Built on Contextual Bandit techniques, Drone is able to achieve a balance between performance and resource cost on public clouds, and optimize performance on private clouds where a hard resource constraint is present. We show that our algorithms can achieve sub-linear growth in cumulative regret, a theoretically sound convergence guarantee, and our extensive experiments show that Drone achieves an up to 45% performance improvement and a 20% resource footprint reduction across batch processing jobs and microservice workloads.
AB - The advances in virtualization technologies have sparked a growing transition from virtual machine (VM)-based to container-based infrastructure for cloud computing. From the resource orchestration perspective, containers’ lightweight and highly configurable nature not only enables opportunities for more optimized strategies, but also poses greater challenges due to additional uncertainties and a larger configuration parameter search space. Towards this end, we propose Drone, a resource orchestration framework that adaptively configures resource parameters to improve application performance and reduce operational cost in the presence of cloud uncertainties. Built on Contextual Bandit techniques, Drone is able to achieve a balance between performance and resource cost on public clouds, and optimize performance on private clouds where a hard resource constraint is present. We show that our algorithms can achieve sub-linear growth in cumulative regret, a theoretically sound convergence guarantee, and our extensive experiments show that Drone achieves an up to 45% performance improvement and a 20% resource footprint reduction across batch processing jobs and microservice workloads.
UR - http://www.scopus.com/inward/record.url?scp=85178519554&partnerID=8YFLogxK
U2 - 10.1145/3620678.3624646
DO - 10.1145/3620678.3624646
M3 - Conference contribution
AN - SCOPUS:85178519554
T3 - SoCC 2023 - Proceedings of the 2023 ACM Symposium on Cloud Computing
SP - 48
EP - 64
BT - SoCC 2023 - Proceedings of the 2023 ACM Symposium on Cloud Computing
PB - Association for Computing Machinery, Inc
Y2 - 30 October 2023 through 1 November 2023
ER -