TY - GEN
T1 - Scalable rollback for cloud operations using AI planning
AU - Satyal, Suhrid
AU - Weber, Ingo
AU - Bass, Len
AU - Fu, Min
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015
Y1 - 2015
N2 - Human-induced faults play a large role in systems reliability. In cloud platforms, system administrators may inadvertently make catastrophic mistakes, like deleting a virtual disk with important data. Providing rollback for cloud operations can reduce the severity and impact of such mistakes by allowing to revert back to a known, good state. In this paper, we present a scalable approach to rollback operations that change state of a system on proprietary cloud platforms. In our previous work, we provided a system that augments cloud APIs and provides rollback operation using an AI planner. However, the previous system eventually suffers from the exponential complexity inherent to AI planning tasks. In this paper, we divide and parallelize rollback plan generation, based on characteristics unique to the rollback scenario. Through experimental evaluation, we show that this approach scales better than the previous, naïve approach, and effectively avoids the exponential behavior.
AB - Human-induced faults play a large role in systems reliability. In cloud platforms, system administrators may inadvertently make catastrophic mistakes, like deleting a virtual disk with important data. Providing rollback for cloud operations can reduce the severity and impact of such mistakes by allowing to revert back to a known, good state. In this paper, we present a scalable approach to rollback operations that change state of a system on proprietary cloud platforms. In our previous work, we provided a system that augments cloud APIs and provides rollback operation using an AI planner. However, the previous system eventually suffers from the exponential complexity inherent to AI planning tasks. In this paper, we divide and parallelize rollback plan generation, based on characteristics unique to the rollback scenario. Through experimental evaluation, we show that this approach scales better than the previous, naïve approach, and effectively avoids the exponential behavior.
KW - AI planning
KW - Cloud computing
KW - Reliability
KW - Web service
UR - http://www.scopus.com/inward/record.url?scp=85078511953&partnerID=8YFLogxK
U2 - 10.1109/ASWEC.2015.34
DO - 10.1109/ASWEC.2015.34
M3 - Conference contribution
AN - SCOPUS:85078511953
T3 - Proceedings - 2015 24th Australasian Software Engineering Conference, ASWEC 2015
SP - 195
EP - 202
BT - Proceedings - 2015 24th Australasian Software Engineering Conference, ASWEC 2015
PB - Institute of Electrical and Electronics Engineers Inc.
ER -