TY - GEN
T1 - Capacity Planning for Dependable Services
AU - Faqeh, Rasha
AU - Martin, André
AU - Schiavoni, Valerio
AU - Bhatotia, Pramod
AU - Felber, Pascal
AU - Fetzer, Christof
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Fault-tolerance techniques depend on replication to enhance availability, albeit at the cost of increased infrastructure costs. This results in a fundamental trade-off: Fault-tolerant services must satisfy given availability and performance constraints while minimising the number of replicated resources. These constraints pose capacity planning challenges for the service operators to minimise replication costs without negatively impacting availability. To this end, we present PCRAFT, a practical system to enable capacity planning of dependable services. PCRAFT ’s capacity planning is based on a hybrid approach that combines empirical performance measurements with probabilistic modelling of availability based on fault injection. In particular, we integrate traditional service-level availability mechanisms (active route anywhere and passive failover) and deployment schemes (cloud and on-premises) to quantify the number of nodes needed to satisfy the given availability and performance constraints. Our evaluation based on real-world applications shows that cloud deployment requires fewer nodes than on-premises deployments. Additionally, when considering on-premises deployments, we show how passive failover requires fewer nodes than active route anywhere. Furthermore, our evaluation quantifies the quality enhancement given by additional integrity mechanisms and how this affects the number of nodes needed.
AB - Fault-tolerance techniques depend on replication to enhance availability, albeit at the cost of increased infrastructure costs. This results in a fundamental trade-off: Fault-tolerant services must satisfy given availability and performance constraints while minimising the number of replicated resources. These constraints pose capacity planning challenges for the service operators to minimise replication costs without negatively impacting availability. To this end, we present PCRAFT, a practical system to enable capacity planning of dependable services. PCRAFT ’s capacity planning is based on a hybrid approach that combines empirical performance measurements with probabilistic modelling of availability based on fault injection. In particular, we integrate traditional service-level availability mechanisms (active route anywhere and passive failover) and deployment schemes (cloud and on-premises) to quantify the number of nodes needed to satisfy the given availability and performance constraints. Our evaluation based on real-world applications shows that cloud deployment requires fewer nodes than on-premises deployments. Additionally, when considering on-premises deployments, we show how passive failover requires fewer nodes than active route anywhere. Furthermore, our evaluation quantifies the quality enhancement given by additional integrity mechanisms and how this affects the number of nodes needed.
UR - http://www.scopus.com/inward/record.url?scp=85142762381&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-21017-4_15
DO - 10.1007/978-3-031-21017-4_15
M3 - Conference contribution
AN - SCOPUS:85142762381
SN - 9783031210167
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 222
EP - 238
BT - Stabilization, Safety, and Security of Distributed Systems - 24th International Symposium, SSS 2022, Proceedings
A2 - Devismes, Stéphane
A2 - Petit, Franck
A2 - Altisen, Karine
A2 - Di Luna, Giuseppe Antonio
A2 - Fernandez Anta, Antonio
PB - Springer Science and Business Media Deutschland GmbH
T2 - 24th International Symposium on Stabilization, Safety, and Security of Distributed Systems, SSS 2022
Y2 - 15 November 2022 through 17 November 2022
ER -