Capacity Planning for Dependable Services

Rasha Faqeh, André Martin, Valerio Schiavoni, Pramod Bhatotia, Pascal Felber, Christof Fetzer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Fault-tolerance techniques depend on replication to enhance availability, albeit at the cost of increased infrastructure costs. This results in a fundamental trade-off: Fault-tolerant services must satisfy given availability and performance constraints while minimising the number of replicated resources. These constraints pose capacity planning challenges for the service operators to minimise replication costs without negatively impacting availability. To this end, we present PCRAFT, a practical system to enable capacity planning of dependable services. PCRAFT ’s capacity planning is based on a hybrid approach that combines empirical performance measurements with probabilistic modelling of availability based on fault injection. In particular, we integrate traditional service-level availability mechanisms (active route anywhere and passive failover) and deployment schemes (cloud and on-premises) to quantify the number of nodes needed to satisfy the given availability and performance constraints. Our evaluation based on real-world applications shows that cloud deployment requires fewer nodes than on-premises deployments. Additionally, when considering on-premises deployments, we show how passive failover requires fewer nodes than active route anywhere. Furthermore, our evaluation quantifies the quality enhancement given by additional integrity mechanisms and how this affects the number of nodes needed.

Original languageEnglish
Title of host publicationStabilization, Safety, and Security of Distributed Systems - 24th International Symposium, SSS 2022, Proceedings
EditorsStéphane Devismes, Franck Petit, Karine Altisen, Giuseppe Antonio Di Luna, Antonio Fernandez Anta
PublisherSpringer Science and Business Media Deutschland GmbH
Pages222-238
Number of pages17
ISBN (Print)9783031210167
DOIs
StatePublished - 2022
Event24th International Symposium on Stabilization, Safety, and Security of Distributed Systems, SSS 2022 - Clermont-Ferrand, France
Duration: 15 Nov 202217 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13751 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Symposium on Stabilization, Safety, and Security of Distributed Systems, SSS 2022
Country/TerritoryFrance
CityClermont-Ferrand
Period15/11/2217/11/22

Fingerprint

Dive into the research topics of 'Capacity Planning for Dependable Services'. Together they form a unique fingerprint.

Cite this