On Observability and Monitoring of Distributed Systems – An Industry Interview Study

Sina Niedermaier, Falko Koetter, Andreas Freymann, Stefan Wagner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Scopus citations

Abstract

Business success of companies heavily depends on the availability and performance of their client applications. Due to modern development paradigms such as DevOps and microservice architectural styles, applications are decoupled into services with complex interactions and dependencies. Although these paradigms enable individual development cycles with reduced delivery times, they cause several challenges to manage the services in distributed systems. One major challenge is to observe and monitor such distributed systems. This paper provides a qualitative study to understand the challenges and good practices in the field of observability and monitoring of distributed systems. In 28 semi-structured interviews with software professionals we discovered increasing complexity and dynamics in that field. Especially observability becomes an essential prerequisite to ensure stable services and further development of client applications. However, the participants mentioned a discrepancy in the awareness regarding the importance of the topic, both from the management as well as from the developer perspective. Besides technical challenges, we identified a strong need for an organizational concept including strategy, roles and responsibilities. Our results support practitioners in developing and implementing systematic observability and monitoring for distributed systems.

Original languageEnglish
Title of host publicationService-Oriented Computing - 17th International Conference, ICSOC 2019, Proceedings
EditorsSami Yangui, Khalil Drira, Ismael Bouassida Rodriguez, Zahir Tari
PublisherSpringer
Pages36-52
Number of pages17
ISBN (Print)9783030337018
DOIs
StatePublished - 2019
Externally publishedYes
Event17th International Conference on Service-Oriented Computing, ICSOC 2019 - Toulouse, France
Duration: 28 Oct 201931 Oct 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11895 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Service-Oriented Computing, ICSOC 2019
Country/TerritoryFrance
CityToulouse
Period28/10/1931/10/19

Keywords

  • Cloud
  • Distributed systems
  • Industry
  • Monitoring
  • Observability

Fingerprint

Dive into the research topics of 'On Observability and Monitoring of Distributed Systems – An Industry Interview Study'. Together they form a unique fingerprint.

Cite this