Skip to main navigation Skip to search Skip to main content

Broom: Sweeping out garbage collection from big data systems

  • Ionel Gog
  • , Jana Giceva
  • , Malte Schwarzkopf
  • , Kapil Vaswani
  • , Dimitrios Vytiniotis
  • , Ganesan Ramalingan
  • , Manuel Costa
  • , Derek Murray
  • , Steven Hand
  • , Michael Isard
  • University of Cambridge
  • ETH Zurich
  • Microsoft Research
  • Google Inc
  • Unaffiliated

Research output: Contribution to conferencePaperpeer-review

77 Scopus citations

Abstract

Many popular systems for processing “big data” are implemented in high-level programming languages with automatic memory management via garbage collection (GC). However, high object churn and large heap sizes put severe strain on the garbage collector. As a result, applications underperform significantly: GC increases the runtime of typical data processing tasks by up to 40%. We propose to use region-based memory management instead of GC in distributed data processing systems. In these systems, many objects have clearly defined lifetimes. Hence, it is natural to allocate these objects in fate-sharing regions, obviating the need to scan a large heap. Regions can be memory-safe and could be inferred automatically. Our initial results show that region-based memory management reduces emulated Naiad vertex runtime by 34% for typical data analytics jobs.

Original languageEnglish
StatePublished - 2015
Externally publishedYes
Event15th Workshop on Hot Topics in Operating Systems, HotOS 2015 - Warth-Weiningen, Switzerland
Duration: 18 May 201520 May 2015

Conference

Conference15th Workshop on Hot Topics in Operating Systems, HotOS 2015
Country/TerritorySwitzerland
CityWarth-Weiningen
Period18/05/1520/05/15

Fingerprint

Dive into the research topics of 'Broom: Sweeping out garbage collection from big data systems'. Together they form a unique fingerprint.

Cite this