Broom: Sweeping out garbage collection from big data systems

Ionel Gog, Jana Giceva, Malte Schwarzkopf, Kapil Vaswani, Dimitrios Vytiniotis, Ganesan Ramalingan, Manuel Costa, Derek Murray, Steven Hand, Michael Isard

Research output: Contribution to conferencePaperpeer-review

71 Scopus citations

Abstract

Many popular systems for processing “big data” are implemented in high-level programming languages with automatic memory management via garbage collection (GC). However, high object churn and large heap sizes put severe strain on the garbage collector. As a result, applications underperform significantly: GC increases the runtime of typical data processing tasks by up to 40%. We propose to use region-based memory management instead of GC in distributed data processing systems. In these systems, many objects have clearly defined lifetimes. Hence, it is natural to allocate these objects in fate-sharing regions, obviating the need to scan a large heap. Regions can be memory-safe and could be inferred automatically. Our initial results show that region-based memory management reduces emulated Naiad vertex runtime by 34% for typical data analytics jobs.

Original languageEnglish
StatePublished - 2015
Externally publishedYes
Event15th Workshop on Hot Topics in Operating Systems, HotOS 2015 - Warth-Weiningen, Switzerland
Duration: 18 May 201520 May 2015

Conference

Conference15th Workshop on Hot Topics in Operating Systems, HotOS 2015
Country/TerritorySwitzerland
CityWarth-Weiningen
Period18/05/1520/05/15

Fingerprint

Dive into the research topics of 'Broom: Sweeping out garbage collection from big data systems'. Together they form a unique fingerprint.

Cite this