Abstract
Many popular systems for processing “big data” are implemented in high-level programming languages with automatic memory management via garbage collection (GC). However, high object churn and large heap sizes put severe strain on the garbage collector. As a result, applications underperform significantly: GC increases the runtime of typical data processing tasks by up to 40%. We propose to use region-based memory management instead of GC in distributed data processing systems. In these systems, many objects have clearly defined lifetimes. Hence, it is natural to allocate these objects in fate-sharing regions, obviating the need to scan a large heap. Regions can be memory-safe and could be inferred automatically. Our initial results show that region-based memory management reduces emulated Naiad vertex runtime by 34% for typical data analytics jobs.
| Original language | English |
|---|---|
| State | Published - 2015 |
| Externally published | Yes |
| Event | 15th Workshop on Hot Topics in Operating Systems, HotOS 2015 - Warth-Weiningen, Switzerland Duration: 18 May 2015 → 20 May 2015 |
Conference
| Conference | 15th Workshop on Hot Topics in Operating Systems, HotOS 2015 |
|---|---|
| Country/Territory | Switzerland |
| City | Warth-Weiningen |
| Period | 18/05/15 → 20/05/15 |
Fingerprint
Dive into the research topics of 'Broom: Sweeping out garbage collection from big data systems'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver