An Analysis of the Mismatch Between Garbage Collection Evolution and Datacenter Paradigm Shifting
-
Abstract
Managed languages, such as Java, are widely used in cloud application development due to their good portability, scalability, and automatic memory management, i.e., garbage collection (GC). However, the managed applications exhibit a series of performance and latency issues when running in the emerging memory-disaggregated servers, which is mainly caused by the mismatch between the design of GC and the hierarchical memory structure. The mismatch is even enlarged with the development of GC, the main reason for which is that the newly proposed GC keeps pursuing lower pause time and higher reclaiming throughput without being aware of the paradigm shift in the underlying memory hierarchy. In this work, we perform an extensive study on a wide spectrum of cloud applications to understand the sources of performance penalties—the poor data locality of GC, the interference between GC tasks and mutators, the failing heuristic GC policies due to the oversight of the underlying memory hierarchy, and the lack of cross-layer co-design. Based on the observations, we demonstrate that the existing GC can achieve similar performance to the disaggregated GC by only tuning its policy to be aware of the memory layouts while showing much better robustness. The developers can benefit from our research by tuning their existing production GC, thus avoiding the time to develop and debug a dedicated collector for the new hardware.
-
-