Java SE 6 has introduced several improvements to help automate the garbage collection (GC) process, to free up memory, thereby helping to reduce the amount of memory required for an application. While the process of setting up GC has gotten more automated, programmers still need to keep in mind how the different GC algorithms work when problems emerge.
GC works by looking at all of the objects in memory in order to find any objects which are no longer being referenced in the program. These unused objects can be deleted in order to make room for new memory objects. But this process of scanning through and deleting can create pauses in the application. This can be an issue for programs with large amounts of data, multiple threads, and high transaction rates.
The major measures of garbage collection performance are the total time not spent in garbage collection (throughput) and the length of pauses when the application is unresponsive owing to the GC process.
In order to reduce the length of pauses, Java GC occurs over three tiers called young, tenured and permanent. Memory objects with a short life span are called young objects. A quick sweep of this young generation occurs frequently, and members of the young generation which are not reclaimed are move to the tenured generation. The process of GC in the tenured and permanent generations usually takes much longer. When garbage collection becomes a bottleneck, the programmer needs to customize the total memory heap size as well as the size of each generation.
Java HotSpot includes three different collectors. The serial collection uses a single thread for GC and is best suited for single processor machines with data sets smaller than 100 Mbytes. The parallel performs minor collections in parallel. It is ideally suited for medium to large datasets running on multi-threaded or multi-processor hardware. The concurrent collector has been optimized to garbage collection pauses short when response
time is more important than throughput. This mode does not normally provide any benefit on a single-core machine.
J2SE 5.0 introduced the concept of ergonomics which dynamically chooses the GC best algorithm based on the type of application. The developer only has to specify the GC type, heap size, and the runtime compiler. This is generally an improvement, but the selected garbage collection algorithm is not always the best choice. It is recommended that developers start with these ergonomic adjustments as presented in “Ergonomics in the 5.0 Java Virtual Machine” before using more detailed controls.
The most important variable to tune is total heap size. Large server applications often experience problems with slow startup because the initial heap is small and must be resized. Sun recommends granting as much memory as possible to the VM.
The second most significant variable is the proportion of the heap size dedicated to the young generation. Increasing the heap assigned to the young generation reduced the frequency of collections, but this can reduce the size of the tenured generation, which increases the frequency of major collections.
The Java SE 6 Performance White Paper highlights some of the some of the advances in Java SE 6 which include parallel compaction collector, a concurrent low pause collector and improvements to ergonomics. The parallel compaction collector improves GC performance with multiple processors. Previously, major collections were only performed using a single thread.
The concurrent low pause collector improves the performance of applications.