Tuning Memory Use in ehcache

In ehcache-1.2 the DiskStore asynchronous spooling was reworked and made much faster. It is now possible to fill the spool very quickly. This gives great cache performance but creates a new problem, temporary memory consumption in the spool thread.

The problem has arisen because data is hitting the spool so fast now that the actual number of element held in the MemoryStore and the spool of the DiskStore can easily exceed the maximum MemoryStore size. You can get memory spikes. Ultimately, I might introduce soft references, so that elements in the spool can simply be reclaimed. This would need to be a configuration option, because some apps are a bit fussy about their cache elements up and disappearing.

The memory spikes, if they are high enough, can cause OutOfMemory errors. For some reason, these would often occur in the flushSpool() method. Ehcache 1.2.0_01 and 1.2.1 contain hardening measures so that these do not cause trouble. If one occurs, that section of the spool that is being written is discarded. So it degrades down to something similar to a SoftReference solution. But it would be better to minimise the occurrence of this behaviour.

This article documents some investigations I have made and a solution I have. The test involves creating 5500 10000 byte objects and putting them in a cache. These go into the MemoryStore and overflow to the DiskStore immediately. The spool thread then takes care of writing them. Memory use spikes to about 55MB and then drops. The disk store was modified to call System.gc() every 700 Elements spooled.

Figure 1 shows the original implementation. The System.gc() calls have little effect. The memory cannot be released until the flushSpool() method completes. Though not shown, forcing a gc in the profiler after elements have been written will return memory back to about 14MB.

Figure 1: Old Disk Store

Figure 2 shows the memory profile after one small change. As the flushSpool() method iterates through the elements to persist them, set each one to null in the array holding references to them, after they are written. The System.gc() every 700 elements actually reclaims memory.

Figure 2: Old Disk Store with just dereference after array element use

Figure 3 shows an implementation which uses the technique from Figure 2 plus two more:

  1. Do the work split up around 5 methods. The theory here is that some JDK implementations do not actually reclaim memory for references that are dereferenced in the same method. They wait for the method to return first.
  2. ByteArrayOutputStream creates lots of temporary byte[] using System.arrayCopy(). It starts with 32 bytes, and then bit shifts to the left by one, as more memory is required. So the sequence is:

    32
    64
    128
    256
    512
    1024
    2048
    4096
    8152
    16304
    32608
    65216

    For our 10000 byte offects, 10 byte arrays are created by ByteArraytOutputStream as the stream is written, totalling 32616 bytes. We create a subclass that uses as its starting point the average size of each entry in the DiskStore.

This implementation’s memory use is much smoother.

Figure 3: New Disk Store

The new improved implementation will be released in ehcache-1.2.0_02 and ehcache-1.2.2.

Published
Categorized as Java

By Greg Luck

As Terracotta’s CTO, Greg (@gregrluck) is entrusted with understanding market and technology forces and the business drivers that impact Terracotta’s product innovation and customer success. He helps shape company and technology strategy and designs many of the features in Terracotta’s products. Greg came to Terracotta on the acquisition of the popular caching project Ehcache which he founded in 2003. Prior to joining Terracotta, Greg served as Chief Architect at Australian online travel giant Wotif.com. He also served as a lead consultant for ThoughtWorks on accounts in the United States and Australia, was CIO at Virgin Blue, Tempo Services, Stamford Hotels and Resorts and Australian Resorts and spent seven years as a Chartered Accountant in KPMG’s small business and insolvency divisions. He is a regular speaker at conferences and contributor of articles to the technical press.

1 comment

  1. Sometimes standalone use of System.arraycopy to copy large byte arrays causes Out of Memory error.

Comments are closed.