Ehcache 1.6 2 Orders of Magnitude Faster

I have been waiting for enough people to move to Java 5 to mandate it as a minimum standard for ehcache. At JavaOne 2008 I found out that a lot of people were still to make the move. Now that we are in 2009 I have decided to move to Java 5. As part of this I have done a general cleanup of the core. I can now retire backport-concurrent which has served the project well (thanks guys) and other dependencies. Ehcache-1.6 core has no dependencies.

I decided that with the improvements in concurrency support that have come along, it was time to move beyond the use of synchronized. Years ago I adopted striped locking on BlockingCache which gave amazing results but I left the core pretty much as it was. The rework adopts some new goodness in Java 5 such as CopyOnWriteArray and ConcurrentHashMap. Having said that there is nothing in Java 5 for eviction, so the new work relies heavily on some excellent contributions to provide performance for caching application that is not available in Java 5.
On my own concurrency tests, which use 70 threads simulating a typical load against a single cache, I get the following improvements in ehcache-1.6 over ehcache-1.5. (Note 70 are just for that cache. Ehcache typically has many caches, so this translates to a production system with thousands of threads against all caches)
Operation Number of Times Faster
Than Ehcache-1.5.0
get 92.5 times faster
put 30 times faster
remove 48 times faster
removeAll 80 times faster
keySet 30 times faster

Manik Surtani maintains a cache performance benchmark tool. Using that I have added ehcache-1.6. It shows dramatically the performance increases in Ehcache-1.6.
For those with less than perfect eyesight, the second column, which is too short to even have its time printed, is the ehcache-1.6 performance.
What these charts are saying, is that an ehcache, with 25 concurrent threads, is now much faster than it was. The single threaded case it no faster. But caches are not about single threads.
Now, in case everyone gets preoccupied on the comparsions between Java caches, here is an old Ehcache versus Memcached chart.
If I redid this chart using ehcache, the barely visible columns for ehcache would completely disappear on this scale.
So what is this really saying? An in-process cache, which uses a few tens of CPU operations to access data already held in memory, is much, much, much faster than going out over the network for some data, regardless of how slick the server implementation at the other end is.
But I recognise that Memcached is about a different type of caching: massive partitioned caches. The Ehcache project has the Ehcache Server for that, with RESTful and SOAP APIs. The RESTful implementation uses a variety of tricks such as conditional get, the ability to have hardware and software load balancers (think ngnx) perform URI routing, head, HTTP1.1 compression and pipelining plus the goodness of modern NIO Java Web Containers to seriously give memcached a run for its money. I will be doing some performance comparisons between Memcached and Ehcache Server in the near future.
What else is next? The above numbers are for MemoryStore based caches. I am also going to give the DiskStore a work over, with lots of suggestions made to me in the last year. Stay tuned.
 

Published
Categorized as Java

By Greg Luck

As Terracotta’s CTO, Greg (@gregrluck) is entrusted with understanding market and technology forces and the business drivers that impact Terracotta’s product innovation and customer success. He helps shape company and technology strategy and designs many of the features in Terracotta’s products. Greg came to Terracotta on the acquisition of the popular caching project Ehcache which he founded in 2003. Prior to joining Terracotta, Greg served as Chief Architect at Australian online travel giant Wotif.com. He also served as a lead consultant for ThoughtWorks on accounts in the United States and Australia, was CIO at Virgin Blue, Tempo Services, Stamford Hotels and Resorts and Australian Resorts and spent seven years as a Chartered Accountant in KPMG’s small business and insolvency divisions. He is a regular speaker at conferences and contributor of articles to the technical press.

3 comments

  1. Ari
    j.u.c. allows me to fix some simple things. For example the Caches held by a CacheManager, and the listeners held by a Cache, I use CopyOnWriteArray or Set for. That is really great, because it means getting or iterating on those, which are very common, will never block.
    But the most significant things are not j.u.c. I had a great contribution from Ben Manes and the new map class which holds elements is ConcurrentLinkedHashMap, which is not part of j.u.c. I do use ConcurrentLinkedHashMap for the LFU cache, as it takes a novel statistical sampling approach to selecting the element to evict which does require maintenance of _any_ ordering at all.
    There are still a few wrinkles to work out in ConcurrentLinkedHashMap, and I am yet to rework DiskStore, but this is exciting stuff.
    Greg

  2. Nice work Greg, impressive performance boosts.

    A point to note though, the charts you generated using the CacheBenchFwk uses an alpha release of JBC 3.0.0 – and specifically, one that predates any profiling/tuning. You will find that the current JBC GA release – 3.0.3.GA – is significantly quicker than the alpha you tried. Would be worth your while benching against this.

    Also, I tried running a few benchmarks of my own and noticed that EHCache-1.6.0-Beta3 – the current latest – was, if anything, a little slower than EHCache 1.5.0. Did you use an unreleased snapshot of EHCache 1.6.0 in the tests above?

    Cheers
    Manik

Comments are closed.