Is it that time of year again. A time for reflection and of New Year’s resolutions. I therefore thought this was good time to reflect on what has been happening with Ehcache.
Ehcache and Terracotta got together in August 2009. We got our first combined release done with Ehcache backed by Terracotta three months later. That was really only the beginning. We have done 5 major releases to date with of course another one in the cooker right now.
The big news is that Ehcache and Terracotta together have been a great success. Most Terracotta users now use it as a distributed backing store for Ehcache. Most of Terracotta’s revenue comes from that use case. This means that Ehcache’s needs are driving the evolution of Terracotta.
At the same time, Ehcache as an open source project has seen a huge investment. Most of the features added are designed to work in Ehcache open source as well as with Terracotta as the backing distributed store.
So let’s look at what we added in 2010 and what is in the cooker so far for 2011.
What we did in 2010
New Hibernate Provider
When we merged Terracotta had just done a Hibernate provider and Ehcache had one too. Neither supported the new Hibernate 3.4 SPI which uses CacheRegionFactories instead of CachingProviders. So we combined the two into a new implementation and added support for the new SPI at the same time. This meant that for the first time Ehcache supported all of the cache strategies, including transactional, in Hibernate and importantly supported them across a caching cluster. (http://ehcache.org/documentation/hibernate.html)
Right now XA transactions have fallen from favour partly because of flaky support for them in XAResource implementations out there. But what if you could create a canonically correct implementation that could be absolutely relied on in the world’s most demanding transactional applications? We hired one of the Java world’s foremost experts in transactions (Ludovic Orban, the author of the Bitronix transaction manager) and came out with just that. We challenge anyone to prove that our implementation is not correct and does not fully deal with all failure scenarios. If you need to be absolutely sure that the cache is in sync with your other XAResources with a READ_COMMITTED isolation level you have come to the right place. (http://ehcache.org/documentation/jta.html)
Terabyte Scale Out
Ehcache backed by Terracotta initially held keys in memory in each application that had Ehcache in it. This effectively limited the size of the caching clusters that could be created. With a new storage strategy, we blew the lid of that and stopped storing the keys in memory. The result – horizontal scaling to the terabyte level.
Write-through, behind and every which way
What happens when you have off-loaded reads from your database but now your writes are killing you? The answer is write-behind. You write to the cache. It calls a CacheWriter which you implement and connect to the cache which is called periodically with batches of transactions. In your CacheWriter you open a transaction write the batch and then close the transaction. Much easier for the database. And all done in HA with the write-behind queue is safe because it is stored on the Terracotta cluster.
More caching in more places
We were really happy to extend our support for popular frameworks. During the year:
- Ehcache became the default caching provider for Grails
- We created an OpenJPA provider
- We created Ruby Gems for JRuby and Rails 2 and 3 caching providers
- We created a Google App Engine module
Acknowledgement of the CAP Theorem
Originally Ehcache with Terracotta was consistent above all else. During the year we flexed both ways to allow CAP tradeoffs to be made by our users. We added XA and manual locking modes on the even stricter side and we added an unlocked reads view of a cache and even coherent=false for scale out without coherence on the looser side. And you can choose this cache by cache. There is a tradeoff between latency and consistency, so you choose the highest speed you can afford for a particular cache.
And rather than just blocking on a network partition, we added NonStopCache, a cache decorator which allows you to choose whether to favour availability or consistency.
BigMemory was a big hit. It surprised a lot of people and frankly it surprised us. We were looking to solve our own GC issues in the Terracotta server and found something that was more generally useful than that one use case. So we added BigMemory to Ehcache standalone as well as the server. In the server we have lifted our field engineering recommendation from 20GB of storage per server partition to 100Gb. And we have tested BigMemory itself out to 350GB and it works great!
A new Disk Store
Let’s say you are using a 100GB in Ehcache standalone. When you restart your JVM you want the cache to be there otherwise it might take hours or days to repopulate such a large cache. So we created a new DiskStore that keeps up with BigMemory. It writes at the rate of 10MB/s. So when it is time to shutdown your JVM it just needs to do a final sync and then your are done. And it starts up straight away and gradually loads data into memory. A nice complement to BigMemory and very important.
Ehcache Monitor/Terracotta Dev Console Improvements
For those using Ehcache standalone we have only ever had a JMX API. That is fine but we found many people built their own web app to gather stats. So we did the same and the result was Ehcache Monitor. One of the highlights is the charts including a chart of estimated memory use per cache.
The Terracotta Developer Console got an Ehcache panel, and as we added features to Ehcache we added more to the panel. If you are using Ehcache with a backing Terracotta store then it is a full featured tool which gives you deep introspection.
What is Coming in 2011
Speed, speed and more speed
What does everybody want? More speed. We are splitting hairs in our concurrency model to enable as much speed as possible for each use case. We now have two and will be adding more modes to allow the best tuning for each use case.
Ehcache is based on a Map API. Maps have keys and values. They have a peculiar property – you need to know the key to access the value. What if you want to search for a key, or you want to index values in numerous ways and search those. All of this is coming to Ehcache in February 2011 and is available right now in beta. Oh and one cool thing: search performance is O(log n/partitions). So as your data grows and spreads out onto more Terracotta server partitions, your search performance stays constant! (http://ehcache.org/documentation/search.html)
New Transaction Modes
We already did the hard one: XA. Now we are adding Local Transactions. If you just want transactionality within the caches in your CacheManager and there are no other XAResources, you can use a Local Transaction. It will be three times faster than an XA cache. (http://ehcache.org/documentation/jta.html)
Quite a few customers use Java but also some .NET. And they want to be able to share caches. We have lots of users happily using ehcache for cross-platform use cases, but are planning on extending our cross-platform support still further – for example with a native .NET client
We are looking at ongoing speedups and testing against larger and larger memory sizes for BigMemory. We are also looking to provide further speed in BigMemory by allowing pluggable Serialization strategies. This will allow our users to use their Serialization framework of choice – and there are now quite a few.