Comparing Memcached and Ehcache Performance
Performance Comparison
I did some research recently on memcached and how it compares to ehcache. The following graph shows the time taken for 10,000 puts, gets and removes, for 10,000 cache items. It uses the latest released versions of memcached and ehcache. In memcached’s case libevent is installed. The computer was a Mac Book Core Duo with 1.25GB of memory. The test used was the standard benchmark used in memcached’s Java client. The client used was the Java client.
The results are that put and gets are 500 to 1000 times faster in ehcache compared with memcached. The grey bar, barely visible, is ehcache with the cache all in memory. The blue bar is with 9,900 of the cache items in the disk store. Even the disk performance of ehcache is way faster than memcached. Of course memcached is entirely in memory.
Why?
The AMP people always make the claim that memcached is really fast. The memcached mailing list has a long thread arguing back and forth as to whether memcached is actually any faster than using the MySQL in memory cache tables. It seems in some circumstances they are pretty close. I think of memcached as a client-server cache.
In the diagram above you can see that each put, get or remove happens over the network between a memcached client running on the web server and a memcached server using the memcache wire protocol. There is network and marshalling overhead for each request.
By comparison, the ehcache architecture is shown below. The cache is local and in-process. The cost of gets is that of referencing memory or loading from disk, with appropriate thread safety.
Puts are really fast because they are always synchronous to the memory store, with transfer to disk or to other nodes in the cluster happening asynchronously by batch.
The performance differences are due to the different architectures.
Implications
In-process caching and asynchronous replication are a clear performance winner.
Ehcache and other in-process caches are very widely used in the Java world. One thing I see happening is new languages reusing Java infrastructure. An example is Grails (Groovy on Rails) which can re-use the Java stack. Grails is using JPA, which lets it use Hibernate and in turn second level caching. Now memcached is typically used with Rails apps running on Apache with the share nothing approach. But now JRuby on Rails is able to be deployed in Java web containers (See ThoughtWorks’ Mingle product which is done this way for an example). There are notes on how to deploy JRuby on Rails apps to Glassfish in a WAR, and Glassfish V3 takes deployment simplicity it even further. The JRuby on Rails (JRails anyone?) folks are planning to move to JPA (from what I heard at JavaOne).
So it seems that regardless of coding in Java, GRails or JRails, you will be able to reuse the Java infrastructure and in particular the in-process Java caches that come with it. And that is a very good thing.