Greg Luck: New US and European Tour and other Ehcache news

Rubbing the bull's nose

Being full time with Terracotta gives me an opportunity to engage with the Ehcache community like never before.

For example I just came back from two weeks in the US. I gave talks in San Francisco, Philadelphia, New York and Atlanta, this last at DevNexus. Here are the details of that tour.

Tour Dates

In May and June I will be hitting the road again. Tour dates so far:

Date     | Location           | Event              | Topic

13 May| Sydney               |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

2 June | San Francisco | Google JUG | Ehcache Google App Engine module and caching in GAE generally

2 June | Jacksonville    |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

2 June | Tampa               |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

15 June | France             |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

16 June | Franfurt           |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

17 June | Amsterdam     |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

18 June | Sweden            |               JUG | Scaling Hibernate and DAOs and Ehcache 2.0; New stuff

Sunrise over the Flat Iron Building, New York City

Topics Flexible

Most people are interested in scaling Hibernate which most of the talks cover. But I am flexible. If you are interested in attending one of these events send me some topic requests to gluck AT gregluck.com.

For example, I learnt on my last tour that around 45% of shops are using JDBC usually with a DAO layer. Because I always use ORM and have been doing that for 7 years this caught me by surprise. Caching DAOs offers the same benefits as Hibernate second level caching. We are developing some new docs on ehcache.org and sample code to show how to do this. So I am going to include that in my next lot.

Another popular topic is Ehcache versus Memcached. Comparing and contrasting the two is a great way to understand what is on offer with Ehcache, particularly in combination with Terracotta.

Other News

There has been a lot going on. Ehcache 2.0 was released a few weeks ago. Ehcache is doing some interesting integrations with Grails, Google App Engine and EC2. Plus there have been new releases of the RESTful server. And next week some bug fix releases coming: ehcache 2.0.1  ehcache-web 2.1.

Finally we will be likely be making some packaging refinements to make it much easier to get Ehcache with Terracotta integrated into your development process. Terracotta is a server. We will probably add Maven and Ant tooling support so that you easily deploy it locally for running integration tests. It’s startup time is 5 seconds which is pretty quick and compares favourably with things like Tomcat and ActiveMQ.

Java Technology (and one other) predictions for the next few years

I went to DevNexus in Atlanta a few weeks back. Neal Ford gave a talk on predicting the future. Neal is a great speaker and I found it very thought provoking. Now as a former colleague of Neal’s I also felt challenged to think about whether I agreed with him. Neal is a super-smart polyglot coder who does not quite view the world the way most devs do.

Here is my list and justifications for each. Some of these disagree with Neal’s predictions.

Will developers have to deal with the challenges of parallelisation in CPUs? No.

Neal said yes, and that this was justification to go with something like Scala with its Actors to avoid learning threading in Java. I say no, and here’s why.

The issue is real enough. Core frequency has stopped increasing and CPUs are getting more cores. Lots more cores. For example, the Sun Fire X4640 server comes with 4 to 8 Six-Core AMD Opteron processors. That’s 48 processors. The normal threading approach has been to use monitors where one thread at a time gets exclusive access to an object or a synchronized block. That works well with small numbers of cores, but not with lots. What happens is that more and more time gets spent waiting to acquire a lock. The newer approach avoids this as much as possible with a whole slew of tricks such as immutable collections like CopyOnArraySet and CAS.

So will devs have to deal with this? No. Why? Because in the Java world developers are mostly protected from multi-threading. Instead the JDK (think java.util.concurrent), web containers and app servers (think Glassfish, Jetty and Tomcat) and libraries, like my own Ehcache have to deal with this. Those libraries that do get to play in this new world. Those that don’t will fall by the wayside. But it is these product and project vendors who are affected, not the vast developer population.

One ring to rule them all… Maven

William Gibson once said “The future is already here. It’s just not very evenly distributed.” A few years ago I banked on Maven becoming a big deal and decided to back it big time. I converted Ehcache to it and the corporate stuff I was doing too. Everywhere. People complain that Maven is too complicated. Well, the problem is complicated. I admit that Maven ranks up there with EJB in its difficulty. But it is getting better. Maven sets lofty ambitions for a system of world-wide component development, and is largely successful. Hats off to Jason and co.

I surveyed my audiences on my recent US tour. In Silicon Valley, 60% used Maven. In the rest of the country about 40-50%. Where is the future already here?

Swallow the red pill.

Brave new polyglot world

4 years ago, fearing I was missing out on the dynamic language revolution, I learnt Python and then Ruby, both of which were in use at my workplace. That was useful and fun.

Now, as Java developers it is difficult not to feel inadequate unless you know Java and a couple more JVM languages which in order of descending popularity I believe are: Groovy, Scala, JRuby, Cloujure, Jython ….

Ok, so how many people can code fluently in Java and one of these in the same day?  Well, I asked my audiences. Answer: < 5%. Which thinking back jelled with my corporate experience. There were two guys, both brothers, who were polyglots. And a few pretenders. And the rest of us struggled.

For the record I am saying that we all need to know at least XML, HTML, CSS, JS, shell, maven, ant – no argument there.

Neal suggested that projects would be written in multiple languages because “that’s how ThoughtWorks Studios does it and that is the future”. Not. ThoughtWorks is filled with polyglots like Neal – which is not representative of the community at large.

My prediction is that a whole project will be written in one JVM language, whether it be Java or one of the others.

Of course a given project can exploit existing libraries which are in byte code form independent of the language they were written in. Incidentally, to support this world, Ehcache is in Grails, has a very useful Groovy caching framework called SpringCache (it is for Groovy not Spring though), has a JRuby GEM and so on. In other words we make ourselves available

Self Healing Open Source

I sometimes worry about the future of Java with the consolidation that has happened – not just the Oracle acquisition but things like Spring. It is a different world than the one we were in 5 years ago. Something similar happened in Unix, which started off free in the 1970s but by the late 80s was dominated by commercial vendors who were in the Unix Wars.

What happened? Linux.

Before turning off the lights, Sun open sourced pretty much everything they owned. That creates a legal basis for the forking of that code into new projects if the open source community feels it needs to self-heal.

So will it? That depends on the vendors. But I think we as developers are safe.

Virtualisation and that cloudy stuff

Love it or hate it, virtualisation is here to stay. Get over it.

Are there problems? Yes. Do the cloud environments create even more problems? Yes. But this is a sea change in our industry which is mostly about freeing us from sysadmin costs and is therefore unstoppable. Get on board or become a dinosaur.

Global Warming

I read a book 5 years ago called The Skeptical Environmentalist which argued quite convincingly a whole host of politically incorrect views, most notably that there were multiple explanations for the warming observations such as the sun spot level. Bjørn Lomborg was pilloried for this and formally investigated for scientific dishonesty. He prevailed.

So at the time my view was that the case was unproven. The nice thing about empirical science is that theories can be falsified with more data. An of course the more data we get the more probably it becomes that global warming is real and is not a problem that will solve itself.

So for some non-Java predictions:

  1. Global warming will eventually be proven to most people’s satisfaction (think evolution) to be correct
  2. Despite the heroic efforts of the Europeans amongst others, some serious warming will occur
  3. This will happen in a perfect storm with the rising cost of fertilizers (based on the price of oil) rolling back a chunk of the green revolution of the 60s
  4. There will therefore be lots of hot, thirsty, hungry people looking for a new home
  5. New Zealand is one of the few places in the world predicted to be little affected by global warming
  6. lots of people will want to move to New Zealand

So the smart money would say “Apply for New Zealand citizenship now and beat the rush”.

New REST and SOAP APIs released for Ehcache

Ehcache Server and Standalone Server have been released to Sourceforge Downloads. The server provides RESTful and SOAP APIs to Ehcache. It is available as a WAR which works with all app servers, or standalone using Glassfish V3 embedded.

New in this release is integration with Terracotta Server Array for coherent, scalable and HA caching. Also the Ehcache core has been upgraded to version 2.0, Jersey to 1.1.5, Metro to 2.1.5 and for standalone, Glassfish to V3 embedded.

This release has been performance tested against memcache and gives comparable over-the-network performance. Coupled with the simplicity of coding in your HTTP client of choice in your programming language of choice and easily getting the benefits of Terracotta Server Array backing it, this is a killer combination.

See the Ehcache Server documentation to get started or download it now.

Due to an issue with external repository handling, these have not yet been released to Maven. Sonatype are completing a project for Sun whereby they will be adding their artifacts to the Central repository. This is expected to be completed this week or next, so the Maven artifacts will be released then.

Interestingly, it can take a little bit of work to tune your Java HTTP client for speed. The following code snippet shows two recommended optimisations for the Apache HTTP client: turning off the staleness check (you would catch the exception on connection clone in production code and open a new one) and ignoring cookies.

HttpClient httpClient = new HttpClient();
httpClient.getParams().setParameter("http.connection.stalecheck", false);
for (int i = 0; i < cacheOperations; i++) {
    String url = new StringBuffer(cacheUrl).append('/').append(keyBase).append(i).toString();
    HttpMethod httpMethod = new GetMethod(url);
    httpMethod.getParams().setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
    httpClient.executeMethod(httpMethod);
}
LOG.info(cacheOperations + " gets: " + stopWatch.getElapsedTime() + "ms");

Comparing Memcache and Ehcache Server Performance

Ehcache Server provides a RESTful API for cache operations. I am working on v0.9 and have been doing some performance benchmarks. I thought it would be interesting to compare it with the performance of that other over-the-network cache, Memcache. Now I already knew that Ehcache in-process was around 1,000 times faster than Memcache. But what would the over-the-network comparison be.

Here are the results:

Memcache and SpyMemcache Client
10000 sets: 3396ms
10000 gets: 3551ms
10000 getMulti: 2132ms
10000 deletes: 2065ms
Ehcache 0.9 with Ehcache 2.0.0
10000 puts: 2961ms
10000 gets: 3841ms
10000 deletes: 2685ms

So, the results are a wash. Memcache is slightly slower on put, maybe because the JVM does not have to malloc, it already has the memory in heap. And very slightly faster on get and delete.

A few years ago there was a raging thread on the Memcache mailing list about Memcache versus MySQL with in-memory tables. They were also a wash. I think the point is that serialization and network time is more significant than the server time, provided the server is not that much different. And this is what we see with Ehcache.

And now for the implications:

  1. REST is just well-formed HTTP. Just about any programming language supports it. Without having to use a language specific client, like you need to do with Memcache. Ehcache Server was the first Java cache to support the REST API. But many others have followed. Why? It is a really good idea.
  2. Performance wise, REST backed by a Java App Server and Cache, is about as fast as Memcache.
  3. Therefore, your decision on what to use should depend on other things. And Memcache is no-frills. If you chuck Terracotta in behind Ehcache (we are one company now after all) then you get HA, persistence, and coherence expressed how you like, as a guarantee that you any reading the last write or more formally in XA transactions.

Finally, I will be giving a Webinar in a few weeks where I compare Ehcache and Memcache.

Upcoming Webinar: Memcached vs. Ehcache: Which One is Right for Me?

Deciding upon the best caching solution can be complicated. Some questions are:

  • Which one offers the best performance?
  • Which provides the highest availability?
  • Which scales most effectively?
  • Which is best for persisting large datasets?
  • How can I guarantee that data is consistent for all users?
  • Which is best for Java applications?
  • Which is best if you have a non-Java client?

This short webcast, to be held on April 15, 2010 at 11 am PDT,  will provide insight into these important topics and others by comparing the two leading open source caching solutions: Ehcache and Memcached.

Not only will you gain a better understanding of the differences between Ehcache and Memcached, you will also learn a lot about improving application performance and scalability from one of the world’s foremost authorities: Ehcache founder, Greg Luck.

Please register for the WebEx even if you are unable to attend and you will get an email with a link to the recording.