Last month we made a big splash with our news of BigMemory, our add-on to Ehcache which creates very large caches in the Java process ((VLIPCCs) while still avoiding the achilles heel (hell) of GC pauses. We released charts on ehcache.org showing our performance up to 40GB of cache.
Optimising for Byte Arrays. Why?
We will be GAing later this month and have been doing lots of optimisation work. One case we have optimised for is storage inthe cache of byte arrays. Why? Because we use also use BigMemory in the Terracotta Server Array (TSA) as well. And data in the TSA is stored in byte arrays. We get to escape GC pauses in the TSA, which is generally a very good idea for a high performance server. A field recommendation to escape GC has been to use no more than 4GB of cache in the TSA in-memory with the balance on disk. With BigMemory for users who do not want disk persistence with TSA, we can run the whole lot in memory.
Some new Charts
Himadri Singh, a performance engineer at Terracotta has done some new performance runs, to try out our byte array optimisation, with our published perf tester (See https://svn.terracotta.org/repo/forge/offHeap-test/ ).
These were done on a new machine with 8 cores and 350GB of RAM of which we used up to 100GB for these tests.
This first chart below shows the mean latency for byte arrays. It is almost 4 times faster than our beta release a month ago! Now we get 110 usec flat response time right out to 100Gb and then we had 450 usec out to 40GB.
This next chart is the old test from the documentation page on ehcache.org and is still representative of the present performance for non byte array data types.
This next chart is throughput. We also get a four fold increase in throughput compared to the beta.
The speedup for byte arrays was done by optimising serialization for this one case. We know that serialization for puts and dererialization for gets is by far the biggest cost. So applying Amdahl’s law, it makes sense for us to concentrate our further optimisations there.
The following chart is from the Java Serialization Benchmark, and shows the performance of alternative Java Serialization frameworks.
Some key observations:
- Java’s built in one is pretty slow
- There are lots of choices now
- We can get around a 50 times speedup with the faster libraries!
- Handwritten is the fastest
So what’s next? We don’t want to hold up ehcache-enterprise 2.3, but we are planning a follow-up release to exploit this. We are thinking about these changes:
- Make Ehcache’s Element Externalizable with a pluggable strategy for readExternal and writeExternal
- Expose the pluggable serialization strategy via config on a per cache basis or across an entire CacheManager
- Probably bundle or make an optional dependency on one of the above serialization packages. If you have a favourite you want to argue for ping me on twitter. I am gregrluck.