OSCON2006: Databases and Caching

I often get accused of being cache centric in my ideas on performance. As I attend OSCON this year I have kept my ear out, both in and out of the sessions, for how everyone is solving performance problems.
There seems to be couple of themes. One is to keep as much away from your database as possible. Another is to do this by using caching. In the LAMP world this generally means using memcached. You can also use mod_proxy for this, and Squid. I am not really sure why memcached is used in preference to these two but I intend to find out.
Another alternative is to replicate data between databases. This also seems less popular. The scalability tutorial speaker was of the view that this is relatively hard, though of course practical. If you are talking commercial databases, you also have cost problems. The general solution here, is say you have Oracle, replicate to read only MySQL servers.
But solutions like memcached, and write through cache solutions such as the approach taken by the very movable type I am using right now are simpler.
The SleepyCat founder was a the Keynote this morning. He suggested that some of the larger shops such as Amazon and Google use SleepyCat rather than client server database servers like Oracle or MySQL.
Another interesting comment was that when MySpace was moved from ColdFusion to ASP.net, the caching piece had to be custom written and was inspired by memcached.
So, I guess to summarise, I would say that a web caching piece is a mandatory part of each stack.

OSCON2006: Capistrano

A month or so ago we talked about how to script deployments. We do Java, Ruby and Python deployments. We sort of had some ideas but came up blank.
Capistrano, a ruby app, uses ssh and relies on posix commands, so it will work on pretty much anything other than windows. I went to the session, which was well attended. The guys have done a lot of work on it and emphasise that, while it is great for Rails, it is also well suited to all sorts of deployments.
I think this may be worth a look.

OSCON2006: Tim O’Reilly Keynote – Open Source Trends

New trends:
Ruby books are now outselling Python and Perl books. But JavaScript books have increased the most and are outselling all of the other dynamic languages. Why? Tim thinks it is driven by interest in Ajax, which is the hottest thing right now. Time to get over my JavaScript hatred. IntelliJ helps with that a lot. (Also interesting that Rhino is bundled in JDK 1.6)
The other thing new on the horizon is Django. Tim pointed out that both Rails and django grew out of closed source projects: Rails from 37 Signals, and django from Lawrence Journal World. django is Python’s answer to Rails, so it will be an interesting one to watch.
Virtualization is a big new trend.
Another one is that being on someone’s platform increasingly means you will also be hosted on their infrastructure.
Open Data. Owning your own data and being able to take it with you.
Firefox is the equivalent in the browser world. (Tim did not say this but someone said something yesterday about Eclipse being the new Emacs).
Asterisk and open VoIP is a big deal.
Ubuntu is on the rise. It has huge interest relative to RedHat.

Report from OSCON2006: The Ruby Conspiracy

(Update: Wow I got a record number of comments to this blog. Answers to some common themes at the end of the post)
Who are those who are benefiting from Ruby on Rails? Answer: O’Reilly Publishing, the authors Bruce Tate and Dave Thomas and a handful of consultants.
At last year’s conference, Tim O’Reilly had carefully analysed his book sales and was desperate to identify the next big thing. Same for the pragmatic programmers and for consulting companies wishing to push the next big thing. C# had been and is a disappointment. Despite a huge push it refuses to move up the Tiobe programming index.
I get the feeling that everyone needs a next big thing, and if there is not one, they create it. So what has happened to Java, after the spate of Beyond Java books? Answer: according to tiobe it has risen higher. So what is declining? VB.net and Perl. Not Java.
So am I ignorant of Ruby on Rails? We have two production applications running on Ruby. And how is it. Well, despite being perhaps no more than 5% of the functionality of our applications, Ruby on Rails is the number one consumer of Oracle CPU and logical gets. Why? Rails does not support prepared statements, so Oracle has to reparse every time. This is something that Java has had for years and years. And ActiveRecord seems not to have learnt Hibernate’s lession; that OR tools suck for performance and need caching tricks to make them work well. Also, our Rails apps running in (now unmaintained) fast-cgi regularly go awry and fork more processes. Each one creates a new connection to Oracle. So, the opposite of connection pooling; connection denial of service. And does Ruby support Unicode. Not really. And is Rails threadsafe? No. So, is it Enterprise Ready. Absolutely, according to those with a clear vested interest in supporting the next big thing.
Are these problems solvable? Yes, in a fashion. For example, I have been told that lighthttpd and Mongrel should be used rather than fast-cgi. And MySQL should be used rather than Oracle.
And does it matter that Ruby is 15 times slower than Java. Of course not. How could it! Just buy more hardware. And more hosting costs. And more System Administrator salaries.
After all, the productivity benefits of Ruby are so much greater than Java you will save all of the money in development. Or do you. Our experience was that Ruby on Rails took longer than Java would have. And what about maintenance. Well we just refactor as things change. Or do we? There are no Ruby tools that support refactoring. And nor are they are expected due to the difficulties of implementing refactoring tools for Dynamic Languages, or so I am advised.
And what about support. Well there is the Ruby mailing list. Which is quite active. That should be good enough for anyone.
And what about Python? Python is arguably a more mature dynamic language with a much larger developer community and number of libraries. Why does it suck? Underscores and “self”, according to one of the leading Ruby advocated. Wow, that sounds really bad. Doesn’t Martin Fowler’s Refactoring book recommend _ for fields? Must be deprecated. Oh and what about Django? Now this really sets the Rails people off. Why? Because it is Rails like. A rifpoff they say. Yes, but where do the Rails ideas come from. From my point of view it exactly what I have been doing in Java for years and years. So if Ruby can rip off from Java (most welcome BTW) why cannot Django ripoff Ruby on Rails? Answer, because the vested interests have decided there is money to be made from promoting Ruby, not Python.
In short, anyone who questions the benefits of Ruby on Rails is not with the program. You know, once upon a time, being open source meant being better because of improvements spurred by constructive criticism. Its about time the Ruby on Rails community accepted some.
Anyone for Haskell? Or J2EE 5?
(Answers to common comments:
1. Where are the line breaks?
Ah, sorry for that. This was a late night post, after a drinking session. Forgot to check “Convert Line Breaks” in the Text Formatting option. Also, I have been using Writely lately to blog, until it broke :)
2. Thanks for saying what everyone is afraid to?
No problem.
3. Why are you so ignorant?
I am capable of learning. Really. Enlighten me.
4. Why are there no references to back up your claims?
Some of the claims are explored more fully in other blog posts. The one about Ruby performance is based on freely available benchmarks. How important is it? The database is normally the slowest part, but it depends on your app.
5. Who was the nasty Railroader who claims django is a rip off?
I personally did not realise that django predated Rails. The Railroader was very tetchy about any success django might be having. He described django as “Rails-like”. The rip off part I think was was a false connection I made. I do however think that the django crowd have realised the importance of hype. So perhaps to that extent they have learnt something from the Rails crowd.
6. What about Haskell?
Sadly no one that posted was that interested in Haskell. Our speaker yesterday said it takes a year to get into it, so maybe that is why.

Rolling your own Google Maps

I attended a session here at OSCON on Rolling Your Own Google Maps. It is rolling your own Google maps without Google.

The session also covered Google Maps. My own beginner effort, showing where I live, is here.

Back in the USA

Well I am back in the USA to attend the OSCON2006 Conference.

Observations/Trends

  • Atkins is out. Glycemic Index (an Australian invention) is in
  • Burger King now has a “BK Stacker”. They go to 4. That is: four beef patties interleaved with four slices of cheese. Yours for USD9.
  • At the cinema, a small drink is 600ml, a medium is 1litre. Wow!
  • Saw a lot less large SUVs and trucks on the road. I saw a lot of them parked at people’s houses. I saw plenty of Hummers sitting in car yards. Speculation: are people driving their smaller second car as their main car? On the other hand I am in Portland, which is known for being relatively green.
  • I was here for record breaking temperatures. 11 of the highest temperature years have been in the last 14. The media seem not to be commenting on this.
  • Al Gore’s “An Inconvenient Truth” is a hit. I saw the movie at the Lloyd Centre, Portland Oregon. It was almost sold out. The movie makes a great case that Global Warming is both real and bad thing. It covers the same material as Tim Flannery’s “The Weather Makers”. I think between them 2006 will be the turning/tipping point for Global Warming. Interestingly, the movie avoids mention of Diesel cars and Nuclear energy, neither of which are popular in the USA.
  • The record heat meant that I was unable to get a room on the coast Saturday night and had to drive all the way back to Portland. It was 40c plus on the west coast.
  • Omega 3 has been added to the plant sterol margarines. Also to a healthy peanut paste. I am looking forward to that coming to Australia.
  • I checked out HDDVD and BluRay players. Neither was impressive. Perhaps we need to wait for new movies to be made. I thought it was in between DVD and broadcast HD. Interestingly, the HDDVD player kept crashing and had to be rebooted. Which one was M$ supporting again? :)

The title of this

The title of this document is “Writely is now stuffed”.

But my blog is not entitled that.

Also, following should be an image. But it is broken. Why? Because writely’s img src is missing “http://writely.com”.

So, in the last three months since Google bought Writely, it no longer works properly and is closed for new registrations. Way to go Google!

Tuning Memory Use in ehcache

In ehcache-1.2 the DiskStore asynchronous spooling was reworked and made much faster. It is now possible to fill the spool very quickly. This gives great cache performance but creates a new problem, temporary memory consumption in the spool thread.

The problem has arisen because data is hitting the spool so fast now that the actual number of element held in the MemoryStore and the spool of the DiskStore can easily exceed the maximum MemoryStore size. You can get memory spikes. Ultimately, I might introduce soft references, so that elements in the spool can simply be reclaimed. This would need to be a configuration option, because some apps are a bit fussy about their cache elements up and disappearing.

The memory spikes, if they are high enough, can cause OutOfMemory errors. For some reason, these would often occur in the flushSpool() method. Ehcache 1.2.0_01 and 1.2.1 contain hardening measures so that these do not cause trouble. If one occurs, that section of the spool that is being written is discarded. So it degrades down to something similar to a SoftReference solution. But it would be better to minimise the occurrence of this behaviour.

This article documents some investigations I have made and a solution I have. The test involves creating 5500 10000 byte objects and putting them in a cache. These go into the MemoryStore and overflow to the DiskStore immediately. The spool thread then takes care of writing them. Memory use spikes to about 55MB and then drops. The disk store was modified to call System.gc() every 700 Elements spooled.

Figure 1 shows the original implementation. The System.gc() calls have little effect. The memory cannot be released until the flushSpool() method completes. Though not shown, forcing a gc in the profiler after elements have been written will return memory back to about 14MB.

Figure 1: Old Disk Store

Figure 2 shows the memory profile after one small change. As the flushSpool() method iterates through the elements to persist them, set each one to null in the array holding references to them, after they are written. The System.gc() every 700 elements actually reclaims memory.

Figure 2: Old Disk Store with just dereference after array element use

Figure 3 shows an implementation which uses the technique from Figure 2 plus two more:

  1. Do the work split up around 5 methods. The theory here is that some JDK implementations do not actually reclaim memory for references that are dereferenced in the same method. They wait for the method to return first.
  2. ByteArrayOutputStream creates lots of temporary byte[] using System.arrayCopy(). It starts with 32 bytes, and then bit shifts to the left by one, as more memory is required. So the sequence is:

    32
    64
    128
    256
    512
    1024
    2048
    4096
    8152
    16304
    32608
    65216

    For our 10000 byte offects, 10 byte arrays are created by ByteArraytOutputStream as the stream is written, totalling 32616 bytes. We create a subclass that uses as its starting point the average size of each entry in the DiskStore.

This implementation’s memory use is much smoother.

Figure 3: New Disk Store

The new improved implementation will be released in ehcache-1.2.0_02 and ehcache-1.2.2.

We have a Python

We have a Python based monitoring application that regularly tops the CPU list, using more CPU than the application it is monitoring.

I am a bit of a newcomer to Python. It is in stable 8th position on the Tiobe programming index (http://www.tiobe.com/tpci.htm). It is mature, has tons of libraries available for it and is suitable for a broad range of tasks.

Python is Sloooooooow

There is something wrong with the monitoring app being hungrier than the app it is monitoring.

So, how fast is Python? Benchmarks always seem to lead to lots of fights. My raw data, and my subesquent conclusion, are based on:

  1. A loop test I wrote. Java was 200 times faster. I had some Python guys look it over. It is valid but is most likely bad because int are primitives in Java but fully fleged objects in Python.
  2. Computer Language Shootout: http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=java&lang2=python

    The graphical results for a series of different benchmarks are shown below. Java is up to 150 times faster. The average is around 10 times faster.

  3. http://www.timestretch.com/FractalBenchmark.html This benchmark is 1.25 seconds Java, 15 seconds Python. Java is about 10 times faster.

My conclusion is that Python is about 10 times slower than Java. (Though not of interest right now, I also took a look at Ruby. It seems to be about 15 times slower). Does this matter? Right now we have a problem and it does matter.

Fixing It

So, what to do? The Python books I have suggest using C libraries for the bits that are slow. So, how to tell what is slow? Fortunately Python comes with a very simple and easy to use profiler. To add profiling to your app simply:

import profile
profile.run(‘main()’) #or whatever your entry point is called

We did this for our monitoring app and found, sure enough, that the largest performance antipattern of all time had reared its head yet again: XML. In our case it was a python lib called tramp xml. It works recursively and seems to hit all the bad points of Python performance. Fortunately we can change our file format to non XML and avoid the issue.

What about psyco?

We also considered accelerating Python with pysco. (http://psyco.sourceforge.net/).

We are running 64 Linux on AMD64. So the requirement for “A 32-bit Pentium or any other Intel 386 compatible processor. Sorry, no other processor is supported. Psyco does not support the 64-bit x86 architecture, unless you have a Python compiled in 32-bit compatibility mode.” sort of kills it for us.

Secondly, pysco is being deprecated in favour of PyPy (http://codespeak.net/pypy/dist/pypy/doc/news.html) PyPy is not yet ready for primetime.

What about a C Lib?

The usual solution to Python performance problems is to use a C library to speed up whatever is your chokepoint in Python. I think that is a valid approach, and one we would have used had we not been able to simply remove the XML.

What about Jython?

I have been playing with Jython lately. Unfortunately Jython does not use the JIT, so Java performance sucks.

Conclusions

  1. Python is slooooow. 10 times slower than Java, which itself is about two times slower than C.
  2. If you are not a C shop, the usual solution of porting the slow bits to C will be a bit too hard
  3. As more production apps migrate to 64 bit AMD64 and EMT64, pysco falls away as a solution. (For non 386 it has never been an option)
  4. Carefully consider the performance requirements of your application before you select Python as the implementaiton language.

How We Solved our Garbage Collection Pausing Problem

I had our main J2EE app at work with 9 second pauses. These would happen on average every 50 seconds. Needless to say this was a huge performance problem. Pauses are caused by major garbage collections. Minor garbage collections do not cause pausing. Pausing means nothing, absolutley nothing, gets done in your app. 9 seconds is a long time. The peaks were up to 15 second.
We tried quite a few garbage collection settings. They each behaved differently but could not be considered better. In the end we consulted some engineers at Sun who, after analysing our verbose gc logs, gave us the following piece of black magic:
java … -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:NewSize=1200m -XX:SurvivorRatio=16
The reasoning for each setting is as follows:
-XX:+DisableExplicitGC – some libs call System.gc(). This is usually a bad idea and could explain some of what we saw.
-XX:+UseConcMarkSweepGC – use the low pause collector
-XX:NewSize=1200m -XX:SurvivorRatio=16 – the black magic part. Tuning these requires emprical observation of your GC log, either from verbose gc or jstat ( a JDK 1.5 tool). In particular the 1200m new size is 1/4 of our heap size of 4800MB.
What was the result? Major GCs and their attendant pauses reduced to 2 per day, from once every 50 seconds. Mean response times dropped from seconds to milliseconds. All in all, one of the best results I have achieved this year. Thanks Sun.