We have a Python

We have a Python based monitoring application that regularly tops the CPU list, using more CPU than the application it is monitoring.

I am a bit of a newcomer to Python. It is in stable 8th position on the Tiobe programming index (http://www.tiobe.com/tpci.htm). It is mature, has tons of libraries available for it and is suitable for a broad range of tasks.

Python is Sloooooooow

There is something wrong with the monitoring app being hungrier than the app it is monitoring.

So, how fast is Python? Benchmarks always seem to lead to lots of fights. My raw data, and my subesquent conclusion, are based on:

  1. A loop test I wrote. Java was 200 times faster. I had some Python guys look it over. It is valid but is most likely bad because int are primitives in Java but fully fleged objects in Python.
  2. Computer Language Shootout: http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=java&lang2=python

    The graphical results for a series of different benchmarks are shown below. Java is up to 150 times faster. The average is around 10 times faster.

  3. http://www.timestretch.com/FractalBenchmark.html This benchmark is 1.25 seconds Java, 15 seconds Python. Java is about 10 times faster.

My conclusion is that Python is about 10 times slower than Java. (Though not of interest right now, I also took a look at Ruby. It seems to be about 15 times slower). Does this matter? Right now we have a problem and it does matter.

Fixing It

So, what to do? The Python books I have suggest using C libraries for the bits that are slow. So, how to tell what is slow? Fortunately Python comes with a very simple and easy to use profiler. To add profiling to your app simply:

import profile
profile.run(‘main()’) #or whatever your entry point is called

We did this for our monitoring app and found, sure enough, that the largest performance antipattern of all time had reared its head yet again: XML. In our case it was a python lib called tramp xml. It works recursively and seems to hit all the bad points of Python performance. Fortunately we can change our file format to non XML and avoid the issue.

What about psyco?

We also considered accelerating Python with pysco. (http://psyco.sourceforge.net/).

We are running 64 Linux on AMD64. So the requirement for “A 32-bit Pentium or any other Intel 386 compatible processor. Sorry, no other processor is supported. Psyco does not support the 64-bit x86 architecture, unless you have a Python compiled in 32-bit compatibility mode.” sort of kills it for us.

Secondly, pysco is being deprecated in favour of PyPy (http://codespeak.net/pypy/dist/pypy/doc/news.html) PyPy is not yet ready for primetime.

What about a C Lib?

The usual solution to Python performance problems is to use a C library to speed up whatever is your chokepoint in Python. I think that is a valid approach, and one we would have used had we not been able to simply remove the XML.

What about Jython?

I have been playing with Jython lately. Unfortunately Jython does not use the JIT, so Java performance sucks.

Conclusions

  1. Python is slooooow. 10 times slower than Java, which itself is about two times slower than C.
  2. If you are not a C shop, the usual solution of porting the slow bits to C will be a bit too hard
  3. As more production apps migrate to 64 bit AMD64 and EMT64, pysco falls away as a solution. (For non 386 it has never been an option)
  4. Carefully consider the performance requirements of your application before you select Python as the implementaiton language.

By Greg Luck

As Terracotta’s CTO, Greg (@gregrluck) is entrusted with understanding market and technology forces and the business drivers that impact Terracotta’s product innovation and customer success. He helps shape company and technology strategy and designs many of the features in Terracotta’s products. Greg came to Terracotta on the acquisition of the popular caching project Ehcache which he founded in 2003. Prior to joining Terracotta, Greg served as Chief Architect at Australian online travel giant Wotif.com. He also served as a lead consultant for ThoughtWorks on accounts in the United States and Australia, was CIO at Virgin Blue, Tempo Services, Stamford Hotels and Resorts and Australian Resorts and spent seven years as a Chartered Accountant in KPMG’s small business and insolvency divisions. He is a regular speaker at conferences and contributor of articles to the technical press.