Browsing Posts in Java

The forthcoming Ehcache 1.6.0 is compatible and works with Google App Engine. You can get it now from ehcache snapshots.
Google App Engine provides a constrained runtime which restricts networking, threading and file system access. All features of Ehcache can be used except for the DiskStore and replication. Having said that, there are workarounds for these limitations.

Why use Ehcache with Google App Engine?

Ehcache cache operations take a few ?s, versus around 60ms for Google’s provided client-server cache memcacheg (as reported on cloudstatus.com). Because it uses way less resources, it is also cheaper.
You can also store non-Serializable objects in it. And finally there is the rich Ehcache API that you can leverage.

Recipes

Setting up Ehcache as a local cache in front of memcacheg

The idea here is that your caches are set up in a cache hierarchy. Ehcache sits in front and memcacheg behind. Combining the two lets you elegantly work around limitations imposed by Googe App Engine. You get the benefits of the ?s speed of Ehcache together with the umlimited size of memcached.
Ehcache contains the hooks to easily do this.
To update memcached, use a CacheEventListener .
To search against memcacheg on a local cache miss, use cache.getWithLoader() together with a CacheLoader for memcacheg.

Using memcacheg in place of a DiskStore

In the CacheEventListener , ensure that when notifyElementEvicted() is called, which it will be when a put exceeds the MemoryStore’s capacity, that the key and value are put into memcacheg.

Distributed Caching

Configure all notifications in CacheEventListener to proxy throught to memcacheg.
Any work done by one node can then be shared by all others, with the benefit of local caching of frequently used data.

Dynamic Web Content Caching

Google App Engine provides acceleration for files declared static in appengine-web.xml.
e.g.




You can get acceleration for dynamic files using Ehcache’s caching filters as you usually would.

Getting Started

To get started see the Ehcache with Google App Engine HowTo.

Anyone with a project on SourceForge who does Maven knows how poorly it supports maven repositories.

You can set up repos in the Apache virtual for your project, but you are limited to 100MB. That gets chewed up very fast.
Secondly, about 6 months ago they revoked ssh access. The result is that the scp wagon cannot create directories. (On a side note, the site deploy is similarly affected – it can scp the file to SourceForge but cannot unzip.) My workaround for this problem has been very frustrating. I do a deploy, which fails, but tells me what directories it was trying to create. Then I sftp to SourceForge and create the directories, then run deploy again. This is such a pain that I have given up deploying snapshots, which then hurts my users.
Now when it comes to deploying to the Central repo I have set up with them a sync for the ehcache artifacts on SourceForge. That works fine. However recently I needed to do an update to jsr107cache, the draft API for JCACHE. This one is not synced. I have been waiting a month plus for action on the manual JIRA upload process to get that deployed to central, the result being that someone has logged a bug against a the ehcache-jcache module saying there is no jsr107cache. Now it is available on my SourceForge repo but that takes more work for people to figure out.
So, in summary, SourceForge is not doing a good Maven job. As a past SourceForge project of the month, I thought I should try to get them to fix things. The last thing I want to do is to move my project, with all the hassle that entails. One thing I like about SourceForge is that you retain complete control over the projects you own. There is no benevolent dictator who can sweep in and take over your project. Ross Turk of SourceForge has been quite responsive to my suggestions, but I think they have a lot of projects to manage and Maven is a Java thing. Java is just one of the many languages there projects are written in.
I have now taken up an offer from Jason Van Zyl for free hosting of my primary repository at http://oss.sonatype.org. It is synced with central, so my jsr107cache problems are over. No distribution problems either.
 <distributionManagement>
        <repository>
            <id>sourceforge-releases</id>
            <name>Sourceforge Release Repository</name>
            <url>http://oss.sonatype.org/content/repositories/sourceforge-releases</url>
        </repository>
        <snapshotRepository>
            <id>sourceforge-snapshots</id>
            <name>Sourceforge Snapshot Repository</name>
            <url>http://oss.sonatype.org/content/repositories/sourceforge-snapshots</url>
        </snapshotRepository>
  </distributionManagement>
I have already started deploying snapshots to this repo.
The full contents of the old repositories at http://ehcache.sf.net/repository and http://ehcache.sf.net/snapshotrepository have been migrated to sonatype by the guys there. I have deleted the old repository to avoid confusion.
At sonatype my stuff lives in a SourceForge repository. Any other SourceForge based projects wishing to host their Maven repos there should contact the guys at Sonatype, as this will be simple for them to set up.
oss.sonatype.org runs on Nexus, a new Maven Repository Manager (“MRM”). It makes a lot of sense to use an MRM inside your company. While a dumb Apache directory works, there is so much more you can do with an MRM. Searching, proxying externals and providing developer friendly metadata are top of my list. Nexus is dual sourced, with a community edition giving most of what you want, and things like LDAP for security available in the licensed version. And of course, uses ehcache :)
Using a smart primary repo, which is then synced to central is better for quality too. As Jason says, 
“if we could get enough projects to use the Nexus gateway then we can really start taking real measures to have signatures, sources, javadocs and even check that transitive closures are intact. Basically prevent any shit from getting into the central repository and making it easier for projects to get artifacts to central.”  

I have been waiting for enough people to move to Java 5 to mandate it as a minimum standard for ehcache. At JavaOne 2008 I found out that a lot of people were still to make the move. Now that we are in 2009 I have decided to move to Java 5. As part of this I have done a general cleanup of the core. I can now retire backport-concurrent which has served the project well (thanks guys) and other dependencies. Ehcache-1.6 core has no dependencies.

I decided that with the improvements in concurrency support that have come along, it was time to move beyond the use of synchronized. Years ago I adopted striped locking on BlockingCache which gave amazing results but I left the core pretty much as it was. The rework adopts some new goodness in Java 5 such as CopyOnWriteArray and ConcurrentHashMap. Having said that there is nothing in Java 5 for eviction, so the new work relies heavily on some excellent contributions to provide performance for caching application that is not available in Java 5.
On my own concurrency tests, which use 70 threads simulating a typical load against a single cache, I get the following improvements in ehcache-1.6 over ehcache-1.5. (Note 70 are just for that cache. Ehcache typically has many caches, so this translates to a production system with thousands of threads against all caches)
Operation Number of Times Faster
Than Ehcache-1.5.0
get 92.5 times faster
put 30 times faster
remove 48 times faster
removeAll 80 times faster
keySet 30 times faster

Manik Surtani maintains a cache performance benchmark tool. Using that I have added ehcache-1.6. It shows dramatically the performance increases in Ehcache-1.6.
For those with less than perfect eyesight, the second column, which is too short to even have its time printed, is the ehcache-1.6 performance.
What these charts are saying, is that an ehcache, with 25 concurrent threads, is now much faster than it was. The single threaded case it no faster. But caches are not about single threads.
Now, in case everyone gets preoccupied on the comparsions between Java caches, here is an old Ehcache versus Memcached chart.
If I redid this chart using ehcache, the barely visible columns for ehcache would completely disappear on this scale.
So what is this really saying? An in-process cache, which uses a few tens of CPU operations to access data already held in memory, is much, much, much faster than going out over the network for some data, regardless of how slick the server implementation at the other end is.
But I recognise that Memcached is about a different type of caching: massive partitioned caches. The Ehcache project has the Ehcache Server for that, with RESTful and SOAP APIs. The RESTful implementation uses a variety of tricks such as conditional get, the ability to have hardware and software load balancers (think ngnx) perform URI routing, head, HTTP1.1 compression and pipelining plus the goodness of modern NIO Java Web Containers to seriously give memcached a run for its money. I will be doing some performance comparisons between Memcached and Ehcache Server in the near future.
What else is next? The above numbers are for MemoryStore based caches. I am also going to give the DiskStore a work over, with lots of suggestions made to me in the last year. Stay tuned.
 

Users of ehcache server have been discussing extending the basic CRUD operations of REST with some more advanced methods, such as deleting all elements in a cache with one DELETE operation.
You are most welcome to join what has become an informative forum thread here: https://sourceforge.net/forum/forum.php?thread_id=2546225&forum_id=322278
So far we have posts from myself, Jim Webber, Brett Dargan and others interested in creating or finding a REST convention for referring to all and specifying means of multi-get, multi-put and multi-delete.

In April Dave Whitla created a project for a Maven Glassfish Plugin.

Kohsuke Kowaguchi joined the project and copied his code in and released it. His focus was V3 Embedded. It supported one goal: run.

There was disagreement as to the features and the code to use. Dave’s plugin was to support a wide range of goals supporting integration of V2 and above into the build process.

Now to use the convenience name you normally add a pluginGroup:

org.glassfish org.glassfish.maven.plugin

The end result is that we have two plugins called maven-glassfish-plugin which are different, but because they use the special naming of maven-name-plugin, both are invoked with mvn glassfish:goal, causing a namespace conflict.

Now when I do mvn glassfish:run I get:

mvn glassfish:run
[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'glassfish'.
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Required goal not found: glassfish:run in org.glassfish.maven.plugin:maven-glassfish-plugin:2.1
[INFO] ------------------------------------------------------------------------

Until or unless Dave’s add a run goal, you can work around it by avoiding Maven’s convenience naming conventions and fully qualifying Kohsuke’s.
mvn org.glassfish:maven-glassfish-plugin:run
It would be nice for one of Kohsuke, Dave or Byron to sort this out.
My suggestion is for Kohsuke to rename his to maven-glassfish-embedded-plugin.

I gave a talk today at the Glassfish V3 Prelude Launch Event. Ehcache Server uses Glassfish for its self contained cache server. You can watch the video of the session here.

Rick Bryant sent me some sample code he wrote which shows how to use the RESTful Cache Server from Java. Thanks Rick. To use the sample just fire up the cache server: startup.sh and then run the following Java code.

package samples;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;
/**
* A simple example Java client which uses the built-in java.net.URLConnection.
*
* @author BryantR
* @author Greg Luck
*/
public class ExampleJavaClient {
private static String TABLE_COLUMN_BASE =
"http://localhost:8080/ehcache/rest/tableColumn";
private static String TABLE_COLUMN_ELEMENT =
"http://localhost:8080/ehcache/rest/tableColumn/1";
/**
* Creates a new instance of EHCacheREST
*/
public ExampleJavaClient() {
}
public static void main(String[] args) {
URL url;
HttpURLConnection connection = null;
InputStream is = null;
OutputStream os = null;
int result = 0;
try {
//create cache
URL u = new URL(TABLE_COLUMN_BASE);
HttpURLConnection urlConnection = (HttpURLConnection) u.openConnection();
urlConnection.setRequestMethod("PUT");
int status = urlConnection.getResponseCode();
System.out.println("Status: " + status);
urlConnection.disconnect();
//get cache
url = new URL(TABLE_COLUMN_BASE);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
is = connection.getInputStream();
byte[] response1 = new byte[4096];
result = is.read(response1);
while (result != -1) {
System.out.write(response1, 0, result);
result = is.read(response1);
}
if (is != null) try {
is.close();
} catch (Exception ignore) {
}
System.out.println("reading cache: " + connection.getResponseCode()
+ " " + connection.getResponseMessage());
if (connection != null) connection.disconnect();
//create entry
url = new URL(TABLE_COLUMN_ELEMENT);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestProperty("Content-Type", "text/plain");
connection.setDoOutput(true);
connection.setRequestMethod("PUT");
connection.connect();
String sampleData = "ehcache is way cool!!!";
byte[] sampleBytes = sampleData.getBytes();
os = connection.getOutputStream();
os.write(sampleBytes, 0, sampleBytes.length);
os.flush();
System.out.println("result=" + result);
System.out.println("creating entry: " + connection.getResponseCode()
+ " " + connection.getResponseMessage());
if (connection != null) connection.disconnect();
//get entry
url = new URL(TABLE_COLUMN_ELEMENT);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
is = connection.getInputStream();
byte[] response2 = new byte[4096];
result = is.read(response2);
while (result != -1) {
System.out.write(response2, 0, result);
result = is.read(response2);
}
if (is != null) try {
is.close();
} catch (Exception ignore) {
}
System.out.println("reading entry: " + connection.getResponseCode()
+ " " + connection.getResponseMessage());
if (connection != null) connection.disconnect();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (os != null) try {
os.close();
} catch (Exception ignore) {
}
if (is != null) try {
is.close();
} catch (Exception ignore) {
}
if (connection != null) connection.disconnect();
}
}
}
The RESTful Ehcache Server is designed to achieve massive scaling using data partitioning – all from a RESTful interface. The largest ehcache single instances run at around 20GB in memory. The largest disk stores run at 100Gb each. Add nodes together, with cache data partitioned across them, to get larger sizes. 50 nodes at 20GB gets you to 1 Terabyte. 
Two deployment choices need to be made: where is partitoning performed, and is redundancy required? These choices can be mixed and matched with a number of different deployment topologies.
This topology is the simplest. It does not use a load balancer. Each node is accessed directly by the cache client using REST. No redundancy is provided. 
The client can be implemented in any language because it is simply a HTTP client. It must work out a partitioning scheme. Simple key hashing, as used by memcached, is sufficient. Here is a Java code sample:
String[] cacheservers = new String[]{“cacheserver0.company.com”, ”cacheserver1.company.com”, ”cacheserver2.company.com”, ”cacheserver3.company.com”, ”cacheserver4.company.com”, ”cacheserver5.company.com”};
Object key = ”123231″;
int hash = Math.abs(key.hashCode());
int cacheserverIndex = hash % cacheservers.length;
String cacheserver = cacheservers[cacheserverIndex];
Redundancy is added as shown in the above diagram by:
  1. Replacing each node with a cluster of two nodes. One of the existing distributed caching options in ehcache is used to form the cluster. Options in ehcache 1.5 are RMI and JGroups-based clusters. Ehcache-1.6 will add JMS as a further option.
  2. Put each ehcache cluster behind VIPs on a load balancer.
Interestingly, content-switching load balancers support URI routing using some form of regular expressions. So, you could optionally skip the client-side hashing to achieve partitioning in the load balancer itself. For example:
/ehcache/rest/sampleCache1/a* => cluster1
/ehcache/rest/sampleCache1/a* => cluster2
Things get much more sophisticated with F5 load balancers, which let you create iRules in the TCL language. So rather than regular expression URI routing, you could implement key hashing-based URI routing. Remember in Ehcache’s RESTful server, the key forms the last part of the URI. e.g. In the URI http://cacheserver.company.com/ehcache/rest/sampleCache1/3432 , 3432 is the key. See http://devcentral.f5.com/Default.aspx?tabid=63&PageID=153&ArticleID=135&articleType=ArticleView for a sample URI hashing iRule.
















IntelliJ 8 milestone 1, a.k.a. Diana rocks! For the non-IntelliJ users of this world, 8m1 was released in the last week.
IntelliJ 7 annoyed me. It was slow and bloated. Some stuff was added in without enough thought. The facets feature, which I was unable to turn off irritated me so much I logged a bug about it. It sort of felt like the philosophy of Idea had been abandoned.
When you start 8 it asks you what features you want. You are warned that they come at a cost. There are more features than 7. You can always turn them on later, presumably by adding the plugin. The first example is RCSs. I just added svn, as that is all I am using these days. You then go through a series of screens
So, I added what I need and voila! – all of the 7 sluggishness has gone.
I have now changed to 8 full time.

I have just released ehcache-server-0.3, which includes a fully functional RESTful, resource-oriented implementation. The standalone-server has also been updated to 0.3.

Cache API

CacheManager Resource Operations
OPTIONS /
Lists the methods supported by the CacheManager resource
GET /
Lists the Caches in the CacheManager.
Cache Resource Operations
OPTIONS /{cache}/
Lists the methods supported by the Cache resource
GET /{cache}
Lists the elements in the cache.
PUT /{cache}
Creates a Cache using the defaultCache configuration.
DELETE / {cache}
Deletes the Cache.
Element Resource Operations
OPTIONS /{cache}/{element}
Lists the methods supported by the Element resource
HEAD /{cache}/{element}
Retrieves the same metadata a GET would receive returned as HTTP headers. There is no body returned.
GET /{cache}/{element}
Gets the element.
HEAD /{cache}/{element}
Gets the element’s metadata.
PUT /{cache}/{element}
Puts and element into the Cache.
DELETE / {cache}/{element}
Deletes the element from the cache.

continue reading…