Classloader leaks III – “Die Thread, die!”

If you just want a quick fix to the problem without understanding the theory, jump to part IV introducing the ClassLoader Leak Prevention library.

In my previous post we looked at different categories of ClassLoader leaks, and looked at a particular example of a reference from outside the web application ClassLoader (a JVM shutdown hook pointing to a JAI class).

In this post we will look at another category; unterminated Threads running in your ClassLoader. This is a problem you can easily create yourself, but it may also come from third party libraries.

MAT analysis with running thread

When doing the “load all classes from third party JARs” test mentioned in my former post, and analyzing it with the technique outlined in my first post, I also ended up with this finding:

Batik analysis

As you can see, it is a thread still running inside my ClassLoader. We can also see, that the thread seems to be part of the Batik library. I was using version 1.5 beta 4, so let’s dig into the sources.

org.apache.batik.util.SoftReferenceCache (from line 181):

    private static Thread cleanup;

    static {
        cleanup = new Thread() {
                public void run() {
                    while(true) {
...
                    }
                }
            };
        cleanup.setDaemon(true);
        cleanup.start();
    }

org.apache.batik.ext.awt.image.rendered.TileMap (from line 139):

    static Thread cleanup;

    static {
        cleanup = new Thread() {
                public void run() {
                    while(true) {
...
                    }
                }
            };
        cleanup.setDaemon(true);
        cleanup.start();
    }

So, what do we have here? Not one but two static blocks (executing as the class is loaded) starting threads that execute in a while(true) loop. Once such a Threads is started, there is no garbage collecting their ClassLoader – neither the ClassLoader having loaded the Thread class (if a custom subclass to java.lang.Thread), nor the Threads contextClassLoader. In theory, the contextClassLoader of the thread can be changed (although I believe that rarely makes sense), but to garbage collect the ClassLoader of a custom Threads subclass, the thread must stop executing.

In newer versions of Batik, the two pieces of code above have been merged together into a new class – org.apache.batik.util.CleanerThread. That’s good. What’s not good is that there is at the time of this writing still a while(true) loop… This problem has been reported, and a patch has been proposed.

Stopping the thread – gangsta style

Fortunately, a referece to the thread is held in both SoftReferenceCache and TileMap (as can be seen above). In the new CleanerThread, there is also a static reference:

public class CleanerThread extends Thread {

    static volatile ReferenceQueue queue = null;
    static CleanerThread  thread = null;

That enables us to get hold of the Thread instance using reflection (same as with the shutdown hook in the former post) and call stop() on the Thread. Note that stop() is deprecated, since it may lead to an incosistent state. (You can read more about that in the Thread.stop() JavaDoc and the document that is linked from there.)

In our case however, leaking ClassLoaders and the eventual java.lang.OutOfMemoryError: PermGen space is a bigger problem than any inconsistent state that – if it occurs – presumably affects the abandoned instance of our web application. The best thing we can do in a generic case, is give the thread a chance to finish execution first. So in the cleanup Servlet/context listener we looked at last time, we will add this method, and call it once for every thread that needs to be stopped.

public static void forceThreadStop(Thread thread) {
  thread.interrupt(); // Make Thread stop waiting in sleep(), wait() or join()

  try {
    thread.join(2000); // Give the Thread 2 seconds to finish executing
  } catch (InterruptedException e) {
    // join failed
  }

  // If still not done, kill it
  if (thread.isAlive())
    thread.stop();

Stopping threads gracefully

In case you spawn threads from your own code, you should make sure that there is either a definitive ending point for them or, in case they need to be executed over and over again like a watchdog thread as in the case with Batik, that there is a way to gracefully tell to Thread to stop executing.

So, instead of the while(true), you should have a boolean flag that can be altered in order to tell the thread it’s time to die.

public class MyThread extends Thread {

  private boolean running = true;

  public void run() {
    while(running) {
      // Do something
    }
  }

  public void shutdown() {
    running = false;
  }
}

It is very important to note however, that the above code is likely to still leak ClassLoaders. This is because the JVM may cache the value of fields per thread, which Heinz Kabutz explains in somewhat more detail in The Java Specialists’ Newsletter edition titled “The Law of the Blind Spot”.

As Heinz shows, the easiest solution is probably to add the volatile keyword.

public class MyThread extends Thread {

  private volatile boolean running = true;

  public void run() {
    while(running) {
      // Do something
    }
  }

  public void shutdown() {
    running = false;
  }
}

I encourage you to read Heinz’s entire article.

That’s all for this time. Until next post, good luck killing those threads!

Links to all parts in the series

Part I – How to find classloader leaks with Eclipse Memory Analyser (MAT)

Part II – Find and work around unwanted references

Part III – “Die Thread, die!”

Part IV – ThreadLocal dangers and why ThreadGlobal may have been a more appropriate name

Part V – Common mistakes and Known offenders

Part VI – “This means war!” (leak prevention library)

Presentation on Classloader leaks (video and slides)

Classloader leaks II – Find and work around unwanted references

If you just want a quick fix to the problem without understanding the theory, jump to part IV introducing the ClassLoader Leak Prevention library.

In my previous post we learnt how to locate classloader leaks using Eclipse Memory Analyzer (MAT).

This time we will discuss different reasons for leaks, look at an example of a leak in a third party library, and see how we can fix that leak by a workaround.

Different reasons for ClassLoader leaks

In order to know what you should be looking for in your heapdump analysis, we could categorize ClassLoader leaks into three different types. In the end, they are all just variants of the first one.

  1. References from outside your webapp – that is from the application server or the JDK classes – to either the ClassLoader itself or one of the classes it has loaded (which in turn has a reference to the ClassLoader), including any instances of such classes.
     
  2. Threads running inside your webapp. If you spawn new threads from within your web application that may not terminate, they are likely to prevent your ClassLoader from being garbage collected. This can happen even if the thread does not use any of the classes loaded by your webapps ClassLoader. This is because threads have a context classloader, to which there is a reference (contextClassLoader) in the java.lang.Thread class. More about this in the next post.
     
  3. ThreadLocals with values whose class is loaded in your webapp. If you use ThreadLocals in your webapp, you need to explicitly clear all ThreadLocals before the webapp closes down. This is because a) the application server uses a thread pool, which means that the thread will outlive your webapp instance and b) ThreadLocal values are actually stored in the java.lang.Thread object. Therefore, this is just a variation of 1.
    (Note: This may be the case most likely created by yourself, but also exists in third party libraries.)

Example of reference from outside your application

When trying to hunt down a ClassLoader leak in our web application, I created a little JSP page in which I looped through all the third party JARs of our application. I tried to load every single class that was found in a custom ClassLoader, added a ZombieMarker to the ClassLoader (see previous post) and then disposed the ClassLoader. I ran the JSP page over and over again until I got a java.lang.OutOfMemoryError: PermGen space. That is, I was able to trigger ClassLoader leaks just by loading classes from our third party libraries… 🙁 It actually turned out to be more than one of them, that triggered this behaviour.

Here is a MAT trace for one of them:

(In this picture, it’s not obvious where our ClassLoader is. The custom ClassLoader was an anonymous inner class in my JSP, so it’s the second entry with the strange class name ending with $1.)

At first glance, it may seem like this is type 2 above, with a running thread. This is not the case however, since the thread itself is not the GC root (not at the bottom level). In fact, there is a Thread involved, but it is not running.

Rather we can see that what keeps our ClassLoader from being garbage collected is a reference from outside the webapp (java.lang.*) to an instance of com.sun.media.jai.codec.TempFileCleanupThread, which in turn is loaded by our ClassLoader. From the names of the referenced and referencing (java.lang.ApplicationShutdownHook) classes, I suspected that a JVM shutdown hook was added by some Java Advanced Imaging (JAI) class when it was loaded.

The com.sun.media.jai.codec.TempFileCleanupThread class is in the Codec part of JAI; version 1.1.2_01 in our case. The sources can be found in the official SVN repo (1.1.2_01 tag). As you can see, TempFileCleanupThread.java class is not in that list. That is because someone thought is was a great idea to put it as a package protected class in FileCacheSeekableStream.java.

There we can also find the source of the leak.

    // Create the cleanup thread. Use reflection to preserve compile-time
    // compatibility with JDK 1.2.
    static {
        try {
            Method shutdownMethod =
                Runtime.class.getDeclaredMethod("addShutdownHook",
                                                new Class[] {Thread.class});

            cleanupThread = new TempFileCleanupThread();

            shutdownMethod.invoke(Runtime.getRuntime(),
                                  new Object[] {cleanupThread});
        } catch(Exception e) {
            // Reset the Thread to null if Method.invoke failed.
            cleanupThread = null;
        }
    }

As suspected, there is a static block that (via reflection) adds a JVM shutdown hook, as soon as the com.sun.media.jai.codec.FileCacheSeekableStream class is loaded. Not very practical in a web application environment, since the JVM will will not shutdown until the application server is shut down.

The JAI TempFileCleanupThread is supposed to delete temporary files when the JVM shuts down. In a web application, what we want is probably to remove those temporary files as soon as the web application is redeployed. If this was our own code, we should have changed this. In this case it’s a third party library, and judging from the SVN trunk, this still has not been fixed, so upgrading doesn’t help. (This has been reported here.)

Cleaning up leaking references at redeploy

In order to clean up references as part of web application shutdown, to prevent ClassLoader leaks, there are two approaches. You can either put the code in the destroy() method of a Servlet that is load-on-startup

  <servlet servlet-name='cleanup' servlet-class='my.CleanupServlet'>
    <load-on-startup>1</load-on-startup>
  </servlet>

or (probably slightly more correct) you can create a javax.servlet.ServletContextListener and add the cleanup to the contextDestroyed() method.

  <listener>
    <listener-class>my.CleanupListener</listener-class>
  </listener>

The workaround

Fortunately, FileCacheSeekableStream keeps a reference to the shutdown hook in our case.

public final class FileCacheSeekableStream extends SeekableStream {

    /** A thread to clean up all temporary files on VM exit (VM 1.3+) */
    private static TempFileCleanupThread cleanupThread = null;

So let’s grab that reference and remove the shutdown hook. But we probably don’t just want to throw away the hook, since in theory that may leave us with temporary files that should has been deleted at JVM shutdown. Instead get the hook, remove it, and then run it immediately.

We may actually turn this into a generic method, to be reused for other third party shutdown hooks we want to remove. (System.out is used for logging, since logging frameworks usually needs to be cleaned up too, and I suggest you do that before calling this method.)

private static void removeShutdownHook(Class clazz, String field) {
  // Note that loading the class may add the hook if not yet present... 
  try {
    // Get the hook
    final Field cleanupThreadField = clazz.getDeclaredField(field);
    cleanupThreadField.setAccessible(true);
    Thread cleanupThread = (Thread) cleanupThreadField.get(null);

    if(cleanupThread != null) {
      // Remove hook to avoid PermGen leak
      System.out.println("  Removing " + cleanupThreadField + " shutdown hook");
      Runtime.getRuntime().removeShutdownHook(cleanupThread);
      
      // Run cleanup immediately
      System.out.println("  Running " + cleanupThreadField + " shutdown hook");
      cleanupThread.start();
      cleanupThread.join(60 * 1000); // Wait up to 1 minute for thread to run
      if(cleanupThread.isAlive())
        System.out.println("STILL RUNNING!!!");
      else
        System.out.println("Done");
    }
    else
      System.out.println("  No " + cleanupThreadField + " shutdown hook");
    
  }
  catch (NoSuchFieldException ex) {
    System.err.println("*** " + clazz.getName() + '.' + field + 
      " not found; has JAR been updated??? ***");
    ex.printStackTrace();
  }
  catch(Exception ex) {
    System.err.println("Unable to unregister " + clazz.getName() + '.' + field);
    ex.printStackTrace();
  }    
}

Now we just call that method in our application shutdown (CleanupServlet.destroy() / CleanupListener.contextDestroyed()) like so:

removeShutdownHook(com.sun.media.jai.codec.FileCacheSeekableStream.class,
  "cleanupThread");

In a worst case scenario, if there is no reference kept to the shutdown hook, we may use reflection into the JVM classes. It would look like this:

final Field field = 
  Class.forName("java.lang.ApplicationShutdownHooks").getDeclaredField("hooks");
field.setAccessible(true);
Map<Thread, Thread> shutdownHooks = (Map<Thread, Thread>) field.get(null);
// Iterate copy to avoid ConcurrentModificationException
for(Thread t : new ArrayList<Thread>(shutdownHooks.keySet())) {
  if(t.getClass().getName().equals("class.name.of.ShutdownHook")) { // TODO: Set name
    // Make sure it's from this web app instance
    if(t.getClass().getClassLoader().equals(this.getClass().getClassLoader())) {
      Runtime.getRuntime().removeShutdownHook(t); // Remove hook to avoid PermGen leak
      t.start(); // Run cleanup immediately
      t.join(60 * 1000); // Wait up to 1 minute for thread to run
    }
  }
}

That’s all for this post. Next time we’ll look at threads running within your ClassLoader.

Update – Bean Validation API begs “FIXME”

I can’t help but post an additional example, that I found just the other day. Had some PermGen errors in a new webapp and this is what I found:

Looking at Validation.java and the inner class javax.validation.Validation.DefaultValidationProviderResolver it does, at least in the current revision, contain these lines of code:

		//cache per classloader for an appropriate discovery
		//keep them in a weak hashmap to avoid memory leaks and allow proper hot redeployment
		//TODO use a WeakConcurrentHashMap
		//FIXME The List<VP> does keep a strong reference to the key ClassLoader, use the same model as JPA CachingPersistenceProviderResolver
		private static final Map<ClassLoader, List<ValidationProvider<?>>> providersPerClassloader =
				new WeakHashMap<ClassLoader, List<ValidationProvider<?>>>();

Isn’t that nice? In the Bean Validation API (JSR 303) – not an implementation but the API – there is a cache that have been created with hot redeployment in mind, and still it has the potential to leaks classloaders. Not only that – the authors of the code have been aware that it can leak classloaders, and still validation-api-1.0.0.GA.jar was released, without any means of manually telling the cache to release our ClassLoader. Sigh…

The leak is triggered when the API is shipped with your application server, but the implementation (Hibernate Validator in my case) is provided in your web application, and thus loaded with your classloader.

Using reflection like above, we stop the leak by getting hold of the Map and remove() our classloader. Alternatively, we could add the JAR of our Validation provider on the Application Server level, so that the cache will not reference our webapp ClassLoader at all.

Links to all parts in the series

Part I – How to find classloader leaks with Eclipse Memory Analyser (MAT)

Part II – Find and work around unwanted references

Part III – “Die Thread, die!”

Part IV – ThreadLocal dangers and why ThreadGlobal may have been a more appropriate name

Part V – Common mistakes and Known offenders

Part VI – “This means war!” (leak prevention library)

Presentation on Classloader leaks (video and slides)

Classloader leaks I – How to find classloader leaks with Eclipse Memory Analyser (MAT)

If you just want a quick fix to the problem without understanding the theory, jump to part IV introducing the ClassLoader Leak Prevention library.

I’m planning a series of posts around classloader leaks, also known as PermGen memory leaks. You have probably arrived at this page because your Java web application crashes with the dreaded java.lang.OutOfMemoryError: PermGen space (or java.lang.OutOfMemoryError: Metaspace, if you’re on Java 8). I will not explain what this error means nor the reason it occurs, since there is lots of information about it on the net – for example, see Frank Kieviet’s blogs on the problem and its solution.

What I will focus on in this first post, is the step between the “what” and the “how” – the “where” that is often forgotten in other online discussions. After you’ve realized you have classloader leaks, you must identify where those leaks are, before you can fix them.

Not many years ago, finding the source of a classloader leak was really tricky – or at least I thought so. The tools at hand were jmap and jhat, which are quite “raw”. Later there were some commercial tools, such as YourKit to help you in the process. Nowadays there are Open Source alternatives that makes it relatively easy to find the offending code. I will show you step by step how to do it.

First things first: the heap dump

The first thing you need to do to find a classloader leak, is to aquire a heap dump to analyze. The heap should be dumped after at least one ClassLoader instance has leaked, so that you can analyze what references there are to the leaked instance, that prevents it from being garbage collected.

One of the easiest ways to do this, is to add a JVM parameter that makes the (Sun/Oracle) JVM automatically create a heapdump whenever a java.lang.OutOfMemoryError occurs. The advantage of this, is that you don’t have to try to force the appearance of the leak, in case you don’t know what triggers it. This also means you won’t spend time looking for a leak in a heapdump where there is none.

The name of the parameter is -XX:+HeapDumpOnOutOfMemoryError, so add -XX:+HeapDumpOnOutOfMemoryError to your command line, script or configuration file – depending on what application server you are using and how you are starting it. Then run and redeploy the application until it crashes with java.lang.OutOfMemoryError: PermGen space / Metaspace and voilà – there is your heap dump. The name of the file will be something like java_pid18148.hprof, and it will be located in whatever was the startup directory of your application server, which may be different from the directory from where you launched the startup script. You may also decide the directory yourself using the -XX:HeapDumpPath=/directory parameter.

Now that you’ve got your heap dump, download Eclipse Memory Analyzer (MAT), run it and open the heap dump you just aquired.

Open heap dump

An alternative approach, is to extract the heap dump from a locally running application server, from inside MAT. Just start MAT and select “Aquire Heap Dump …” from the File menu. This will present you with a list of running Java applications.

Select your application server (make sure it’s not the application servers bootstrapper / watchdog) and click Finish.

Find a leaked classloader

When you open or aquire a heapdump, MAT will ask you if you want to perform some kind of analyzis on the dump, such as looking for memory leak suspects. This may be good for looking for heap leaks, but in my experience is not of much help when it comes to classloader leaks, since the leaked classloaders often have less retained (non-Class) objects than the current “non-leaked” one. Therefore I suggest you click Cancel.
Getting Started Wizard
What you should do instead, depends on what application server you used when aquiring the heap dump. In case you were using a fairly recent version (>= 4.0.12) of Caucho’s Resin you’re in luck, since it has some features that significantly simplifies finding the leaked classloaders. What Resin does, quite geniously, is that it adds a marker to each classloader that from Resins perspective is ready do be garbage collected. That allows us to simply search for that marker and analyze why the marked classloaders are not garbage collected.

So click the “Open Query Browser” icon, and select “List objects” / “with incoming references”.
List objects
Now type in the class name of the marker, which for Resin version 4.0.12 – 4.0.20 is called com.caucho.loader.ZombieMarker and since Resin 4.0.21 it is called com.caucho.loader.ZombieClassLoaderMarker.
List zombie markers
Clicking Finish will present you with a list of zombie marker instances, one for every classloader that Resin considers ready for garbage collection. You can see the classloader for each of them by clicking the little arrow in front, which will unfold the incoming references.

List of zombie markers

Now you can skip the rest of this section.

I don’t know if any other application servers provide something similar to Resins zombie markers, but assuming yours do not, you should do this instead: click the “Open Query Browser” icon, and select “Java Basics” / “Class Loader Explorer”.

Class Loader ExplorerUnless you already know the class name of the classloaders used for each web application in your application server, just click Finish. This will present you with a list of all the classloaders in your heap dump.

Class Loader Explorer listHopefully you can figure out by the class names, which ones are – possibly leaked – web application instances. For each such instance, you need to perform the steps in Finding the leak below to determine if that instance is a leaked one.

Different types of references

As you know, the reason for the java.lang.OutOfMemoryError: PermGen space / Metaspace is that the old, unused classloaders are not being garbage collected, and the reason they are not being garbage collected is that there is a reference from outside the classloader either to a class (including any instance of such class) loaded by that classloader, or to the classloader itself. What you might not know, is that there are actually four different types of references in Java. Before moving on to finding your classloader leak, I thought I’d take the time to explain them briefly.

There is the “normal” strong reference, which is what you have unless you make any effort to have a weaker reference. Then there is the weak reference, which you may have used – directly or indirectly for example via a WeakHashMap. The weak reference works in a such a way, that the referenced object may be garbage collected whenever there are no more strong references to it. This means that weak references will not themselves cause memory leaks.

Not too long ago, I also learned about soft references and phantom references. Soft references are stronger than weak references. An object will not be garbage collected, even if the only reference to it is a soft reference. What a soft reference means, is that whenever the JVM is about to run out of memory, as a last resort it will garbage collect all the objects with only soft (and possibly weaker) references. The JavaDoc for java.lang.ref.SoftReference says

All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError.

The JavaDoc does not explicitly say whether this applies only to normal objects on the heap, or if this applies also to classes in the PermGen space. While investigating a classloader leak with a SoftReference in the mix, I downloaded the JDK 1.6 sources and tried to find out by studying them. My conclusion from the sources – that it does not apply to PermGen / class allocation – was contrary to what later testing showed… I’m still not certain how this really works, but since it was “Long time, no C” for me, I’m leaning towards believing that soft referenced objects are garbage collected before a java.lang.OutOfMemoryError: PermGen space is thrown. If you know for certain, please leave a comment! Update: I even asked a member of Oracles GC development team that couldn’t give a straight answer…

This leaves us with phantom references. I haven’t really gotten a hold of phantom references yet, but they are weaker than weak references and from what I understand, so weak you cannot even reach the referenced object having only a phantom reference to it. Rather the phantom reference can be used with a ReferenceQueue to be notified when the referenced object is being garbage collected. For now we will only need to know two things. 1: You will probably never use any phantom references. 2: Phantom references will not cause classloader leaks.

If you want to read more about the different types of references, see for example this blog entry.

Finding the leak

Now, to find out the cause of your classloader leak, right click on one of the classloaders that you found above – either one that you application server has marked as ready for garbage collection (in that case just right click the zombie marker itself), or one that might be a leaked one. If you’re in the “Class Loader Explorer” you need to first select “Class Loader” and in either case you will then select “Path To GC Roots” and then, since (assumingly) only strong references will cause class loader leaks, select “exclude all phantom/weak/soft etc. references”.
Path To GC Roots
Now one of three things can happen:

I have seen cases where no strong references at all are found. In this case, the classloader should be garbage collected. I won’t discuss now why it isn’t, but might be back with a rant about that. For now, it’s enought to know that it’s not your fault, and there is nothing you can do about it.

If you did not use the zombie marker feature of Resin (or similar in other app server), you may find a totally legitimate strong reference. As an example, your ClassLoader may be the contextClassLoader of a currently executing thread, such as one from the application servers thread pool, serving an HTTP request.
Serving HTTP request
(However, being the contextClassLoader of a thread may actually be the cause of the leak – more about that in part III).

Last but not least, we may find the cause of our leak, by looking throught references and finding the unwanted one that prevents the classloaders from being garbage collected. This reference may be within your own code, a third party library, your application server or the JVM. This is what it would look like, in case you have put your JDBC driver within your web application, rather than on the application server level.
JDBC driver leak
In the following posts, I intend to show a few different examples of what these references might look like, what causes the leak and how to fix or work around the leak.

Until then, good luck hunting down those nasty classloader leaks!

Links to all parts in the series

Part I – How to find classloader leaks with Eclipse Memory Analyser (MAT)

Part II – Find and work around unwanted references

Part III – “Die Thread, die!”

Part IV – ThreadLocal dangers and why ThreadGlobal may have been a more appropriate name

Part V – Common mistakes and Known offenders

Part VI – “This means war!” (leak prevention library)

Presentation on Classloader leaks (video and slides)

Agile code review – video

When I talked about agile code review on JavaForum Göteborg there was a fellow recording the event, and he was kind enough to share the recording with me so that I could put it online.

I mixed it with the slides into the video below. I might get back with the recording of the Q&A session at the end, when my Vimeo quota allows for it. Update: the Q&A part is now also available.

If you want to share the video, please link to this page rather than directly to Vimeo, in case I do some further editing and the URL of the video is changed.

Update: Consider watching a newer recording of this talk here instead.

For better readability of the slides, I recommend viewing the video in HD on Vimeo

Note that the slides and links to code review tools and further reading are in my previous post.

Agile code review

Tonight I will be giving a talk at JavaForum Göteborg on code review and how it can be used by agile teams.
(If you heard me speak, please feel free to rate me at SpeakerRate)

Here are the (Swedish) slides:


As promised, here are the links to some code review tools

In addition to this list, code review is one of the features of GitHub.

Some reading suggestions on code review:
Coding Horror blog entry
Free book from the authors of Code Collaborator
White paper from the authors of Klocwork

Update: A video recording of the talk is now available. Q&A part available here.

Unexpected side effect of LambdaJ aggregates

I have been using LambdaJ (2.3.2) for a couple of weeks now. It’s a simple yet impressive API that you should read more about if you’re not using it already.

One of the features of LamdaJ are aggregates, with which you can do stuff like the following. Assuming a class

class Person {

  // name etc ...  

  int age;

  public int getAge() {
    return age;
  }

  public void setAge(int age) {
    this.age = age;
  }

}

you could then do

Person twelveYearsOld = new Person();
twelveYearsOld.setAge(12);
Person fiftyYearsOld = new Person();
fiftyYearsOld.setAge(50);
List<Person> persons = Array.asList(twelveYearsOld, fiftyYearsOld);

int maxAge = maxFrom(persons).getAge(); // Will be 50
int minAge = maxFrom(persons).getAge(); // Will be 12
int sumAge = sumFrom(persons).getAge(); // Will be 62

The other day I used sumFrom() for the very first time, in a layer of our application where we do not (yet) apply TDD. When I ran a manual test of the changed code, I got an exception informing me that updates could not be made in a read-only transaction! Well yes, the transaction surrounding the code that I changed was read-only, but I hadn’t made any updates to my Hibernate entities…? It didn’t take too long to narrow it down to the LambdaJ sumFrom() being what triggered the exception (which was thrown in the Hibernate flush made at transaction commit). It took me a while longer to understand what was going on. It was a bit interesting, so I will try to explain it to you.

LambdaJ uses proxies for some of its functionality, such as aggregates. If the type argument of the collection (Person in the above example) is an interface, a regular Java Proxy is used. If however the type argument is a class – as in our case above – then LamdaJ will make use of cglib which will perform runtime bytecode instrumentation. This results in a “secret” subclass of the type argument class, for which method calls are sent to a MethodInterceptor which works just like an InvocationHandler for a regular interface based Proxy (LambdaJ’s InvocationInterceptor implements both interfaces).

In the case of LambdaJ aggregates, any method calls to the proxy will be invoked on all the objects in the collection, and the return values will be assembled by some Aggregator (min, max, average, sum etc).

This still does not explain the behavior I was seeing, does it? No, because I left out a part, which I finally realized to be the explanation. In my case the data class had properties that were initialized with default values, not at the declaration and not with a simple assignment, but with a caller to setters in the default constructor. That is, as if the Person class above had

class Person {

  ...

  public Person() {
    this.setAge(20); // Set default age to 20 ("this" is explicit for clarity)
  }

  ...

}

Since cglib proxies are subclasses of the original class, it means that creating an instance of this proxy class will invoke the default constructor of it’s base class. If there are method calls from within the constructor on the object itself, these method calls will also be taken care of by the MethodInterceptor. In the case of LambdaJ, this means that the method call in the constructor will be issued on all the objects in the collection…

Reusing the example above, with Person having it’s new constructor, the result is this

List<Person> persons = Array.asList(twelveYearsOld, fiftyYearsOld);

int maxAge = maxFrom(persons).getAge(); // Will be 20!
int minAge = maxFrom(persons).getAge(); // Will be 20!
int sumAge = sumFrom(persons).getAge(); // Will be 20!
int age12 = twelveYearsOld.getAge(); // Will be changed to 20!!!
int age50 = fiftyYearsOld.getAge(); // Will be changed to 20!!!

The workaround in this case is very simple: Don’t call setters from the constructor. Instead use “inline” property assignment (this.age = 20).

Unfortunately, in our project we have lot’s of classes with default values set by calling setters in the constructor. Therefore I created a patch for LambdaJ, which “deactivates” the InvocationInterceptor (or rather the concrete ProxyIterator subclass) while cglib proxy creation is in progress, and activates it before the proxy is returned for use. Hopefully the fix will make it into LambdaJ 2.3.3.