Daniel Mitterdorfer

Microbenchmarking in Java with JMH: An Introduction

Fire Breathers

Image by Zach Dischner; license: CC

This is the first post in a series about microbenchmarking on the JVM with the Java Microbenchmarking Harness (JMH).

part 2: Microbenchmarks and their environment

part 3: Common Flaws of Handwritten Benchmarks

part 4: Hello JMH

part 5: Digging Deeper

In this post I'll introduce benchmarking conceptually and describe the specific flavor microbenchmarking.

What is Benchmarking?

When software engineers are concerned with the performance of a system, they can resort to a rich variety of practices, for example:

In this series of blog posts we take a deeper look at benchmarking. As software engineers, we are interested in answers to questions like:

These examples already expose some characteristics of software benchmarks:

This leads to two commonly known types of benchmarks:

You should now have a rough understanding of microbenchmarking and where this series of blog posts is headed. In the next post, we'll discuss common flaws of microbenchmarks on the JVM and find out why writing a correct one is not a piece of cake. Stay tuned.

Questions or comments?

Just ping me on Twitter


Using Perlock with Spring

Perlock - a simple Java path watching library I have written - is available in Maven Central since a few weeks now. I thought it would be nice to provide another demo application to show how it can be used with Spring. It is provided along with Perlock on Github in examples/perlock-spring-demo.

Demo Scenario

The demo application implements a XML file processing application. Clients put XML files in an input directory (in the example, it's just the /tmp directory), a MessageDispatcher listens for new XML files in the directory with Perlock, and dispatches the file processing to a MessageProcessor which simulates some work. Below is an overview of the involved classes:

Key Classes of the Spring Perlock demo application

The most interesting part of the demo application is MessageDispatcher (slightly simplified below):

public class MessageDispatcher extends AbstractPathChangeListener {
    private final Executor executor;

    public MessageDispatcher(Executor executor) {
        this.executor = executor;
    }

    @Override
    public void onPathCreated(Path path) {
        if (path.toString().endsWith("xml")) {
            executor.execute(new MessageProcessor(path));
        }
    }
}

It is notified by Perlock whenever a path is created (callback #onPathCreated(Path)) and decides whether a corresponding message needs to be processed. As message processing might by very time-consuming, the MessageDispatcher executes each MessageProcessor in an Executor. Thus, the MessageDispatcher returns very quickly allowing the application to stay responsive.

Bootstraping Perlock with Spring

The Perlock API provides three key API elements: We have seen the first one already: PathChangeListener, which listens for changes on a file system. It is always associated to one PathWatcher, which allows applications to control the path watching life cycle. A PathWatcher is created by a PathWatcherFactory. For details refer to the README on Github or my introductory post on Perlock. The Perlock-Spring integration consists of two FactoryBean implementations; one to create the PathWatcherFactory and another one to create PathWatcher instances:

Key classes of the Spring-Perlock Integration

This might look verbose at first glance, but keep in mind that an application should only create one instance of PathWatcherFactory but may create multiple instances of PathWatcher to watch different paths, so the separation of the FactoryBeans makes sense.

Trying the Spring Demo

To try the demo yourself, follow these steps:

  1. Clone the repo: git clone https://github.com/danielmitterdorfer/perlock.git
  2. Create a JAR with all dependencies: gradle fatJar
  3. Issue: java -jar examples/perlock-spring-demo/build/libs/perlock-spring-demo-0.2.0.jar

You'll now see this output:

TRACE 2014-02-19 21:04:15,298 [main] (PathWatcherFactory.java:114) - Submitting 'PathWatcher for '/tmp'' to executor service.
 INFO 2014-02-19 21:04:15,302 [main] (SpringPathWatcherDemo.java:29) - Press any key to stop the demo
TRACE 2014-02-19 21:04:15,302 [pathWatchExecutor-1] (PathWatcherFactory.java:140) - About to run 'PathWatcher for '/tmp''.
TRACE 2014-02-19 21:04:15,308 [pathWatchExecutor-1] (AbstractRegistrationStrategy.java:33) - Registering Path: '/tmp'
TRACE 2014-02-19 21:04:15,309 [pathWatchExecutor-1] (WatchServicePathWatcher.java:91) - Waiting for file system events

cd /tmp in a new terminal window and issue touch hello{1,2}.xml && sleep 2 && touch hello{3,4}.xml:

You'll now see output like:

TRACE 2014-02-19 21:07:01,425 [pathWatchExecutor-1] (WatchServicePathWatcher.java:124) - Received watch event with kind 'ENTRY_CREATE'
 INFO 2014-02-19 21:07:01,426 [pathWatchExecutor-1] (MessageDispatcher.java:31) - Starting handling path '/tmp/hello1.xml'
TRACE 2014-02-19 21:07:01,428 [pathWatchExecutor-1] (WatchServicePathWatcher.java:124) - Received watch event with kind 'ENTRY_CREATE'
TRACE 2014-02-19 21:07:01,428 [messageHandlingExecutor-1] (MessageProcessor.java:24) - Simulating work on '/tmp/hello1.xml'
 INFO 2014-02-19 21:07:01,428 [pathWatchExecutor-1] (MessageDispatcher.java:31) - Starting handling path '/tmp/hello2.xml'
TRACE 2014-02-19 21:07:01,428 [messageHandlingExecutor-1] (MessageProcessor.java:27) - Processing '/tmp/hello1.xml'. 5 seconds left...
TRACE 2014-02-19 21:07:01,429 [pathWatchExecutor-1] (WatchServicePathWatcher.java:124) - Received watch event with kind 'ENTRY_CREATE'
TRACE 2014-02-19 21:07:01,429 [messageHandlingExecutor-2] (MessageProcessor.java:24) - Simulating work on '/tmp/hello2.xml'
TRACE 2014-02-19 21:07:02,429 [messageHandlingExecutor-1] (MessageProcessor.java:27) - Processing '/tmp/hello1.xml'. 4 seconds left...
# ... more output here ...
TRACE 2014-02-19 21:07:05,438 [messageHandlingExecutor-3] (MessageProcessor.java:27) - Processing '/tmp/hello3.xml'. 1 seconds left...
TRACE 2014-02-19 21:07:06,433 [messageHandlingExecutor-1] (MessageProcessor.java:34) - Done. Cleaning up... '/tmp/hello1.xml'
TRACE 2014-02-19 21:07:06,434 [messageHandlingExecutor-2] (MessageProcessor.java:34) - Done. Cleaning up... '/tmp/hello2.xml'
TRACE 2014-02-19 21:07:06,439 [messageHandlingExecutor-3] (MessageProcessor.java:34) - Done. Cleaning up... '/tmp/hello3.xml'
TRACE 2014-02-19 21:07:06,439 [messageHandlingExecutor-4] (MessageProcessor.java:34) - Done. Cleaning up... '/tmp/hello4.xml'

Note how all MessageProcessors run interleaved in their own thread. This allows the application to receive new events while still processing older messages.

What have we done?

The takeaways of this post are:

If you need path watching in a Spring-based application go and get the new Perlock-Spring integration! In Gradle it is as easy as:

dependencies {
    compile group: 'name.mitterdorfer.perlock', name: 'perlock-spring', version: '0.2.0'
}

or alternatively via Maven.

<dependency>
    <groupId>name.mitterdorfer.perlock</groupId>
    <artifactId>perlock-spring</artifactId>
    <version>0.2.0</version>
</dependency>

Questions or comments?

Just ping me on Twitter


Perlock - Path Watching without Headaches

Sherlock

Image by Sandra; license: CC

As I have written in my previous post the JDK 7 WatchService API is too low level to be used directly in an application. To my surprise, to this day no Java library has existed that abstracts the JDK 7 WatchService API and provides an easy high level API on top.

A Simplified API: Perlock

Therefore I created perlock (short for Path-Sherlock). It supports the standard use case of watching files or directories for modifications and performing some action then. It consists of a very small API with two interfaces and two classes:

Suppose, you want to print a message every time a file is created in the path /tmp. First, implement the callback:

public class SamplePathLogger extends AbstractPathChangeListener {
    @Override
    public void onPathCreated(Path path) {
        System.out.println("Created: '" + path + "'");    
    }
}

Next, create a PathWatcher and start watching:

Path pathToWatch = Paths.get("/tmp");
ExecutorService executorService = Executors.newFixedThreadPool(1);
PathWatcherFactory pathWatchers = new PathWatcherFactory(executorService);
PathWatcher watcher = pathWatchers.createNonRecursiveWatcher(pathToWatch, new SamplePathLogger());
watcher.start();

That's it. When you are done, call watcher.stop() and shutdown the thread pool. You might wonder why the PathWatcherFactory needs an ExecutorService. Path watching is always performed in a background thread (one per PathWatcher instance) otherwise path watching would block your entire program.

Using Perlock

If you want to try perlock, clone the Github repo, try the example application or build the binary yourself with Gradle.

Questions or comments?

Just ping me on Twitter


Watching File Changes with the JDK 7 WatchService

Have you ever needed to watch the file system for changes in a Java application? Java 7 ships with the WatchService API that is suited exactly for this use case as you might know. My journey with the WatchService API began a few months ago when I stumbled across this code I have originally written a few years ago:

//TODO: Update to JDK 7 and replace with native watcher
FileSystemManager fsManager = VFS.getManager();
watchedDirectory = fsManager.resolveFile(queueDirectory.getAbsolutePath());

DefaultFileMonitor fileMonitor = new DefaultFileMonitor(this);
fileMonitor.setRecursive(false);
fileMonitor.addFile(watchedDirectory);

In the snippet above I setup a Apache Commons VFS DefaultFileMonitor which checks for modification within the queueDirectory. The code is part of a Dropbox-based data synchronization I use for myself. Although there is nothing wrong with Apache Commons VFS, I use only a small portion of a quite large library and wanted to reduce the application's footprint by switching to native JDK facilities. Furthermore, Apache Commons VFS relies on polling whereas the Java 7 WatchService can use native kernel facilities if they are available on the platform. As Java 7 is available for quite some time, I decided to finally have a look at the WatchService API. I was not amused.

The Pain

As you can see above, the DefaultFileMonitor in Apache Commons VFS has a very straightforward API: You register for a directory to watch and receive change events afterwards. However, a look at the Javadoc of WatchService shows that Oracle has chosen a totally different approach. The code snippet below demonstrates how to watch for changes in a directory with the WatchService API. I simplified the example to its bare essentials: the code contains no precondition checks and no exception handling. Do not use this code for anything else than playing around with the API, I have warned you:

import java.io.File;
import java.nio.file.*;
import static java.nio.file.StandardWatchEventKinds.*;

public class BasicPathWatcher {
    public static void main(String[] args) throws Exception {
        //the path to watch; we assume no recursive watching (would be even more complicated)
        Path pathToWatch = Paths.get(args[0]);
        WatchService ws = pathToWatch.getFileSystem().newWatchService(); // 1.
        // we do not care about the returned WatchKey in this simple example
        pathToWatch.register(ws, ENTRY_CREATE, ENTRY_MODIFY, ENTRY_DELETE); //2.
        boolean valid = true;
        while(valid) {
            WatchKey key = ws.take(); //3.
            for (WatchEvent &lt;?&gt; e: key.pollEvents()) { //4.
                WatchEvent.Kind&lt;?&gt; kind = e.kind();
                if (kind != OVERFLOW) {
                    WatchEvent&lt;Path&gt; event = (WatchEvent&lt;Path&gt;)e;
                    Path filename = event.context();
                    Path child = pathToWatch.resolve(filename);
                    System.out.println("Got event '" + kind + "' for path '" + child + "'");
                }
            }
            valid = key.reset(); //5.
        }
    }
}

Here is what the code does:

  1. Get the right WatchService. Many code examples get this step wrong by always obtaining the WatchService from the "default" file system. If the provided path is not on the "default" file system, the wrong WatchService is obtained. The correct way is to use Path#getFileSystem()#newWatchService() as shown in the example.
  2. Register a Path at the WatchService. You'll then get a WatchKey, which is a handle for events that you'll receive later on.
  3. Next, call WatchService#take() in a loop to get notified on file system changes. This call blocks until an event occurs.
  4. The call returns a WatchKey, which you then have to poll for events.
  5. Now you can handle the events and finally reset the WatchKey (or you won't get any further events). If you want to watch recursively and a directory has been created, you have to register the new directory again with the watch service.

Even for the simplest case, these are a lot of details that you have to get right in order to use the API correctly. A more robust implementation is as least twice as long.

There is a reason why the API is designed this way. Oracle engineer Alan Bateman gave an introductory talk on the WatchService API at Devoxx 2011 where he also described the API design (starts at around 45 minutes into the video).

This part of the API is deliberately a low-level interface

Alan Bateman at Devoxx 2011

In the video, he describes that the key decision was to design the interface very low-level to allow for many use-cases such as server applications or graphical applications.

Others must experience this pain too...

Although the API is hard to use, I still wanted to integrate it albeit not directly in the application. So I searched for libraries that abstract all the gory details. To my surprise I couldn't find a single one. These are the file watching libraries in Java I am aware of:

There also exist a few Scala libraries but nobody seems to provide an abstraction library in plain Java.

Conclusion

I refuse to use such a complex low-level API directly in my application. Since Oracle designed the WatchService API as a low level API I would have expected that someone has written a library by now that abstracts the details. It's time to do something about that ...

Questions or comments?

Just ping me on Twitter


Hello World on (20)131211

What's this all about? Well, let the pictures speak for themselves:

Geek and Poke - Concurrency

Geek and Poke - Latency

Geek and Poke - Software Engineering

Image, Image, Image by Geek&Poke; license: CC