//Java Binding for the OpenCL API

I am currently working on Java Binding for the OpenCL API using GlueGen (as used in JOGL, JOAL). The project started as part of my bachelor of CS thesis short after the release of the first OpenCL specification draft and is now fully feature complete with OpenCL 1.1. JOCL is currently in the stabilization phase, a beta release shouldn't be far away.

Overview - How does it work?

JOCL enables applications running on the JVM to use OpenCL for massively parallel, high performance computing tasks, executed on heterogeneous hardware (GPUs, CPUs, FPGAs etc) in a platform independent manner. JOCL consists of two parts, the low level and the high level binding.

The low level bindings (LLB) are automatically generated using the official OpenCL headers as input and provide a high performance, JNI based, 1:1 mapping to the C functions.

This has the following advantages:

  • reduces maintenance overhead and ensures spec conformance
  • compiletime JNI bindings are the fastest way to access native libs from the JVM
  • makes translating OpenCL C code into Java + JOCL very easy (e.g. from books or tutorials)
  • flexibility and stability: OpenCL libs are loaded dynamically and accessed via function pointers

The hand written high level bindings (HLB) is build on top of LLB and hides most boilerplate code (like object IDs, pointers and resource management) behind easy to use java objects. HLB use direct NIO buffers internally for fast memory transfers between the JVM and the OpenCL implementation and is very GC friendly. Most of the API is designed for method chaining but of course you don't have to use it this way if you don't want to. JOCL also seamlessly integrates with JOGL 2 (both are built and tested together). Just pass the JOGL context as parameter to the JOCL context factory and you will receive a shared context. If you already know OpenCL and Java, HLB should be very intuitive for you.

The project is available on jogamp.org. Please use the mailinglist / forum for feedback or questions and the bugtracker if you experience any issues. The JOCL root repository is located on github, you may also want to take a look at the jocl-demos project. (If the demos are not enough you might also want to take a look at the junit tests)

Screenshots (sourcecode in jocl-demos project):

JOCL Julia Set high precision

More regarding OpenGL interoperability and other features in upcoming blog entries.

The following sample shows basic setup, computation and cleanup using the high level APIs.

Hello World or parallel a+b=c

 * Hello Java OpenCL example. Adds all elements of buffer A to buffer B
 * and stores the result in buffer C.
 * Sample was inspired by the Nvidia VectorAdd example written in C/C++
 * which is bundled in the Nvidia OpenCL SDK.
 * @author Michael Bien
public class HelloJOCL {

    public static void main(String[] args) throws IOException {
        // Length of arrays to process (arbitrary number)
        int elementCount = 11444777;
        // Local work size dimensions
        int localWorkSize = 256;
        // rounded up to the nearest multiple of the localWorkSize
        int globalWorkSize = roundUp(localWorkSize, elementCount);

        // setup
        CLContext context = CLContext.create();

        CLProgram program = context.createProgram(

        CLBuffer<FloatBuffer> clBufferA =
                       context.createFloatBuffer(globalWorkSize, READ_ONLY);
        CLBuffer<FloatBuffer> clBufferB =
                       context.createFloatBuffer(globalWorkSize, READ_ONLY);
        CLBuffer<FloatBuffer> clBufferC =
                       context.createFloatBuffer(globalWorkSize, WRITE_ONLY);

        out.println("used device memory: "
            + (clBufferA.getSize()+clBufferB.getSize()+clBufferC.getSize())/1000000 +"MB");

        // fill read buffers with random numbers (just to have test data).
        fillBuffer(clBufferA.getBuffer(), 12345);
        fillBuffer(clBufferB.getBuffer(), 67890);

        // get a reference to the kernel functon with the name 'VectorAdd'
        // and map the buffers to its input parameters.
        CLKernel kernel = program.createCLKernel("VectorAdd");
        kernel.putArgs(clBufferA, clBufferB, clBufferC).putArg(elementCount);

        // create command queue on fastest device.
        CLCommandQueue queue = context.getMaxFlopsDevice().createCommandQueue();

        // asynchronous write to GPU device,
        // blocking read later to get the computed results back.
        long time = nanoTime();
        queue.putWriteBuffer(clBufferA, false)
             .putWriteBuffer(clBufferB, false)
             .put1DRangeKernel(kernel, 0, globalWorkSize, localWorkSize)
             .putReadBuffer(clBufferC, true);
        time = nanoTime() - time;

        // cleanup all resources associated with this context.

        // print first few elements of the resulting buffer to the console.
        out.println("a+b=c results snapshot: ");
        for(int i = 0; i < 10; i++)
            out.print(clBufferC.getBuffer().get() + ", ");
        out.println("...; " + clBufferC.getBuffer().remaining() + " more");

        out.println("computation took: "+(time/1000000)+"ms");


    private static final void fillBuffer(FloatBuffer buffer, int seed) {
        Random rnd = new Random(seed);
        while(buffer.remaining() != 0)

    private static final int roundUp(int groupSize, int globalSize) {
        int r = globalSize % groupSize;
        if (r == 0) {
            return globalSize;
        } else {
            return globalSize + groupSize - r;



    // OpenCL Kernel Function for element by element vector addition
    kernel void VectorAdd(global const float* a,
                          global const float* b,
                          global float* c, int numElements) {

        // get index into global data array
        int iGID = get_global_id(0);

        // bound check (equivalent to the limit on a 'for' loop)
        if (iGID >= numElements)  {

        // add the vector elements
        c[iGID] = a[iGID] + b[iGID];

//New Getting Started with JOGL 2 tutorials

Thanks to Justin Stoecker, computer science graduate student at the University of Miami, JOGL gets a new set of getting started tutorials:

JOGL, or Java Bindings for OpenGL, allows Java programs to access the OpenGL API for graphics programming. The graphics code in JOGL programs will look almost identical to that found in C or C++ OpenGL programs, as the API is automatically generated from C header files. This is one of the greatest strengths of JOGL, as it is quite easy to port OpenGL programs written in C or C++ to JOGL; learning JOGL is essentially learning OpenGL[...]


Thanks Justin!

//JOGL 2 - Composeable Pipline

JOGL provides a feature called 'composeable pipeline' which can be quite useful in some situations. It enables you to put additional delegating layers between your java application and the OpenGL driver. A few usecases could be:
  • performance metrics
  • logging, debugging or diagnostics
  • to ignore specific function calls
It is very easy to set up. Just put this line into your code and the DebugGL layer will throw a GLException as soon an error occurs (you want this usually when you are developing the software).
    public void init(GLAutoDrawable drawable) {
        // wrap composeable pipeline in a Debug utility, all OpenGL error codes are automatically
        // converted to GLExceptions as soon as they appear
        drawable.setGL(new DebugGL3(drawable.getGL().getGL3()));
Another predefined layer is TraceGL which intercepts all OpenGL calls and prints them to an output stream.
        drawable.setGL(new TraceGL3(drawable.getGL().getGL3(), System.out));
see also GL Profiles

//You have won the Jackpot 3.0

You probably remember the project called Jackpot which James Gosling was initially involved with. It was basically a way to migrate client code between incompatible third party libraries by specifying refactoring rules. The project was that good integrated into NetBeans that it looked dead from the outside for a long time, since it was only used internally. NetBeans 6.9 uses Jackpot for most of the in-code hints for instance.

There where various ways to specify the transformation rules, e.g. via a special declarative language or even in Annotations directly in the library-code which would cause incompatibilities (or e.g in conjunction with @Deprecated).

Jan Lahoda recently started with the efforts to make the project usable as standalone tool again. Jackpot 3.0 is available via bitbucket for early adopters.

Back to the Future

I used this opportunity to test jackpotc (the jackpot compiler) with JOGL. What I tired is to provide transformations which transform old JOGL 1.1.1 code into latest JOGL 2 compatible client code. So firstly thanks to Jan for fixing all the bugs we run into while testing the experimental commandline compiler.

The first thing I did was to transform the code to properly use OpenGL profiles. As testcode i will use the famous Gears OpenGL demo (but those kind of automatic transformations will only pay of if you use them on large codebases). Since it was written against JOGL 1.1.1 it can only use OpenGL up to version 2.x, which means we can simply use the GL2 profile.

Transformation Rules

'JOGL2 API change: javax.media.opengl.GL -> javax.media.opengl.GL2':

'JOGL2 API change: new javax.media.opengl.GLCapabilities(javax.media.opengl.GLProfile)':
new javax.media.opengl.GLCapabilities()=>
new javax.media.opengl.GLCapabilities(javax.media.opengl.GLProfile.get(javax.media.opengl.GLProfile.GL2));;

'JOGL2 API change: GL gl = drawable.getGL() -> GL2 gl = drawable.getGL().getGL2()':
$d.getGL() :: $d instanceof javax.media.opengl.GLAutoDrawable=>

Just by looking at the transformation rules you can easily see that it is far more powerfull as any simple text replacement could be. Jackpot uses javac and can therefore work with full qualified names, instanceof and more. It will also correctly fix imports for you (there is currently a open bug in this area). The quotes are used as description string which will be printed when jackpotc runs on every code occurrence which applies.

Invoking Jackpot

jackpotc -sourcepath $SRC -cp $LIBS -d $OUTPUT\
         -Ajackpot30_extra_hints=./jogl1Tojogl2.hint $FILESET

$LIBS must contain both library versions, JOGL 1.1.1 and JOGL 2. This is not optimal but it will probably work in most situations to just use both without thinking about an particular ordering or the need to do multiple iterations.


If everything runs fine the console output should look like the sample below for each transformation which applies for the given $FILESET:
./test/oldgears/src/jogl111/gears/Gears.java:54: warning: JOGL2 API change: GL gl = drawable.getGL() -> GL2 gl = drawable.getGL().getGL2()
    GL gl = drawable.getGL();
The final result is a diff patch located in $OUTPUT/META_INF/upgrade called upgrade.diff containing the complete changeset for the transformation. Now the only thing you have to do is to review the changes and apply them.
@@ -51,7 +51,7 @@
     // Use debug pipeline
     // drawable.setGL(new DebugGL(drawable.getGL()));
-    GL gl = drawable.getGL();
+    GL2 gl = drawable.getGL().getGL2();

You can find the complete demo and all ready-to-run shellscripts in the tools/jackpotc folder inside JOGL's git repository. The classic JOGL 2 Gears demo can be found in form of an applet here (uses latest hudson builds... can be unstable).

happy coding!

- - - -
The JOGL repositories are open for contributions. If you would like to add some rules or fix other things... feel free to fork the repos on github and commit to them. (same rule applies for all JogAmp Projects like JOCL, JOAL, GlueGen... etc)

//JogAmp at SIGGRAPH 2010

The JogAmp team will be present at SIGGRAPH this year:
3D & Multimedia Across Platforms and Devices Using JOGL
Tuesday, 27 July | 4:00 PM - 6:00 PM

This session discusses the features, contributions, and future of OpenGL, OpenCL, and OpenMax
across devices and OS exposed on top of Java using the JogAmp open-source libraries.
link to Session

hope to meet you there.

about JogAmp.
JogAmp is the home of high performance Java libraries for 3D Graphics, Multimedia and Processing. JogAmp consists currently of the projects JOGL, JOCL and JOAL which provide cross platform language bindings to the OpenGL, OpenCL, OpenAL and OpenMAX APIs.

- - - -
(yes i know i should start bogging again :))

//Object Pooling - Determinism vs. Throughput

Object pooling in java is often seen as an anti pattern and/or wasted effort - but there are still valid reasons to think about pooling for certain kind of applications.

The JVM allocates objects much faster from managed heap (young generation; contiguous and defragmented) as you could ever recycle objects from a self written pool running on top of a VM. A good configured garbage collector is also able to delete unused objects fast. GCs in fact don't delete objects explicitly, they rather evacuate all surviving objects and sweep whole memory regions in a very efficient manner and only when its necessary to reduce runtime overhead.

Object allocation (of small objects) on modern JVMs is even so fast that making a copy of immutable objects sometimes outperforms modification of mutable (and often old) objects. JVM languages like scala or clojure make heavy use of this observation. One of the reasons for that anomaly is that generational JVMs are designed to be able to deal with loads of short living objects which makes them inexpensive compared to long living objects in old generations.

Performance does not always mean Throughput

Rendering a game with 60fps might be optimal throughput for a renderer but the performance might be still unacceptable when all frames are rendered in the first half of the second with the second half spent on GC ;). Even if Object Pools may not increase system throughput they can still increase determinism of your application. Here are some observations and tips which might help:

When should I consider Object Pools?

  • GC tuning did not help - you want to try something else
  • The application creates a lot of objects which die in the old generation
  • Your Objects are expansive to create but easy to recycle
  • Determinism, e.g response time (soft real time requirements) is more important for you than throughput

Pro Pooling:

  • pools reduce GC activity in peak times (worst case scenarios)
  • are easy to implement and test (its basically an array ;))
  • are easy to disable (inject a fake pool which returns only new Objects)

Con Pooling:

  • more (old) objects are referenced when a GC kicks in (increases gc overhead)
  • memory leaks (don't forget to reclaim your objects!)
  • cause additional problems in a multi-threaded scenario (new Object() is thread safe!)
  • may decrease throughput
  • cumbersome, repetitive client code

When you decided to use pools you have to make sure to reclaim all objects as soon they are no longer used. One way of doing this is by applying the static factory method pattern for object allocation and a per object dispose method for deallocation.

/**not Thread safe!**/
public class Vector3f {
    private static final ObjectPool<Vector3f> pool;
    public float x, y, z;
    private boolean disposed;
        pool = new <Vector3f>ObjectPool(1024);
        for(int i = 0; i  < 1024; i++) {
            pool.reclaim(new Vector3f());

    private Vector3f() {}

    public static Vector3f create(float x, float y, float z) {
        Vector v = pool.isEmpty() ? new Vector() : pool.get();
        v.x = x;
        v.y = y;
        v.z = z;
        v.disposed = false;
        return v;
    public void dispose() {
        if(!disposed) {
            disposed = true;

To demonstrate the perceived performance difference I captured two flyovers of my old 3d engine. The second flyover was captured with disabled object pools. The terrain engine triangulates the ground dependent on the position and view direction of the observer which makes object allocation hard to predict. The triangulation runs in parallel to the rendering thread which made the pool implementations a bit more complex as the example above.

Every vertex, normal, triangle and quad-tree node is a pooled object (wireframe on mouse over)

on the left: flyover with pre allocated object pools; right: dynamic object allocation (new Object())

Notice the pauses at 7, 17 and 26s on the flyover with disabled pools (right video).

Note on the videos: The quality is very bad since the tool I used created 700MB large files for the 30s videos a lot of frames got skipped. I even sampled them down from 1600x1200 to 1024x768 and limited the fps to 30 but the bottleneck was still the hard disk. This is the main reason why even the left video does not look smooth. (I even had to boot windows the first time in 2 years to use the tool!). I'll try to capture better vids next time.


Using pools requires discipline, is error prone, not good for system throughput and does not play very well with threads. However there are some attempts to make them more usable in case you think you need them. The physics engine JBullet for example uses JStackAlloc to prevent repetitive and cumbersome code by using automatic bytecode instrumentation in the build process. Type Annotations (JSR 308 targeted for OpenJDK 7) in combination with project lombok and/or the automatic resource management proposal might provide further possibilities for simplifying the usage of object pools in java and reduce the risk for memory leaks.

//Using Applets as fallback mode for video on pre html5 browsers

The upcoming html5 standard will make it very easy to embed media of not proprietary formats in webpages. For example video can be embedded in the same way you would probably do it with an image. But what happens when your browser does not support html5 yet?

Firstly: don't panic! Secondly: you could consider using for example the 256kb large cortado applet as fallback mode, since pre html5 browsers will ignore unknown tags like the video tag they will still read the object tag. Using an applet as cross platform fallback mode for playing e.g. theora encoded hd movies is therefore fairly easy - you even don't have to convert the video to an other format.

Read More

//Java EE 6 - The Salvation

My Brother Adam Bien released his new book Real World Java EE Patterns - Rethinking Best Practices yesterday. It is available as download or softcover. I am sure you will like it. (all in english)

You can testread the first two chapters here, more infos are on his blog.

He will check in all samples of his book to this projekt on kenai.com - so make sure you bookmark it or do a mercurial refresh in your favourite IDE when you are interested.

//JavaOne 09 keynote replays and technical session slides available

For those who haven't watched to the keynotes live or per live stream can now watch the replays of all general sessions (on your favourite screen of your live - sorry couldn't resist ;)).

The good thing about that is: you can hit the fast forward button as soon as the marketing guys start talking ;).

The slides of most presentations are also available [updated link] for download.

Especially recommended are James Gosling's toy show (last session), the first two keynotes and of course the pdfs of the technical sessions.

//Enabling the new java browser plugin on ubuntu

When you are using Ubuntu and upgraded from older releases to intrepid or jaunty you might have run into a setup bug which causes the browser to keep using the old java plugin despite having latest Java SE and plugin packages installed (e.g 1.6 update 13 from multiverse repository).

To fix this you will have to update some symlinks and let them point to the correct location.

one easy way of doing this is by using the update-alternatives command:

 sudo update-alternatives --all

this will iterate through all symlinks in /etc/alternatives which have more than one alternative and ask you which one to use. Simple update all links which point to:


to the location of the new plugin (e.g for i386):


For all other links just hit Return.

This is a little bit of a brute force approach but there shouldn't be many of them and it is the only way to make sure you don't overlook one of them since they are all called differently ;)

Next time you restart your browser the new plugin should be loaded and applets which use e.g jnlp for deployment (or out of process functionality) should work.

//Java - JavaScript Communication example

Communication between java applets and javascript code is already available since J2SE 1.3 (aka LiveConnect, which was btw. rewritten from scratch in Java 6 update 10 as part of the new plugin) and is really easy to implement. It is a simple way to break out of the sandbox and do things which would usually require full system access (a signed applet + user approval via security dialog). For example applets living in a sandbox are only allowed to read mouse events via the AWT/Swing event mechanism which works as long the mouse is over the applet.

To read e.g the mouse position globaly you would need to call MouseInfo.getPointerInfo().getLocation() which would cause a java.security.AccessControlException: access denied. However, in javascript it is trivial to track mouse events for the whole html document (e.g google adds track onclick x,y events).

All you have to do is to use the object tag instead of applet tag (which is deprecated anyway) and give the object (applet) a name via the id attribute.

<form name="FishForm">   
<object width="256" height="256" type="application/x-java-applet" id="CrazyFish">
<param value="http://people.fh-landshut.de/~mbien/weblog/java_js_interop/launch.jnlp" name="jnlp_href" />
<param value="false" name="draggable" />

 now you can simply call methods as usual.

<script language="JavaScript1.2">
document.onmousemove = onMouseMoved;

var tempX = 0;
var tempY = 0;

var applet = document.FishForm.CrazyFish;

function onMouseMoved(e) {
// javascript -> java calls
applet.jsObjectOrigin(findPosX(applet), findPosY(applet));
applet.jsMouseMoved(tempX, tempY)
return true

 the other side is a plain old public method implemented in the java applet.

* called from javascript.
public void jsMouseMoved(int x, int y) {
//do something usefull

 RIA/Web2.0 Observer Pattern in action ;)

(The applet won't work with JRE version < 1.6 update 10 (or the equivalent on Mac OS) since I used the jnlp deployment mechanism, but it wouldn't have been necessary for this particular applet)

... and never forget Web 2.0 is watching you

//FishFarm wins second prize in GlassFish Community Innovation Awards Program

I recently won with FishFarm the second prize in the GlassFish Community Innovation Awards Program -  which is pretty cool. I would never have thought that I have a chance to win something in the GAP.

Since you probably don't know what FishFarm actually is, I will try to introduce it with this entry.


FishFarm = Shoal + Fork/Join Framework
Shoal = simple to use clustering framework currently based on JXTA and used within GlassFish

Fork/Join Framework = pretty cool concurrency framework for local parallelization of tasks (jsr166y targeted for Java 7)

=> FishFarm = simple but pretty cool solution for distributing concurrent tasks over a p2p network [q.e.d.]


Project FishFarm is a simple solution for distributing computational intensive tasks over the network based on Java SE APIs.

The goal of this project is to take any task written in the Fork/Join Framework (JSR166y targeted for Java 7) and distribute the computation over multiple nodes in a grid. FishFarm introduces no new frameworks and is also no full featured distribution system.

The initial focus was to make the ForkJoinPool (which is a core part of jsr166y) distributable with as few code changes as possible. Thanks to Doug Lee these modifications are now in trunk of his Fork/Join Framework and he even provided a handful of utility methods to make further extensions simpler.

How it works:

All you need to make your Application distributable is to replace ForkJoinPool with FishFarm's DistributedForkJoin pool.

  ForkJoinPool pool = new DistributedForkJoinPool();

// submit as many tasks you want (nothing changed)
Future futureResult1 = pool.submit(new MyTask());
Future futureResult2 = pool.submit(new LongRunningTask());

// block until done
  System.out.println("result of task 1: " + futureResutl1.get());
// or ask if done
System.out.println("task 2 isDone=" + futureResutl2.isDone());

Every DistributedForkJoinPool is member of a peer2peer network and automatically steals work from overstrained pools if idle. DistributedForkJoinPool extends ForkJoinPool and will complete submitted tasks even when working offline or on node failures. No additional configuration required.

I wouldn't call it ready for production yet but it should be stable enough to have fun ;-)

webstartable demos


In case you are wondering why you are reading this entry via the RSS feed of my brother's (Adam) blog. This is a bug which confuses both urls, I hope this should be fixed with the next roller deployment.

This is Michael - over and out ;-)

//Web 3.0 alias Java 6 update N + JavaFX Desktop profile anounced

"The Workaround" - better known as Web 2.0 is now no longer necessary ;).

With Java 6 update 10 and later + JavaFX Desktop profile it is possible to run REAL applications (lets call them applets) out of process in the browser. The browser sandbox is now optional, if you like the applet to persist, just drag it out of the browser and you can use it stand alone (same process -> same state).

If you close it, it simple moves back to the browser, but what happens if you close the browser when the stand alone applet is still up and running you may ask? Now you technically transformed your applet to a webstart application and installed it on your system. Isn't that cool?!! Installation has never been so easy.

Additional to that a great demonstration of the Adobe - Photoshop/Illustrator exporter plugin has been shown on the JavaOne tech session where a professional designer and a java geek developed side by side a cool looking animated application without knowing concrete implementation/design details from each other*. Now you are able not just to develop techical superior applications you can also make them look awesome with minimal effort. (technically the exporter plugin exports each layer so coolest things will be possible)

But this is not everything: missing audio and video codecs are now ready to use, BDLive which brings network to your Blu-ray player has been shown and BDLive developer discs are now availabe at BDLive.com.

The early access JavaFX SDK will be available in June. More information on javafx.com (well the webside does not look and feel very good but thats probably because its not written in JavaFX script... ;) )


*Sun Tech Session 3

Applets Reloaded with Ken Russell

Livedemo including sourcecode and documentation of the draggable applet feature is now available.

//NetBeans 7.0 with better desktop integration planed

Note: This entry has been posted on 1. April 2008 and nothing of that below is true :-)

- - - - -

You probably already know a lot of changes are planed for the NetBeans 7.0 release.

One of the bigger changes is tighter integration to the Windows Presentation Foundation for the SWT/JFace rewrite of NetBeans 7.0 similar tho the Eclipse roadmap. The minimum system requirement will rise to windows vista ultimate with a DirectX 10 capable graphics card and a USB stick plugged into your system (swap file for java quickstarter) to render NetBeans 7 in full HD. The primary reason for that was the out of the box Java 6 incompatibility to apple systems (who knows maybe it is compatible with MacOS X but no one will tell you because if he tried installing SE 6 on macs he/she also signed an NDA...) and the issue that many architects simple do not understand the internals of linux distributions (e.g Ubuntu) to install NetBeans.

On older cards or other operating systems JEdit will be started in compatibility mode (motif look&feel and full shell support).

The reason for that are the new consumer guidelines and download size limitations of the java 6 Update 10 release. (there is a because of backwards compatibility problems [to build 11] not fixable ArrayIndexOutOfBounds bug in the pack200 implementation - primary reason why the swing renderer does not fit into the NB 7 distribution anymore if started with Java 6 update 10).

This brings several advantages. E.g instead of playing Jake in the browser (update 10 required) NetBeans 7.0 will be capable to render Halo 3 in the editor pane (with full profiler integration and 16x FSAA text overlay). Regular patches will be available via update center.
Additionally to that better joystick support is planed. This should improve the navigation through larger projects and replace the "go to declaration" action. You may also activate the force feedback option in the new "user experience" tab in the options dialog to detect the files which causes unit tests to fail. (Note: not available in JEdit compatibility mode but there will be a blinking icon instead)

The higher costs to develop NetBeans 7.0 make a complete free distribution not feasible but it will be still free for opensource developers (but not without limitations e.g UML diagrams will be limited to only two kind of widgets "hack" and "ship" while maintaining 100% SSP compatibility).

//Garbage First - It has never been so exciting to collect garbage :)

If you are reading this entry, you probably already know about G1 the new Garbage First concurrent collector currently in development for Java 7.

Jon Masamitsu made recently a great overview of all GCs currently integrated into JVM of Java SE 6 and announces the new G1 collector on his weblog.

I asked him in the comments some questions about G1 and a very interesting discussion starts. Tony Printezis an expert from the HotSpot GC Group joined the discussion and answered all the questions very detailed.

(I have aggregated the discussion here because I think it is much easier to read if the answer follows next to the question without the noise between them)

me: I just recently thought about stack allocation for special kind of objects. Couldn't the hotspot compiler provide enough information to determine points in code when its safe to delete certain objects? For example many methods use temporary objects. Is it really worth to put them into the young generation?

Tony: Regarding stack allocation. I believe (and I've seen data on papers that support this) that stack allocation can pay off for GCs that (a) do not compact or (b) are not generational (or both, of course).

In the case of (a), a non-compacting GC has an inherently slower allocation mechanism (e.g., free-list look-ups) than a compacting GC (e.g., "bump-the-pointer"). So, stack allocation can allow some objects to be allocated and reclaimed more cheaply (and, maybe, reduce fragmentation given that you cut down on the number of objects allocated / de-allocated from the free lists).

In the case of (b), typically objects that are stack allocated would also be short-lived (not always, but I'd guess this holds for the majority). So, effectively, you add the equivalent of a young generation to a non-generational GC.

For generational GCs, results show that stack allocation might not pay off that much, given that compaction (I assume that most generational GCs would compact the young generation through copying) allows generational GCs to allocate and reclaim short-lived objects very cheaply. And, given that escape analysis (which is the mechanism that statically discovers which objects do not "escape" a thread and hence can be safely stack allocated as no other thread will access them) might only prove that a small proportion of objects allocated by the application can be safely stack allocated (so, the benefit would be quite small overall).

(BTW, your 3D engine in Java shots on your blog look really cool!)

thank you! :)

Read More