Useful git resources

In my attempt to learn git I have come across many good online resources - here is a list of all of them, hopefully it will help you learn git faster and I will never Google for them again:

1. Pro Git Book - good reference.

2. Git magic book - a good condenced book with lots of examples.

3. Git for Designers -  an introduction to SCM and how GIT fits in.

4. Git cheat sheet - full command reference - very useful.

5. Git Man Pages

6. Git from the bottom up - a pdf explaining all concepts of git from the bottom up

7. Visual Git Cheat sheet

git

git

Manual transactional demarcation in spring and hibernate

Often you are forced to write code where the @Transactional in spring simply does not cut it. You want to execute certain pieces of code in different transactions all with a different propagation and isolation levels. The manual transactional demarcation in spring is stupidly simple after some upfront configuration and works uniformly by enrolling in what ever the enclosing transaction manager is. In case you are running in a JEE server it will tie in with the JTA transaction manager otherwise hibernate transaction manager.

In case you want to integrate with the JTA transaction manager and want the spring @Transactional to work simply put this int he application context:

<!-- configure JTA transaction manager -->
    <tx:annotation-driven transaction-manager="transactionManager" proxy-target-class="true"/>
    <bean id="transactionManager" class="org.springframework.transaction.jta.JtaTransactionManager">
        <property name="allowCustomIsolationLevels" value="true" />
    </bean>

If you are running inside a plain servlet container like tomcat you can configure the hibernate transactions, vanilla JDBC transactions and etc. to use the hibernate transaction manager like this:

<!-- use the hibernate transaction manager -->
    <bean id="transactionManager" class="org.springframework.orm.hibernate3.HibernateTransactionManager">
        <property name="sessionFactory" ref="sessionFactory" />
        <property name="dataSource" ref="dataSource" />
    </bean>

Notice that the session factory and the data source properties are both set. This then allows jdbcTemplate etc. to all participate in the same transactions as a hibernate call to save() and load().

Once configured its great any spring managed bean with the following annotations on methods will work flawlessly.

@Transactional(isolation = Isolation.READ_COMMITTED, propagation = Propagation.REQUIRED)

This is where the problem starts. Even though you get very granular in the transactions here - calling methods within a service implementation will not be transactionally aware because the demarcation, flushing, commits etc. happen by generating a proxy around the class and internal calls are simply by passed.

In this case you can make use of the spring TransactionTemplate and the PlatformTransactionManager class.

Using it is quite simple - inject the platformTransactionManager into a bean:

<!-- use the hibernate transaction manager -->
    <bean id="xyzService" class="XYZServiceImpl">
        <property name="platformTransactionManager" ref="transactionManager" />
    </bean>

and then directly use transaction templates by specifying custom isolation and propagation levels like this:

public class XYZServiceImpl implements XyzService {
    PlatformTransactionManager platformTransactionManager;

    public void doService() {
        TransactionTemplate template = new TransactionTemplate(platformTransactionManager);
        template.setIsolationLevel(TransactionDefinition.ISOLATION_READ_COMMITTED);
        template.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRES_NEW);
            //annon. inner class
            template.execute(new TransactionCallback() {
            public Object doInTransaction(TransactionStatus status) {
                //your business logic here
            }
        });
    }

    public void setPlatformTransactionManager(PlatformTransactionManager platformTransactionManager) {
        this.platformTransactionManager = platformTransactionManager;
    }
}

The cool thing is that you could have parts of the same method execute in a different transactions - looking at different isolation levels. No more complex XA/JTA code, hibernate sessions will flush, transactions will commit/rollback at the demarcations you expect - spring makes it too easy.

Reverting a file in ‘git’

I have just begun to learn ‘git’ and understand the motivation behind distributed SCMs. One of the great powers of git, in addition to the fact that it is super fast is its excellent merging and branching capabilities. I use perforce at work and SVN at home and I have used CVS as well so to tune your head around a distributed SCM is challenging and requires some reorientation.

We use perforce at work and created merged 104 branches in the last year (all experimentation and execution is done in branches with the trunk or mainline being stable) - I think the merging in perforce is quite good but too slow, plus it costs money. SVN and CVS are different stories however - merging is downright painful and requires endless hours on manual merges and testing. So I was interested in seeing what GIT has to offer.

Installation and creation of a repository was quite straight forward. I checked in a project and was functional.

Next I tried to edit a file and tried to revert it.

pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ vim prototype.js
pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ git status
# On branch master
# Changed but not updated:
#   (use "git add ..." to update what will be committed)
#   (use "git checkout -- ..." to discard changes in working directory)
#
#	modified:   prototype.js
#
no changes added to commit (use "git add" and/or "git commit -a")
pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ git revert prototype.js
fatal: Cannot find 'prototype.js'
pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ 

Not what I really expected. A little research made me learn is “revert” in git is actually a “rollback” of a checked in change. This was a little counter intuitive - I would have preferred it being called “rollback” instead.

The real way to actually revert a file is check it out again.

pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ git checkout prototype.js
pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ git status
# On branch master
nothing to commit (working directory clean)
pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$
pranay@pranaydesktop:~/dev/workspace/grails/testApp/web-app/js/prototype$ 

Hopefully I will never forget this.

High speed file splitting/integration in java using NIO

I have many times been faced with a situation where I am trying to move very large files (ISOs or zips upto 1-4 GB in size) but I don’t have a USB drive of that capacity and for some reason I can’t do it over the network. Of course if you want to P2P broadcast of huge files (think updating 200 machines simultaneously) - splitting them up helps in this case specially if you want to replicate a managed bit-torrent like environment. I have found some commercial file splitters out there but they are too slow and clunky. There is no concievable reason why they have to be so slow or I should live without options.
So I just decided to write one from scratch plus it gave me a reason to refresh my NIO knowledge. With some tweaking and proper usage of buffers and channels I have managed to get a comparable/better throughput in java than even the native operating system tools. I tested the integrity of the file and everything was OK.

The amount of code to do it minuscule and quite straight forward. First the splitter:

package net.ahlawat.file;

import java.io.*;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

/**
 * Program that splits the file
 * User: Pranay Ahlawat
 * Date: Jan 18, 2010
 * Time: 8:14:03 PM
 */
public class Splitter {
    static long BYTE_TO_MB  = 1024 * 1024;
    static long BUFER_SIZE = 128 * 1024;

    public static void main(String[] args) throws Exception {
        if (args.length < 3) {
            System.out.println("splitter [fileName] [split size in MB] [out dir]");
            System.exit(1);
        }

        //create the local variables to be used in the rest of the application
        File inFile = new File(args[0]);
        long partitionSize = Long.parseLong(args[1]) * BYTE_TO_MB;
        File outDir = new File(args[2]);

        //create inital counters
        final long totalFileSize = inFile.length();

        //create the out dirs if they dont exist
        if (!outDir.exists()) {
            System.out.println("Creating directory : " + outDir.getName());
            outDir.mkdirs();
        }

        FileChannel inChannel =  new FileInputStream(inFile).getChannel();

        long currentPosition = 0;
        int ctr = 0;
        ByteBuffer buff = ByteBuffer.allocate((int)BUFER_SIZE);
        long start = System.currentTimeMillis();

        while(currentPosition < totalFileSize) {
            //get the out channel for the file - roughly is the "originalFileName.ext.n" where 'n' is the partition number
            FileChannel outChannel = getChannel(inFile, outDir, ++ctr); //init the out channel
            //the size of the nth partition
            long size = currentPosition + partitionSize < totalFileSize? partitionSize : totalFileSize - currentPosition;
            //sout
            System.out.print(String.format("Creating part %s of size %s MB", ctr, size/BYTE_TO_MB));
            long start2 = System.currentTimeMillis();

            //the end position of the nth partition w.r.t the entire file
            long endPosition = currentPosition + size;

            //write partition in BUFFER_SIZE chunks
            while(currentPosition < endPosition) {
                //read the chunk into the buffer
                long subSize = (currentPosition + BUFER_SIZE) < endPosition ? BUFER_SIZE : endPosition - currentPosition;
                inChannel.read(buff, currentPosition);
                //prepare for writing
                buff.flip();
                //write
                outChannel.write(buff);
                currentPosition += subSize;
                //clear the buffer - so we can write again
                buff.clear();
            }

            outChannel.close(); //close

            //print throughput for this file partition
            double delta = (double)(System.currentTimeMillis() - start2)/1000;
            System.out.println(String.format(" -> Transferred in %.2f s @ %.2f MB/s", delta,
                    (double) size/BYTE_TO_MB/delta));
        }

        //calculate time
        double delta =  (double)(System.currentTimeMillis() - start)/1000;

        //print out the total throughput
        System.out.println(String.format("Copied %.2f MB in %.2f s @ %.2f MB/s", (double)totalFileSize/BYTE_TO_MB, delta, (double)totalFileSize/BYTE_TO_MB/delta));

        //finally close the channel
        inChannel.close();
    }

    private static FileChannel getChannel(File inFile, File outDir, int ctr) throws FileNotFoundException {
        return new FileOutputStream(new File(outDir, (inFile.getName() + "." + ctr))).getChannel();
    }
}

There are a couple of things I would like to mention about this code. First I tried a variety of things - I tried the MappedMemoryBuffers which was not giving me good performance so I reverted to using vanilla byte buffers. Next I tried a variety of buffer sizes unsurprisingly too low a buffer size means too many reads and too high meant very slow buffer manipulation - vanilla byte buffers of 128K seemed to be just right and gave me great speed and memory numbers.

The file under experiment was the open solaris ISO - about 700 MB in size. Here is the output:

Creating part 1 of size 100 MB -> Transferred in 0.27 s @ 366.30 MB/s
Creating part 2 of size 100 MB -> Transferred in 0.25 s @ 403.23 MB/s
Creating part 3 of size 100 MB -> Transferred in 0.24 s @ 413.22 MB/s
Creating part 4 of size 100 MB -> Transferred in 0.25 s @ 406.50 MB/s
Creating part 5 of size 100 MB -> Transferred in 1.19 s @ 84.32 MB/s
Creating part 6 of size 100 MB -> Transferred in 2.16 s @ 46.38 MB/s
Creating part 7 of size 76 MB -> Transferred in 2.21 s @ 34.85 MB/s
Copied 676.99 MB in 6.69 s @ 101.21 MB/s

Not bad I could split the file up in under 7 seconds - this is better throughput than what the native tool gives me. The result of this code was that the big file was split into 100MB chunks (and change).

Next the integrator -

package net.ahlawat.file;

import java.io.File;
import java.io.FileOutputStream;
import java.io.FileInputStream;
import java.nio.channels.FileChannel;
import java.nio.ByteBuffer;
import static net.ahlawat.file.Splitter.*;

/**
 * Integrator - integrate files
 * User: Pranay Ahlawat
 * Date: Jan 18, 2010
 * Time: 10:51:43 PM
 */
public class Integarator {

    public static void main(String[] args) throws Exception {
        if (args.length < 3) {
            System.out.println("integrator [fileName] [dir] [out file name]");
            System.exit(1);
        }

        //create core variables
        File dir = new File(args[1]);
        String baseFileName = args[0];
        File outFile = new File(args[2]);

        //create the out channel - to which the data will be written
        FileChannel outChannel = new FileOutputStream(outFile).getChannel();

        //core buffer
        ByteBuffer buff = ByteBuffer.allocate((int)BUFER_SIZE);

        int ctr = 0;
        long start = System.currentTimeMillis();
        while(true) {
            //some profiling
            long start2 = System.currentTimeMillis();
            //create the file and test to see if it's there
            File file = new File(dir, String.format("%s.%s", baseFileName, ++ctr));
            if (!file.exists()) { //no the file 'n' does not exist - integration complete
                break;
            }

            System.out.print(String.format("Integrating %s", file.getName()));

            //creat the in channel for the partitioned file 'n'
            FileChannel inChannel = new FileInputStream(file).getChannel();

            long currentPosition = 0;
            long fileSize = file.length();

            //read the file in chunks of BUFFER_SIZE
            while(currentPosition < fileSize) {
                long chunkSize = (currentPosition + BUFER_SIZE) < fileSize? BUFER_SIZE : fileSize - currentPosition;
                inChannel.read(buff, currentPosition);
                currentPosition += chunkSize;
                buff.flip(); //flip the buffer we are ready to write
                outChannel.write(buff);
                buff.clear(); //clear
            }

            //close/flush the information
            inChannel.close();

            //print profiling inforamtion
            double delta = (double) (System.currentTimeMillis() - start2)/1000;
            System.out.println(String.format(" -> Integration complete in %.2f s @ %.2f MB/s",
                    delta, file.length()/BYTE_TO_MB/delta));
        }

        outChannel.close();
        double delta = (double) (System.currentTimeMillis() - start)/1000;
        System.out.println(String.format("Integration complete in %.2f @ %.2f MB/s", delta, outFile.length()/BYTE_TO_MB/delta));
    }
}

Again I tried the outChannel.transferFrom() but it just bew up - the performance was horrible. The best results were when I used vanilla buffers and manipulated them myself.

Here are the results:

Integrating osol.iso.1 -> Integration complete in 0.35 s @ 283.29 MB/s
Integrating osol.iso.2 -> Integration complete in 0.26 s @ 378.79 MB/s
Integrating osol.iso.3 -> Integration complete in 0.25 s @ 393.70 MB/s
Integrating osol.iso.4 -> Integration complete in 0.28 s @ 361.01 MB/s
Integrating osol.iso.5 -> Integration complete in 1.68 s @ 59.56 MB/s
Integrating osol.iso.6 -> Integration complete in 2.10 s @ 47.55 MB/s
Integrating osol.iso.7 -> Integration complete in 1.39 s @ 54.87 MB/s
Integration complete in 6.44 @ 104.97 MB/s

Not bad at all. Just to put how fast this in in perspective - using cygwin just copying about 700 MB takes about 15 seconds.

deepti@aanyalaptop /cygdrive/c/test
$ time cp osol.iso cp_of_osol.iso

real    0m16.014s
user    0m0.031s
sys     0m1.825s

deepti@aanyalaptop /cygdrive/c/test
$

And I wrote this little bat script to measure the throughput of native windows command line.

prompt $d $t $_$P$G
copy osol.iso another_cp.iso
prompt $d $t $_$P$G

Here is the output.

C:\test>prompt $d $t $_$P$G

Tue 01/19/2010  1:07:48.86
C:\test>copy osol.iso another_cp.iso
        1 file(s) copied.

Tue 01/19/2010  1:08:00.32
C:\test>prompt $d $t $_$P$G

Tue 01/19/2010  1:08:00.32
C:\test>

Which is approximate 12 seconds… :) - java NIO rocks.

I will package this up with a UI and make it available as a tool on ahlawat.net soon for all interested.

OpenSolaris - my first experience

During the last two weeks or so I have been trying to carefully evaluate what the best platform is run an enterprise java app. I am not even going to consider windows. During my research on the internet I came across this article.

I have worked with Solaris off and on - both on work and also when I was at school - at Cornell our entire packet switched networks class lab was completely in C and solaris, but I have never really liked it that much - I just prefer linux. However the differences between the file IO and numeric computation was considerable. (possibly because of the differences between EXT3 and ZFS). It intrigued  me enough to try to run open solaris. So I downloaded the iso from the opensolaris and installed it as a virtual box vm. The default version, like ubuntu started as live CD from where the user has an option to install it. The installation was quick and easy - no issues at all. I logged in and was very quickly up and running with javac, ant, mvn, groovy etc. No issues at all - then I went to the idea website to get a version of the IDE and guess what - they don’t support solaris - out of hope I downladed the linux version and it did not work.

Of course eclipse has been recently ported over to opensolaris and there is net beans.

The package manager sucks though - it downloads stuff one at a time? I liked that there was a default AMP package that installed PHP, Apache and MySql so you had the basics of a web server in place. Installing your own stuff as a service is radically different from linux. The default init.d scripts have been deprecated in favor of the service manager facility. Its seems to be very well thought out - moving services between run levels - auto restart of services gone bad etc. are awesome features but there is a lot of admin to do here. Linux is just a heck of a lot easier - you can easily find init.d scripts and if you are using centos managing run levels and services using the chkconfig command is too simpile. Sample scripts for svcadm are hard to find and frankly administering solaris is not what interested me most so I never bothered with it too much.

All in all would I ever develop on opensolaris - probably not because the tooling support sucks, some of my favorite python libraries might not install on solaris. Would I use it in as a production server - POSSIBLY .. if I find that my language performs faster on solaris - which the bench mark seems to suggest and once I get around the initial installing as a service, monitoring bits. Other people seem to think that open solaris is too slow - I never benchmarked high network or file IO but even working on VM seemed not that bad - the unizpping and moving around copying stuff seemed to be reasonable for a VM. I have a mixed feeling about it - hopefully when I start using it more I might have more to say. For now linux it is.

Spring Roo - first impressions

Ok so another productivity tool hits the market - this time its from the big boys at Spring. Spring Roo is an open source software tool that uses convention-over-configuration principles to provide rapid application development of Java-based enterprise software. The resulting applications use common Java technologies such as Spring Framework, Java Persistence API, Java Server Pages, Apache Maven and AspectJ. Spring Roo is a member of the Spring portfolio of projects.

I was able to install and create an application in under 5 minutes with this thing. Which is great for demo but then I took a look at the generated source code and was not too happy. For a framework that claims to pure java RAD framework there is a bit too much AspectJ in the source folder. In fact its silly the amount of files the generator spits out. The documentation tells you that there is “very little” blackbox and that “you should never be editing” the IDT generated files directly. But for a groovy/java developer who has a lot of web application development experience I am just a little nervous with aspectJ polluting my class files the way i don’t want it too. We use dynamic proxies in more frameworks and places than we can care to think about but there is something about compile time weaving that just makes me a little uncomfortable. Of course this could just be my inexperience with acj and aspectJ talking here.

capture

capture

So for one domain there are a total of 5 source files - 4 of which I should not be touching?

The next thing that was a little bit restrictive was the whole domain modelling - its absoloutely great for simple one-one relationships or many to one relationships but it’s not suitable for slightly more complex types. Also I don’t know how the framework will behave with slightly more advanced but common features of Hibernate (if you are using a Hibernate provider) - like custom hibernate types, hibernate collection of items etc.There is little in the documentation about this - so its hit and try or hit the source code of the project to figure it out.

The scaffold views are not bad but too busy .. from a js/css point - frameworks should ship with minimal style sheets.

I am a little disapointed really I was really excited about this. If I have to choose a RAD framework that will work on a standard JVM I will stick to tomcat + grails.

Here are the advantages of grails as I see it:

1. Cleaner MVC - convention is great /controller/view/id type bindings work great

2. Extremely extensible using plugins, this is a huge advantage and there are some pretty good ones out there from charting to security

3. GORM is great and for the most part works OK

4. Scriptable environment - great little tools like grails console

5. Groovy is great and the framework does not prevent you from using pure java when needed

If you use roo productively - let me know and I will give it a shot again.

Circumventing creating objects via Reflection vs Direct object manipulation

Of course its no secret that reflection is a lot slower than that direct object manipulation but with java 5 and 6 the speed has dramatically improved I wanted to find out just how much of a penalty hit was it to use reflection rather than direct object manipulation to create objects (from the database end for eg.).

The answer is - substantial.

Here is a code that basically populates an object with random data - first completely via reflection and then by using actual java code:

package net.ahlawat.Main;

import java.lang.reflect.Method;
import java.util.*;

public class Main {
    static class TestClass {
        private String firstname;
        private String lastname;
        private String username;
        private String address;
        private String email;

        public String getFirstname() {
            return firstname;
        }

        public void setFirstname(String firstname) {
            this.firstname = firstname;
        }

        public String getLastname() {
            return lastname;
        }

        public void setLastname(String lastname) {
            this.lastname = lastname;
        }

        public String getUsername() {
            return username;
        }

        public void setUsername(String username) {
            this.username = username;
        }

        public String getAddress() {
            return address;
        }

        public void setAddress(String address) {
            this.address = address;
        }

        public String getEmail() {
            return email;
        }

        public void setEmail(String email) {
            this.email = email;
        }
    }

    public static void main(String[] args) {
        System.out.println("Reflection performance");

        try {
            long start = System.currentTimeMillis();
            Method[] methods = TestClass.class.getMethods();
            List<Method> objectMethods = Arrays.asList(Object.class.getMethods()); 

            List<Method> setters = new ArrayList<Method>();

            for (Method method : methods) {
                String name = method.getName();
                if (objectMethods.contains(method)) {
                    continue;
                } else if (name.startsWith("set") && method.getParameterTypes().length == 1) {
                    setters.add(method);
                }
            }

            System.out.println(String.format("Resolving all methods in %s ms", (System.currentTimeMillis()-start)));

            int objectCount = 10000;
            Random random = new Random();
            start = System.currentTimeMillis();
            for (int x=0; x<objectCount; x++) {
                Object o = TestClass.class.newInstance();
                for (Method getter : setters) {
                    getter.invoke(o, random.nextDouble() + "");
                }
            }
            long refTime = System.currentTimeMillis() - start;
            System.out.println(String.format(Locale.getDefault(),"Creating %d objects via reflection: %d ms",
                    objectCount, refTime));

            start = System.currentTimeMillis();
            for(int x = 0; x<objectCount; x++) {
                TestClass ob = new TestClass();
                ob.setAddress(random.nextDouble() + "");
                ob.setFirstname(random.nextDouble() + "");
                ob.setLastname(random.nextDouble() + "");
                ob.setUsername(random.nextDouble() + "");
                ob.setEmail(random.nextDouble() + "");
            }

            long directTime = System.currentTimeMillis() - start;
            System.out.println(String.format(Locale.getDefault(),"Creating %d objects via direct object manipulation: %d ms",
                    objectCount, directTime));

            System.out.println(String.format("Slowness: %.2f%%", (double)(refTime-directTime)*100/directTime));
        } catch (Exception e) {
            e.printStackTrace();
        }

    }
}

Quite straight forward. Here are the results:

Reflection performance
Resolving all methods in 2 ms
Creating 10000 objects via reflection: 218 ms
Creating 10000 objects via direct object manipulation: 129 ms
Slowness: 68.99%

Oddly internally the compiler optimized the reflective transfer if you do it many times over - I don’t know what exactly causes it but increasing the number to 100,000 instead the slowness is 40%.

Reflection performance
Resolving all methods in 2 ms
Creating 100000 objects via reflection: 1714 ms
Creating 100000 objects via direct object manipulation: 1213 ms
Slowness: 41.30%

If you take a look at just milliseconds it might not seem like a lot - but if you use reflection excessively it will kill the performance of you app. The good thing is you don’t have to live with it.

If you must use reflection due to the dynamic nature of the framework you are putting in place you are much better off writing a one time code generator. Using this generator you can overcome the 50% penalty of using reflection and yet keep the application dynamic.

So use reflection to produce the code that does direct object transfer to a .java file - compile it and use it. I plan to use the same for a data access service object which reads stuff from a table and populates an object using annotations on the class (a hibernate wannabe if you will). The milliseconds will add up pretty fast. I can share the groovy code generator if people are interested.

Groovy Adaptable Packaging Engine (GRAPE)

I think I am falling in love with groovy all over again. GRAPE is a fantastic addition to Groovy 1.6.x. Transitive dependency management is becoming more and more common with the JAR hell that java/groovy application developers face. With the number of libraries growing exponentially and complex interdependence between them - to start a java project with any complexity one has to maintain a huge stash of jars and the overhead required to maintain them is unbelievable. One reason why projects like Grails/App fuse have been successful is that they reduce the ramp up time to develop something interesting. For me a major part of the ramp up time is setting up all the libraries - spring, hibernate, eh-cache (and all the dependent libraries). Open source build tools like IVY and Maven have taken transitive dependency management to the next level and GRAPE builds the capability inside the language itself.

By placing a couple of really simple annotations - one can forget about the dependent jars altogether. The packages will be downloaded automatically and used before executing the script.

So lets say I have a simple script that depends on commons-logging.

I can use it inside my script like this:

import org.apache.commons.logging.*;

@Grab(group = 'commons-logging', module = 'commons-logging', version='1.1.1')
public class GrapeTest {
	static Log logger = LogFactory.getLog(GrapeTest.class);
	public static void main(String[] args) {
		//log
		logger.info("Hello world from Grape!")

	}
}

Here the group is the maven GroupId and the module is the maven ArtifactId. On the command line all I do is groovy <fileName> and the jar download and classpath resolution etc. are handled transparently:

deepti@aanyalaptop /cygdrive/c/Users/deepti/Desktop
$ groovy test.groovy
Dec 16, 2009 12:24:49 AM sun.reflect.NativeMethodAccessorImpl invoke0
INFO: Hello world from Grape!

deepti@aanyalaptop /cygdrive/c/Users/deepti/Desktop
$

If one takes a look the downloded files are present in ~/.groovy/grapes/…

deepti@aanyalaptop /cygdrive/c/Users/deepti
$ find . -name commons*.jar -print
./.groovy/grapes/commons-logging/commons-logging/jars/commons-logging-1.1.1.jar

deepti@aanyalaptop /cygdrive/c/Users/deepti
$

Incredibly simple - what is cool is that the grape handles transitive dependencies transparently - so all the libraries on which commons-logging depends will be downloaded and installed automatically.

So now one can ship a groovy script (piece of code) without any dependencies. Very interesting and very useful.

A flawless Publisher-Subscriber using BlockingQueue

java.util.concurrent simply rocks. I cant believe how simple it has made every day programming tasks.

What is the first thing your learn when you do multi-threading - a producer consumer. It’s a great example to learn notify, wait and understanding the locking semantics of java threading. Its a pity I started earlier because those of us starting out with java 1.5/6 will have their lives too easy. The BlockingQueue is a fantastic addition to the language and using it one can implement a synchronized multi publisher-multi subscriber system using semantics and constructs no different from java collections.

Here is an example with 5 publishers and 2 subscribers:

package net.ahlawat;

import java.util.concurrent.BlockingQueue;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.Date;

/**
 * @author Pranay Ahlawat
 */
public class PubSubTest {
    static class Publisher implements Runnable {
        BlockingQueue<String> queue;
        String name;
        public Publisher(BlockingQueue<String> queue, String name) {
            this.queue = queue;
            this.name = name;
        }

        public void run() {
            while(true) {
                queue.add(String.format("Msg from %s: %s [on %s]", name, Math.random(), new Date()));
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                    break;
                }
            }
        }
    }

    static class Subscriber implements Runnable {
       BlockingQueue<String> queue;
        String name;
        public Subscriber(BlockingQueue<String> queue, String name) {
            this.queue = queue;
            this.name = name;
        }

        public void run() {
            while(true) {
                try {
                    String in = queue.take();
                    System.out.println(String.format("[%s GOT MESSAGE] %s",name,in));
                } catch (InterruptedException e) {
                    e.printStackTrace();
                    break;
                }

            }
        }
    }

    public static void main(String[] args) {
        final int numberOfPublishers = 5;
        BlockingQueue<String> blockingQueue = new ArrayBlockingQueue<String>(10);
        for (int x=1; x<=numberOfPublishers; x++) {
            Publisher publisher = new Publisher(blockingQueue, x+"");
            new Thread(publisher).start();
        }

        Subscriber subscriber1 = new Subscriber(blockingQueue, "Subscriber 1");
        new Thread(subscriber1).start();
        Subscriber subscriber2 = new Subscriber(blockingQueue, "Subscriber 2");
        new Thread(subscriber2).start();
    }
}

It’s quite straight forward - there are a total of 7 threads interacting with the queue. 5 publishers are putting stuff on the queue and 2 subscribers are picking up stuff from it and creatively printing it out the standard out. What I want you to see is the number of times I have used ’synchronized’ in the code - 0.

The output is not surprising:

[Subscriber 1 GOT MESSAGE] Msg from 4: 0.6466875854315378 [on Fri Dec 11 02:11:27 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 1: 0.33362845358296433 [on Fri Dec 11 02:11:27 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 2: 0.11207796566244055 [on Fri Dec 11 02:11:27 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 3: 0.6810655758824113 [on Fri Dec 11 02:11:27 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 5: 0.5679631128460616 [on Fri Dec 11 02:11:27 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 1: 0.6304440131162121 [on Fri Dec 11 02:11:28 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 2: 0.021117766277559014 [on Fri Dec 11 02:11:28 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 3: 0.1955294791717468 [on Fri Dec 11 02:11:28 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 4: 0.884529348835637 [on Fri Dec 11 02:11:28 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 5: 0.034690283475101946 [on Fri Dec 11 02:11:28 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 1: 0.5764439934861816 [on Fri Dec 11 02:11:29 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 2: 0.3629499102212388 [on Fri Dec 11 02:11:29 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 4: 0.3770428828123388 [on Fri Dec 11 02:11:29 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 3: 0.9450938944637225 [on Fri Dec 11 02:11:29 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 5: 0.8910317407643176 [on Fri Dec 11 02:11:29 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 1: 0.5785955008786261 [on Fri Dec 11 02:11:30 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 4: 0.9442550853581151 [on Fri Dec 11 02:11:30 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 3: 0.3308239883343358 [on Fri Dec 11 02:11:30 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 5: 0.5450057593023042 [on Fri Dec 11 02:11:30 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 2: 0.13504231409694423 [on Fri Dec 11 02:11:30 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 1: 0.1018850869879191 [on Fri Dec 11 02:11:31 EST 2009]
[Subscriber 1 GOT MESSAGE] Msg from 2: 0.7325884278324815 [on Fri Dec 11 02:11:31 EST 2009]
[Subscriber 2 GOT MESSAGE] Msg from 4: 0.8804538983093999 [on Fri Dec 11 02:11:31 EST 2009]
...

Such an elegant solution to a classic problem - the wonderful BlockingQueue …

Understading table dependencies caused by referential constraints in oracle

At work last week - I was doing some iterative development which required me to start from a clean database every time. The solution provided by the DBA’s in my team was either a complete reinitialization of the schema OR a script that disables every constraint, truncates all the tables and re-enables all the constraints. Both of these solutions were dreadfully slow.

This was a bit of a hassle because we have an extremely complex schema with a lot of inter-dependencies between tables. Therefore like any self respecting developer I decided to script my way out - so the problem was to:

1. Figure out all the referential constraints from the database

2. Order the deletes in a manner that none of the constraints are violated

A little bit of reading and research introduced me to the all_constraints table ..

SQL> desc all_constraints;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------

 OWNER                                     NOT NULL VARCHAR2(30)
 CONSTRAINT_NAME                           NOT NULL VARCHAR2(30)
 CONSTRAINT_TYPE                                    VARCHAR2(1)
 TABLE_NAME                                NOT NULL VARCHAR2(30)
 SEARCH_CONDITION                                   LONG
 R_OWNER                                            VARCHAR2(30)
 R_CONSTRAINT_NAME                                  VARCHAR2(30)
 DELETE_RULE                                        VARCHAR2(9)
 STATUS                                             VARCHAR2(8)
 DEFERRABLE                                         VARCHAR2(14)
 DEFERRED                                           VARCHAR2(9)
 VALIDATED                                          VARCHAR2(13)
 GENERATED                                          VARCHAR2(14)
 BAD                                                VARCHAR2(3)
 RELY                                               VARCHAR2(4)
 LAST_CHANGE                                        DATE
 INDEX_OWNER                                        VARCHAR2(30)
 INDEX_NAME                                         VARCHAR2(30)
 INVALID                                            VARCHAR2(7)
 VIEW_RELATED                                       VARCHAR2(14)

SQL>

The columns of interest here is the R_CONSTRAINT_NAME  which is the constraint on which the constraint in question relies on. A join with the user_constraint table can then give the actual table on which the current table in question relies on -

select a.constraint_name as acn, a.table_name as atn, b.table_name as btn,
  b.constraint_name  as bcn from all_constraints a, user_constraints b where
  a.r_constraint_name = b.constraint_name

After this its a cake walk - here is the groovy script that deletes the table in the correct order.

import groovy.sql.Sql

def sql = Sql.newInstance("jdbc:oracle:thin:username/password@//localhost:1521/SID",
        "oracle.jdbc.driver.OracleDriver")
long start = System.currentTimeMillis()
def dependencies = [:]
def deleted = new HashSet();

sql.eachRow("select table_name from user_tables where temporary = 'N' and table_name not like 'MVC_%' and table_name not like 'MV_%'") {
  dependencies[it.table_name] = new HashSet();
}

sql.eachRow("""select a.constraint_name as acn, a.table_name as atn, b.table_name as btn,
  b.constraint_name  as bcn from all_constraints a, user_constraints b where
  a.r_constraint_name = b.constraint_name""") {
  dependencies[it.btn] << it.atn
}

println "Getting all tables and resolving referential constraints finished ${System.currentTimeMillis() - start} ms"

start = System.currentTimeMillis();

int size = dependencies.keySet().size()
while (deleted.size() < size) {
  int initSize = deleted.size();
  dependencies.keySet().each {table->
    def reducedSet = dependencies[table].grep {!deleted.contains(it)}
    if (reducedSet.size() ==0 || (reducedSet.size() == 1 && reducedSet.iterator().next() == table)) {
      deleted.add(table)
      println "deleting $table"
      String query = "delete from $table"
      sql.execute(query)
    }
  }

  if (initSize == deleted.size()) {
    println "PROBLEM .. "
    dependencies.keySet().each() {table->
      if (deleted.contains(table)) {
        return;
      }
      println "$table -> ${dependencies[table].grep {!deleted.contains(it)}}"
    }
    throw new IllegalStateException("Cannot delete tables because of cyclic dependencies")
  }
}

println "Deleting all tables completed in ${System.currentTimeMillis() - start}"

I hope this saves you as much time as it has already saved me.