Efficient use of Java’s Scanner

So I’ve been debugging some code that I wrote quickly to parse a couple of hundred megabytes of log files. I wasn’t surprised to notice that the most expensive method call was to getNextLine() in java.util.Scanner, but I was surprised to see that hasMoreLines() was the second most expensive. It appears that there is no buffering of the line when it does its search of the stream, which just about doubles the processing time if you’re reading millions of lines. I can only assume the same problem is encountered with the other return types (getNextLong(), etc).

Luckily it’s an easy fix, changing this code:


Scanner s = new Scanner(inputFile);
while(s.hasMoreLines()){
    System.out.println(s.nextLine());
}
s.close();

to:


Scanner s = new Scanner(inputFile);
try{
    while(true){
        System.out.println(s.nextLine());
    }
}catch(NoSuchElementException e){
    s.close();
}

For those of you used to exception-based programming this will seem obvious, but I’ve posted it here for everyone else!

Installing Tomahawk in a Seam-gen generated project

After a fair amount of trawling around the web for one of my software hut students I stumbled upon this excellent article about installing tomahawk in seam-gen generated projects:
http://demetrio812.blogspot.com/2007/08/install-tomahawk-in-seam-gen-generated.html

To be honest I thought the RichFaces library included with Seam was pretty comprehensive until today. I’ve been through the self-doubt and approached ICEFaces, but thankfully stepped right back into RichFaces once I realised I didn’t really gain anything. If only RichFaces would provide some comprehensive documentation. The examples on Exadel are excellent for showcasing Seam, but they really do fall short of demonstrating any usability, even for the basic form items.

The Tomahawk tag library adds some particularly useful tags, such as the scheduler, which is an incredibly useful extension to the standard calendar tags but allows insertion of information into the days/months. It also supports nested forms, which is an ongoing battle I’ve been losing with Seam recently!

All I need to do now is get my head around integrated testing!