A new XQJ draft

By Michael Kay on June 13, 2007 at 02:18p.m.

XQJ is the proposed XQuery API for Java. A new draft of the XQJ specifications came out yesterday, with the psychologically significant version number of 0.9, and I spent today updating Saxon's implementation to conform. The spec can be downloaded from http://jcp.org/en/jsr/detail?id=225

It's the first new version for about a year, and the spec is developed under conditions of absolute secrecy, so I was interested to see what was going to be in it. I wasn't expecting too much, because most of my comments on the previous draft had been politely rejected (about 8 months after I submitted them, with no open discussion). Sure enough, they've tidied up quite a lot of little things, but the overall design hasn't changed much. (It's sufficiently incompatible, however, that most applications will have to be tweaked: not rewritten, but amended here and there.)

First the criticisms. It's still uncompromisingly based on a client-server, connection-oriented model where the application lives on a different machine from the database (so 1980s!). There's all sorts of nonsense to do with connection pooling and the like, which makes no sense for Saxon at all. Results are still handled through a horrible XQResultSequence object that acts both as a sequence and as an iterator over the sequence - I thought that it had been known for decades that it was better to keep the two concepts separate. Bizarrely, prepared expressions are not thread-safe - you can't compile an expression and then run it in multiple threads simultaneously, because the XQPreparedExpression holds its own dynamic execution context. There's a major omission that most of the interfaces for parsing a source document don't allow the base URI to be specified, which means that you can't then follow relative references from one document to another. And there's a weird  way of getting input from an XMLReader that requires the user to wrap the standard parser in a private filter that has the actual input URI preinitialized within it. Fortunately you can also take input from a JAXP Source object, which makes all the other methods redundant.

Are there any good points? Yes, quite a few. The mapping of XPath values (the 19 primitive data types) to equivalent objects in Java is done reasonably well - much better than JAXP which simply uses Object to represent any XPath value, and doesn't give you any hints as to which kinds of Object might be acceptable. This version also has a full representation in Java terms of all XPath types - one could argue that this is overkill because it adds a lot of weight to the interface and isn't needed very often, but if you're going to do it then you ought to do it properly, and they have done so (contrast with the JAXP XPath API, which still only supports XPath 1.0, and makes a real mess of that). This time round there's a reasonable level of integration with the other important XML APIs in the JAXP world, without going over the top, and at the same time they've avoided getting into some of the JAXP disaster areas like the factory loading mechanism.

Still, I think one can do better. I've implemented the new interfaces, but I have to admit the test coverage is very minimal (doing better will have to await a third party test suite). I'm not going to integrate these interfaces more deeply into Saxon, however: unless the user community reacts with more enthusiasm than I've seen so far, I shall leave it as an optional API, one among several. Instead, I shall try to steal a few of the better ideas for the grand new unified API planned for a future Saxon release.