XMark Q12 examined further

In yesterday's posting, I showed some figures demonstrating that for most queries, Saxon-B and Saxon-SA have very similar performance, but for some join queries, Saxon-SA performs dramatically better. However, I left with some unexplained results for the XMark query Q12. In this posting I'll explore this query in a little more detail. I don't know if we'll come to any conclusions, but at the very least it's an opportunity to explain how to use Saxon's "explain" output, which shows the structure of the compiled expression tree, that is, the query execution plan chosen by the optimizer. You can get this diagnostic output for XQuery using the -e option on the command line, or for XSLT using saxon:explain="yes" typically on an xsl:template or xsl:function declaration.

Q12 reads like this:

for $p in /site/people/person
let $l := for $i in /site/open_auctions/open_auction/initial
          where $p/profile/@income > (5000 * $i)
          return $i
where  $p/profile/@income > 50000
return <items person="{$p/name}">{ count ($l) }</items>

The explain output is shown below, with comments interspersed.

for $p in
  path /
    path /
      path /
        /
        child::element(site)
      child::element(people)
    child::element(person)

This is a very direct translation of the first "for" clause, the optimizer has done nothing clever. It's a bit tricky to read, because the expression has effectively been rewritten in "reverse Polish" order. But you should be able to make sense of it.

return
  if (
    some $qq:13419912 in
      treat as xdt:_numeric_
        convert untyped atomic items to xdt:_numeric_
          atomize
            path /
              path /
                $p
                child::element(profile)
              attribute::attribute(income)
    satisfies
      operator gt
        $qq:13419912
        50000

This part is the "where" clause associated with the outer FLOWR expression. I'm slightly suprised that this clause wasn't turned into a filter expression, but in this case that wouldn't have given any benefit, so it's nothing to worry about. The "where" is being turned into a simple conditional expression testing every item returned by the "for". If you look ahead in the output you'll see that the "else" branch of the conditional is "else ()" - when the condition is true, some output is produced, otherwise nothing is produced.

The original condition $p/profile/@income > 50000 has first been turned into the expression some $qqq in $p/profile/@income satisfies $qqq gt 50000. This will test each selected @income in turn against the value 50000, and exit as soon as one is found that satisfies the condition. Of course, without a schema, the XQuery compiler cannot know that each person ($p) has only one profile and therefore only one income.

The other changes here implement the type-checking and type-conversion rules. Reading up the tree, the @income node is first atomized, to extract its typed value. The attribute could be list-valued, so the typed value has to be treated as a sequence. The next operation (convert) converts any untypedAtomic values in this sequence to xs:double values, leaving other values unchanged (it uses the internal type name xdt:_numeric_ in the output, but it's actually converting to doubles). Then in the "treat" operation it checks that all the values in the resulting sequence (including any values that weren't untyped and therefore haven't been converted) are now numeric, and raises an error if this check fails. (Actually, because these operations on sequences are pipelined, the error will only be raised if a non-numeric value is found before finding a value that's > 50000).

At this point we see a difference in the explain output if we've told Saxon that all nodes are untyped (using configuration.setAllNodesUntyped(true)). In this case the condition starts:

if (
  some $qq:20324370 in
    convert untyped atomic items to xdt:_numeric_
      atomize

Note the absence of the "treat" operator. This is because Saxon knows that the result of atomization will always be untyped atomic, and the result of the convert operator will therefore be numeric, so testing the values to check that they are numeric is superfluous. This is the reason that Saxon-B sometimes has a tiny edge over Saxon-SA when handling untyped data.

The "then" part of the conditional represents the "let" and "return" clauses of the outer FLWOR expression. It starts like this:

    let $l[refCount=1] :=
      let $zz:18929195[refCount=10] :=
        lazy
          atomize
            path /
              path /
                $p
                child::element(profile)
              attribute::attribute(income)
      return

The [refcount=1] is an indication of how many references to a variable there are (with a reference from inside a loop counting as "many"). You'll never see zero, because if there are no references, the expression gets optimized away. The only real distinction is between one and many (represented here by a nominal value of 10). The difference is that if there's only one reference, then evaluation of the variable and the expression that uses it can be pipelined together, with no memory allocated to storage of the variable's value. The refcount is also used when deciding whether to create an index for the variable.

The operator "lazy" means "evaluate this expression the first time its value is needed". When an expression is pulled out of a loop, it is always prefixed by this operator to ensure that in cases where the loop is evaluated zero times, the extracted expression isn't evaluated. The main reason for this is to avoid spurious errors occurring for expressions like for $x in $sequence return $x + 23 div count($sequence).

The outer "let" here is for the user-defined variable $l; the inner let is a system-allocated variable, which has been created because the optimizer has recognized that the expression $p/profile/@income doesn't depend on the value of $i, and can therefore be pulled out of the inner "for" loop. Atomization of the value (to extract the typed value of the @income attribute) also happens outside the loop. (Saxon doesn't however recognize that this expression appears twice in the query.)

The value of $l is computed by the next subexpression:

      return
        path /
          /
          filter []
            path /
              path /
                path /
                  child::element(site)
                  child::element(open_auctions)
                child::element(open_auction)
              child::element(initial)
            operator many-to-many >
              $zz:18929195
              operator *
                5000
                atomize singleton
                  .

This time the "where" clause has been turned into a filter expression (represented by the operator "filter []". The expression being filtered is the path expression site/open_auctions/open_auction/initial, and the filter is [$zz > 5000*.], where $zz is the system-allocated variable we saw earlier. Notice how the variable $i has disappeared and has been replaced by a reference to the context item "." within the predicate.

In this section of the query we see the main difference that occurs when we set the allNodesUntyped() option. In that case, firstly, after atomizing the value of @income outside the loop, the value is converted to an xs:double with the operator

convert untyped atomic items to xdt:_numeric_

Secondly, the use of "operator many-to-many >" is replaced by

            let $zz:13419912[refCount=10] :=
              lazy
                operator *
                  5000
                  atomize
                    .
            return
              some $qq:8106640 in
                $zz:29131495
              satisfies
                operator gt
                  $qq:8106640
                  $zz:13419912

That is, the expression [$zz > 5000 * .] has been rewritten as [let $zz2 := (5000 * .) return some $qq in $zz satisfies $qq gt $zz2].

It seems that this rewrite, in this particular case, is an unwise one. It's possible because in this case we have more information, for example we know that the typed value of @income will always be a single value, never a sequence, and as a result we don't need to use the potentially very expensive "many to many >" operator which tests every item on the left against every item on the right. But the actual performance results show that in this particular case, where in fact every person has only one income though we had no way of knowing this in advance, the "many to many >" operator actually performed better.

The final part of the explain output, for completeness, represents the "return" clause of the outer FLWOR expression:


    return
      element 
        name items
        content
            attribute 
              name person
              function string-join
                convert items to xs:string
                  atomize
                    path /
                      $p
                      child::element(name)
                                    " "
            function count
              $l

Notice again how Saxon, without the benefit of a schema, has no idea that a person has only one name. So the code person={$p/name} translates into a call on string-join() to output all the values of $p/name, separated by spaces.

What can we conclude so far? We've learned that for this particular query, telling Saxon that all the nodes were untyped had the opposite effect from that intended: it enabled Saxon to perform an optimization which in this particular case was counter-productive.

This doesn't actually help with our original question: why, unlike the vast majority of the queries in this benchmark, does Saxon-B achieve better results that Saxon-SA? It's not a big difference, but it needs explaining. It turns out that the "explain" output for Saxon-B is identical to the "explain" output for Saxon-SA with the "all nodes untyped" option, so it seems that there's something different happening at run-time. Sorry to leave you with an unsolved mystery, but I don't yet know what causing this!