At the XML Summer School 2013, Tony Graham presented a lightning talk about life after
libxslt 1.0. I was not present for this summer school, but it was clear from the
feedback of the discussions I received that there is a major gap of XSLT 2.0 support
in the large developer community of C/Perl/PHP/Python/Ruby world and associated tools
that rely on libxslt.
It is a known problem, which has never, to my knowledge been addressed. At Saxonica,
we wanted to try and plug this gap by porting the Saxon processor from Java to C/C++,
which would enable us to communicate with the languages specified above. One of our
goals, if possible was to interface with libxml and libxslt. Providing such a bridge
or cross-compiled version of a full fledged Java application
to C/C++ is always a daunting task. In this blog post I discuss the technical steps
in our quest to achieve our goals
and give some details of the experiences gained along the way. I will begin by detailing
the various technologies that we tried, and how we have have ended up using a commercial
Java native compiler after several failed attempts with tools that either did not
work, cumbersome or were just too error prone.
LLVM
At the summer school there were discussions that the tool LLVM could do the job of compiling Java to native code. As claimed on the project website LLVM is a collection of modular and reusable compiler and toolchain technologies. The LLVM project seems very active with many projects using it to do various task, but I found it difficult to get anything working. In particular, I tried using the VMKit which relies on LLVM to compile some a simple 'Hello World' examples to machine code, but even that seemed cumbersome.
GCJ
Secondly, I looked at the GCJ technology. GCJ is a tool that I have used before, so I was confident that it would work. However, from my past experience using this tool is that it can be error prone and contains long standing bugs, which is a result of the project being dormant for several years, it seems unlikely that bugs will be fixed. The other worrying fact is that GCJ only supports up-to JDK 1.5. Nevertheless for lack of other options, I persevered with GCJ and I had much better success given that I managed to compile Saxon-HE to native machine code and actually got it to execute my example stylesheets. I had some problems because of classes that were not present in the GCJ implementation of JDK 1.5, such as the packages java.math and javax.xml. Therefore, I had to include my own version of these packages.
The next step was to create a shared library of Saxon-HE, so that I could interface it with C/C++. This proved to be a real battle, which in the end I succeeded. I decided to use Compiled Native Interface (CNI), which presents a convenient way to write Java native methods using C++. The alternative was JNI (Java Native Interface), which may be viewed as more portable. Both interfaces though have similar principles: you need a Java/CNI-aware C++ compiler, any recent version of G++ is capable, and then you must include the header file for each Java class it uses. These header files, if not automatically generated, can be done using gcjh. I soon gave up on using GCJ: I stumbled upon a few known bugs and because if I was having major issues with the setup and prerequisites required then surely users would have the same problems.
Excelsior JET
The Excelsior JET tool is the final technology we looked at and thankfully it is what we have ended up using in the alpha release. JET is a commercial product that provides a Java native compiler for both Linux and Windows platforms. What is good about this software tool is that it provides an easy to use Graphical interface to build native executables and shared libraries from jar file(s). It also has the feature to package up the software into an installer ready to be deployed onto its intended host machine. This was great for us!
There is a lot I could write about JET, but it would be a repeat of the plethora of information currently available on their website and forum. However, just to mention we started with their evaluation version which offers 90-days free usage of their software before purchasing the professional edition. Another point of interest is that Excelsior offer a free-of-charge license for use in conjunction with open-source software.
We know that there will be some sections of the open-source community that dislike the dependency upon using a commercial tool, but it is not that dissimilar from the early years of Java when the Sun compiler was freely available but not open-sourced.
Implementation notes using JET
After creating the shared library, to interface it with C/C++ I used JNI. It is possible to use JET's own Java interface to external functions called xFunction, which is recommended if starting from scratch, but having used JNI with GCJ I continued with that approach. To get started there are a few examples of invoking a library with C/C++. In essence, you need to load the library and initialize the JET run-time before you can use it, see the code below (from the file xsltProcessor.cc):
/* Load dll. */
HANDLE loadDll(char* name)
{
HANDLE hDll = LoadLibrary (name);
if (!hDll) {
printf ("Unable to load %s\n", name);
exit(1);
}
printf ("%s loaded\n", name);
return hDll;
}
extern "C" {jint (JNICALL * JNI_GetDefaultJavaVMInitArgs_func) (void *args);
jint (JNICALL * JNI_CreateJavaVM_func) (JavaVM **pvm, void **penv, void *args);
}
/*Initialize JET run-time.*/
extern "C" void initJavaRT(HANDLE myDllHandle, JavaVM** pjvm, JNIEnv** penv)
{
int result;
JavaVMInitArgs args;
JNI_GetDefaultJavaVMInitArgs_func =
(jint (JNICALL *) (void *args))
GetProcAddress (myDllHandle, "JNI_GetDefaultJavaVMInitArgs");
JNI_CreateJavaVM_func =
(jint (JNICALL *) (JavaVM **pvm, void **penv, void *args))
GetProcAddress (myDllHandle, "JNI_CreateJavaVM");
if(!JNI_GetDefaultJavaVMInitArgs_func) {
printf ("%s doesn't contain public JNI_GetDefaultJavaVMInitArgs\n", dllname);
exit (1);
}
if(!JNI_CreateJavaVM_func) {
printf ("%s doesn't contain public JNI_CreateJavaVM\n", dllname);
exit (1);
}
memset (&args, 0, sizeof(args));
args.version = JNI_VERSION_1_2;
result = JNI_GetDefaultJavaVMInitArgs_func(&args);
if (result != JNI_OK) {
printf ("JNI_GetDefaultJavaVMInitArgs() failed with result %d\n", result);
exit(1);
}
/* NOTE: no JVM is actually created
* this call to JNI_CreateJavaVM is intended for JET RT initialization
*/
result = JNI_CreateJavaVM_func (pjvm, (void **)penv, &args);
if (result != JNI_OK) {
printf ("JNI_CreateJavaVM() failed with result %d\n", result);
exit(1);
}
printf ("JET RT initialized\n");
fflush (stdout);
}
XsltProcessor::XsltProcessor(bool license) {
/* * First of all, load required component.
* By the time of JET initialization, all components should be loaded.
*/
myDllHandle = loadDll (dllname);
/*
* Initialize JET run-time.
* The handle of loaded component is used to retrieve Invocation API.
*/
initJavaRT (myDllHandle, &jvm, &env);
/* Look for class.*/
cppClass = lookForClass(env, "net/sf/saxon/option/cpp/XsltProcessorForCpp");
versionClass = lookForClass(env, "net/sf/saxon/Version");
cpp = createObject (env, cppClass, "(Z)V", license);
jmethodID debugMID = env->GetStaticMethodID(cppClass, "setDebugMode", "(Z)V");
if(debugMID){
env->CallStaticVoidMethod(cppClass, debugMID, (jboolean)false);
}
....
}
...
In the constructor method of XsltProcessor we see that once we have loaded the library and initialized the JET run-time we can now make calls to the environment, which has been created to get class definitions and create instance(s) of the class in the Java world. This is before we make method calls on the object.
PHP Extension in C/C++
After successfully getting XSLT transformations to work within C/C++, the next step was to try and develop a PHP extension, which would operate like libxslt. There is a lot of material on the web and books in regards to PHP extensions and I found the following guide very useful: http://devzone.zend.com/1435/wrapping-c-classes-in-a-php-extension/. I literally followed it step-by-step, adding a few steps of my own when I worked out what I was doing.
Testing
As a proof of concept I wrote a test harness in PHP which makes use of the PHP extension (see: xslt30TestSuite.php in the download library). This is a test driver designed to run the public W3C XSLT test suite at https://dvcs.w3.org/hg/xslt30-test/. The test driver in its current form requires Saxon-EE, which is not yet available in this alpha release; nevertheless, the program may serve as a useful example of how the API can be used. Note that it is written to use libXML to read the test catalog, but to use Saxon for running the tests and assessing the results.
Performance Testing
I now draw comparisons between running Saxon-HE (on Java) vs running Saxon-HE/C on C++ and on PHP on some preliminary tests. I also compare these times to libxslt (C/C++). An important aim is to get a good measure of the costs of crossing the Java/C++ boundary using JNI and also to see what the effect is with the PHP extension.
I used Saxon-HE 9.5.1.3 as the baseline. The test machine was a Intel Core i5 processor 430M laptop with 4GB memory, 2.26Ghz CPU and 3MB L3 cache, running Ubuntu 13.10 Linux. Servers Apache2 and PHP version 5.5.3-1ubuntu2. The compiler was Sun/Oracle Java 1.6.0.43.
The experiments were based on the XMark benchmark. I used query q8, which was converted into the stylesheet below. The choice of q8.xsl is because we should expect some performance bottle-necks across the implementations due to its equijoins in the query:
<result xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xsl:version="2.0"> <!-- Q8. List the names of persons and the number of items they bought. (joins person, closed_auction) --> <xsl:for-each select="/site/people/person"> <xsl:variable name="a" select="/site/closed_auctions/closed_auction[buyer/@person = current()/@id]"/> <item person="{name}"><xsl:value-of select="count($a)"/></item> </xsl:for-each> </result>
The running times of executing q8.xsl on the document xmark64.xml, which is a 64MB size document are as follows:
Saxon-HE (Java): 60.5 seconds
Saxon-HE/C (C++): 132 seconds
Saxon-HE/C (PHP): 137 seconds
libxslt (C/C++): 213 seconds
The times for Saxon-HE/C are without the cost of JET initialisation and loading the library, which accounted for only 4 seconds. So we observe that there is not a big overhead between C++ and the PHP extension. The biggest cost as expected is between Java and C++, where we see on the C++/PHP platform a slowdown of ~ x2.2. We also observe that Saxon-HE/C out performs libxslt on C/C++ by ~40% on q8.
See project page on Saxon/C.
At Saxonica, we have for a long time now used a tailor-made java application to create and issue licenses for all commercial products we develop. There is no real database at the back-end, but just a local XML file with customer details and copies of the licenses created and issued. For a one man company this poses no real problem, but inevitably as the company has expanded over the last two years this has been a major concern.
Early last year Mike Kay presented me with the task to create a new saxon-license application with the following requirements:
From the outset we thought that for such a tool which heavily relies on XML and XSLT at its core, that requirements would be best met using XSLTForms and Servlex to develop the tool.
In this blog post I would like to share my own experiences in the development of the saxon-license webapp using Servlex and XSLTForms. In the discussion I include how we stitched on our existing back-end core Java tool and challenges faced with encoding. Specific details of the features and functions are not that important here, only the engineering process is of interest.
On the client-side we write XForms [1] documents, which are manipulated by XSLTForms [2] (created by Alain Couthures) to render in the browsers. XSLTForms is an open source client-side implementation, not a plug-in or install, that works with all major browsers.
On the server-side we integrate the core Saxon-license tool in a Servlex webapp [3] as Saxon extension functions called from within XSL. The Servlex is an open-source implementation of the EXPath webapp framework [4] based on Saxon and Calabash as its XSLT, XQuery and XProc processors. Servlex provides a way to write web applications directly in XSLT. It is developed as a Java EE application requiring Servlet technology, sitting on tomcat for binding to HTTP server-side.
The server-side Servlex works as a dispatching and routing mechanism to components (implementation as XSLT stylesheets), applying a configuration-based mapping between the request URI and the component used to process that URI. The container communicates with the components by means of an XML representation of the HTTP request, and receives in turn XML data with HTML at the request body with XForms content and XSLTForms references to render the page. The representation of the HTTP response is sent back to the client. There are buttons on the forms, which if pressed trigger the action HTTP PUT request; made through the client-side XSLTForms. These requests are handled by Servlex.
There are 7=5 main XSLT functions described below, which map the URIs to generate the various XForms to tunnel the instance data between the XForms. These functions all make calls to the core Saxon-license tool written in Java, made available as a Saxon extensions calls from the XSLT:
fnRunMainForm: A request to serve the main form is made with the following URI pattern:
http://192.168.0.2:8080/app/license-tool/main
License requests are usually made through the main saxonica website either for evaluation or paid order (See: http://www.saxonica.com/download/download.xml and http://www.saxonica.com/purchase/purchase.xml, respectively), these orders are receives as an email, which are then copied and pasted on the main form. This data is sent in the form of a XForms instance in a web request, picked up by servlex.
fnManualEntry: Manual Entry form for manual creation of the customer details to create a license. A request is made to servlex with the following URI pattern:http://192.168.0.2:8080/app/license-tool/manualEntry
fnFetchRecord: Existing licenses created we can retrieve and re-issue. A request is made to Servlex with the following URI pattern. We observe the parameter after the ? Is the license number to fetch:
http://192.168.0.2:8080/app/license-tool/fetchRecord?Select=X002110
fnReport: This function generates an HTML page containing all license created or such the last 20.
http://192.168.0.2:8080/app/license-tool/report
fnEditParseRequest: Manual Entry form: The manual form with the client data populated. The order request from the main form is parsed and returned as a Xforms instance data which is used to generate the form on the server. A request is made to Servlex with the following URI pattern:
http://192.168.0.2:8080/app/license-tool/editParseRequest
Securing access to the saxon-license webapp is achieved through apache2 configuration.
A long-standing problem we faced in this application was the handling of non-ASCII characters. We raised this issue with Alain and Florent the creators of XSLTForms and Servlex, respectively, to get to the bottom of this problem.
Basically, if the user enters data on a form, we're sending it back to the server in a url-encoded POST message, and it's emerging from Servlex in the form of XML presented as a string, and if there are non-ASCII characters then they are miscoded. In the form we set the submission method attribute to 'xml-urlencoded-post' to guarantee that the next page will fully replace the current one: XMLHttpRequest is not used in this case.
We were seeing the typical pattern that you get when the input characters are encoded as a UTF-8 byte sequence and the byte sequence is then decoded to characters by someone who believes it to be 8859-1. We were not able to work out where the incorrect decoding was happening. We originally circumvented the problem by reversing the error: we converted the string back to bytes treating each char as a byte, and then decoded the bytes as UTF-8.
A feature of XSLTForms is the profiler (enabled by pressing F1 or setting debug='yes' in the xsltforms-options process instruction). The profiler allows the inspection of the instance data. Another mechanism is to inspect the requests sent by the browser with the network profiler of a debugger.
We established that on the client side, there is an HTML Form Element that gets built, and just before the submit() method gets called on this object, the data appears to be OK. But when we look at the Tomcat log of the POST request, it's wrong. Somewhere between the form.submit() on the client and the logging of the message on the server, it's getting corrupted. We can't actually see where the encoding and decoding is happening between these two points.
To tackle this problem Florent provided a development version of Servlex, which added logging of the octets as they are read from the binary stream (the logger org.expath.servlex must be set to trace, which should be the default in that version). In addition to logging the raw headers, as they are read by Tomcat.
With this new version of Servlex in place I inputted the following data on the main form. We observe the euro symbol at the end of my first name 'O'Neil' is a non-ASCII character which needs to be preserved:
First Name: O'Neil€ Last Name: Delpratt Company: Saxonica Country: United Kingdom Email Address: oneil@saxonica.com Phone: Agree to Terms: checked
After submitting this data to the URI pattern: .../app/license-tool/editParseRequest we see below the the log data reported by tomcat. What is interesting is the line 'DEBUG [2013-03-04 18:06:34,281]: Request - header : content-type / application/x-www-form-urlencoded'. Also at this stage the input to the receiving form has been corrupted to 'O'Neil€' which should be 'O'Neil€' :
DEBUG [2013-03-04 18:06:34,279]: Request - servlet : parseRequest DEBUG [2013-03-04 18:06:34,280]: Request - path : /parseRequest DEBUG [2013-03-04 18:06:34,280]: Request - method : POST DEBUG [2013-03-04 18:06:34,280]: Request - uri : http://localhost:8080/app/license-tool/parseRequest DEBUG [2013-03-04 18:06:34,280]: Request - authority: http://localhost:8080 DEBUG [2013-03-04 18:06:34,280]: Request - ctxt_root: /app/license-tool DEBUG [2013-03-04 18:06:34,280]: Request - param : postdata / <Document><Data>First Name: O'Neil€ Last Name: Delpratt Company: Saxonica Country: United Kingdom Email Address: oneil@saxonica.com Phone: Agree to Terms: checked </Data><Options><Confirmed>false</Confirmed><Create>false</Create><Send>false</Send><Generate>false</Generate><Existing/></Options></Document> DEBUG [2013-03-04 18:06:34,281]: Request - header : host / localhost:8080 DEBUG [2013-03-04 18:06:34,281]: Request - header : user-agent / Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:18.0) Gecko/20100101 Firefox/18.0 DEBUG [2013-03-04 18:06:34,281]: Request - header : accept / text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 DEBUG [2013-03-04 18:06:34,281]: Request - header : accept-language / en-gb,en;q=0.5 DEBUG [2013-03-04 18:06:34,281]: Request - header : accept-encoding / gzip, deflate DEBUG [2013-03-04 18:06:34,281]: Request - header : referer / http://localhost:8080/app/license-tool/main DEBUG [2013-03-04 18:06:34,281]: Request - header : connection / keep-alive DEBUG [2013-03-04 18:06:34,281]: Request - header : content-type / application/x-www-form-urlencoded DEBUG [2013-03-04 18:06:34,281]: Request - header : content-length / 482 DEBUG [2013-03-04 18:06:34,281]: Raw body content type: application/x-www-form-urlencoded TRACE [2013-03-04 18:06:34,281]: TraceInputStream(org.apache.catalina.connector.CoyoteInputStream@771eeb) TRACE [2013-03-04 18:06:34,282]: read([B@1a70476): -1
Florent made the following observations:
Alain provided the following example to test the assumptions made by Florent.
Encoding.xhtml:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:xf="http://www.w3.org/2002/xforms"> <head> <title>Encoding Test</title> <xf:model> <xf:instance> <data/> </xf:instance> <xf:submission id="s01" method="xml-urlencoded-post" replace="all" action="http://www.agencexml.com/xsltforms/dump.php"> <xf:message level="modeless" ev:event="xforms-submit-error">Submit error.</xf:message> </xf:submission> </xf:model> </head> <body> <xf:input ref="."> <xf:label>Input:</xf:label> </xf:input> <xf:submit> <xf:label>Save</xf:label> </xf:submit> </body> </html>
dump.php:
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>HTTP XML POST Dump</title> </head> <body> <h1>HTTP XML POST Dump</h1> <h2>Raw Data :</h2> <?php $body = file_get_contents("php://input"); echo strlen($body); echo " bytes: <br/>"; echo "<pre>$body</pre>"; if(substr($body,0,9) == "postdata=") { $body = urldecode(substr($body,strpos($body,"=")+1)); } $xml = new DOMDocument(); $xml->loadXML($body); $xslt = new XSLTProcessor(); $xsl = new DOMDocument(); $indent = "<xsl:stylesheet xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\" version=\"1.0\"><xsl:output method=\"xml\" indent=\"yes\" encoding=\"UTF-8\"/><xsl:template match=\"@*|node()\"><xsl:copy-of select=\".\"/></xsl:template></xsl:stylesheet>"; $xsl->loadXML($indent); $xslt->importStylesheet($xsl); $result = $xslt->transformToXml($xml); $result = substr($result, strpos($result,"?>")+3); echo "<h2>Indented XML :</h2><pre>".htmlspecialchars($result, ENT_QUOTES)."</pre>"; ?> </body> </html>
When submitting '€', I get this:
HTTP XML POST Dump Raw Data : 41 bytes: postdata=%3Cdata%3E%E2%82%AC%3C%2Fdata%3E Indented XML : <data>€</data>
and with Firebug, I can see following, which is correct:
Florent states:
What should be in the content of the HTTP request is %E2%82%AC to represent the Euro
symbol as URL- encoded (because that represents the 3 octets of in UTF-8).
Because of the "automatic" handling of that Content-Type by Java EE, I am afraid the
only way to know for sure what is on the wire is to actually look into it (using a
packet sniffer, like Wireshark for instance).
At this stage it was important to check what packets are being sent. The following is a snippet of the reports from Wireshark, with the data format correct at this point.
HTTP 1207 POST /app/license-tool/parseRequest HTTP/1.1 (application/x-www-form-urlencoded) [truncated] postdata=%3CDocument+xmlns%3D%22%22%3E%3CData%3EFirst+Name%3A+O%27Neil %E2%82%AC%0D%0A%0D%0ALast+Name%3A+Delpratt%0D%0A%0D%0ACompany%3A+Saxonica %0D%0A%0D%0ACountry%3A+United+Kingdom%0D%0A%0D%0AEmail+Address%3A+oneil %40saxonica.c ...
Florent discovered using Alain's test case that it was actually Tomcat itself interpreting the %xx encoding as Latin-1! More infos at:
http://wiki.apache.org/tomcat/FAQ/CharacterEncoding
In summary, the message is the decoding was done using 8859-1 not UTF-8 as one would expect.
To overcome the problem Florent created a new config property for Servlex, which is named org.expath.servlex.default.charset, the value of which can be set to "UTF-8" in Tomcat's conf/catalina.properties. If set, it's value is used as the charset for requests without an explicit charset in Content-Type.
Thanks to Florent, Alain and Mike the encoding problem has now been resolved. The lesson learnt in all, is that tracking down encoding problems can still be very hard work.
References
[1] XForms. W3C. http://www.w3.org/MarkUp/Forms/
[2] XSLTForms. Alain Couthures. http://www.agencexml.com/xsltforms
[3] Servlex. Florent George. Gihub: https://github.com/fgeorges/servlex Google Project: http://code.google.com/p/servlex/
[4] EXPath Webapp. http://expath.org/wiki/Webapp
I would like to report on some Saxon performance measure on a Word ladder solution implemented in XSLT.
Firstly, some background information on the Word ladder problem. From Wikipedia, the free encyclopedia:
A word ladder (also known as a doublets, word-links, or Word golf) is a word game invented by Lewis Carroll. A word ladder puzzle begins with two given words, and to solve the puzzle one must find the shortest chain of other words to link the two given words, in which chain every two adjacent words (that is, words in successive steps) differ by exactly by one letter.
XSLT interest in this problem was first started (to the best of my knowledge) by Dimitre Novatchev through the mulberry mailing list, who provides a 20 step guide to create a stylesheet in his blog to solve the Word ladder problem (FindChainOfWordsHamming.xsl). Following the post on the list, there has been some interest; another solution to this problem was given by Wolfgang Laun (please see thread, file: FindChainOfWordsHamming2.xsl).
Experimental Evaluation
Our interest resides in the Saxon performances only. I was curious and surprised by the results reported by Dimitre. The question I had is why Dimitre's stylesheet was much slower than Wolfgang's stylesheet in Saxon and faster in another XSLT processor: there must be some optimization step we were not making. I was motivated to understand were the bottle necks were and how we could improve the performance in Saxon.
Wolfgang wrote: "The XSLT program is three times faster on one XSLT implementation than on another one is strange, 'very' strange".
Mike Kay addressed Wolfgang's comment by writing in the thread: "No, it's extremely common. In fact, very much larger factors than this are possible. Sometimes Saxon-EE runs 1000 times faster than Saxon-HE. This effect is normal with declarative languages where powerful optimizations are deployed - SQL users will be very familiar with the effect."
The table below shows the execution times of the stylesheets in Saxon 9.XX (for some recent X). Time were reported by Dimitre.
Transformation | Times (secs) |
---|---|
Dimitre | 39 |
Wolfgang | 25 |
We observe that Wolfgang's transformation is 1.56 times faster. Please note that with Wolfgang's stylesheet his results lists all solutions (i.e. ladders), whereas Dimitre only finds one.
Saxon represents a stylesheet as a compiled abstract syntax tree (AST) which is processed in a interpreted manner. Since the release of Saxon 9.4 we have included the bytecode generation feature, which allows us at the compilation phase to generate directly the byte code representation of the entire AST or sub-trees of it where performance benefits can be achieved. We make use of properties we know at compile time (See full paper).
Analysis of Dimitre's Stylesheet
Step one was to see how well Saxon does with the bytecode feature switched on. This proved inconclusive because we discovered a bug in the bytecode generated. A useful exercise already, we managed to fix the bug (see bug issue: #1653). The problem was in the function processQueue the tail recursive call was not being properly generated into bytecode.
The Table below shows running times of the stylesheets under Saxon 9.4.0.6. We observe that Wolfgang's stylesheet was 2.07 and 3.22 faster in Saxon Intepreted and bytecode, respectively.
Transformation | Interpreted - Times (secs) | With bytecode generation - Times (secs) |
---|---|---|
Dimitre | 7.95 | 7.78 |
Wolfgang | 3.83 | 2.41 |
Analyzing Dimitre's stylesheet with the Saxon tracing profile (i.e. option -TP) proved useful. See the html output produced by Saxon below. We observe that there is a big hit on the processNode method, with the most time spent in this function.
Total time: 9498.871 milliseconds
Time spent in each template or function:
The table below is ordered by the total net time spent in the template or function. Gross time means the time including called templates and functions; net time means time excluding time spent in called templates and functions.
file | line | instruction | count | avg time (gross) | total time (gross) | avg time (net) | total time (net) |
---|---|---|---|---|---|---|---|
"*rdsHamming.xsl" | 79 | function my:processNode | 2053 | 4.12 | 8470.67 | 3.729 | 7655.792 |
"*rdsHamming.xsl" | 21 | function my:chainOfWords | 1 | 9491.1 | 9491.12 | 993.34 | 993.34 |
"*rdsHamming.xsl" | 131 | function f:eq | 3993 | 0.06 | 230.02 | 0.058 | 230.26 |
"*rdsHamming.xsl" | 131 | function my:HammingDistance | 3993 | 0.20 | 807.38 | 0.049 | 194.77 |
"*func-apply.xsl" | 21 | function f:apply | 15972 | 0.01 | 290.01 | 0.011 | 175.00 |
"*-Operators.xsl" | 244 | template f:eq | 15972 | 0.01 | 115.01 | 0.004 | 68.23 |
"*-Operators.xsl" | 248 | function f:eq | 15972 | 0.003 | 46.77 | 0.003 | 46.77 |
"*nc-zipWith.xsl" | 21 | function f:zipWith | 19965 | 0.002 | 33.11 | 0.002 | 33.11 |
"*nc-zipWith.xsl" | 9 | function f:zipWith | 19965 | 0.003 | 57.67 | 0.001 | 24.56 |
"*func-apply.xsl" | 16 | function f:apply | 15972 | 0.019 | 309.52 | 0.001 | 19.52 |
"*rdsHamming.xsl" | 70 | function my:processQueue | 2053 | 0.009 | 18.35 | 0.009 | 18.35 |
"*hFunctions.xsl" | 498 | function f:string-to-codepoints | 3993 | 0.003 | 10.52 | 0.003 | 10.52 |
"*rdsHamming.xsl" | 120 | function my:HammingDistance | 3993 | 0.204 | 814.48 | 0.002 | 7.09 |
"*hFunctions.xsl" | 498 | function f:string-to-codepoints | 3993 | 0.001 | 4.88 | 0.001 | 4.88 |
"*rdsHamming.xsl" | 73 | function my:processNode | 2053 | 4.128 | 8475.2 | 0.002 | 4.57 |
"*rdsHamming.xsl" | 54 | function my:processQueue | 2053 | 0.011 | 22.20 | 0.002 | 3.85 |
"*rdsHamming.xsl" | 17 | template /* | 1 | 9491.87 | 9491.9 | 0.756 | 0.76 |
"*rdsHamming.xsl" | 40 | function my:chainOfWords | 1 | 0.344 | 0.34 | 0.344 | 0.34 |
"*rdsHamming.xsl" | 117 | function my:enumerate | 10 | 0.166 | 1.65 | 0.029 | 0.29 |
"*rdsHamming.xsl" | 111 | function my:enumerate | 10 | 0.176 | 1.76 | 0.010 | 0.10 |
In addition to the Saxon tracing profile I ran the Java hrof profiling tool, which
showed up that most time was spent in comparing strings. See the Java profile results
below. It was now obvious that the GeneralComparison expression was in question. Specifically
we narrowed it down to the instruction: <xsl:for-each select="$vNeighbors[not(. = $pExcluded)]">
. For the interpreted code we were doing some unnecessary runtime type checking when
we know statically at compile time that we are comparing string values. More Specifically,
we know at compile time that $vNeighbors is a sequence of untyped atomic values and
$pExcluded is a sequence of strings. We were unnecessarily checking at runtime that
untyped atomic and string literal were comparable and we were doing an unnecessary
conversion from an untyped atomic to string.
CPU SAMPLES BEGIN (total = 1213) Thu Nov 29 14:42:47 2012 rank self accum count trace method 1 24.24% 24.24% 294 300547 java.lang.Integer.hashCode 2 19.13% 43.36% 232 300581 net.sf.saxon.expr.GeneralComparison.compare 3 7.75% 51.11% 94 300613 java.util.HashMap.getEntry 4 2.14% 53.26% 26 300570 java.util.LinkedHashMap$Entry.recordAccess 5 2.06% 55.32% 25 300234 java.lang.ClassLoader.defineClass1 6 2.06% 57.38% 25 300616 com.saxonica.expr.ee.GeneralComparisonEE.effectiveBooleanValue 7 1.98% 59.36% 24 300603 java.util.LinkedHashMap$Entry.recordAccess 8 1.98% 61.34% 24 300609 net.sf.saxon.type.Converter.convert ....
See full hprof results: java.hprof-DN.txt
Improvements in Bytecode generation
In the bytecode we discovered we were missing out on opportunities to capitalise on static properties we know at compile time. For example during atomization we were doing an instanceof test to see whether each item was a node when we already know from static analysis that this was the case. We were also able to avoid unnecessary conversions of the strings, checking of instanceof and we found we could avoid repeated conversions by saving of string values for reuse when appropriate.
With the code improvements discussed above we were able to apply them in Saxon-EE 9.5 (pre-release). The table below shows these running times on the stylesheet written by Dimitre and Wolfgang. We observe that in the interpreted code that Wolfgang's XSL is 2.13 times faster than Dimitre (This is similar to Dimitre results above). With the bytecode generation feature switched on: Dimitre's stylesheet has dramatically improved in performance and is now 1.19 times faster than Wolfgang's XSL.
Transformation | Interpreted - Times (secs) | With bytecode generation - Times (secs) |
---|---|---|
Dimitre | 7.373 | 1.938 |
Wolfgang | 3.450 | 2.17 |
We have not done any similar analysis on Wolfgang's stylesheet, we will now attempt to do this.
To be continued....