.NET and XPath

So I'm working on this XPath presentation for my team at work. I was trying to hack up a sample using some of the more interesting XPath functions, like string-join. PHP's DOMXPath throws a fit when I use this function so I cracked open MSDN and saw that XPathNavigator in the 2.0 framework claims to support "the XQuery 1.0 and XPath 2.0 Data Model[s]." Nifty, huh? Especially since string-join is defined in those specs. (Note that this table claims it is available in XPath 1.0. Apparently nobody bothered to check the XPath 1.0 specification which does not mention it at all.)

PHP's implementation must be broken then. Off I go and code a Winforms project that I can use to run my example. Right? Yeah, right...

For the sake of simplicity, I coded a small CLI program that will run an XPath query against an empty document:

using System;
using System.Xml;
using System.Xml.XPath;

public class XPathCLI {
    public static void Main(string[] args) {
        XmlDocument doc = new XmlDocument();

        XPathNavigator nav = doc.CreateNavigator();
        Console.WriteLine(nav.Evaluate(args[0]).ToString());
    }
}

Now let's make sure it's working:

$ ./XPathCLI.exe 'concat("hello ", "world")'
hello world

Looks good. Now let's try the examples listed under string-join:

$ ./XPathCLI.exe "string-join({'Now', 'is', 'the', 'time', '...'}, \" \")"

Unhandled Exception: System.Xml.XPath.XPathException: invalid token: '{'
  at Mono.Xml.XPath.Tokenizer.ParseToken () [0x00000]
  at Mono.Xml.XPath.Tokenizer.advance () [0x00000]
  at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]

$ ./XPathCLI.exe "string-join({abra, cadabra}, \"\")"

Unhandled Exception: System.Xml.XPath.XPathException: invalid token: '{'
  at Mono.Xml.XPath.Tokenizer.ParseToken () [0x00000]
  at Mono.Xml.XPath.Tokenizer.advance () [0x00000]
  at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]

$ ./XPathCLI.exe 'string-join((), "separator")'

Unhandled Exception: System.Xml.XPath.XPathException: Error during parse of string-join((), "separator") ---> Mono.Xml.XPath.yyParser.yyException: irrecoverable syntax error
  at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000] --- End of inner exception stack trace ---

  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]
  at System.Xml.XPath.XPathExpression.Compile (System.String xpath, IXmlNamespaceResolver nsmgr, IStaticXsltContext ctx) [0x00000]
  at System.Xml.XPath.XPathExpression.Compile (System.String xpath) [0x00000]
  at System.Xml.XPath.XPathNavigator.Compile (System.String xpath) [0x00000]
  at System.Xml.XPath.XPathNavigator.Evaluate (System.String xpath) [0x00000]
  at XPathCLI.Main (System.String[] args) [0x00000]

Ok, that didn't go too well. Apparently Mono doesn't like some of the syntax. Let's use a node selecting expression instead:

$ ./XPathCLI.exe 'string-join(//something, "separator")'

Unhandled Exception: System.Xml.XPath.XPathException: function string-join not found
  at System.Xml.XPath.ExprFunctionCall.Evaluate (System.Xml.XPath.BaseIterator iter) [0x00000]
  at System.Xml.XPath.CompiledExpression.Evaluate (System.Xml.XPath.BaseIterator iter) [0x00000]

Uh... ok. Let's start over on MS.NET. It must be a Mono bug, right?

>XPathCLI.exe "string-join({'Now', 'is' 'the', 'time', '...'}, \" \")"

Unhandled Exception: System.Xml.XPath.XPathException: 'string-join({'Now', 'is''the', 'time', '...'}, " ")' has an invalid token.
   at MS.Internal.Xml.XPath.XPathScanner.NextLex()
   at MS.Internal.Xml.XPath.XPathParser.ParseMethod(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParsePrimaryExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseFilterExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParsePathExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseUnionExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseUnaryExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseMultiplicativeExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseAdditiveExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseRelationalExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseEqualityExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseAndExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseOrExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseXPathExpresion(String xpathExpresion)
   at System.Xml.XPath.XPathExpression.Compile(String xpath, IXmlNamespaceResolver nsResolver)
   at System.Xml.XPath.XPathNavigator.Evaluate(String xpath)
   at XPathCLI.Main(String[] args)

Let's jump straight to the one that made it past Mono's parser to crash in the evaluator:

>XPathCLI.exe "string-join(//something, \"separator\")"

Unhandled Exception: System.Xml.XPath.XPathException: Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function.
   at MS.Internal.Xml.XPath.CompiledXpathExpr.get_QueryTree()
   at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context)
   at System.Xml.XPath.XPathNavigator.Evaluate(String xpath)
   at XPathCLI.Main(String[] args)

From this we can make a few conclusions:

  • Mono, MS.NET, and PHP do not support XPath 2.0. I cannot find any PHP documentation that claims a specific version of XPath support, but, as noted in the intro paragraph, MSDN claims XPath 2.0 support and MS.NET does not deliver. (Mono may be following the MS.NET implementation instead of the spec, so whether this is a Mono bug or not is debatable.)
  • Mono, MS.NET, and PHP do not support the {...} construct, which is present in the XPath 2.0 "Precedence Order" section but not actually defined elsewhere. This construct is not present at all in the XPath 1.0 specification. Whether this is a specification or implementation defect is left an open question.
  • Mono, MS.NET, and PHP do not implement the string-join function defined in at least XPath 2.0.

And from those conclusions we can draw a few more.

  • Nobody gives a whip about following the XPath specification.
  • The XPath specification is broken. Or confusing. Or (more likely) both.

The real question, then, is do people intentionally not implement the XPath 2.0 specification because they don't want to, or because parts of it make no sense? It seems odd to me that an implementation would support concat and not string-join, especially since they are defined right next to each other.

In any case, if you're not implementing all of it, don't claim that you do. Incorrect documentation is worse than no documentation.

Comments