How to prevent XPath injection

Parameterization and input validation are invaluable to application security. Which method is best for preventing XPath injection attacks? Chris Eng explains.

XPath injection attacks can wreak havoc on Web applications. Application security expert Chris Eng answers a reader's...

question on whether parameterization or input validation is best for preventing XPath injection.

I am trying to build a Web site application and am considering security issues with XPath injections. According to the literature available, there are two methods of countering these attacks. One is by sanitizing the input provided by the user, and the other method is to develop parameterized queries. My question is which one is better and why? I know input validation is tougher and will impose rules on the input the client can provide, but parameterization seems foolproof. Is this right?

If you had to pick one approach or the other, parameterizing your XPath queries would be more effective. Similar to using prepared statements to protect against SQL injection, precompiling your XPath query creates a distinction between the control plane and the data plane. Input provided by the end user is treated only as data and cannot influence the syntax of the query itself.

Consider the following XML document:

        <?xml version="1.0"?>

Here is an example of a parameterized XPath query implemented in Java:

XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setXPathVariableResolver(new LoginResolver(login));
XPathExpression xlogin =
    xpath.compile("//users/user[username/text()=$username and
    password/text()=$password and pin=$pin]/ssn/text()");
Document d =
    ().parse(new File("userdata.xml"));
String ssn = xlogin.evaluate(d);

You would also need to define the LoginResolver class, which the XPath evaluator uses to populate the values of the bound parameters.

Application security resources
The importance of input validation 

Understanding XPath injection 

Malicious code injection: It's not just for SQL anymore 

This is not to say that you should avoid input validation altogether. Defense in depth is a crucial concept to keep in mind when designing secure applications, and as a general rule, user-supplied data should always be validated to ensure that it conforms to expected values or ranges. In the event that a novel new technique is discovered to attack parameterized XPath queries, strict input validation will help mitigate your application's exposure to this risk. Similarly, if a developer forgets to (or simply chooses not to) use a parameterized query, a properly designed input filter may be enough to prevent the vulnerability from being exploited.

Finally, when designing input validation filters, be wary of trying to find a one-size-fits-all approach. For example, I have read recommendations stating that SQL injection and XPath injection can be prevented by simply escaping single quotes, double quotes, backslashes, and semicolons from user-supplied data. But this doesn't make all dynamic queries safe, only those that enclose the user-supplied data in quotes. Consider a non-parameterized version of the same XPath query we looked at earlier:

XPath xpath =XPathFactory.newInstance().newXPath();
XPathExpression xlogin = xpath.compile("//users/user
    [username/text()='" + login.getUsername() + "' and
    password/text()='" + login.getPassword() + "' and
Document d =
    ().parse(new File("userdata.xml"));
String ssn = xlogin.evaluate(d);

Using the previously mentioned blacklist style approach to escape quotes would prevent the username and password fields from being manipulated. However, the pin field, which is a numeric field, is still vulnerable. If the value "1 or 1=1" is supplied for the pin field, it will pass through the blacklist filter untouched. Knowing that a user's pin should always consist of six numeric characters, a better mechanism would be to create a whitelist filter that allows only that specific data format and rejects everything else.

About the author: Chris Eng is director of security research at Veracode.

Dig Deeper on Topics Archive