Q

Understanding XPath injection

XPath injection is similar to SQL injection and other injection attacks, but this XML exploit has its own unique set of issues. Web services expert Rami Jaamour details how these exploits work -- and how you can avoid them.

Can you please explain what an XPath injection is? I'm assuming it's similar to SQL injection, but I don't know how I would prevent it.

XPath stands for XML Path, a language for addressing nodes in an XML document. For example, if we have the XML snippet: ...                  123456789           < address>                 100 First Ave.                 Los Angeles                 CA                 91234                       

username="jsmith"password="somevalue">           :lt;ssn>54321<            …            ... The XPath      /customers/account[1]/address/state/text() Returns the text node value "CA"

XPaths are can be used in XSLT documents for transformation purposes (for example, to transform a plain XML data document to a formatted HTML document). XPaths make it easy to retrieve node values deep from inside document rather than needing to traverse the document with SAX or DOM. This makes XPaths an attractive choice for writing code that accesses data from XML content. One can use a single line to get the state value from the XML snippet example above, compared to 3-levels of DOM traversal if DOM API was used.

The XPath language supports expressions that can be more complex. For example, a node could have been retrieved from the above XML document based on the username/password rather than index:

/customers/account[attribute::username ='rjaamour'][attribute::password='mypass']/ssn


Which returns the element

<ssn>123456789</ssn>.


Such expressions enable programmers to construct the XPath expressions dynamically based on variables or input from the user. For example, assume you have the following Java code:

...
public AccountInfo getAccountInfo(String username, String password) {

    String xpath = "/customers/account[attribute::username =" + username + "][attribute::password='" + password + '"]/ssn";

    Node ssn = XPathAPI.selectSingleNode(documentNode, xpath);

    if (ssn == null) {
    return null;
     }
        ...
}

This code is expected to return the SSN element based on the username and password passed in as a parameter, possibly from a Web form or a SOAP message parameter accessible from the outside world. At first glance, the code appears to return null as long as no correct username/password pair was provided. However, if a malicious user passed ' or 1=1 as the password parameter value, the XPath which executes in the previous code would become:

/customers/account[attribute::username ='rjaamour'][attribute::password='' or 1=1]/ssn

This results in the SSN node for the username "rjaamour" being retrieved without specifying a valid password since "or 1=1" will always cause the expression to evaluate to true.

As you can see, the concept of XPath injection is very similar to SQL injection. XPath injections are just another embodiment of injection attacks in general.

To prevent against XPath injections, use a similar approach to preventing SQL injections. That is, validate and sanitize user input before it is passed to an expression. In most cases, enforcing access controls like the above example should be avoided altogether, possibly by providing the code with an XML content that has already been filtered based on authorization rules and then queried with XPath.

If, however, you cannot avoid executing risky XPaths, make sure that you create a "whitelist" of allowable characters and values. For example, only allow alphanumeric characters, or special characters except quotes. Reject any input that does not comply with the whitelist, or sanitize it properly (by encoding or escaping, etc.) when dangerous characters must be allowed. In addition to compromises to confidentiality and access controls, beware of other malicious XPath parameters that could compromise the system in another way, such as a malicious value that results in an XPath returning a very large number of nodes, possibly causing a denial of service (DoS) attack.

More information:
This was first published in October 2006
This Content Component encountered an error

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSOA

TheServerSide

SearchCloudApplications

SearchAWS

SearchBusinessAnalytics

SearchFinancialApplications

SearchHealthIT

Close