Ask the Expert

Understanding XPath injection

Can you please explain what an XPath injection is? I'm assuming it's similar to SQL injection, but I don't know how I would prevent it.

    Requires Free Membership to View

XPath stands for XML Path, a language for addressing nodes in an XML document. For example, if we have the XML snippet:

...
<customers>
     <account username="rjaamour" password="mypass">
          <ssn>123456789</ssn>
          < address>
               <street>100 First Ave.</street>
               <city>Los Angeles</city>
               <state>CA</state>
               <zip>91234</zip>
          </address>
     </account>
     <account username="jsmith"password="somevalue">
          :lt;ssn>54321<
           …
     </account>
     ...
</customers>

The XPath

     /customers/account[1]/address/state/text()

Returns the text node value "CA"

XPaths are can be used in XSLT documents for transformation purposes (for example, to transform a plain XML data document to a formatted HTML document). XPaths make it easy to retrieve node values deep from inside document rather than needing to traverse the document with SAX or DOM. This makes XPaths an attractive choice for writing code that accesses data from XML content. One can use a single line to get the state value from the XML snippet example above, compared to 3-levels of DOM traversal if DOM API was used.

The XPath language supports expressions that can be more complex. For example, a node could have been retrieved from the above XML document based on the username/password rather than index:

/customers/account[attribute::username ='rjaamour'][attribute::password='mypass']/ssn


Which returns the element

<ssn>123456789</ssn>.


Such expressions enable programmers to construct the XPath expressions dynamically based on variables or input from the user. For example, assume you have the following Java code:

...
public AccountInfo getAccountInfo(String username, String password) {

    String xpath = "/customers/account[attribute::username =" + username + "][attribute::password='" + password + '"]/ssn";

    Node ssn = XPathAPI.selectSingleNode(documentNode, xpath);

    if (ssn == null) {
    return null;
     }
        ...
}

This code is expected to return the SSN element based on the username and password passed in as a parameter, possibly from a Web form or a SOAP message parameter accessible from the outside world. At first glance, the code appears to return null as long as no correct username/password pair was provided. However, if a malicious user passed ' or 1=1 as the password parameter value, the XPath which executes in the previous code would become:

/customers/account[attribute::username ='rjaamour'][attribute::password='' or 1=1]/ssn

This results in the SSN node for the username "rjaamour" being retrieved without specifying a valid password since "or 1=1" will always cause the expression to evaluate to true.

As you can see, the concept of XPath injection is very similar to SQL injection. XPath injections are just another embodiment of injection attacks in general.

To prevent against XPath injections, use a similar approach to preventing SQL injections. That is, validate and sanitize user input before it is passed to an expression. In most cases, enforcing access controls like the above example should be avoided altogether, possibly by providing the code with an XML content that has already been filtered based on authorization rules and then queried with XPath.

If, however, you cannot avoid executing risky XPaths, make sure that you create a "whitelist" of allowable characters and values. For example, only allow alphanumeric characters, or special characters except quotes. Reject any input that does not comply with the whitelist, or sanitize it properly (by encoding or escaping, etc.) when dangerous characters must be allowed. In addition to compromises to confidentiality and access controls, beware of other malicious XPath parameters that could compromise the system in another way, such as a malicious value that results in an XPath returning a very large number of nodes, possibly causing a denial of service (DoS) attack.

More information:

This was first published in October 2006

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: