XSS vector: URLs

Injection is an attack that involves breaking out of a data context and switching into a code context through the use of special characters that are significant in the interpreter being used. - OWASP

Many PHP applications will HTML encode any untrusted data using htmlentities() irrespective of context. This is a problem as htmlentities does not mitigate certain XSS injections. For example, the output of any data that will be used as a URL:

[php]echo '<a href="', htmlentities( 'javascript:alert("xss");' ), '">XSS</a>';[/php]

In this instance htmlentities is not sufficient protection, the above outputs

[html]<a href="javascript:alert(&quot;xss&quot;);">XSS</a>[/html]

To prevent this injection URLs should be validated on input, and htmlentites() encoded on output.

[php] $url = 'http://hostname/path?arg=value';

$parsedUrl = parse_url( $url );

if( $parsedUrl['scheme'] != 'http' && $parsedUrl['scheme'] != 'https' ) {

// reject URL } else { $url = mysqlirealescape_string( $mysqli, $url ); $sql = "INSERT INTO table (url) VALUES ('$url')"; // insert query }

...

echo '<a href="', htmlentities( $url ), '">XSS</a>', "\n";

[/php]

The URL now stored in the database should still be output using htmlentities() to encode quote marks that could inject further code, such as in this example:

[html]http://www.test.com/" onClick="javascript:alert(\'xss\');"[/html]

For further information about XSS, the OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet offers a list of XSS Prevention Rules, while RSnake's XSS (Cross Site Scripting) Cheat Sheet is the definitive list of XSS injection test strings.