Protecting your Site

Page 1 - Validation and Cross-site Scripting

Skip to navigation

Padlock for Security
Aug
6

For the majority of sites it is an essential that they are secure, and as safe from hacking to some degree. It is impossible to state that a site is not hackable based on the belief that given enough time any site could be broken. This belief makes it necessary to find a comfortable balance between security and resources spent on making it so.

The first step in avoiding hacking attempts is to make sure all input is validated - this means session variables, GET and POST variables should be validated to make sure they contain the type of content that is expected of them. Such an example would be to make sure a variable containing a telephone number only contains numbers. This also covers the avoidance of "Cross-site Scripting" (referred to as XSS). Variables that accept strings should also use utf8_decode() in PHP to ensure that no UTF-8 encoded cross-site scripting attempts get through. Cross-site scripting attempts are typically attempts to use input boxes, such as search fields, as a way of sending JavaScript and/or PHP such that when the POST or GET variable is printed or used, it could cause abnormal behaviour. The cleverer hacker would not however be deterred by their initial attempt, and may choose to encode their same attack using UTF-8. This means the typical usage of htmlspecialchars() and/or strip_tags() will not work. This is why it is important utf8_decode() is used before htmlspecialchars() and strip_tags(). A typical method of sanitising the content would be as follows (where $content contains a returned field that is allowed to contain a string).

$content = htmlspecialchars(utf8_decode($content));

If no HTML code or PHP code is allowed in the content, then htmlspecialchars() should be replaced with strip_tags(). It is also possible to define which tags to strip, or what to convert them to by using the strtr function. In order to decide which of these techniques to use it is important to first understand what each one does.

htmlspecialchars()
This encodes ' " < > &
htmlentities()
convert any html entity into it's related code
strip_tags()
Removes all php and html tags from the string

One further check should be made on data that has been passed in through a super global, and that is to ensure that any serialised data has not been tampered with - this can be done by having a checksum for the data stored in the session data. Although this cannot guarentee it's validity, it does offer one more step to protect the data.

Further validation of the content type of variables can be done in PHP using the character type functions.

ctype_alnum
Check for alphanumeric character(s)
ctype_alpha
Check for alphabetic character(s)
ctype_cntrl
Check for control character(s)
ctype_digit
Check for numeric character(s)
ctype_graph
Check for any printable character(s) except space
ctype_lower
Check for lowercase character(s)
ctype_print
Check for printable character(s)
ctype_punct
Check for any printable character which is not whitespace or an alphanumeric character
ctype_space
Check for whitespace character(s)
ctype_upper
Check for uppercase character(s)
ctype_xdigit
Check for character(s) representing a hexadecimal digit