What is input validation?
The act of ensuring data being input to your software is in a valid format.
We also use input validation simply to ensure the data matches what the bussiness wants it to be. If we want a phone number then it should look like one!
Input validation is good at limiting risk as it can reduce the types of characters available to an attacker although it has to be used with care to ensure it isn’t restricting genuine input.
Good examples of input validation are specific and strictly limit data to what is reasonable, for example:
- A phone number can contain only numbers, hyphens and brackets and should be no longer than 20 characters
- A postcode cannot be more than 12 characters long and can contain numbers, letters and spaces
- An ID must be formatted as a Guid (e.g. “160702a4-7eb1-44e9-ba6e-4c6886b67ca8”)
Bad examples of input validation:
- The person’s name must be a string – there are lots of characters that wouldn’t reasonably be in a person name that could be used as an attack
- The person’s name must contain only the characters A-Z – this is a whitelist which negatively affects usability. People with non-English names may have accented characters (Renée), also anyone with a hyphen (Johnson-Lynn) or apostrophe (O'Neill) in their name will incorrectly fail validation.
- A person’s address must not contain the characters “£$%^&*-“ - this is a blacklist, typically a whitelist is better as you’re being more specific about what you’ll accept.
Regular expressions (regex) are often used for input validation - Some good regex samples can be found at OWASP.
Input validation can be used in web pages to stop people entering values you don’t want, this provides a good user experience by quickly showing the user their data is wrong. At this point it doesn’t provide any protection to your web service, so that validation must also be used when the data is being accepted by the server.
What can go wrong if we don’t validate input?
What could go wrong inside our service:
- SQL injection
- Command injection
- lots of other things that we haven’t even thought of yet
What else could go wrong?
Input validation can provide protection at so many levels, it can protect against known attacks such as SQL injection, but it also has the potential to protect against attacks that are as yet unknown. Attacks often use non-standard characters (like <&~|;) to work and if you only allow letters, numbers and a few basic pieces of punctuation then you’re clearly limiting the likelihood of some attacks.
Input validation is an important part of Defence in depth, it covers known vulnerabilities and potentially helps out with protection for as yet unknown attacks. The best part is that it’s incredibly simple to implement*, you apply a regular expression to all input fields and feel all warm and content that you’ve got a little more depth to your security.
*Ok, regular expressions aren’t always simple
Got a comment or correction (I’m not perfect) for this post? Please leave a comment below.