Saturday, June 25, 2016

Information Security: Output Encoding and Error Messaging

Continuing from my last post on importance of input validation in information security, I will discuss Output Encoding and Error Messaging for security with you, from the point of view of system development. Again, for the sake of ease of reading, this is structured as a set of question and answers that follow a conversation:
 
  • What is output encoding?
    Output encoding is the process of converting information or instructions being outputted by a component or a service into a particular format. The underlying idea is that the output once emitted may have to go through multiple intermediate hops before reaching the final destination, and having it properly encoded leaves little scope for attacks.

  • What happens due to improper output encoding?
    Improper output encoding and the lack of it is the root of implementation flaws, since data that is safe in one context may not have been encoded for safety in another context. It is important to remember that most interpreted language and structured documents formats like HTML or XML contain information in the specified structure, which can break

  • What happens if the structure is not adhered to?
    When the structure us not adhered to, injection flaws can occur during the time data is passed from an interpreted structure to another system. This gives scope to vulnerabilities like XSS, CSRF attacks, and other such threat vectors.

  • How to ensure that output is properly encoded?
    For proper encoding, it is important to study the structure of your output, and identify the troublesome characters used to breakout from data into structure. Delimiting characters in strings, and escape characters - used to encode dangerous/incompatible characters are high on the list. Any other delimiters introduced by the systems and contracts should also be checked for proper encoding.

  • How should one proceed with encoding?
    Encoding of troublesome characters should always be done using standard library. It is a 
    best practice to encode anything that may be dangerous in nature, i.e, anything that is not alphanumeric.

  • Are the any testing strategies for proper output encoding?
    Yes, definitely. You should be thoroughly testing the output encoding using 
    unit test validation and encoding functions. Every input field - hidden or intentional needs to be tested in the test cases. It is recommended that these tests be done on a regular basis, since downstream and upstream systems may change regularly.

  • What is meant by Error Messaging?
    Typically, hackers and attackers gain additional insights based on the kind of error messages the system provides them. It is therefore important to provide only the relevant information, keeping the messaging generic enough to not impart any additional information to an attacker. For example, consider the use case of recovering an account by email. In this case, either the entered email Id may have an account, or it may not. In such cases, a system can display the message  "Instructions successfully sent to email Id" and "Account does not exist" respectively. However, a generic message on the lines of "If an account exists, instructions will be sent to mail Id" works better as it does not provide any additional ammo to the hacker.

  • How do we repel attackers effectively?
    To repel attackers, give no additional ammunition, like the example above described. 
    Attackers focus on finding exceptional conditions within the system that they can exploit, and the text of error messages helps users, developers, and attackers alike. In practice, this translates into gracefully handling all exceptions and making error message generic

  • So what is the right time to do proper Error Messaging?
    Having the right Error Messaging is a design time problem. It is important to 
    define error codes or exceptions that your systems or services can emit early on during the design of your components. It is equally important to define conventions on when and how errors are to be caught, and ensuring that they will always be managed via extensive code reviews. Another best practice is to ensure errors are always handled in a consistent fashion.
So far, we have covered how to make sure that a) our systems get validated inputs, b) they emit encoded outputs and c) they don't provide additional information to attackers by having the right error messaging. In the next few posts, we will look at threat models.

Sunday, June 12, 2016

Information Security: Input validation

I've realized working at startups, security is one of the last things on a developer's mind. As long as the system isn't hackable, it doesn't matter. So recently while researching more on security, I came across a certification called as CISA, or Certified Information Security Auditor. It is an industry standard certification issued by ISACA for the people in charge of ensuring that an organization's IT and business systems are monitored, managed and protected.

Reading up more on the examination, I came across certain information security concepts that I believe are applicable to any system design in general. So I will discuss them in detail in this series of posts on Information Security.

Let us begin with input validation - it is the first line of defense that you have against any threatening actor for your systems and services. To make the post more readable for anyone new to the topic, I've structured it as a list of questions and answers that follow a conversation:
  • The first question that arises in our mind is what is Input validation?
    Input validation is the process of ensuring that data that has been passed is both correct and useful for the purpose for which it is being collected.

  • Why is input validation important?
    Input validation is important, because when not done right, it opens applications vulnerable. Exploits like buffer overflow, directory traversal, cross-site scripting and SQL injection are just a few of the attacks that can result from improper data validation.

  • Where should we validate input?
    Usually, folks confuse on where to validate the inputs correctly - on the client side or on the server side. It is important to remember that Java Script can be disabled on the client side, and thus, it is best to validate your inputs both on client and server.

  • What data should be validated?
    It is important to validate all data received from a user. While the average user may not be malicious, remember that they may be accessing your products and services from a compromised system or network. This means, all Form data, Hidden fields, Cookie data, HTTP headers and anything else of importance in general within the HTTP request should be validated.

  • What all should we validate from the input?
    It is important to remember that input has a meaning only when it is an interpretable format, since it may have to be transferred over the wire in a custom formats. So, therefore, it is required that both the syntax and semantics of the input are verified.

  • What all should be done while performing syntactic validation?
    For syntactic validation, it is important to
    • Identify and validate the structure of input - what all goes into it and what does not
      • The structure of any special symbols needs to be enforced
      • The input needs to have proper syntax for input
    • Standardize the encoding - it could be base64, or any custom implementation based on data being sent

  • What should happen to other inputs?
    Anything which does not pass the strict syntactical validation should be rejected. Common validations can include that the bounds are validated, numbers, text and text length are in acceptable ranges, and that dates and other data follow the format specified.

  • What should be done during semantic validation?
    Semantics mean that which relates to the meaning in language or logic. As such, one needs to not only check the structure of the data, but also the meaning of data. For example, if an API accepts dictionaries, it is important to validate that the right kind of dictionaries are being passed around, and not just with any data fields 

So, if you have done your input validations right, you are already safe from the large number of attacks that come from accepting incorrect inputs.