Saturday, December 31, 2016

Information Security: Data Flow Diagrams

Continuing from my last post on STRIDE and threat models, I will discuss Data flow Diagram (DFD) in this post. Data flow Diagrams are useful tools used in threat modeling, that capture system detail and data flow information in common element in a standard format. At a high level, it can be used via  0 GUI for end user while saving the threat model to file system.

Following are the 4 major components of a DFD:
  • Process Elements
    • Any element that is an active processor of data
    • These elements are the focus of DFD
    • There must be a sensible breakdown of system processes
    • Represented by Single circle
  • External Interaction Elements
    • These refer to the original points of interaction, and also the final destination
    • For Example: valid end users, authenticated 3rd party systems etc
    • These interactions can input, read and update (CRUD operations) on data
    • While these are similar to Process Elements, these are beyond the control and scope of the system developers
    • It is important that the threat model contain summary of the interactions with these,
    • A list of security promises made by the elements should also be documented
    • The role that each interactor plays must also be captured in the threat model
    • Represented by rectangles
  • Data Store Elements: Passive Storage
    • These elements represent that data is being stored, however, no computation is being performed on it, and hence, no processing happens here
    • Represented by horizontal parallel lines (imagine a rectangle with vertical side missing)
  • Data Flow Elements
    • These elements represent inputs and outputs between different other elements 
    • Represented by single directional arrows
Representation of different elements


Apart from the above, following representations are also used based on the complexity at hand:
  • Boundaries
    • The boundaries represent privilege differentials between different elements where applicable
    • A boundary may be of multiple types:
      • Process boundary: cross process data flows on same machine
      • Machine boundary: cross process data flows across a network
      • Trust boundary: Emphasize a privilege in general case
      • Other boundary: generic
    • Represented by dotted red line
  • Multiple Process Element
    • Sometimes, in the interest of space and ease of representation, multiple processes may be abstracted at the higher level using this element
    • Each multiple process element has a separate DFD
    • Each such element represents sub components using multi process elements
    • Represented by concentric circles

Any system of sufficient complexity will require multi level DFDs. Due to which, the DFD can become very complex and hard to read. In such cases, context diagrams are used to present information in an easy to understand manner.

Context diagrams are useful to divide systems based on physical realities. For example, the context diagram may be based on major components (rather than high level logical diagram), so that they are  functionally comprehensible. Grouping into logical divisions is helpful in case of multiple ways to divide because of similar realities

Thus overall, a major system can follow the DFD organization as:
Context diagram -> L1 diagram -> L2 diagram and so on at each element
The engineers should ideally continue to decompose the system, till relevant security detail is achieved in order to think of a DFD as complete. For example, for a webapp, it could mean not caring about how the data is stored within your MySQL internals, as long as it reaches the database.

Sunday, December 25, 2016

Information Security: Threat Models and STRIDE

I began writing about Information Security concepts few months back, and this post follows the last post I had on the topic - Information Security: Output Encoding and Error Messaging. In this post, I'm going to talk about Threat models, what they are, and how to build one.

Now, when we talk of threat models, it is important to change perspective to an attacker and then visualize the software. In most projects, Software Engineers are assigned work based on functional groupings. Systems are usually partitioned at each level of complexity for easier assignment to developers. It thus becomes important to communicate assumptions and manage complexity through contracts and specifications.

Many methods like UML diagrams (Unified Modelling Langugage) exist for communicating designs. The key to building large scale systems is abstraction of every component and principles involved, and in order to build secure apps, one needs to make sure that all members are aware of the assumptions involved if any.

Moving on to threat models, it is important to be aware of the key terms involved, and how they differ from each other:
  • Assets – It could refer to the users or systems or information or anything that can be assigned a value. An asset represents what needs to be protected.
  • Vulnerability – A vulnerability is simply a design or implementation flaw that can be exploited by an attacker to gain unauthorized access to an asset. A vulnerability represents the weakness in our protection efforts.
  • Threat – Anything (person, program, bot etc) that may exploit a vulnerability, intentionally or accidentally. The attackers aim could be to obtain, damage, or destroy asset(s). A threat represents what needs to be protected against.
  • Risk – The potential for loss, damage or destruction of an asset which results from a threat exploiting a vulnerability. Risk is the intersection of assets, threats, and vulnerabilities.

In layman terms, the 4 entities above are related by the equation
A (Assets) + T (Threat) + V (Vulnerability) = R (Risk)
Since attackers want systems and assets, they usually target the weakest links in chain. Your overall system is only as secures as the weakest links after all, and therefore, we need to understand ways in which we can be targeted. It is required to capture important threats, and document mitigated threats. One needs to create negative use cases - for example, what happens when something that is supposed to happen doesn’t happen, and something that isn't supposed to happen actually happens.

And so we arrive at threat modeling, a process by which potential threats, such as structural vulnerabilities can be identified, enumerated, and prioritized. STRIDE is a threat classigication model developed at Microsoft for thinking about Computer Software Vulnurabilities:
  • Spoofing: Example: spoof identity and impersonate
  • Tampering: Tamper data in transit or at rest
  • Repudiation (denial of truth or validity of something): repudiate an action - actions that will be harmful to the flow
  • Information disclosure: disclosing of secret data, the flip side of tampering
  • Denial of service: Restricting the usage by legitimate users by a malicious actor
  • Elevation of Privilege: end goal of most attackers - change the interpretation of data

To further help engineers understand mitigation strategies required, Microsoft also came up with Elevation of Privilege game - Go through each of STRIDE one by one, and try to see if there are any vulnerabilities that can be targetted.

In the next posts, I'm going to talk about Data Flow Diagrams, DREAD, and mitigation of threat models.