Embedding Threat Modeling in the DevOps Lifecycle (Part 2: Risk Management)
In part 1 of this series, we identified the first common challenge in integrating threat modeling in the DevOps lifecycle: backlog management. Moving to the second part of this series, we’ll have a look at the often problematic relationship between Risk Management and Threat modelling practices in organisations. What should be a straightforward relationship of practices born to reduce risk and manage threats are often at significant odds with each other.
Nothing exists in isolation in organisations, and threat modeling is now exception. The constraints of finite resources, conflicting goals and continuous change are inescapable, so ensuring that you design organisational practices in a way that is both informed and informative to the other organisational structures who require such information to be able to make informed decisions between conflicting goals, is our responsibility when designing threat modelling, outside its more visible aspect of eliciting threats and mitigations.
We’ll have a look at some common patterns that you may find in organisations and things you can do to experiment and improve on them.
Disconnected practices for different audiences
I’ve often seen relatively mature risk management practices in organisations co-exist with relatively mature threat modeling practices, in which their outputs operate completely independent of each other, other than somewhere on a risk treatment plan someone saying “we should threat model” then apparently someone does. And that’s both a shame and a missed opportunity as they have much in common and be mutually supportive and reinforcing.
This also often leads to inefficiencies relating to duplication of effort and practices that are supposed to be finding vulnerabilities and treating them or identifying risks and treating them. The misseed opportunity is that this would be a great boundary practice between the GRC and Engineering functions, and accelerate achieving a state where both functions are mutually supportive (as they’re very often at odds). My preference, namely when orgnisations already have consistent threat modelling practice when this happens, would be to have periodic dedicated sessions where threat modelling becomes the practice which determines risk treatment plans, as opposed to the Risk and Compliance teams, in isolation, defining risk treatment activities without the benefit of the local knowledge that teams can provide. Risk-informed threat modelling, where threat modelling can also help inform the types of risks being tracked and discussed to ensure consistency of information across different layers of the organisation.
Aggregation of vulnerabilities and risks misaligned with organisational structure
A related challenge I’ve seen in many different organisations, is how risks and vulnerabilities are aggregated, accepted and managed in the organisation. The most common challenge is teams being unable to articulate why a particular risk should be an Engineering Risk vs a Product Risk. It rarely is an explicit decision made from weighing pros and cons of different models
When I see this happening, there are usually two related challenges going on.
- Product Management feel they don’t have the knowledge to understand what they should be doing about those threat modelleed things and how important is in their “grand scheme” of things relating to product development
- Engineering feel like they just get asked to accepted things that they have no role in prioritising, as it’s Product Management defining the priorities
Being explicit about the risk model and how the threat modelling activities relate to it, is step one in making something sustainable and repeatable.
The default and implicit model I see, is that risks (outside of market-fit and feature prioritisation) aren’t really managed within the management activities of Product Management itself, and instead there are “left-field” things which keep being thrown at it from other areas which then results in re-prioritising work. I tend to call this a “hiearchically-aligned” risk model, as it’s a bunch of areas competing for Product Management resources to deal with something they consider a risk to their domain. The drawback of this model, is that it tends to up being a battle between Product and everyone competing for their attention, and the way their own success is measured usually incentivises them to mostly care about features and Product, and now the other things which are relevant to it, security being one of them.
My preferred model of thinking about and aggregating threat modelling results to Product risks, is what I’ve tended to call “Product-oriented” risk model.
In this model, Engineering and Product security risks and related metrics are aggregated, reported and managed at the Product level, and as such any associated risk acceptance needs to be a Product Management acceptance, not an Engineering or Security one. In the reliability space, this has been achieved by the concept and mechanics of “Error budgets” and SLOs, but that largely the security industry hasn’t leveraged (and I still think we should) as they help answer “how much security should I be doing?”, as they’ve answered already for reliability.
However, independent of the way we manage the risk management component, thinking of how we’ll identify, aggregate and categorise the outputs of the Threat modelling sessions (at both the individual Product and overall Product Management level) should be something on your mind. Remember that with each hierarchical layer, stakeholders will be further detached from the technical aspects of our systems, so if our threat modelling output are “too technical” without a clear ability to categorise and put it in context, we may get pushback in the form of mitigations not being implemented.
My recommendation would be to agree on a categorisation system that can help associate findings with best practices (such as OWASP ASVS) and/or objectives of the ISMS (Information Security Management system) such as ISO 27001. If you do this consistently, then it becomes really easy to aggregate the types of issues that are being found in threat modelling across the organisation, and even put you in the position to assess what are Product-level issues with more systemic issues, for instance you may find a lot of teams are struggling particularly with “cryptography” or “backups” and that can then serve as input for strategic infosec initiatives to help provide better capability for Product teams to address shortcomings more systematically.
Over-prescribed risk treatment plans
Another missed opportunity I often found, was that the Risk Management function put too much effort in trying to determine risk treatment plans without necessarily discussing with those affected and or impacted by its resolution. Remember, everything in Engineering is about constraints and trade-offs, and the risk teams are often further away from the “sharp end” of operations, and as such tend to have a very limited understanding of constraints and trade-offs that affect the viability of their defined risk treatment plans. This is, again, where threat modelling can come to the rescuee. Particularly after you’ve started to successfully implement threat modelling across the organisation, a great opportunity for collaboration would be for risk teams to stop determining risk treatment plans entirely, and instead use the threat modelling facilities and process to explore mitiagtions to identified risks. You can see below an example of one way you can think about connecting these different timelines on what’s important across stakeholders, ensuring that each of them can perform the practice that is closer to their needs but which are connected in ways that allow us to reason about and ensure that stakeholders are aligned across the organisation
Have you seen these challenges at your own organisations ? Did you approach its resolution differently ? Would love to hear others thoughts.