Infrastructure-as-code (and GitOps) extend the use of source control (git) and code (well, manifest files) into a new field. This changed radically how we create infrastructure in the cloud, by making the process more robust and less error prone, and also easier for developers. Can we do the same for threat modeling? How can threat-modeling-as-code change and improve the way we do threat modeling today?
Let’s start with a really short introduction to threat modeling. Threat modeling is a practice that help us take a system design and look for possible security issues, by asking these 4 questions:
- What are we building?
- What can go wrong?
- What are we doing about it?
- Are we doing a good job?
Conduction a threat model helps to find issues sooner – and in most cases, detect issues that are hard to find using other practices. This is why conducting a threat model is a critical part in building a secre software. If you’re not familiar with this practice, I’m highly recommending this post by Adam Shostack, one of the authorities in the field. OWASP Threat Modeling project (and channel) is also an excellent learning resource.
I like working with Git, and especially GitHub PR flow. The idea is simple: one single master branch, which is kept “green” (e.g. all tests are passing – meaning, the code is working as expected). When I want to work on a feature, I create a branch, write my code and push the branch. Than, I can open a pull-request (PR) – a request to pull the changes from my branch to the master branch. What happens on a PR?
- Tests are running to assess the quality of my changes. Based on the tests results, the PR can be approved or denied.
- Code review. Someone can take a look on my code and decide if my PR is approved or not.
This flow is so common, that is now used for more than just code. For example, using OctoDNS, all DNS changes are done by git. This allows every developer to apply DNS changes, and still make sure it stays secure and safe. Another example is Terraform, which let you manage cloud infrastructure as code, using manifest files to describe the required resources. This let you create databases, servers or anything else only by using git. All these are examples to GitOps – operations using git. GitOps is so popular and loved by developers because it is implemented by tools that are familiar and known by developers.
How is all that related to threat modeling? Just imagine we could have the same for threat modeling. Any time someone want to conduct a threat model, she would open a PR with her changes to a repository. On the PR we can:
- Run tests: From tests that validates the threat model, to more sophisticated tests that can detect automatic issues (“static analysis”). Imagine, for example, that we can add clear-text connection threat to all network connections. Maybe we can even fail the test on specific threats? Saying, if there is even one connection that is clear-text, the PR is blocked? So many possibilities here.
- Perform a review: The developer can ask many people to review the threat modeling and add their insights. This can either happen asynchronous, or as part of a threat modeling meeting. The PR become the place where the threats are documented. Later, we can link the issues we decided to fix to this PR.
Moving to a threat-modeling-as-code opens up new possibilities. The main motivation, at least in my opinion, is moving to tools that are used, loved and known by developers. How? Let me introduce two tools you can start using today for threat-modeling-as-code.
PlantUML – drawing-as-code
The first tool is PlantUML, a language that let you create a drawing, using a specific DSL (domain-specific language). Take for example the following “code” (this is the official example from here):
@startuml Alice -> Bob: Authentication Request Bob --> Alice: Authentication Response Alice -> Bob: Another authentication Request Alice <-- Bob: Another authentication Response @enduml
This very simple code is translated to the following diagram:
PlantUML allows us to create a drawing using code (and it has so many features for creating a complex diagram, including numbering, type of participants and more). This free us from messing with the actual drawing of the diagram. And, because this is normal code, we can have review comments on it (see an example PR here):
What about testing? PlantUML language is simple, writing a SAST tool should be an easy task. There are existing solutions already – for example, OWASP Threat Dragon has a rules engine that can generate threats and mitigations automatically. Microsoft Threat Modeling Tool can do the same. All that we need is import PlantUML files into one of these tools, and we can easily have testing.
And what about templating? In the basis, the things we build are usually similar – a web API that handles HTTP request, a worker that read from a queue, mobile application that interact with an API etc. Using PlantUML we can create ready-made templates, with the generic flow and potential threats. Developers just need to copy the template, modify according to the specific use case and review the threats. PlantUML opens so many possibilities, just by enabling diagram-as-code.
Gherkin – threats-as-code
Drawing a diagram is only the first part of conducting a threat model. After drawing it, we need to look for potential threats, prioritize them and discuss potential mitigations. Gherkin is a language for writing user stories and scenarios, and it is commonly used for Behavior Driven Development. Threats and controls are just user stories, so we can use Gherkin to document them. A great example of using Gherkin for this purpose is OWASP Cloud Security Project (thank you, Fraser Scott, for the project, and for the inspiration!). The project goal is to document threats and controls that are relevant to applications running in a cloud environment. Let’s take a look at one example threat:
Scenario: Getting the security credentials When the attacker injects a request to http://169.254.169.254/latest/meta-data/iam/security-credentials/ROLE_NAME Then the temporary security credentials for ROLE_NAME are returned
And this is the matching control, to mitigate this threat:
Scenario: Application is protected against Server Side Request Forgery Given an EC2 instance with access to the metadata service And an application running on the instance When we inject a request to the http:/169.254.169.254 metadata service URL Then the application must not call the provided metadata service URL And the application must not return any results of a call to the metadata service
Using Gherkin language (and more in general, using user stories) to document threat and control makes it a lot easier to understand and prioritize them. Like with PlantUML, adding a review is simple – this is just a text file.
What about testing? Gherkin is commonly used for Behavior Driven Development: Writing a scenario using Gherkin, and having a code that generates tests automatically from this scenario. For example, for the threat above – the code should generate a test for SSRF vulnerability in the API. There is some development in this area ([Threat Playbook](https://en.wikipedia.org/wiki/Behavior-driven_development) is a great example of automatic security tests), and I hope we see more tools like that.
Using Gherkin we can also create threat libraries easily (like Cloud Security Project which I mentioned earlier). Developers can refer to the library when conducting a threat model, and choose the relevant threats for their use case. The library should contain common controls to mitigate these threats – so all that is left is just choose the right control and implement it.
When I worked on releasing Kamus, our secret management solution for Kubernetes, I looked on how it will be best to release publicly the threat model – in order to share the threat and mitigations that were discussed. Combining the power of PlantUML and Gherkin was a great choice for this use case (I’m still looking for a good solution to generate a web site). This is another benefit of threat-modeling-as-code, and I hope to see more open-source project following this path in the future.
Using PlantUML and Gherkin is only the first step toward threat-modeling-as-code. From my experience, it does make it a bit easier for devs, but the tools are still not mature enough. Having some sort of automated tests could boost up the experience – and hopefully, make threat modeling easier for everyone. Threat modeling is a critical process of building secure software, make it easier (and maybe even fun?) could encourage more developers to adopt it.
I just find out the Abhay Bhargav gave a talk on the same subject at AppSec USA. Thanks, Josh Grossman for sharing this talk!
1 thought on “Threat Modeling as Code”
Gherkin looks like a good option to describe attack scenarios. Such scenarios stay in a separate file which make them a bit disconnected from the code. If one updates the code, he should also check if the threat model is updated. I think it’s something which is easy to forget. It would be nice to have a way which allows describing threats, sinks, etc in the code. For example, threatspec tries to implement this way.