Threat modeling, faster — with an LLM in the loop
A practical pattern for using an LLM to bootstrap STRIDE without giving up the parts that need a human.
I've been threat-modeling features for over a decade. The thing I keep noticing is that most of the time spent in a threat-modeling session is on the parts an LLM is good at: enumerating threat categories, surfacing the obvious-but-easy-to-miss patterns, and producing the first draft of the writeup.
The part that actually requires a human — judgment about what matters in this system, with this trust boundary, with this deadline — is maybe 20% of the session.
So why am I still doing the other 80% by hand?
The pattern
Here's the loop that's been working for me:
- Feed the LLM the feature spec, the data flow, and the trust boundaries. Force it to restate them back. If it can't, the spec isn't good enough yet — and that's a finding.
- Ask for STRIDE-by-element, with concrete threats specific to this feature. Reject generic ones.
- Demand a likelihood/impact rating with reasoning, not just a label. The reasoning is where the LLM either earns trust or exposes a gap.
- Have it propose the smallest mitigation that meaningfully reduces risk for each. The word "smallest" matters — it forces specificity.
- Then I read it. Critically. The output is a draft, not a decision.
The first pass usually takes me ~10 minutes instead of 90. The second pass — the human review — is where the real work happens, and now I get to do it on a dense, structured starting point instead of a blank page.
The prompt
The version I use lives in the prompts library — Threat-model a feature. It enforces a specific output shape, forbids generic threats, and ends with "open questions for product" — because half the value of threat modeling is exposing the questions nobody asked yet.
What still requires a human
Three things, in order of importance:
- Knowing what not to model. The LLM will happily generate threats for everything. Half of them aren't worth the cost of mitigation. Triage is human work.
- Trust boundaries. Where the boundary actually is — not where the diagram says it is — needs someone who's seen the system in production.
- Politics. "This is a P0 risk, ship date is Friday" is a conversation, not a prompt.
The LLM accelerates the parts that benefit from acceleration. It doesn't — and shouldn't — replace the judgment.
A note on guardrails
If you're sending real specs to an external model, treat them like the sensitive documents they are. Your threat-modeling prompt should never be the place a competitor learns about your unreleased feature. For most teams, this means a vetted endpoint, not a public chat UI.