You've already forked zblade.dev
Made blog post stronger and more direct
This commit is contained in:
@@ -33,6 +33,8 @@ Most LLM-based security audits fail in predictable ways. The failure modes are r
|
|||||||
|
|
||||||
Our goal was to build something that produces audits a security engineer would actually want to read and act on — and that meant fixing the approach, not just tuning the prompt.
|
Our goal was to build something that produces audits a security engineer would actually want to read and act on — and that meant fixing the approach, not just tuning the prompt.
|
||||||
|
|
||||||
|
The uncomfortable truth: most LLM-based audit tools aren't doing security analysis. They're doing pattern matching with authority.
|
||||||
|
|
||||||
## The Key Insight: From "Find Scary Things" to "Decide What Matters"
|
## The Key Insight: From "Find Scary Things" to "Decide What Matters"
|
||||||
|
|
||||||
The single most important design decision was this: **a dangerous sink is not a vulnerability.**
|
The single most important design decision was this: **a dangerous sink is not a vulnerability.**
|
||||||
@@ -41,11 +43,7 @@ Most tools stop at the sink. Real analysis starts at the boundary.
|
|||||||
|
|
||||||
This sounds obvious, but it's the root of most false positives. The model sees `os.system(user_input)` and flags it. But the question isn't whether the sink is dangerous — it's whether the attacker *crosses a meaningful trust boundary* to reach it, and whether they *gain capability they didn't already have*.
|
This sounds obvious, but it's the root of most false positives. The model sees `os.system(user_input)` and flags it. But the question isn't whether the sink is dangerous — it's whether the attacker *crosses a meaningful trust boundary* to reach it, and whether they *gain capability they didn't already have*.
|
||||||
|
|
||||||
A desktop application executing commands from its own config file, which only the user can edit, running as that same user? That's not a vulnerability. That's the application working as designed. The user already has all the authority the "exploit" would give them.
|
A desktop app executing commands from its own config, editable only by the same user it runs as? Not a vulnerability — the user already has that authority. A web server executing commands from an HTTP request body? Completely different trust model, and it *is* a vulnerability.
|
||||||
|
|
||||||
A web server executing commands from an HTTP request body? That's a completely different trust model, and it *is* a vulnerability.
|
|
||||||
|
|
||||||
The skill has to reason about trust boundaries and privilege deltas, not just dangerous function calls.
|
|
||||||
|
|
||||||
## The Exploit Value Test
|
## The Exploit Value Test
|
||||||
|
|
||||||
@@ -67,7 +65,7 @@ This is now the gating test before any severity assignment. If Exploit Value ≤
|
|||||||
|
|
||||||
## The Classification Taxonomy
|
## The Classification Taxonomy
|
||||||
|
|
||||||
Most audit frameworks have findings and... that's it. We found this forced the model to either inflate borderline issues into "findings" or drop them entirely. Neither is correct.
|
Most audit frameworks have findings and... that's it. That binary — flag it or ignore it — is itself a design failure. It forces the model to either inflate borderline issues into "findings" or drop them entirely. Neither is correct.
|
||||||
|
|
||||||
The skill now distinguishes five categories:
|
The skill now distinguishes five categories:
|
||||||
|
|
||||||
@@ -153,6 +151,8 @@ We tested the skill against the Openbox source code — a Linux/BSD desktop envi
|
|||||||
|
|
||||||
A naive audit flags all of these as Critical. Our early versions flagged most of them as Medium+. The final version correctly classified the majority as Design Properties, with specific reasoning about why trust holds (or doesn't) in each case.
|
A naive audit flags all of these as Critical. Our early versions flagged most of them as Medium+. The final version correctly classified the majority as Design Properties, with specific reasoning about why trust holds (or doesn't) in each case.
|
||||||
|
|
||||||
|
For example: Openbox's `autostart.sh` mechanism executes arbitrary shell commands from a config file. A generated audit calls this RCE. Our audit classifies it as a Design Property — the config is owned by the same user who runs the window manager, the user already has shell access, and no trust boundary is crossed. But if that config were loaded from a remote sync or a shared NFS mount, the classification would flip to Confirmed Finding, because now an untrusted party can influence trusted execution.
|
||||||
|
|
||||||
The real vulnerabilities — the subtle authorization gaps, the parser edge cases, the incomplete fixes — actually became *more* visible once the noise was gone.
|
The real vulnerabilities — the subtle authorization gaps, the parser edge cases, the incomplete fixes — actually became *more* visible once the noise was gone.
|
||||||
|
|
||||||
## The Full Skill
|
## The Full Skill
|
||||||
|
|||||||
Reference in New Issue
Block a user