88 points | by sigmar13 hours ago
> We believe this is the first public example of an AI agent finding a previously unknown exploitable memory-safety issue in widely used real-world software. Earlier this year at the DARPA AIxCC event, Team Atlanta discovered a null-pointer dereference in SQLite, which inspired us to use it for our testing to see if we could find a more serious vulnerability.
Every word in "public example of an AI agent finding a previously unknown exploitable memory-safety issue in widely used real-world software" applies to the Team Atlanta finding that they are citing too, and the Team Atlanta finding was against an actual release version of SQLite instead of a prerelease. If either team has provided the first example of this convoluted sentence, it is Team Atlanta, not Google.
It is possible that they're arguing that the Team Atlanta finding wasn't "exploitable", but this is very debatable. We use CVSS to rate vulnerability impact, and CVSS defines Availability (crashing) as an equal member of the [Confidentiality, Integrity, Availability] triad. Being able to crash a system constitutes an exploitable vulnerability in that system. This is surely especially true for SQLite, which is one of the most mission-critical production software systems in the entire world.
But if we're going to act like we're being very precise about what "exploitable" means, we should conclude that neither of these are exploitable vulnerabilities. To exploit them, you have to provide malicious SQL queries to SQLite. Who does that? Attackers don't provide SQL queries to SQLite systems -- the developers do. If an attacker could provide arbitrary SQL queries, they can probably already exploit that SQLite system, through something like an arbitrary file content write to an arbitrary local filename into RCE. I don't think either group found an exploitable vulnerability.
Yet SQL injection is still a thing. Any vuln that can promote an SQL injection to an RCE is very bad.
> However, the generate_series extension is only enabled by default in the shell binary and not the library itself, so the impact of the issue is limited.
What came to mind was the DARPA Cyber Grand Challenge. The winner was a product used in the real world, too.
https://en.m.wikipedia.org/wiki/2016_Cyber_Grand_Challenge#:....
The whole thing has the scent of "we were told to get an outcome, eventually got it, and wrote it up!" --- I've never ever read a Project Zero blog post like this,* and I believe they should be ashamed of putting glorified marketing on it.
* large # of contributors with unclear contributions (they're probably building the agent Google is supposed to sell someday), ~0 explication of the bug, then gives up altogether on writing and splats in selected LLM responses.
* disclaimer: ex-Googler, only reason it matters here is I tend to jump to 'and this is a corruption of Google' because it feels to me like it was a different place when I joined, but either A) it wasn't or B) we should all be afraid of drift over time in organizations > 1000 people
It basically just takes the diff from the PR and sends it to GPT-4o for analysis, returning a severity (low/medium/high) and a description.
PRs are auto-blocked for high severity, but can be merged with medium or low.
In practice it’s mostly right, but definitely errs on the side of medium too often (which is reasonable without the additional context of the rest of the codebase).
With that said, it’s been pretty useful at uncovering simple mistakes before another dev has had a chance to review.
[0] https://magicloops.dev/loop/3f3781f3-f987-4672-8500-bacbeefc...
> We also feel that this variant-analysis task is a better fit for current LLMs than the more general open-ended vulnerability research problem. By providing a starting point – such as the details of a previously fixed vulnerability – we remove a lot of ambiguity from vulnerability research, and start from a concrete, well-founded theory: "This was a previous bug; there is probably another similar one somewhere".
LLMs are great at pattern matching, so it turns out feeding in a pattern describing a prior vulnerability is a great way to identify potential new ones.
My concept was running a bunch of open-source, static analyzers with the LLM’s essentially blocking false positives. They can do it analytically or by generating the test cases to prove the bug. It might also be easier to fine-tune open models for this since the job is narrower.
https://maxdunhill.medium.com/how-effective-are-transformers...
And created a Hugging Face repo of known vulnerabilities if anyone else wants to work on a similar project (link in blog post).
My project was a lay person’s not especially successful attempt to fine-tune a BERT-based classifier to detect vulnerable code.
Having said this, a main takeaway echoes simonw’s comment:
“LLMs are great at pattern matching, so it turns out feeding in a pattern describing a prior vulnerability is a great way to identify potential new ones.”
Given majority of vulnerabilities stem from memory misallocation, it seems that an LLM would most consistently find misallocated memory. Useful, though not the most complex vulnerabilities to weaponise.
It seems the next frontier would be for an LLM to not only identify previously unidentified vulnerabilities, but also describe how to successfully daisy chain them into an effective vulnerability exploitation.
Said differently, giving an LLM a goal like jail breaking the iOS sandbox and seeing how it might approach solving the task.