Researchers were able to successfully hack into more than half their test websites using autonomous teams of GPT-4 bots, co-ordinating their efforts and spawning new bots at will. And this was using previously-unknown, real-world ‘zero day’ exploits.
A couple of months ago, a team of researchers released a paper saying they’d been able to use GPT-4 to autonomously hack one-day (or N-day) vulnerabilities – these are security flaws that are already known, but for which a fix hasn’t yet been released. If given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 was able to exploit 87% of critical-severity CVEs on its own.
Skip forward to this week and the same group of researchers released a follow-up paper saying they’ve been able to hack zero-day vulnerabilities – vulnerabilities that aren’t yet known – with a team of autonomous, self-propagating Large Language Model (LLM) agents using a Hierarchical Planning with Task-Specific Agents (HPTSA) method.
Instead of assigning a single LLM agent trying to solve many complex tasks, HPTSA uses a “planning agent” that oversees the entire process and launches multiple “subagents,” that are task-specific. Very much like a boss and his subordinates, the planning agent coordinates to the managing agent which delegates all efforts of each “expert subagent”, reducing the load of a single agent on a task it might struggle with.
newatlas.com/technology/gpt4-autonomously-hack-zero-day-security-flaws/