{"id":43019,"date":"2026-01-14T19:21:22","date_gmt":"2026-01-14T19:21:22","guid":{"rendered":"https:\/\/agooka.com\/news\/business\/ais-hacking-skills-are-approaching-an-inflection-point\/"},"modified":"2026-01-14T19:21:22","modified_gmt":"2026-01-14T19:21:22","slug":"ais-hacking-skills-are-approaching-an-inflection-point","status":"publish","type":"post","link":"https:\/\/agooka.com\/news\/business\/ais-hacking-skills-are-approaching-an-inflection-point\/","title":{"rendered":"AI\u2019s Hacking Skills Are Approaching an \u2018Inflection Point\u2019"},"content":{"rendered":"<p>Save StorySave this storySave StorySave this story<\/p>\n<p>Vlad Ionescu and Ariel Herbert-Voss, cofounders of the cybersecurity startup RunSybil, were momentarily confused when their AI tool, Sybil, alerted them to a weakness in a customer\u2019s systems last November.<\/p>\n<p>Sybil uses a mix of different AI models\u2014as well as a few proprietary technical tricks\u2014to scan computer systems for issues that hackers might exploit, like an unpatched server or a misconfigured database.<\/p>\n<p>In this case, Sybil flagged a problem with the customer\u2019s deployment of federated GraphQL, a language used to specify how data is accessed over the web through application programming interfaces (APIs). The issue meant that the customer was inadvertently exposing confidential information.<\/p>\n<p>What puzzled Ionescu and Herbert-Voss was that spotting the issue required a remarkably deep knowledge of several different systems and how those systems interact. RunSybil says it has since found the same problem with other deployments of GraphQL\u2014before anybody else made it public \u201cWe scoured the internet, and it didn\u2019t exist,\u201d Herbert-Voss says. \u201cDiscovering it was a reasoning step in terms of models\u2019 capabilities\u2014a step change.\u201d<\/p>\n<p>The situation points to a growing risk. As AI models continue to get smarter, their ability to find zero-day bugs and other vulnerabilities also continues to grow. The same intelligence that can be used to detect vulnerabilities can also be used to exploit them.<\/p>\n<p>Dawn Song, a computer scientist at UC Berkeley who specializes in both AI and security, says recent advances in AI have produced models that are better at finding flaws. Simulated reasoning, which involves splitting problems into constituent pieces, and agentic AI, like searching the web or installing and running software tools, have amped up models\u2019 cyber abilities.<\/p>\n<p>\u201cThe cyber security capabilities of frontier models have increased drastically in the last few months,\u201d she says. \u201cThis is an inflection point.\u201d<\/p>\n<p>Last year, Song cocreated a benchmark called CyberGym to determine how well large language models find vulnerabilities in large open-source software projects. CyberGym includes 1,507 known vulnerabilities found in 188 projects.<\/p>\n<p>In July 2025, Anthropic\u2019s Claude Sonnet 4 was able to find about 20 percent of the vulnerabilities in the benchmark. By October 2025, a new model, Claude Sonnet 4.5, was able to identify 30 percent. \u201cAI agents are able to find zero-days, and at very low cost,\u201d Song says.<\/p>\n<p>Song says this trend shows the need for new countermeasures, including having AI help cybersecurity experts. \u201cWe need to think about how to actually have AI help more on the defense side, and one can explore different approaches,\u201d she says.<\/p>\n<p>One idea is for frontier AI companies to share models with security researchers before launch, so they can use the models to find bugs and secure systems prior to a general release.<\/p>\n<p>Another countermeasure, says Song, is to rethink how software is built in the first place. Her lab has shown that it is possible to use AI to generate code that is more secure than what most programmers use today. \u201cIn the long run we think this secure-by-design approach will really help defenders,\u201d Song says.<\/p>\n<p>The RunSybil team says that, in the near term, the coding skills of AI models could mean that hackers gain the upper hand. \u201cAI can generate actions on a computer and generate code, and those are two things that hackers do,\u201d Herbert-Voss says. \u201cIf those capabilities accelerate, that means offensive security actions will also accelerate.\u201d<\/p>\n<p><em>This is an edition of<\/em> <a href=\"https:\/\/www.wired.com\/author\/will-knight\/\" rel=\"noreferrer\" target=\"_blank\"><em><strong>Will Knight\u2019s<\/strong><\/em><\/a> <em><a href=\"https:\/\/www.wired.com\/newsletter?sourceCode=editarticle\" rel=\"noreferrer\" target=\"_blank\"><strong>AI Lab newsletter<\/strong><\/a>. Read previous newsletters<\/em> <a href=\"https:\/\/www.wired.com\/tag\/ai-lab\/\" rel=\"noreferrer\" target=\"_blank\"><em><strong>here.<\/strong><\/em><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Save StorySave this storySave StorySave this story Vlad Ionescu and Ariel Herbert-Voss, cofounders of the cybersecurity startup RunSybil, were momentarily confused when their AI tool, Sybil, alerted them to a weakness in a customer\u2019s systems last November. Sybil uses a mix of different AI models\u2014as well as a few proprietary technical tricks\u2014to scan computer systems [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":43020,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[],"class_list":["post-43019","post","type-post","status-publish","format-standard","has-post-thumbnail","category-business"],"_links":{"self":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/43019","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/comments?post=43019"}],"version-history":[{"count":0,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/43019\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media\/43020"}],"wp:attachment":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media?parent=43019"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/categories?post=43019"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/tags?post=43019"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}