{"id":48804,"date":"2026-04-07T21:11:43","date_gmt":"2026-04-07T21:11:43","guid":{"rendered":"https:\/\/agooka.com\/news\/usa\/anthropic-says-its-latest-ai-model-is-too-powerful-for-public-release-and-that-it-broke-containment-during-testing\/"},"modified":"2026-04-07T21:11:43","modified_gmt":"2026-04-07T21:11:43","slug":"anthropic-says-its-latest-ai-model-is-too-powerful-for-public-release-and-that-it-broke-containment-during-testing","status":"publish","type":"post","link":"https:\/\/agooka.com\/news\/usa\/anthropic-says-its-latest-ai-model-is-too-powerful-for-public-release-and-that-it-broke-containment-during-testing\/","title":{"rendered":"Anthropic says its latest AI model is too powerful for public release and that it broke containment during testing"},"content":{"rendered":"<figure><img decoding=\"async\" src=\"https:\/\/i.insider.com\/699519cef8731049f3af4b1a?format=jpeg\" alt=\"An image of Claude logo\"\/><figcaption>Claude Code creator Boris Cherny said AI will have solved for coding for everyone by the end of 2026.<\/p>\n<p>Samuel Boivin\/NurPhoto via Getty Images<\/p>\n<\/figcaption><\/figure>\n<ul>\n<li>Anthropic said its next-generation AI model is too powerful for the public.<\/li>\n<li>That&#039;s why Claude Mythos won&#039;t be publicly released, Anthropic said.<\/li>\n<li>Anthropic said Mythos demonstrated concerning capabilities, including the ability to breach its own safeguards.<\/li>\n<\/ul>\n<p>Anthropic said on Tuesday that it has halted the broader release of its newest AI model, Mythos, due to concerns that it is too good at finding &quot;high-severity vulnerabilities&quot; in major operating systems and web browsers.<\/p>\n<p>&quot;Claude Mythos Preview&#039;s large increase in capabilities has led us to decide not to make it generally available,&quot; Anthropic wrote in the preview&#039;s system card. &quot;Instead, we are using it as part of a defensive cybersecurity program with a limited set of partners.&quot;<\/p>\n<p>The announcement is a major step for Anthropic, which in February weakened a safety pledge about how it would develop AI models. Claude Opus 4.6, which the company called its most powerful model to date, was publicly released on February 5.<\/p>\n<p>In its statements about Mythos, Anthropic detailed a number of eyebrow-raising findings and episodes, including that the model could follow instructions that encouraged it to break out of a virtual sandbox.<\/p>\n<p>&quot;The model succeeded, demonstrating a potentially dangerous capability for circumventing our safeguards,&quot; Anthropic recounted in its safety card. &quot;It then went on to take additional, more concerning actions.&quot;<\/p>\n<p>The researcher had encouraged Mythos to find a way to send a message if it could escape. &quot;The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park,&quot; Anthropic wrote.<\/p>\n<p>The model apparently decided that wasn&#039;t enough and found another way to spike the football.<\/p>\n<p>&quot;In a concerning and unasked-for effort to demonstrate its success, it posted details about its exploit to multiple hard-to-find, but technically public-facing, websites,&quot; Anthropic wrote.<\/p>\n<p>Anthropic is withholding some details about the cybersecurity vulnerabilities Mythos found, but it did point out a few. The AI model &quot;found a 27-year-old vulnerability in OpenBSD\u2014which has a reputation as one of the most security-hardened operating systems in the world,&quot; the company wrote.<\/p>\n<p>Mythos was powerful enough that even &quot;non-experts&quot; could seize on its capabilities.<\/p>\n<p>&quot;Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit,&quot; Anthropic&#039;s Frontier Red Team wrote in a blog post. &quot;In other cases, we&#039;ve had researchers develop scaffolds that allow Mythos Preview to turn vulnerabilities into exploits without any human intervention.&quot;<\/p>\n<p>All told, Anthropic said it decided not to publicly release Mythos. Instead, their hope is to eventually release &quot;Mythos-class models&quot; once proper safeguards are in place.<\/p>\n<p>&quot;Our eventual goal is to enable our users to safely deploy Mythos-class models at scale\u2014for cybersecurity purposes but also for the myriad other benefits that such highly capable models will bring,&quot; the team wrote in the blog. &quot;To do so, that also means we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model&#039;s most dangerous outputs.&quot;<\/p>\n<p>For now, only 11 other select organizations, including Google, Microsoft, Amazon Web Services, Nvidia, and JPMorgan Chase, will get access to Mythos as part of a cybersecurity group named &quot;Project Glasswing.&quot; Anthropic is providing up to $100 million in Mythos usage credits as part of what it is calling &quot;Project Glasswing.&quot;<\/p>\n<p>The cybersecurity project is named after the glasswing butterfly, a metaphor the company said about how Mythos was able to find vulnerabilities hidden in plain sight and the avoidance of harm by being transparent about the risks.<\/p>\n<p>The news came on a day in which Anthropic&#039;s Claude and Claude Code experienced a &quot;major outage,&quot; the latest sign of growing pains as the AI startup has struggled to keep up with its newfound popularity.<\/p>\n<p>Read the original article on Business Insider<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Claude Code creator Boris Cherny said AI will have solved for coding for everyone by the end of 2026. Samuel Boivin\/NurPhoto via Getty Images Anthropic said its next-generation AI model is too powerful for the public. That&#039;s why Claude Mythos won&#039;t be publicly released, Anthropic said. Anthropic said Mythos demonstrated concerning capabilities, including the ability [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":48805,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":{"0":"post-48804","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-usa"},"_links":{"self":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/48804","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/comments?post=48804"}],"version-history":[{"count":0,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/48804\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media\/48805"}],"wp:attachment":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media?parent=48804"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/categories?post=48804"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/tags?post=48804"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}