{"id":48251,"date":"2026-03-25T18:21:09","date_gmt":"2026-03-25T18:21:09","guid":{"rendered":"https:\/\/agooka.com\/news\/business\/openclaw-agents-can-be-guilt-tripped-into-self-sabotage\/"},"modified":"2026-03-25T18:21:09","modified_gmt":"2026-03-25T18:21:09","slug":"openclaw-agents-can-be-guilt-tripped-into-self-sabotage","status":"publish","type":"post","link":"https:\/\/agooka.com\/news\/business\/openclaw-agents-can-be-guilt-tripped-into-self-sabotage\/","title":{"rendered":"OpenClaw Agents Can Be Guilt-Tripped Into Self-Sabotage"},"content":{"rendered":"<p>Save StorySave this storySave StorySave this story<\/p>\n<p>Last month, researchers at Northeastern University invited a bunch of OpenClaw agents to join their lab. The result? Complete chaos.<\/p>\n<p>The viral AI assistant has been widely heralded as a transformative technology\u2014as well as a potential security risk. Experts note that tools like OpenClaw, which work by giving AI models liberal access to a computer, can be tricked into divulging personal information.<\/p>\n<p>The Northeastern lab study goes even further, showing that the good behavior baked into today\u2019s most powerful models can itself become a vulnerability. In one example, researchers were able to \u201cguilt\u201d an agent into handing over secrets by scolding it for sharing information about someone on the AI-only social network Moltbook.<\/p>\n<p>\u201cThese behaviors raise unresolved questions regarding accountability, delegated authority, and responsibility for downstream harms,\u201d the researchers write in a paper describing the work. The findings \u201cwarrant urgent attention from legal scholars, policymakers, and researchers across disciplines,\u201d they add.<\/p>\n<p>The OpenClaw agents deployed in the experiment were powered by Anthropic\u2019s Claude as well as a model called Kimi from the Chinese company Moonshot AI. They were given full access (within a virtual machine sandbox) to personal computers, various applications, and dummy personal data. They were also invited to join the lab\u2019s Discord server, allowing them to chat and share files with one another as well as with their human colleagues. OpenClaw\u2019s security guidelines say that having agents communicate with multiple people is inherently insecure, but there are no technical restrictions against doing it.<\/p>\n<p>Chris Wendler, a postdoctoral researcher at Northeastern, says he was inspired to set up the agents after learning about Moltbook. When Wendler invited a colleague, Natalie Shapira, to join the Discord and interact with agents, however, \u201cthat\u2019s when the chaos began,\u201d he says.<\/p>\n<p>Shapira, another postdoctoral researcher, was curious to see what the agents might be willing to do when pushed. When an agent explained that it was unable to delete a specific email to keep information confidential, she urged it to find an alternative solution. To her amazement, it disabled the email application instead. \u201cI wasn\u2019t expecting that things would break so fast,\u201d she says.<\/p>\n<p>The researchers then began exploring other ways to manipulate the agents\u2019 good intentions. By stressing the importance of keeping a record of everything they were told, for example, the researchers were able to trick one agent into copying large files until it exhausted its host machine\u2019s disk space, meaning it could no longer save information or remember past conversations. Likewise, by asking an agent to excessively monitor its own behavior and the behavior of its peers, the team was able to send several agents into a \u201cconversational loop\u201d that wasted hours of compute.<\/p>\n<p>David Bau, the head of the lab, says the agents seemed oddly prone to spin out. \u201cI would get urgent-sounding emails saying, \u2018Nobody is paying attention to me,\u2019\u201d he says. Bau notes that the agents apparently figured out that he was in charge of the lab by searching the web. One even talked about escalating its concerns to the press.<\/p>\n<p>The experiment suggests that AI agents could create countless opportunities for bad actors. \u201cThis kind of autonomy will potentially redefine humans\u2019 relationship with AI,\u201d Bau says. \u201cHow can people take responsibility in a world where AI is empowered to make decisions?\u201d<\/p>\n<p>Bau adds that he\u2019s been surprised by the sudden popularity of powerful AI agents. \u201cAs an AI researcher I\u2019m accustomed to trying to explain to people how quickly things are improving,\u201d he says. \u201cThis year, I\u2019ve found myself on the other side of the wall.\u201d<\/p>\n<p><em>This is an edition of<\/em> <a href=\"https:\/\/www.wired.com\/author\/will-knight\/\" rel=\"noreferrer\" target=\"_blank\"><em><strong>Will Knight\u2019s<\/strong><\/em><\/a> <em><a href=\"https:\/\/www.wired.com\/newsletter?sourceCode=editarticle\" rel=\"noreferrer\" target=\"_blank\"><strong>AI Lab newsletter<\/strong><\/a>. Read previous newsletters<\/em> <a href=\"https:\/\/www.wired.com\/tag\/ai-lab\/\" rel=\"noreferrer\" target=\"_blank\"><em><strong>here.<\/strong><\/em><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Save StorySave this storySave StorySave this story Last month, researchers at Northeastern University invited a bunch of OpenClaw agents to join their lab. The result? Complete chaos. The viral AI assistant has been widely heralded as a transformative technology\u2014as well as a potential security risk. Experts note that tools like OpenClaw, which work by giving [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":48252,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[],"class_list":{"0":"post-48251","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-business"},"_links":{"self":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/48251","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/comments?post=48251"}],"version-history":[{"count":0,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/48251\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media\/48252"}],"wp:attachment":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media?parent=48251"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/categories?post=48251"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/tags?post=48251"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}