{"id":36927,"date":"2025-10-29T19:31:21","date_gmt":"2025-10-29T19:31:21","guid":{"rendered":"https:\/\/agooka.com\/news\/technologies\/researchers-warn-that-llms-can-get-brain-rot-too\/"},"modified":"2025-10-29T19:31:21","modified_gmt":"2025-10-29T19:31:21","slug":"researchers-warn-that-llms-can-get-brain-rot-too","status":"publish","type":"post","link":"https:\/\/agooka.com\/news\/technologies\/researchers-warn-that-llms-can-get-brain-rot-too\/","title":{"rendered":"Researchers warn that LLMs can get \u201cbrain rot\u201d too"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/dataconomy.com\/wp-content\/uploads\/2025\/10\/researchers-warn-that-llms-can-get-brain-rot-too.jpg\" alt=\"Researchers warn that LLMs can get \u201cbrain rot\u201d too\" title=\"Researchers warn that LLMs can get \u201cbrain rot\u201d too\"\/><\/p>\n<p>In a new preprint paper, researchers from <strong>Texas A&amp;M University, the University of Texas at Austin, and Purdue University<\/strong> have introduced a troubling new concept: the <strong>\u201cLLM Brain Rot Hypothesis.\u201d<\/strong> The study finds that continually pre-training large language models (LLMs) on \u201cjunk web text\u201d causes a lasting cognitive decline in their abilities. . This matters because it\u2019s not just a temporary glitch; the researchers found the damage is persistent, reframing the simple act of data curation as a critical, training-time safety problem for all future AI development.<\/p>\n<h2>How to give an AI \u2018brain rot\u2019<\/h2>\n<p>The term \u201cbrain rot\u201d was famously named Oxford\u2019s word of the year for 2024, describing the mental fog humans get from consuming too much trivial online content. The researchers set out to see if the same thing happens to AI. To do this, they ran a controlled experiment using a massive corpus of real Twitter\/X posts. They created two distinct datasets: a \u201cjunk\u201d dataset and a \u201ccontrol\u201d dataset.<\/p>\n<p>The \u201cjunk\u201d data was defined in two different ways:<\/p>\n<ul>\n<li><strong>M1 (Engagement Degree):<\/strong> This dataset was filled with short, highly popular posts (length &lt; 30 tokens, popularity &gt; 500). The researchers found this non-semantic metric\u2014popularity\u2014was a surprisingly powerful indicator of the brain rot effect, distinct from the text\u2019s actual meaning.<\/li>\n<li><strong>M2 (Semantic Quality):<\/strong> This dataset was filled with content that an AI (GPT-4o-mini) classified as low-quality, such as \u201cconspiracy theories, exaggerated claims, unsupported assertions or superficial lifestyle content.\u201d<\/li>\n<\/ul>\n<p>They then took four different LLMs (including Llama3 8B and Qwen2.5 7B) and continually trained them on these junk datasets, comparing their performance against models trained on the control data.<\/p>\n<h2>The cognitive decline is real<\/h2>\n<p>The results were immediate and significant. Models trained on the junk data showed a <strong>non-trivial cognitive decline (Hedges\u2019 g &gt; 0.3)<\/strong> across the board. The more \u201cjunk\u201d the models consumed, the worse they got, demonstrating a clear \u201cdose-response\u201d decay. For example, as the junk ratio of M1 data rose from 0% to 100%, one reasoning benchmark score <strong>plummeted from 74.9 to 57.2<\/strong>.<\/p>\n<p>The damage wasn\u2019t just in one area. The researchers found declines in:<\/p>\n<ul>\n<li><strong>Reasoning:<\/strong> Models lost their ability to solve complex problems.<\/li>\n<li><strong>Long-context understanding:<\/strong> Their ability to retrieve information from long documents collapsed.<\/li>\n<li><strong>Safety:<\/strong> The models became less aligned with ethical norms.<\/li>\n<li><strong>Personality:<\/strong> Most disturbingly, the models developed \u201cdark traits,\u201d showing a significant spike in <strong>psychopathy and narcissism<\/strong>.<\/li>\n<\/ul>\n<p>When the researchers dug into <em>why<\/em> this was happening, they identified a primary failure mode they call <strong>\u201cthought-skipping.\u201d<\/strong> The AI models would increasingly truncate or skip reasoning chains entirely. Instead of thinking step-by-step, they would just jump to a (usually wrong) answer, mimicking the short, attention-grabbing, non-reflective style of the junk data they were fed.<\/p>\n<h2>Can the rot be cured?<\/h2>\n<p>This is the most worrying part of the study: not really. The researchers tried two different ways to \u201cheal\u201d the brain-rotted models, and neither was fully successful.<\/p>\n<ol>\n<li>\n<ol>\n<li><strong>Training-free reflection:<\/strong> They tried to get the models to \u201creflect\u201d on their mistakes and fix them. This failed. The models\u2019 \u201cinternalized cognitive decline\u201d was so deep that they were unable to even identify their own reasoning failures.<\/li>\n<li><strong>Post-hoc tuning:<\/strong> They tried to \u201cwash out\u201d the bad training by re-training the models on a massive amount of clean, high-quality instruction data. While this helped, it couldn\u2019t restore the models\u2019 original capabilities. Even after scaling the \u201cclean\u201d data to <strong>4.8 times the amount of the junk data<\/strong>, a large performance gap remained.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>The findings provide powerful, causal evidence that data quality is a critical driver of AI capability and safety. The damage, once done, appears to be deeply internalized. This suggests that simply scraping the internet for ever-larger datasets is a dangerous path, and it motivates the need for routine \u201ccognitive health checks\u201d for AI models, lest they, too, fall victim to the internet\u2019s junk food.<\/p>\n<p><a href=\"https:\/\/unsplash.com\/photos\/round-shaped-letters-illustration-cM4au7e5RAE\" rel=\"noreferrer\" target=\"_blank\"><strong>Featured image credit<\/strong><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In a new preprint paper, researchers from Texas A&amp;M University, the University of Texas at Austin, and Purdue University have introduced a troubling new concept: the \u201cLLM Brain Rot Hypothesis.\u201d The study finds that continually pre-training large language models (LLMs) on \u201cjunk web text\u201d causes a lasting cognitive decline in their abilities. . This matters [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":36928,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[],"class_list":{"0":"post-36927","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technologies"},"_links":{"self":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/36927","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/comments?post=36927"}],"version-history":[{"count":0,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/36927\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media\/36928"}],"wp:attachment":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media?parent=36927"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/categories?post=36927"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/tags?post=36927"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}