Seriously? Using DARE scare tactics?
This study focuses on finding out the cognitive cost of using an LLM in the educational context of writing an essay.
We assigned participants to three groups: LLM group, Search Engine group, Brain-only group, where each participant used a designated tool … to write an essay. We conducted 3 sessions with the same group assignment for each participant. In the 4th session we asked LLM group participants to use no tools (we refer to them as LLM-to-Brain), and the Brain-only group participants were asked to use LLM (Brain-to-LLM). We recruited a total of 54 participants for Sessions 1, 2, 3, and 18 participants among them completed session 4. We used electroencephalography (EEG) to record participants' brain activity in order to assess their cognitive engagement and cognitive load, and to gain a deeper understanding of neural activations during the essay writing task. We performed NLP analysis, and we interviewed each participant after each session. We performed scoring with the help from the human teachers and an AI judge (a specially built AI agent).
We discovered a consistent homogeneity across the Named Entities Recognition (NERs), n-grams, ontology of topics within each group. EEG analysis presented robust evidence that LLM, Search Engine and Brain-only groups had significantly different neural connectivity patterns, reflecting divergent cognitive strategies. Brain connectivity systematically scaled down with the amount of external support: the Brain‑only group exhibited the strongest, widest‑ranging networks, Search Engine group showed intermediate engagement, and LLM assistance elicited the weakest overall coupling. In session 4, LLM-to-Brain participants showed weaker neural connectivity and under-engagement of alpha and beta networks; and the Brain-to-LLM participants demonstrated higher memory recall, and re‑engagement of widespread occipito-parietal and prefrontal nodes, likely supporting the visual processing, similar to the one frequently perceived in the Search Engine group. The reported ownership of LLM group's essays in the interviews was low. The Search Engine group had strong ownership, but lesser than the Brain-only group. The LLM group also fell behind in their ability to quote from the essays they wrote just minutes prior.
As the educational impact of LLM use only begins to settle with the general population, in this study we demonstrate the pressing matter of a likely decrease in learning skills based on the results of our study. The use of LLM had a measurable impact on participants, and while the benefits were initially apparent, as we demonstrated over the course of 4 months, the LLM group's participants performed worse than their counterparts in the Brain-only group at all levels: neural, linguistic, scoring.
This study has made the obligatory rounds through anti-AI spaces and echo chambers with headlines smugly crowing, or fearmongering, or both, about the idea that ChatGPT "damages your brain," causes "brain rot," and about half a million variations thereof. This is what the paper's own FAQ has to say about that framing:
Is it safe to say that LLMs are, in essence, making us "dumber"?
No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", "passivity", "trimming" and so on. It does a huge disservice to this work, as we did not use this vocabulary in the paper, especially if you are a journalist reporting on it.
[…]
Additional vocabulary to avoid using when talking about the paper
In addition to the vocabulary from Question 1 in this FAQ - please avoid using "brain scans", "LLMs make you stop thinking", "impact negatively", "brain damage", "terrifying findings".
And this isn't just because they find the use of these terms distasteful for reasons of academic decorum: there are real reasons why this framing, while tempting, is fundamentally wrong. What this study shows is not some kind of generalized, permanent cognitive decline just from touching LLMs at all in any novel way, but a subject-specific decrease in effort, ownership, and memory if you use LLMs to do all your thinking for you, and the possibilty of skill atrophy if you overuse AI to execute that skill for you; in other words, rather obvious and non-novel conclusions, that we've known about for other technologies for forever. If you copy from Wikipedia, you'll know less about the topic, be able to quote from your essay less, and feel less ownership over it; if you use search engines, your ideas may be less unique and you'll be able to recall facts off the top of your head less well; if you use calculators or GPS, you'll get worse at mental math or navigation. Even Socrates was complaining about this, to use a cliche example.
So why is this study getting treated like this?
I think part of the reason is that people keep misreading it. The language is ambiguous and sometimes unfortunate in the way it phrases a lot of its findings, from the perspective of a layperson. It writes in a way that's clearly intended for other scientists, who know what "neural connectivity" in the context of an EEG means — namely, that it's about activity, not the physical neurological connections — and so it feels free to say things like "a decrease in neural connectivity"; likewise, it assumes that the reader will understand that when it talks about remembering less, it will be clear that, just due to the scope of the experiment and the nature of the methodology, it's talking about remembering less about the specific essay that was already written, not a general pattern of having a worse memory or memory loss. But for motivated laypeople and journalists looking for a story, this languague is ambiguous and ripe for pulling out of context.
Let's skim through the paper and take a look at what it actually says, though, to counteract this narrative, and hopefully derive some practical guidelines on how to maintain one's cognitive skills while benefitting from AI, instead of just fearmongering that's closer to a DARE speech than anything useful.
Sessions 1-3
For sessions 1-3, all this study shows is that if you don't actually write something, or even come up with the ontology, terminology, arguments, and research for it (from diverse sources) yourself, but instead have ChatGPT generate it wholesale, the output is going to be more homogenous, and you're going to remember less of it and have less of a sense of ownership over it. That's literally it:
After the participants finished the study, we used a local LLM to classify the interactions participants had with the LLM, most common requests were writing an essay, see the distribution of the classes in Figure 31 below. [If you look at Figure 31, it shows that 97 requests were for writing the essay, whereas the closest second-place request type, guidance and clarification, only showed up 43 times. The third highest, inquiry and discussion, was at 39. This very clearly shows that the way people ended up using ChatGPT was primarily in writing the essay for them, and much less in other capacities.]
Quoting accuracy was significantly different across experimental conditions (Figure 6). In the LLM‑assisted group, 83.3 % of participants (15/18) failed to provide a correct quotation, whereas only 11.1 % (2/18) in both the Search‑Engine and Brain‑Only groups encountered the same difficulty.
The response to this question was nuanced: LLM group either indicated full ownership of the essay for half of the participants (9/18), or no ownership at all (3/18), or "partial ownership of 90%' for 1/18, "50/50' for 1/18, and "70/30' for 1/18 participants.
The implication of this is not that you should never use large language models, but that you should be strategic in your usage of them, only using them for things that you're okay with offloading, including the consequences of that, such as less unique ideas and poorer memory of what you did and why, less learning, and less sense of ownership and agency. That's pretty obvious, and true of almost all forms of automation. It's also important to note that this was specifically the result of the participants primarily using the LLM to write sections of the essay for them, as well as using it with search citations turned of for ideation and information retrieval. Not something like critique or copy editing, or as a web search copilot. Also note that with the ChatGPT interface, you don't get the affordances that you get with e.g. agentic coding, where you can collaboratively build an essay change by change with the AI, interrupting it, critiquing, iterating, engaging with each small decision it makes, going back and forth and collaborting on a document, which allows you to maintain ownership over the project. Instead, it's much more of a batch process: you tell the AI what you want, and it spits out a huge output, which you copy paste; and if you want to change something, it has to regenerate the entire output again, which can cause all sorts of problems and so is emergently discouraged. (It should also be noted that originality and such isn't really the goal of coding, so a lot of these metrics don't really… apply to coding.)
This is an interesting quote:
In the LLM group, topic selection was mainly motivated by perceived engagement and personal resonance: four participants chose prompts they considered “the most fun to write about” (P1), while five selected questions they had “thought about a lot in the past” (P11). Two additional participants explicitly reported that they “want to challenge this prompt” or “disagree with this prompt”. Search Engine group balanced engagement (5/18) with relatability and familiarity (8/18), citing reasons such as “can relate the most”, “talked to many people about it and [am] familiar with this topic”, and “heard facts from a friend, which seemed interesting to write about”. By contrast, the Brain-only group predominantly emphasized prior experience alongside engagement, relatability, and familiarity, noting that the chosen prompt was “similar to an essay I wrote before”, “worked on a project with a similar topic”, or was related to a “participant I had the most experience with”. Experience emerged as the most frequently cited criteria for Brain-only group in Session 2, most likely reflecting their awareness that external reference materials were unavailable.
Despite the concerns that the paper expresses later on about the enforced subject homogeneity and echo chambers introduced by search engines and LLMs, it seems to me that on the basis of this quote we can say that the Brain-only group also faced the same issue; it was just that the biases and homogeneity was within-subject instead of between subject. So it becomes a question of whether we want people to stay within individualized bubbles of ideas, or engage with ideas outside their own head in a way that might lead them to homogenize with "general society" but also might lead them to diversify their own ideas and subjects of interest?
The following quote is also really interesting:
Across all sessions, participants articulated convergent themes of efficiency, creativity, and ethics while revealing group‑specific trajectories in tool use. The LLM group initially employed ChatGPT for ancillary tasks, e.g. having it “summarize each prompt to help with choosing which one to do” (P48, Group 1), but grew increasingly skeptical: after three uses, one participant concluded that “ChatGPT is not worth it” for the assignment (P49), and another preferred “the Internet over ChatGPT to find sources and evidence as it is not reliable” (P13). Several users noted the effort required to “prompt ChatGPT”, with one imposing a word limit “so that it would be easier to control and handle” (P18); others acknowledged the system “helped refine my grammar, but it didn't add much to my creativity”, was “fine for structure… [yet] not worth using for generating ideas”, and “couldn't help me articulate my ideas the way I wanted” (Session 3). Time pressure occasionally drove continued use, “I went back to using ChatGPT because I didn't have enough time, but I feel guilty about it”, yet ethical discomfort persisted: P1 admitted it “feels like cheating”, a judgment echoed by P9, while three participants limited ChatGPT to translation, underscoring its ancillary role. In contrast, Group 2's pragmatic reliance on web search framed Google as “a good balance” for research and grammar, and participants highlighted integrating personal stories, “I tried to tie [the essay] with personal stories” (P12). Group 3, unaided by digital tools, emphasized autonomy and authenticity, noting that the essay “felt very personal because it was about my own experiences” (P50)
This indicates that LLM usage or overusage is not inevtiable, nor are people being fooled. It seems as though most of the LLM group participants quickly learned its limitations and figured out that it wasn't good for the task they were being forced by the study's structure to use it for, and only resorted to it over their own thoughts and words (which they clearly would've preferred) because of artificial time limits! Again, this is far from the "ChatGPT is a cognitohazard that addicts you and rots your brain, that you will inevitably desire to use if presented with the option" framing that fearmongers and smug haters like to push on the back of this paper which they clearly didn't read.
Session 4
Meanwhile, for session 4, where participants are asked, crucially, not to write on brand new topics, but to write on previous topics, or even iterate on previous essays:
Additionally, instead of offering a new set of three essay prompts for session 4, we offered participants a set of personalized prompts made out of the topics EACH participant already wrote about in sessions 1, 2, 3. […] This personalization took place for EACH participant who came for session 4.
Across all groups, participants strongly preferred continuity with their previous work when selecting essay topics. […] Overall, familiarity remained the principal motivation of topic choice.
Here we report how brain connectivity evolved over four sessions of an essay writing task in Sessions 1, 2, 3 for the Brain-only group and Session 4 for the LLM-to-Brain group. The results revealed clear frequency-specific patterns of change: lower-frequency bands (delta, theta, alpha) all showed a dramatic increase in connectivity from the first to second session, followed by either a plateau or decline in subsequent sessions, whereas the beta band showed a more linear increase peaking at the third session. These patterns likely reflect the cognitive adaptation and learning that occurred with repeated writing in our study.
[…]
The critical point of this discussion is Session 4, where participants wrote without any AI assistance after having previously used an LLM. Our findings show that Session 4's brain connectivity did not simply reset to a novice (Session 1) pattern, but it also did not reach the levels of a fully practiced Session 3 in most aspects. Instead, Session 4 tended to mirror somewhat of an intermediate state of network engagement.
One plausible explanation is that the LLM had previously provided suggestions and content, thereby reducing the cognitive load on the participants during those assisted sessions. When those same individuals wrote without AI (Session 4), they may have leaned on whatever they learned or retained from the AI, but because prior sessions did not require the significant engagement of executive control and language‑production networks, engagement we observed in Brain-only group (see Section “EEG Results: LLM Group vs Brain-only Group” for more details), the subsequent writing task elicited a reduced neural recruitment for content planning and generation.
This is crucial: yes, the neural activity of LLM-to-Brain subjects in sesson 4 was less than the fully in the flow, fully activated neural activity of Brain-only participants in session 3 — but session 3 participants were all writing brand new essays, whereas session 4 participants were rewriting essays they'd already written. This isn't a comparable situation, and we'd expect even people who'd been brain only the entire time, doing session 4 brain-only, to have lower neural connectivity than in the flow participants of session 3 writing brand new essays. That's just what novelty does, at a neural level. Likewise, if the LLM-to-Brain participants had lower neural activation than the Brain-to-LLM group during session 4, that could well be because the Brain-To-LLM group was suddenly being knocked out of their flow and forced to use a brand new tool, and thus an essentially entirely different task, and couldn't rely on their past familiarity with the prompt because now the essay was doing the writing part that would have benefitted from that past knowledge.
It should also be noted that the neural activity of the LLM-to-Brain participants "didn't reset back to session 1 level", which indicates that you do, in fact, get some experience over a brand new neophyte, just from using AI to do something — even if you later try to do it on your own. Just, not as much as if you'd done it all yourself.
This quote, which nobody ever seems to mention, significantly complicates the picture of the LLM-to-Brain group as experiencing pure decline even within the specific prompts involved, and in fact may indicate participants even benefitting to some degree from that past AI interaction:
On the other hand, Session 4's connectivity was not universally down, in certain bands, it remained relatively high and even comparable to Session 3. Notably, theta band connectivity in Session 4, while lower in total than Session 3, showed several specific connections where Session 4 was equal or exceeded Session 3 (e.g. many connections followed S2 > S4 > S3 > S1 pattern). Theta is often linked to semantic retrieval and creative ideation; the maintained theta interactions in Session 4 may reflect that these participants were still actively retrieving knowledge or ideas, possibly recalling content that AI had provided earlier. […] In a sense, the AI could have served as a learning aid, providing new information that the participants internalized and later accessed. The data hints at this: one major theta hub in all sessions was the frontocentral area FC5 (near premotor/cingulate regions), involved in language and executive function, which continued to receive strong inputs in Session 4. Therefore, even after AI exposure, participants engaged brain circuits for memory and planning. Similarly, the delta band in Session 4 remained as active as in Session 3, indicating that sustained attention and effort were present. This finding is somewhat encouraging: it suggests that having used AI did not make the participants completely disengaged or inattentive when they later wrote on their own. They were still concentrating, delta connectivity at Session 4 was ~45% higher than Session 1's and matched Session 3's level.
Of course, this is not to say that if you offload your thinking in every single subject to AI, there won't be a general effect — I definitely agree with the study that "over-reliance on AI can erode critical thinking and problem-solving skills: users might become good at using the tool but not at performing the task independently to the same standard. Our neurophysiological data provides the initial support for this process, showing concrete changes in brain connectivity that mirror that shift." But this is less like some kind of special "brain rot" syndrome, and more like how literally any tool, from Google Maps, to Google Search, to a calculator, functions — the key is not to belittle the tool as some kind of ultimate evil and avoid it at all costs, but to engage with it carefully and consciously.
Since I use writing to think, and exploring my notes to synthesize my thoughts and information I've collected, I never use LLMs to do it; likewise, since I use quoting and summarizing the things I post to my mirrors page as an opportunity to re-engage with the material, in order to help me remember it, and decide what's important and noteworthy about the material, I don't use LLMs to do that either
Likewise, as the study says:
Our results also caution that certain neural processes require active exercise. The under-engagement of alpha and beta networks in post-AI writing might imply that if a participant skips developing their own organizational strategies (because an AI provided them), those brain circuits might not strengthen as much. Thus, when the participant faces a task alone, they may underperform in those aspects. In line with this, recent research has emphasized the need to balance AI use with activities that build one's own cognitive abilities [3]. From a neuropsychological perspective, our findings underscore a similar message: the brain adapts to how we train it. If AI essentially performs the high-level planning, the brain will allocate less resources to those functions, as seen in the moderated alpha/beta connectivity in Session 4.
Which is why when I do agentic coding, I make sure to do all the debugging myself, as well as all the design for the program's state machine, logic, architecture, data flow, UI/UX design, and feature selection. I also choose what technologies to use and how/when to apply them. This way, all of the general skills that make me a good programmer stay strong, even if my specific knowledge of the "intellectual empty calories" of some specific command or framework isn't reinforced constantly; moreover, even for specific commands or frameworks, if I do want them to be part of my core competency, I make sure to practice them often, instead of using the AI to write with them.
Another interesting thing here is that the Brain-to-LLM group, who first wrote their essays by themselves, and then used LLMs to improve and iterate their writing, showed actually much more unique perspectives and higher engagement, ownership, brain connectivity, and critical engagement with the LLM outputs:
- Better integration of content compared to previous Brain sessions (Brain-to-LLM). More information seeking prompts. Scored mostly above average across all groups. Split ownership.
- High memory recall. Low strategic integration. Higher directed connectivity across all frequency bands for Brain-to-LLM participants, compared to LLM-only Sessions 1, 2, 3.
This indicates that, if you're going to use LLMs to write for you, you should treat them as a copyeditor used only before a final draft, or as something to structure and clarify thoughts you put in in the first place, instead of starting and finishing with LLM outputs and putting very little effort in between. Not that LLM usage should be totally verboten. As the study itself says:
Going forward, a balanced approach is advisable, one that might leverage AI for routine assistance but still challenges individuals to perform core cognitive operations themselves. […] It would be important to explore hybrid strategies in which AI handles routine aspects of writing composition, while core cognitive processes, idea generation, organization, and critical revision, remain user‑driven. During the early learning phases, full neural engagement seems to be essential for developing robust writing networks; by contrast, in later practice phases, selective AI support could reduce extraneous cognitive load and thereby enhance efficiency without undermining those established networks.
Across all frequency bands, Session 4 (Brain-to-LLM group) showed higher directed connectivity than LLM Group's sessions 1, 2, 3. This suggests that rewriting an essay using AI tools (after prior AI-free writing) engaged more extensive brain network interactions.
This correlation between neural connectivity and behavioral quoting failure in LLM group's participants offers evidence that:
- Early AI reliance may result in shallow encoding. LLM group's poor recall and incorrect quoting is a possible indicator that their earlier essays were not internally integrated, likely due to outsourced cognitive processing to the LLM.
- Withholding LLM tools during early stages might support memory formation. Brain-only group's stronger behavioral recall, supported by more robust EEG connectivity, suggests that initial unaided effort promoted durable memory traces, enabling more effective reactivation even when LLM tools were introduced later.
- Metacognitive engagement is higher in the Brain-to-LLM group. Brain-only group might have mentally compared their past unaided efforts with tool-generated suggestions (as supported by their comments during the interviews), engaging in self-reflection and elaborative rehearsal, a process linked to executive control and semantic integration, as seen in their EEG profile.
The significant gap in quoting accuracy between reassigned LLM and Brain-only groups was not merely a behavioral artifact; it is mirrored in the structure and strength of their neural connectivity. The LLM-to-Brain group's early dependence on LLM tools appeared to have impaired long-term semantic retention and contextual memory, limiting their ability to reconstruct content without assistance. In contrast, Brain-to-LLM participants could leverage tools more strategically, resulting in stronger performance and more cohesive neural signatures.
In conclusion, AI is not going to magically "rot your brain" independent of how and when and why you use it. What this study says is actually much more useful than that: that it is important to realize that if you use AI to perform a skill for you, you'll eventually lose practice with that skill, because the brain ruthlessly prunes those sorts of things; and that if you use AI to do something for you, in such a way that you're not intimately involved with the process, then you'll know and remember less about what you did. But the answer to that is to make sure that you're always doing core cognitive competencies — whatever those are — yourself, instead of letting the machines think for you. Make sure it's always you that's doing the critical thinking, that's directing the outline of the process and originating the ideas and expressions, and make sure that you keep intimate track of what the AI is doing, treating it at best as a pair programmer, and most of the time as something like a copyeditor or information retrieval system.
Oh, and regarding information retrieval: one of the concerns this study raises is that since ChatGPT only gives a univocal answer to any given question, which inherits the biases of its creators and trainers, as well as the model's own emergent stochastic "opinions", this can lead to biased and less diverse information intake, when compared to the irreducibly multivocal experience of information retrieval with a search engine. In my opinion, this is where things like Perplexity come in: this allows the AI to actually fetch multiple opinions, and then summarize all of them, giving you a high level overview of all the sides in any given debate or argument, allow you to actually more accurately get access to multivocal information, since with a search results page, since you have to click on and read each view individually, you might be tempted to stop with the first one. More importantly, something like Perplexity automates the process of making many internet searches, presenting you with a long list of original sources to click on and explore, and citing them in its answer, so that its answer only serves as a high level jumping off point. I think this resolves that problem well enough.