MIT Report on Generative AI Deployments Criticized for Lack of Evidence

This title was summarized by AI from the post below.

The "MIT report" which made a big splash this week, finding few generative AI deployments generate any returns, is deeply problematic. The incident is, frankly, a great example of confirmation bias. AI skeptics were looking for evidence to support their suspicions, and they thought they had found it. That's not what I see. I went to the MIT Project NANDA website to find the report, and instead of the file, there was a form for researchers to fill out their information. I did so, and received nothing. This is odd for academic research. I eventually found the attached document in another LinkedIn post. Unless this is a hoax, the report does not provide credible support for its conclusions. It was produced "in collaboration with Project NANDA out of MIT" (Acknowledgements, Section 8.1). The lead author, a research contributor to NANDA, is a scientist at Microsoft who teaches an online course for Stanford on generative AI. There's nothing wrong with academic centers bringing together outside experts, but it's unclear how involved MIT faculty and full-time researchers were with this work. The bigger issue is the content. The report states flatly, "Despite $30–40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return." It claims that is based on "interviews with representatives from 52 organizations, and survey responses from 153 senior leaders collected across four major industry conferences." Yet no information is provided on these organizations or representatives, or details of the survey. And then... there appears to be no further support for the 95% claim. I've read through the document multiple times, and I still can't understand where it comes from. There is a 5% number in Section 3.2, for "custom enterprise AI tools" being "successfully implemented." But that's much narrower. And successful deployment is defined as "causing a marked and sustained productivity and/or P&L impact." In other words, "unsuccessful" explicitly does not mean "zero returns." Most damning is the statement on p. 6: "Research Limitations: These figures are directionally accurate based on individual interviews rather than official company reporting." This makes no sense. 95% getting zero returns is a specific factual claim, not "directional." There is an interview question listed about return on investment from generative AI, but if 95% of respondents said their organization's ROI was zero, one would expect the report to state that. That still wouldn't be proof, but it would be a data point. The fact that no information is provided on the answers that question makes me deeply suspicious. My guess is those writing the report were doing their best, without fully appreciating academic research standards and how their conclusions would be taken. If MIT Project NANDA stands beheind the claims, it should release the full supporting data. If not, it should retract the report.

The fact that this report fails to demonstrate that most generative AI deployments fail does not, of course, mean they are successful. There are good reasons to wonder whether generative AI is creating returns to justify the massive level of investment. But it's not going to be a black and white matter. I suspect this report captured press and investor interest because it seemed to give such a definitive answer: 95% had zero returns. Reality is inevitably messier, especially given the short time period since generative AI was introduced widely in enterprises.

Thank you for doing the work to verify facts, Kevin. With so much disinformation and bias running rampant, the digging is appreciated. We’re living in the Wild West of AI info right now. What that means in the long run remains to be seen.

Shoot first, ask questions later…seems to be the new mantra, even in academic circles?

Agreed. I just finished the 26-page report to form my own opinion, and I share your skepticism. While the report is an interesting read, its conclusions are questionable due to a lack of methodological rigor and a clear bias toward promoting agentic AI (vs. general-purpose LLMs) without sufficient evidence. Specifically, I'm puzzled by the lack of precision in the methodology in Section 5 about "how the best builders succeed." The author repeatedly presents 'agentic AI' as the key differentiator for successful GenAI projects, insisting it solves critical missing factors like "memory" and "adaptability." Their definition of this 'agentic AI' concept relies on technologies like MCP, A2A, and notably, the author's own solution, NANDA. This raises a real question about the report's neutrality, especially since the entire argument is backed by little more than a single quote from one interviewee. In reality, the technical distinction between LLMs and 'agentic AI' is not a clear one, as models like ChatGPT, Gemini and Claude are already agentic. This leaves me with some open questions about potential biases stemming from the author's own business interests. I'd be keen to hear other perspectives on this and discuss it further.

As an AI optimist, this isn’t surprising - adoption is very hard. Harder than most people are expecting and are prepared for, in part because it’s unlike prior forms of technological adoption. It requires a more complete view of a business, processes, data, and people. It’s unlikely to generate meaningful returns if you’re just implementing it on the margins.

Kevin Werbach thanks - same here. And also crickets back from Nanda.

The “95% GenAI failure” headline is déjà vu. Could not agree with Kevin Werbach more! MIT NANDA “GenAI Divide” report makes waves by claiming 95% of deployments deliver zero returns. But dig deeper and the story unravels: - Zero return = no P&L impact, ignoring documented productivity gains . - A sample of 52 orgs + 153 execs at conferences is too thin for sweeping billion-dollar conclusions. - The same report admits 90% of employees are already using shadow AI daily . That’s not “failure,” that’s grassroots adoption. We’ve heard this before: - ERP was “dead money” (20 yrs ago). - Data Warehousing was “costly hype” (15 yrs ago). - Big Data was “over” (10 yrs ago). - Digital Transformation was “fatigue” (5 yrs ago). Every time, the obituary was premature. Value emerged later — after organizations learned how to scale. 💡 My take: this isn’t failure, it’s the messy middle of adoption. The real divide is between those who adapt workflows for AI, and those who expect plug-and-play miracles. So let’s stop calling it “zero return” and start asking: who’s building systems that learn, adapt, and integrate — and who’s just chasing headlines?

Many thanks for sharing the actual report Kevin Werbach — I ran into the same issue trying to track it down myself!

Your analysis perfectly captures how confirmation bias can masquerade as research when we start with conclusions and work backward to find supporting evidence. Kevin Werbach what if instead of asking whether AI generates returns, we focused on building systematic frameworks to measure and optimize the specific conditions under which AI actually delivers value?

See more comments

To view or add a comment, sign in

Explore content categories