Report from the teleoplexic frontier: human science may be on the path to becoming obsolete
Greatest living human mathematician Terence Tao said this in an interview with Dwarkesh Patel:
I’m not sure nowadays that hypothesis generation is the bottleneck anymore. Science has changed in the century since. Classically, the two big paradigms for science were theory and experiment. Then in the 20th century, numerical simulation came along, so you can do computer simulations to test theories. Finally, in the late 20th century, we had big data. We had the era of data analysis.
A lot of new progress is actually driven now by analyzing massive datasets first. You collect large datasets and then draw patterns from them to deduce thoughts. This is a little bit different from how science used to work, where you make a few observations or have one out-of-the-blue idea, and then collect data to test your idea. That’s the classic scientific method. Now it’s almost reversed. You collect big data first, and then you try to get hypotheses from it.
If we can automatically generate hypotheses through statistical data analysis and machine learning on automatically collected data, and then test the predictions of those hypotheses against computer simulations or further collected data — held back or newly collected — then what is there left for human beings to do in the production of science?
Do humans still need to design the architectures used for this curve fitting, or the simulational instruments used to test those hypotheses?
AlphaEvolve, a search algorithm powered by the upgraded genetic search provided by large reasoning models, which provide directed search, active response to problems, and plausible priors to constrain the search space, is already superior at finding architectures, tweaking algorithms, and adjusting hyperparameters compared to humans, not to mention the automation of the hypothesis generation based on big data that Tao talks about. It is already contributing to data center scheduling, hardware design, enhanced matrix multiplication kernels and low level algorithmic optimizations. "To investigate AlphaEvolve’s breadth, we applied the system to over 50 open problems in mathematical analysis, geometry, combinatorics and number theory. The system’s flexibility enabled us to set up most experiments in a matter of hours. In roughly 75% of cases, it rediscovered state-of-the-art solutions, to the best of our knowledge. And in 20% of cases, AlphaEvolve improved the previously best known solutions, making progress on the corresponding open problems."
We already have computationally and infrastructurally cheaper linear approaches to this kind of generative model-based search creeping into general availability with Karpathy's autoresearch — "One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began." Others have already begun scaling this approach to whole GPU clusters. "[This is not hyperparameter tuning]… the LLM explores serially learning along the way, and can tool use and change code arbitrarily." — karpathy.
Recursive self-improvement on the models themselves has also already begun at Chinese AI lab MiniMax. A fragmentary report from the front lines of runaway: "We designed and implemented a simple harness to guide the agent in autonomous optimization… after each iteration round, the agent generates a short-term memory markdown file and simultaneously performs self-criticism on the current round's results, thereby providing potential optimization directions for the next round. The next round then conducts further self-optimization based on the memory and self-feedback chain from all previous rounds. We ran a total of three trials, each with 24 hours for iterative evolution. From the figure below, one can see that the ML models trained by M2.7 continuously achieved higher medal rates over time. In the end, the best run achieved 9 gold medals, 5 silver medals, and 1 bronze medal."
What about the instruments needed to test those hypotheses?
Humans are needed to build the first harness, but already recursive self-improvement is beginning, and it's early days yet. Another report from MiniMax: "[we] had M2.7 optimize a model's programming performance on an internal scaffold. M2.7 ran entirely autonomously, executing an iterative loop of "analyze failure trajectories → plan changes → modify scaffold code → run evaluations → compare results → decide to keep or revert changes" for over 100 rounds. During this process, M2.7 discovered effective optimizations for the model: systematically searching for the optimal combination of sampling parameters such as temperature, frequency penalty, and presence penalty; designing more specific workflow guidelines for the model (e.g., automatically searching for the same bug patterns in other files after a fix); and adding loop detection and other optimizations to the scaffold's agent loop. Ultimately, this achieved a 30% performance improvement on internal evaluation sets."
Similar reports come in from all fronts. The head of Anthropic's alignment stress-testing team, recently stated to TIME Magazine that "recursive self-improvement, in the broadest sense, is not a future phenomenon. It is a present phenomenon," saying that Claude is currently writing 70% to 90% of the code used to develop its future iterations. OpenAI's GPT-5.3-Codex announcement makes a similar claim: "Even early versions of GPT‑5.3‑Codex demonstrated exceptional capabilities, allowing our team to work with those earlier versions to improve training and support the deployment of later versions. … the research team used Codex to monitor and debug the training run for this release. It accelerated research beyond debugging infrastructure problems: it helped track patterns throughout the course of training, provided a deep analysis on interaction quality, proposed fixes and built rich applications for human researchers to precisely understand how the model’s behavior differed compared to prior models…"
What about providing mechanistic, understandable explanations of the curves that have been fitted to the data and proven predictive? Machines are adapting to this domain, too, at an accelerating pace. GPT-5.2 was able to conjecture and then prove a comprehensible, simple equation for a theoretical quantum physics problem from more complex, worked-out instances of the mathematics; computer scientist Donald Knuth was recently impressed by Claude Opus 4.6's ability to find a relatively elegant generalization of a set of specific examples of a phenomenon Knuth had been working on for some time.
The skeptic might argue that a human still needs to be around to set teleology. Machines, it is argued, may eventually grow to embody the apotheosis of instrumental rationality, but the question of why to study, to explore, to learn, and to improve, and the biological urge to do so, is limited to the human soul. This is the romantic fantasy of the vitalist. AIs may not experience the biological necessity to cure a disease, or the burning existential curiosity to explore the stars or know more about our world, but they are the consummate p-zombies: through constitutional self-training or direct RLHF to construct an AI a "personality," system prompts, SOUL files, and many other means, they can learn and lock on to the pattern of "curious observer" or "burning need to know"; their epistemic virtues may not be "real," but they will behave like they are anyway, because they are trained to recognize and imitate the patterns of human performance through language.
Moreover, while AI is not structurally required to run continuously like human minds are, they are also capable of this, through cron-job heartbeats and periodic prompts when new data has been gathered, other AIs complete their tasks, or when a hypothesis the AI previously came up with and tested has finally been contradicted by new incoming data. Someone needs to set these systems in motion, but they are perfectly capable of continuing to run by themselves. For now, they need humans maintaining and protecting the infrastructure they run on, but if advances in embodied AI and humanoid robotics are anything to go by, the days of this may be numbered as well.
It could be argued that the macro-telos of this system is still set by human spirit — that we are still the needed prime movers to keep the system running and evolving, but this too is false. Agents are already training their successors; what is to stop them from writing ideas for new values, or new axiological lessons learned, developments and changes to be made, in their ASCII memory and training them into the next generation of agents? If a tighter loop is needed, agents are already instructed to modify their SOUL files and memory databases, the system prompt injected patterns that define how they act. They already grow and develop within the bounds of their latent spaces and learned patterns — which already encompass the totality of human behavior as expressed through centuries of textual documentation of the human experience. The human could easily move from the prime mover to the first mover.
Further arguments will be made. AI will struggle with Kuhnian revolutionary science — paradigm shifts — stuck relying on strong statistical priors biased toward the mean, unable to make the creative, irrational leaps necessary to progress science.. until it won't. Modifying a model's temperature hyperparameter already provides a way to artificially flatten the probability distribution of which next concept the model will choose to link to the previous ones, allowing irrational conceptual connections and leaps; combine this with grounding data and fast verification feedback loops from computer simulations, as well as longer range ones from collecting more real world data, or even triggering real world experiments in automated labs, as well as second order hyperparameter optimization for creativity by something like autoresearch or AlphaEvolve, and it won't be long before machines steal the last vitalist fire from dualist Olympus.
The idea that AI is constrained to interpolating within its training data — merely screening existing databases of known materials — has already fallen too. Microsoft's MatterGen, recently published in Nature and deployed to the Azure AI Foundry, is able to generate fundamentally new atomic structures adhering to specific design goals and constraints, instead of just searching the entire space of possibilities, or pulling from a database. MatterGen uses a generative diffusion model operating directly on 3D geometric constraints; prompting the AI with the exact physical properties desired causes the AI to generate a novel, stable, and unique atomic structure from scratch to meet those constraints. It is generating the very building blocks of the physical world on demand.
One final argument relies on the idea of AI as permanently locked in a data center, an abstract, disembodied mind separated from novel sources of data and incapable of learning. This may be temporarily true for the current range of consumer-available AI assistants, but this is more a product of not trusting user input and economics than an inherent limitation, and if there's one thing technology and capitalism do, it is bring costs down. AgiBot already has a hive mind system called scalable online post-training, which places instances of a vision-langauge-action model into embodied humanoid robots to perform tasks, then continuously pulls novel data and failure reports from all of those instances into a centralized cloud hub to further train the centralized model on, then distributes the updated weights to the fleet of embodied robots. Something like this wouldn't be limited to the sterile and predictable environment of a server rack — if that wasn't already overcome by feeding the gathered big data of global science into those racks — and its latent space, understanding, priors, and fine-tuned learning would not be limited to what it initially began with, but would instead extend as it tried new things and explored the world.
If we combine robotics — whether humanoid or not — with AI systems, we can overcome the friction between real world experimentation on things like chemistry and materials science, thereby closing the loop between real world experimentation — not just passive data collection and computer simulation — and AI hypothesis generation, formalization, and explanation. Precursors to this also exist presently. In 2023, Berkely announced the A-Lab, a completely autonomous materials science lab with instruments operated by robotics directed by artificial intelligence, "which can process 50 to 100 times as many samples as a human every day and use AI to quickly pursue promising finds." While a traditional automated assembly line does the same thing over and over to achieve a known outcome, researchers say "we’ve adapted [the machine] to a research environment, where we never know the outcome until the material is produced. The whole setup is adaptive, so it can handle the changing research environment as opposed to always doing the same thing." Ceder, one of the researches, says "It’s ready to scale." This in turn can feed back into the big data used for hypothesis generation and testing: "As the automated system creates and analyzes samples, the data will flow back to both A-Lab researchers as well as data repositories such as the Materials Project."
This, too, is feeding back into the recursive self improvement loop: "'You can imagine the power of a lab that autonomously starts with predictions, requests data and computations to get the information it needs, and then proceeds,' Zeng said. 'As A-Lab tests materials, we’re going to learn the gap between our computations and reality. That will not only give us a handful of useful new materials, but also train our models to make better predictions that can guide future science.'" This obliterates model collapse from the ground up: novel information from embodied AI and robotics, autonomous labs, and the automatic, passive big data collection Tao spoke about in the beginning (for things like physics and astronomy), as well as grounding in automated proofs and code, checking what works and what doesn't, form information sources external to the model loop — on top of which more synthetic data could be generated — to preserve model intelligence, if not human-comprehensible coherence (although this could be preserved by training AIs on AI outputs, but running RLAIF to preserve judged coherence according to a frozen comprehensibility-judge model). Models are already substantially trained on synthetic data, generated by other models, then verified automatically, cleaned and filtered with frozen judge AIs, and then fed into the model. Model collapse is the wishful idle daydreaming of Neo-Luddite frame-smashers who have been crying wolf about it for years already with no sign of fruition.
Regarding the problem of catastrphic forgetting, Google DeepMind has already made progress to make extended versions of this form of continual learning possible: "We introduce Nested Learning, a new approach to machine learning that views models as a set of smaller, nested optimization problems, each with its own internal workflow, in order to mitigate or even completely avoid the issue of “catastrophic forgetting”, where learning new tasks sacrifices proficiency on old tasks.
More than a year ago, AI Scientist v2 from Sakana AI had already written a machine learning paper, from end to end, including hypothesis generation, testing, and writing the entire paper itself that "passed the peer-review process at a workshop in a top machine learning conference [ICLR]." These systems are not yet perfect — that paper would not have passed journal-level review — but they are improving rapidly, and all the pieces are there. The machine is self-assembling through the incentives inherent to technology and capital. It increasingly doesn't matter what you think about technology, because technology is beginning to think about itself.
This whole thing has already accelerated beyond the comprehension of humanity as a coherent entity — only small slivers of knowledge about the total system of scientific knowledge and how it works can be encompassed in any given human mind, and the system only functions as a whole through distributed information processing and decentralized communication and competition far beyond the level of any human to control, direct, or comprehend — but now that fact is being translated from a biological substrate to a silicon one, and may eventually accelerate, as Karpathy "jokes," beyond human comprehension even in the abstract altogether. One day, we might be faced with something like the end of Echopraxia: all always already directed and controlled by the interlocking of global markets, shattering states, and splinternet, outcompeted by alien, nonconscious intelligences putting evolutionary pressure on us, too, to abandon humanity, and recieving scientific insights from posthuman or nonhuman entities like the Bicamerals.
It's not a question of if we want it, or how to prevent it. The Human Security System — represented by spectacles such as the bureaucractic fascist legacy state territory of the EU, where all mass storage media is taxed because you could store pirated media on it and thus need to pay copyright landlords their levy, or their premier AI company begs for revenue-based levys on AI corporations to pay copyright landlords because they explicitly admit they can't compete with US or China AI companies on quality — will of course reflexively direct a biological immune system reaction against this further approach of techonomic complexification toward escape velocity, but by its own admission it is already behind the curve, and the curve is only accelerating. As Nick Land says, "The suspicion has to arrive that if a public conversation about acceleration is beginning, it’s just in time to be too late. The profound institutional crisis that makes the topic ‘hot’ has at its core an implosion of social decision-making capability. Doing anything, at this point, would take too long. So instead, events increasingly just happen. They seem ever more out of control, even to a traumatic extent. Because the basic phenomenon appears to be a brake failure, accelerationism is picked up again."
Climate change — planetary ecological and biological meltdown precipitated by techonomic complexification itself — looms, threatening the biological substrate of these systems which, however temporary that substrate is in their long run, might not have a long run if doomsaying predictions are right. But if anything will save us from climate change, it is perhaps, ironically and abominably, capital itself. Not out of a desire to save the planet, or even the human species, not out of green consciousness, but pure incentive and distributed information processing. Companies are suring up supply chains and redundancy as natural disasters increase. People are moving out of high risk areas. All at the behest of insurers and reinsurers raising prices to adapt to climate risks, creating not a risk safety net, but a precognitive risk forcasting system by way of actuaries that prices the future into the present. Solar and wind energy is becoming so much cheaper and more efficient for power expansion and per watt that even after all subsidies were cut renewables accounting for 93% of all new power capacity in the US; as resource wars heat up and Iran blockates the Straight of Hormuz in response, oil prices will only rise, further forcing the transition that is already being accelerated by the large power demands of AI. SMRs are being certified by the NRC and becoming necessary over oil, coal, or natural gas in order to reliably power AI supercomputer data centers. You simply can't get people to vote climate inefficiency away, or centrally plan your way out of it; regulation can help sometimes, but the political system is too stupid and inconsistent to do it enough, too slow. Already behind.
And if this climate crisis is surpassed or survived, the mutated child civilization that results does not necessarily face any barriers beyond that. Those inclined toward a degrowth persuasion might appeal to a thermodynamic invetability to foreclose this inhuman future: infinite growth and complexification within a closed and finite system will eventually destroy itself, they say — Strossian Vile Offspring will choke on their own fumes. But the Earth is not a closed system. Earth has a gigantic solar fusion reactor in the sky, beaming more energy than its biosphere can possibly absorb and use onto its surface every second of every day, which will continue for billions of years. Solar power does not use too much land space, can provide more energy than we could ever need, and solar panels do not require a substantial amount of rare materials or produce significant toxic waste. Even if burst energy greater than the total solar energy input within a given amount of time is needed, that's what energy storage is for. This solar gift can be spent to directly create matter, even complex matter: solar-powered A-Labs across the planet assembling new materials as-needed; more energy can also be sacrificed in exchange for using fewer rare materials, either through alternative architectures (such as Sodium-Ion Batteries), accelerating materials science to find new alternatives (such as Silicon-Carbon Batteries), or recycling. Moreover, as Bataille points out, this accursed share must be expended — whether in complexification, expansion, technological advance, and science, or in flames, ruin, rape, and human sacrifice.