Agentic AI outside coding
The level of competence, adaptiveness, and general capability even qwen-code (known to be a pretty bad agent framework/app compared to the best ones) displays is genuinely insane. I can just give it a task, turn on a sandbox and YOLO mode, and it will execute exploratory find, grep, curl, ls, which etc commands to figure out the structure of its environment and the data I'm giving it, and then write and execute command line scripts (or python, and install dependencies, and use curl to access REST APIs and jq to parse JSON responses) if necessary to do whatever I ask it to do. It basically completely automates my everyday terminal use for longer or more involved tasks.
And yes, AI hallucinates, it gets things wrong, it forgets things, it has things that our out of its training distrubution — but that's where the fact that agentic AI is already neuro-symbolic AI: it's automatically self correcting when it runs into errors, basically never ends up permanently stuck, and is really good at checking itself and its assumptions by catting its output files, running LSPs, typecheckers, linters, compilers, and tests if available and automatically correcting any errors those reveal (especially if your tools have error messages that make it clear where and what the error is in natural language, and suggest fixes, which is good for humans anyway). They're also able to agentically consult documentation if you set it up, or read documentation you provide, meaning that something being outside the training distribution isn't a problem if you can provvide documentation for it to search that comes with examples for it to find patterns and synthesize from, essentially turning it into a few-shot instead of zero-shot problem for it to solve. This isn't just good for writing code for a project. This can just do things.
The key here really is that this is already neuro-symbolic, verified AI: having the transformer model generate the code that then procedurally, symbolically executes the details of tasks and leaves behind reliable, deterministic artifacts to repeat the tasks again, so that you don't have to rely on its own stochastic execution of tasks; procedurally, reliably, and symbolically veriying whether the task succeeded and feeding that back into the model loop…
At this point it um… does kind of feel like we have a little bit of AGI, given how adept these tools are at using bash and python, and access to the internet, which is basically all you ever need to do anything that can be done on a computer. I'm just using it instead of the terminal sometimes at this point. I went from mocking people that did this to really seing their point.
See also: Why does AI feel so different?, Power to the people: how LLMs flip the script on technology diffusion, and Claude Code is my computer.