AI · June 26, 2025

Reacting to Andrej Karpathy’s talk, “Software Is Changing (Again)”

Andrej Karpathy’s appeared on June 17, 2025 at AI Startup School in San Francisco and delivered a very explanatory talk about three eras of software and he explained the nature of LLMs and their implications for the industry.

For those who doesn’t know him, he actually coined the term “vibe coding” and you can read about my thought about the vibe coding at the current state on this post.

Here’s a breakdown of the three different software era [01:42]:

Software 1.0: Traditional code written directly by humans, like C++.

Software 2.0: Neural networks, where the “code” is the trained weights of the network, and development involves tuning datasets and running optimizers. Karpathy also notes that at Tesla, Software 2.0 (neural networks) “ate through” the Software 1.0 (C++ code).

Software 3.0: Introduced with LLMs, where prompts in natural language (like English) become the programs or the code itself. This new paradigm means anyone who speaks a natural language can be a programmer.

Is “era” the right wording here?

I know that people like to talk about different “eras”.

..new era in science, new era in football, new era this and new era that..

The term is popular because it evokes a sense of significant change, a clear turning point, and often, a dramatic shift in how things are done. However, what truly constitutes an “era”?

An “era” often suggests a period defined by a dominant characteristic, a prevailing paradigm, or a foundational technology that fundamentally alters the landscape. Did AI altered the landscape? Hell yea!

The Bronze Age, the Industrial Revolution, the Information Age..

These were all eras marked by profound and irreversible transformations in human civilization.

I guess what I am trying to say that the distinction lies in whether the previous “era” becomes entirely obsolete. While Software 2.0 “ate through” parts of Software 1.0, traditional coding hasn’t vanished. People still code C++ and Java. And it’s unlikely Software 3.0 will entirely replace Software 1.0 or 2.0. Instead, it seems more probable that these different approaches will coexist and integrate, forming a more complex and maybe more layered software ecosystem. AI era is clearly a new age in human history, however currently perceived Software 3.0 may not be an “era” depending on whether LLMs have already reached their knee point.

“LLMs are like New Operating Systems”

Karpathy’s strongest analogy is that LLMs are like new operating systems, specifically comparing their current state to the 1960s era of computing where compute was expensive and centralized. They are complex software ecosystems, with closed-source providers (like Windows/Mac OS) and open-source alternatives (like the LLaMA ecosystem akin to Linux).

I understand the analogy and I liked it. I mean obviously, LLMs, at their core, are predictive text generators; they don’t directly control hardware, manage file systems, or enforce security policies in the way a kernel does. It relies on a traditional operating system to execute a command. He is not trying to say that this is a new OS, he just compares how OSs appeared in 60s.

I liked the Karpathy’s way of thinking on this one. Can we expect a similar course of development? Well, there is some positive development on native on device LLMs. Seems like he might be right after all.

Psychology of LLMs

  • LLMs are described as “stochastic simulations of people,” trained on vast amounts of text, giving them human-like psychology [14:49].
  • They possess encyclopedic knowledge and memory, superpowers [15:30].
  • They exhibit hallucination, jagged intelligence (superhuman in some areas, basic errors in others), and anterograde amnesia, meaning they don’t natively consolidate knowledge over time like humans. [16:07]
  • Context windows are likened to working memory that needs direct programming. LLMs are also gullible and susceptible to prompt injection risks [17:44].

All points are very valid and well informing. The reliance on probabilistic completion, rather than a genuine grasp of reality, means LLM “intelligence” is inherently unreliable in contexts requiring nuanced judgement, logical coherence, or real-world grounding.

A robot talking to a psychologist

The current reality is that people tend to bypass content they perceive as AI-generated. On platforms like YouTube and Instagram, I find myself instantly skipping AI-generated shorts or those with AI narrators, almost instinctively.

I’m not opposed to AI-generated content all the images on this blog (aside from the logo’s frog) were AI-created. However there’s just something inherently repulsive about fully AI-generated content.

Opportunities and Challenges in LLM Applications

  • Partial Autonomy Apps: He highlights features like context management, orchestration of multiple LLM calls, application-specific GUIs, and an “autonomy slider” that allows users to control the level of AI intervention [19:08]. Cursor and Perplexity are told as examples of successful LLM apps.
  • Speeding up Verification: It’s crucial to make the human-AI cooperation loop as fast as possible [22:19].
  • Keeping AI on a Leash: Karpathy stresses the importance of controlling the AI to avoid overwhelming users with large, unmanageable outputs (e.g., a 10,000-line code diff)
  • The “Vibe Coding” Phenomenon: The ability to program with natural language means that more people can now “vibe code” and create software without extensive traditional programming knowledge [29:34].
  • Building for Agents: He argues that software needs to be designed for interaction with AI agents, not just humans [33:40]. This includes making documentation LLM-friendly (e.g., Markdown instead of human-centric text, replacing “click” with API commands)

These are all valid points. I use Cursor every day, and I’m just starting with Perplexity. We should remember that these tools are very new. Karpathy himself says they are good for trying out ideas, but they aren’t ready for building production ready coding tools.

I’ve heard many people say it’s quicker to explain a problem to AI tools like ChatGPT or Groq and just copy the code from there, rather than using a tool like Cursor. I try both ways, and I hope these tools get much better. I’m just trying to learn how to make software this new way.

Keeping AI on a Leash

Keeping AI on a Leash is another great way of looking at this. You can’t leave the steering wheel yet otherwise AI is lost in the woods.

The concept of “building for agents” resonates strongly in our team discussions at work. Imagine if we could redesign the world’s road system for autonomous cars, self-driving might become a reality overnight. The challenge lies in our historical approach: we’ve treated LLMs as mere APIs for our human systems. Yet, for true agentic functionality, the paradigm must shift!

We might need to become the APIs for them.

This raises a silly question. If we anatomically crafted tools to enable monkeys to complete tasks typically performed by humans, would that make the monkeys truly intelligent?

As I argued in my previous post, AI is, and will remain, a game-changer. However, if we narrowly focus on the current generation of LLMs for every task, we will undoubtedly face frustration. Consider the Stockfish chess engine: it is a perfect tool for its specific purpose, nicely designed for optimal chess play. Yet, I observe people attempting to play chess with LLMs, despite their obvious unsuitability for that specific task.

This tells us something important: you can’t use one tool for every problem. It’s our job to pick the right tool for the right job. Don’t try to force a tool to do something it’s not good at. To truly master this new way of building software, we need to understand the strengths and weaknesses of these new tools.

Conclusion

Overall, I really liked Karpathy’s talk. But honestly, it feels like every AI talk these days is still trying to sell us something. Come on, we’re already bought in! We don’t need more hype. Tell us where we really are. What are the current limits of LLMs? What are the big problems and challenges we’re facing now? Have we hit a wall, or are there still big leaps to come? What new AI ideas are emerging beyond just LLMs? Let’s talk about the real path forward, not just keep the hype going.