If you follow AI even loosely, you’ve probably bumped into the name Andrej Karpathy. He’s the researcher who helped make deep learning feel less like a secret club and more like a craft you can learn. He was born in Bratislava and moved to Toronto as a teenager, studied at the University of Toronto and UBC, then earned a PhD at Stanford with Fei-Fei Li. His thesis focused on connecting images and language—a theme that aged well as vision-language models took off.
That academic chapter wasn’t just papers and posters. He co-created and taught CS231n, the Stanford course that became a rite of passage for a generation of engineers who later built today’s AI stack. The course notes, videos, and problem sets still circulate like a well-thumbed playbook. If you’ve ever implemented a softmax from scratch at 2 a.m., there’s a decent chance CS231n was in your browser tabs.
From lab notes to OpenAI v1.0
Right after Stanford, Karpathy joined the original research group at OpenAI in 2015. That first wave was a small crew with an outsized footprint, and Karpathy was part of it. He worked on deep learning across vision and language while the organization was still figuring out what “OpenAI” would become.
His writing during that time helped the broader community make sense of fast-moving ideas. “The Unreasonable Effectiveness of Recurrent Neural Networks” wasn’t just a viral read; it shaped how many people learned about sequence modeling before transformers stole the stage. And yes, it spelled out the thrill and the limitations of character-level models in plain English. That accessibility became a pattern.
The Tesla years: cameras, code, and a big bet
In 2017, Karpathy left OpenAI to lead Autopilot vision at Tesla. That put him at the center of one of the most contentious tech debates: can camera-only systems, powered by giant neural nets, eventually drive us safely? Publicly, he explained Tesla’s approach at CVPR and at AI Day, showing how the company turned fleets of cars into data machines and how the networks mapped pixels to a top-down view of the road. Even critics admitted the engineering depth on display.
Tesla later removed radar and ultrasonic sensors from some vehicles and leaned harder on vision. Karpathy discussed the rationale in talks and interviews: a simpler, camera-centric stack that learns from real-world data. The choice drew both praise and sharp criticism, which is putting it mildly. Still, whether you cheer or jeer that call, it marked a defining moment in how AI systems interface with messy physical reality.
In 2022, after a sabbatical, he announced he was leaving Tesla. The farewell note was gracious and short on drama, as you’d expect. Newsrooms read tea leaves; the internet did its internet thing. Either way, his run at Tesla shaped how many practitioners think about large-scale data engines, labeling, and the long grind from demos to dependable behavior.
Here’s the thing: Karpathy doesn’t just build; he teaches. He kept posting materials that make tough concepts feel handleable—Software 2.0 on Medium, the highly bookmarked “Recipe for Training Neural Networks,” and later a hands-on YouTube series where he codes through tiny engines like micrograd and growth-to-a-GPT projects like makemore, minGPT, nanoGPT, and a bare-metal llm.c. If you’ve ever pulled one of those repos to “see what’s actually going on,” you’re in good company.
What I appreciate about those projects—speaking as a fellow explainer—is the tone. They don’t pretend the path is smooth. They show the knobs, the dead ends, the “okay that blew up, let’s fix it” attitude. It’s human. And it’s rare.
Back to OpenAI… and out again
In early 2023, Karpathy said he was rejoining OpenAI. It fit the moment. Large language models were leaping across capability cliffs, and the lab was shipping at a pace that kept the rest of us refreshing timelines. A year later, he announced he’d left again, adding that there was “no drama.” That resonated with anyone who has stepped away from a great gig because a different itch needed scratching.
Around that time, TIME included him in the TIME100 AI list, noting both his research and his influence as a teacher. Say what you will about lists, but it captured something real: a lot of people learned deep learning through his materials, then took those skills to startups and labs worldwide.
Eureka Labs: an AI-native school, not just a course site
If you were waiting for the “what’s next,” it came in mid-2024. Karpathy launched Eureka Labs, an education company that weaves AI tutors into the learning process. The first course, LLM101n, aims to help students train their own small models. It’s not hand-wavy content marketing; it’s exercises, code, and a “you can build this” vibe. Think of it as CS231n for the LLM era, but hosted on a platform that treats AI as part of the classroom.
People will debate digital tutors—privacy concerns, loss of human connection, all fair points. But as a direction, it lines up with his long streak of “open the hood” teaching. And if you step back, it’s also a bet on how the next million developers will learn: with a console, a notebook, and an AI helper that never sleeps.
The “vibe coding” spark
Early 2025 brought a phrase that made engineers roll their eyes and nod at the same time: “vibe coding.” Karpathy used it to describe building software by talking to AI agents, steering them toward intent rather than typing every line yourself. The term took on a life of its own—defenders see it as a new workflow, skeptics as a squishy label for prompt-glued spaghetti. Even Andrew Ng weighed in, calling the name unfortunate but the work real and, frankly, exhausting. That’s honest.
The discussion spilled into the mainstream. Business leaders debated how far this style can go and what it means for tools we rely on every day. Regardless of the label, there’s a shift here: we’re moving from programming every branch ourselves to specifying behavior and auditing what models assembled. It’s a different mental model of software.
Karpathy being Karpathy, he didn’t stop at talk. He shipped nanochat, a compact “ChatGPT-style” training pipeline, and folded it into Eureka Labs as a capstone. That’s a pattern worth noticing: coin a concept, then give people a repo they can run and critique.
The teacher’s toolkit keeps expanding
If you’re trying to learn or teach AI, his ecosystem is unusually complete.
- Videos: “Let’s build GPT from scratch” and “The spelled-out intro to neural networks” are approachable without being cutesy. You can follow along line by line. (YouTube)
- Repos: micrograd for backprop from first principles, makemore for character models up to a transformer, minGPT/nanoGPT for transformer training in clean PyTorch, and llm.c when you want to feel close to the metal. Use them as reference, or as scaffolding for your own experiments. (GitHub)
This isn’t just content. It’s a philosophy: learn by building and by reading code that fits on your screen.
Before the PhD, before OpenAI, there was badmephisto—Karpathy’s YouTube channel with Rubik’s Cube tutorials that helped a generation of speedcubers. If that seems random, look again. The videos are procedural, patient, and strangely soothing. They teach hard things with simple steps and a friendly voice. The through-line from cubes to convnets is clearer than it first appears.
Where he is now—and why it matters
As of this year, his energy is pointed at Eureka Labs, at courseware like LLM101n, and at the “vibe coding” discourse that has everyone from founders to junior devs asking new questions about how we build. He also keeps nudging caution in public talks: LLMs are powerful and also weird, and you should keep them “on the leash.” That tension—ambition paired with restraint—feels right for the moment.
So what’s the big picture? Karpathy’s story tracks the arc of modern AI. Academic breakthroughs. A new lab with big aspirations. A bold product bet at Tesla. A wave back to research. An education startup aimed at scaling skill, not just hype. It’s the same curiosity moving through different rooms.
You don’t need to agree with every call he’s made to learn from the method. Start small, make the code readable, measure everything, and explain your work as if your future teammate is you, six months from now. When the conversation turns to “vibes,” pair them with tests. If your model writes code, make sure your CI has a spine.
And yes, let yourself be a beginner at something again. That may be the quiet thesis under all of Karpathy’s work: ship things people can learn from. The rest tends to follow.