technology

24101 readers

223 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.

founded 5 years ago

MODERATORS

context@hexbear.net

SexUnderSocialism@hexbear.net

gaycomputeruser@hexbear.net

Wakmrow@hexbear.net

SwitchyandWitchy@hexbear.net

Andrej Karpathy — “We’re summoning ghosts, not building animals” (www.youtube.com)

submitted 1 day ago by yogthos@lemmygrad.ml to c/technology@hexbear.net

3 comments fedilink hide all child comments

Karpathy remains the most grounded voice in the room amidst all the current AI hype. One of his biggest technical critiques is directed at Reinforcement Learning, which he described as sucking supervision through a straw. You do a long, complex task, and at the end, you get a single bit of feedback, right or wrong, and you use that to upweight or downweight the entire trajectory. It's incredibly noisy and inefficient, suggesting we really need a paradigm shift toward something like process supervision. A human would never learn that way because we'd review our work, figure out which parts were good and which were bad, and learn in a much more nuanced way. We're starting to see papers try to address this, but it's a hard problem.

He also pushed back on the idea that we’re recreating evolution or building digital animals. Karpathy argues that because we train on the static artifacts of human thought, in form of internet text, rather than biological survival imperatives, we aren't building organisms. Animals come from evolution, which bakes a huge amount of hardware and instinct directly into their DNA. A zebra can run minutes after it's born. We're building something else that's more akin to ghosts. They are fully digital, born from imitating the vast corpus of human data on the internet. It's a different kind of intelligence, starting from a different point in the space of possible minds.

This leads into his fairly conservative timeline on agents. All these wild predictions about AGI are largely fundraising hype. While the path to capable AI agents is tractable, it's going to take about a decade, not a single year. The agents we have today are still missing too much. They lack true continual learning, robust multimodality, and the general cognitive depth you'd need to hire one as a reliable intern. They just don't work well enough yet.

Drawing from his time leading Autopilot at Tesla, he views coding agents through the lens of the "march of nines." Just like self-driving, getting the demo to work is easy, but grinding out reliability to 99.9999% takes ten years. Right now, agents are basically just interns that lack the cognitive maturity to be left alone.

Finally, he offered some interesting thoughts on architecture and the future. He wants to move away from massive models that memorize the internet via lossy compression, advocating instead for a small, 1-billion parameter cognitive core that focuses purely on reasoning and looks up facts as needed. He sees AI as just a continuation of automation curve we’ve been on for centuries.

you are viewing a single comment's thread
view the rest of the comments

[–] SouffleHuman@lemmy.ml 3 points 1 day ago (1 children)

Karpathy definitely knows what he’s talking about, his GPT from scratch video is pretty good. His explanation here really resonates with how current models can be so good at many difficult tasks while still struggling a lot with basic things. I think there definitely needs to be a few breakthroughs more before we could discuss whether actual AGI can be reached.

[–] yogthos@lemmygrad.ml 3 points 1 day ago

Yup, and I think chasing the whole AGI thing is a bit misguided. What's going to happen in practice is we'll just see gradual increase in capabilities where these tools can do more and more useful things.