Ilya Sutskever (Co-founder and Chief Scientist) 00:00.060
know. I think I think I think they'll definitely be a they'll be diminishing returns because you want you want people who think differently rather than the same. I think that if they were literal copies of me, I'm not sure how much more incremental value you'd get. I think that
Ilya Sutskever (Co-founder and Chief Scientist) 00:21.700
but people
Dwarkesh Patel (Host) 00:22.060
who think differently, that's what you want. Why is it that it's been if you look at different models even released by totally different companies trained on potentially non-overlapping data sets It's actually crazy how similar LLMs are to each other. Maybe
Ilya Sutskever (Co-founder and Chief Scientist) 00:35.100
the data sets are not as non-overlapping as it seems.
Dwarkesh Patel (Host) 00:38.380
But there's there's some sense that there's like even if an individual human might be less productive than the future AI, maybe there's something to the fact that human teams have more diversity than teams of AI's might have, but how do we elicit meaningful diversity among AI's?
Dwarkesh Patel (Host) 00:52.260
So I think just raising the temperature just to results in gibberish. I think you want something more like different scientists have different different prejudices or different ideas. How do you get that kind of diversity among AI agents. So, the reason
Ilya Sutskever (Co-founder and Chief Scientist) 01:04.340
there has been no diversity, I believe, is because of pre-training. All the pre-trained models are the same, pretty much, because they're pre-trained on the same data. Now, IRL and post-training is where some differentiation starts to emerge because different people come up with
Ilya Sutskever (Co-founder and Chief Scientist) 01:22.220
different IRL training. Yeah.
Dwarkesh Patel (Host) 01:24.820
And then I've heard you hint in the past about self-play as a way to either get data or match agents to other agents with equivalent intelligence to kick off learning. How should we think about why there's no public um proposals of this kind of thinking working with other LMs? I
Dwarkesh Patel (Host) 01:45.900
would
Ilya Sutskever (Co-founder and Chief Scientist) 01:46.060
say there are two things to say. I would say that the reason why I thought self-play was interesting is because it offered a way to create models using compute only without data. Right? And if you think that data is the ultimate bottleneck Then using computer only is very
Ilya Sutskever (Co-founder and Chief Scientist) 02:03.980
interesting. So that's what makes it interesting. Now, the the thing is that self play, at least the way it was done in the past when you have agents which are somehow compete with each other, it's only good for developing a certain set of skills. It is too narrow. It's only
Ilya Sutskever (Co-founder and Chief Scientist) 02:25.060
good for like negotiation, conflict, certain social skills. strategizing that kind of stuff. And so if you care about those skills, then self-play will be useful. Now, actually I think that self-play did find a home, but just in a different form, in a different form. So things
Ilya Sutskever (Co-founder and Chief Scientist) 02:47.460
like debate, prover verifier, you have some kind of an LLM as a judge which is also incentivized to find mistakes in your work. You could say this is not exactly self-play, but this is, you know, a related adversarial set that up that people are doing, I believe. And really
Ilya Sutskever (Co-founder and Chief Scientist) 03:03.860
self-play is an example of um is a special case of more general like um competition between between agents. Right? The response, the natural response to competition is to try to be different. And so if you were to put multiple agents and you tell them, you know, you all need to
Ilya Sutskever (Co-founder and Chief Scientist) 03:19.700
work on some problem and you are an agent and you're inspecting what everyone else is working, you're going to say, "Well, if they're already taking this approach, it's not clear I should pursue it. I should pursue something different And so I think that something like this
Ilya Sutskever (Co-founder and Chief Scientist) 03:34.060
could also create an incentive for a diversity of approaches.
Dwarkesh Patel (Host) 03:38.100
Yeah. Um, final question. What is research taste? You're obviously the person in the world who is considered to have the best taste in doing research in AI. You were uh the co-author on many of the biggest the biggest things that have happened in the history of deep learning.
Dwarkesh Patel (Host) 04:01.540
learning from Alex Net to GPT-3 to so on. What is it that how do you characterize how you come up with these ideas?
Ilya Sutskever (Co-founder and Chief Scientist) 04:10.100
I can answer so I can comment on this for myself. I think different people do it differently. But one thing that um guides me personally is an aesthetic of how AI should be by thinking about how people are. But thinking correctly. Like, it's very easy to think about how people
Ilya Sutskever (Co-founder and Chief Scientist) 04:34.140
are incorrectly. But what does it mean to think about people correctly? So I'll give you some examples. The idea of the artificial neuron is directly inspired by the brain. And it's a great idea. Why? Because you say, "Sure, the brain has all these different organs, it has the
Ilya Sutskever (Co-founder and Chief Scientist) 04:50.880
folds, but the folds probably don't matter." Why do we think that the neurons matter? Because there is many of them. Kind of feels right, so you want the neuron. Yeah. You want some kind of local learning rule that will change the connections. You want some local learning rule
Ilya Sutskever (Co-founder and Chief Scientist) 05:04.720
rule that will change the connections between the neurons. Right? It feels plausible that the brain does it. The idea of the distributed representation. The idea that the brain, you know, the brain responds to experience and neural it should learn from experience, not response.
Ilya Sutskever (Co-founder and Chief Scientist) 05:20.720
The brain learns from experience, then neural network experience. And you kind of ask yourself, is some is something fundamental or not fundamental? How things should be? Yeah. And I think that's been guiding me a fair bit, kind of thinking from multiple angles and looking for
Ilya Sutskever (Co-founder and Chief Scientist) 05:37.400
almost beauty, beauty, simplicity, ugliness, there's no room for ugliness. It's just beauty, simplicity, elegance, correct inspiration from the brain and all of those things need to be present at the same time and the more they are present, the more confident you can be in a top
Ilya Sutskever (Co-founder and Chief Scientist) 05:53.400
down belief. And then a top down belief is the thing that sustains you when the experiments contradict you. Because if you just trust the data all the time, well sometimes you can be doing a correct thing but there's a bug. But you don't know that there is a bug. How can you
Ilya Sutskever (Co-founder and Chief Scientist) 06:07.920
tell that there is a bug? How do you know if you should keep the bug in or you conclude it's the wrong direction? Well, it's the top down. Well, how should you can say the things have to be this way. Something like this has to work. Therefore, we got to keep going. That's the
Ilya Sutskever (Co-founder and Chief Scientist) 06:22.080
top down. And it's based on this like multifaceted beauty and inspiration by the brain.