Ilya Sutskever – We're moving from the age of scaling to the age of research

Ilya Sutskever (Co-founder and Chief Scientist) 00:00.280

You know it's crazy,

Ilya Sutskever (Co-founder and Chief Scientist) 00:01.800

that all of this is real.

Dwarkesh Patel (Host) 00:04.200

Yeah, meaning what?

Ilya Sutskever (Co-founder and Chief Scientist) 00:05.120

Don't you think so?

Dwarkesh Patel (Host) 00:06.120

Meaning what?

Ilya Sutskever (Co-founder and Chief Scientist) 00:06.880

Like all the AI stuff and all the bay area, yeah, that it's happened like Isn't it straight out of science fiction?

Dwarkesh Patel (Host) 00:13.760

Yeah. Another thing that's crazy is like how normal the slow take off feels. The idea that we'd be investing 1% of GDP in AI, like I feel like it would felt like a bigger deal, you know? Where right now it just feels like

Ilya Sutskever (Co-founder and Chief Scientist) 00:27.240

We get used to things free fast turns out, yeah. But also it's kind of like it's abstract, like, what does it mean? What it means that you see it in the news

Dwarkesh Patel (Host) 00:35.760

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 00:36.360

that such and such company announced such and such dollar amount.

Dwarkesh Patel (Host) 00:39.040

Right.

Ilya Sutskever (Co-founder and Chief Scientist) 00:40.360

That's that's all you see.

Dwarkesh Patel (Host) 00:41.560

Right.

Ilya Sutskever (Co-founder and Chief Scientist) 00:42.880

It's not really felt in any other way so far.

Dwarkesh Patel (Host) 00:45.360

No. Should we actually begin here? I think this is an interesting discussion.

Ilya Sutskever (Co-founder and Chief Scientist) 00:47.960

Sure.

Dwarkesh Patel (Host) 00:48.760

I think your point about well, from the average person's point of view, nothing is that different. will continue being true even into the singularity.

Ilya Sutskever (Co-founder and Chief Scientist) 00:57.280

No, I don't think so.

Dwarkesh Patel (Host) 00:58.760

Okay, interesting.

Ilya Sutskever (Co-founder and Chief Scientist) 01:00.000

So, the thing which I was referring to, not feeling different is okay, so such and such company announced some difficult to comprehend dollar amount of investment.

Dwarkesh Patel (Host) 01:12.120

Right.

Ilya Sutskever (Co-founder and Chief Scientist) 01:12.640

I don't think anyone knows what to do with that.

Dwarkesh Patel (Host) 01:15.000

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 01:15.840

But I think that the impact of AI is going to be felt. AI is going to be diffused through the economy. There are very strong economic forces for this. And I think the impact is going to be felt very strongly.

Dwarkesh Patel (Host) 01:30.640

When do you expect that impact? I think the models seem smarter than their economic impact would imply.

Ilya Sutskever (Co-founder and Chief Scientist) 01:38.320

Yeah, this is one of the very confusing things about the models right now. How to reconcile the fact that they are doing so well on evals. And you look at the evals and you go, those are pretty hard evals.

Dwarkesh Patel (Host) 01:56.280

Right.

Ilya Sutskever (Co-founder and Chief Scientist) 01:57.080

They're doing so well. But the economic impact seems to be dramatically behind and it's almost like It's It's very difficult to make sense of how can the model on the one hand do these amazing things and then on the other hand like repeat itself twice in some situation in a kind

Ilya Sutskever (Co-founder and Chief Scientist) 02:19.320

of a an example would be let's say you use vibe coding to do something and you go to some place and then you get a bug. And then you tell the model, can you please fix the bug?

Dwarkesh Patel (Host) 02:29.520

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 02:30.320

And the model says, "Oh my god, you're so right. I have a bug. Let me go fix that." And it introduces a second bug.

Dwarkesh Patel (Host) 02:35.640

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 02:36.720

And then you tell it you have this you have this new second bug. And it tells you, "Oh my god, how could I've done it? You're so right again." And brings back the first bug.

Ilya Sutskever (Co-founder and Chief Scientist) 02:44.320

Yeah. And you can alternate it between those

Dwarkesh Patel (Host) 02:45.680

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 02:46.160

And it's like, how is that possible?

Dwarkesh Patel (Host) 02:48.120

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 02:48.680

It's like I'm not sure. But it does suggest that the something strange is going on. I have two possible explanations. So here this is the more kind of a whimsical explanation is that maybe our RL training makes the models a little bit too single-minded and narrowly focused, a

Ilya Sutskever (Co-founder and Chief Scientist) 03:08.640

little bit too I don't know, unaware, even though it also makes them aware in some other ways. And because of this, they can't do basic things, but there is another explanation which is back when people were doing pre-training, the question of what data to train on was answered.

Ilya Sutskever (Co-founder and Chief Scientist) 03:32.840

Because the that answer was everything.

Dwarkesh Patel (Host) 03:35.080

Yeah.

Ilya Sutskever (Co-founder and Chief Scientist) 03:36.080

When you do pre-training, you need all the data. So you don't have to think it's going to be this data or that data.

Dwarkesh Patel (Host) 03:44.320

Yeah. But

Ilya Sutskever (Co-founder and Chief Scientist) 03:45.080

when people do RL training, they do need to think. They say okay we want to have this kind of RL training for this thing and that kind of RL training for that thing. And from what I hear, all the companies have teams that just produce new RL environment And just add it to the

Ilya Sutskever (Co-founder and Chief Scientist) 04:01.120

training mix and the question is what are those? There are so many degrees of freedom. There is such a huge variety of RL environments you could produce. And one of the one thing you could do, and I think that's something that is done inadvertently is that people take

Ilya Sutskever (Co-founder and Chief Scientist) 04:19.320

inspiration from the evals. You say, "Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that would help on this task, right? I think that is something that happens and I think it could explain a lot of

Ilya Sutskever (Co-founder and Chief Scientist) 04:37.520

what's going on. If you combine this with generalization of the models actually being inadequate that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real real world performance, which is something that we don't today

Ilya Sutskever (Co-founder and Chief Scientist) 04:56.920

exactly even understand what what we mean by that.