Ilya Sutskever – We're moving from the age of scaling to the age of research - part 16/17
2025-11-25_17-29 • 1h 36m 3s
Ilya Sutskever (Co-founder and Chief Scientist)
00:00.060
know.
I
think
I
think
I
think
they'll
definitely
be
a
they'll
be
diminishing
returns
because
you
want
you
want
people
who
think
differently
rather
than
the
same.
I
think
that
if
they
were
literal
copies
of
me,
I'm
not
sure
how
much
more
incremental
value
you'd
get.
I
think
that
Ilya Sutskever (Co-founder and Chief Scientist)
00:21.700
but
people
Dwarkesh Patel (Host)
00:22.060
who
think
differently,
that's
what
you
want.
Why
is
it
that
it's
been
if
you
look
at
different
models
even
released
by
totally
different
companies
trained
on
potentially
non-overlapping
data
sets
It's
actually
crazy
how
similar
LLMs
are
to
each
other.
Maybe
Ilya Sutskever (Co-founder and Chief Scientist)
00:35.100
the
data
sets
are
not
as
non-overlapping
as
it
seems.
Dwarkesh Patel (Host)
00:38.380
But
there's
there's
some
sense
that
there's
like
even
if
an
individual
human
might
be
less
productive
than
the
future
AI,
maybe
there's
something
to
the
fact
that
human
teams
have
more
diversity
than
teams
of
AI's
might
have,
but
how
do
we
elicit
meaningful
diversity
among
AI's?
Dwarkesh Patel (Host)
00:52.260
So
I
think
just
raising
the
temperature
just
to
results
in
gibberish.
I
think
you
want
something
more
like
different
scientists
have
different
different
prejudices
or
different
ideas.
How
do
you
get
that
kind
of
diversity
among
AI
agents.
So,
the
reason
Ilya Sutskever (Co-founder and Chief Scientist)
01:04.340
there
has
been
no
diversity,
I
believe,
is
because
of
pre-training.
All
the
pre-trained
models
are
the
same,
pretty
much,
because
they're
pre-trained
on
the
same
data.
Now,
IRL
and
post-training
is
where
some
differentiation
starts
to
emerge
because
different
people
come
up
with
Ilya Sutskever (Co-founder and Chief Scientist)
01:22.220
different
IRL
training.
Yeah.
Dwarkesh Patel (Host)
01:24.820
And
then
I've
heard
you
hint
in
the
past
about
self-play
as
a
way
to
either
get
data
or
match
agents
to
other
agents
with
equivalent
intelligence
to
kick
off
learning.
How
should
we
think
about
why
there's
no
public
um
proposals
of
this
kind
of
thinking
working
with
other
LMs?
I
Dwarkesh Patel (Host)
01:45.900
would
Ilya Sutskever (Co-founder and Chief Scientist)
01:46.060
say
there
are
two
things
to
say.
I
would
say
that
the
reason
why
I
thought
self-play
was
interesting
is
because
it
offered
a
way
to
create
models
using
compute
only
without
data.
Right?
And
if
you
think
that
data
is
the
ultimate
bottleneck
Then
using
computer
only
is
very
Ilya Sutskever (Co-founder and Chief Scientist)
02:03.980
interesting.
So
that's
what
makes
it
interesting.
Now,
the
the
thing
is
that
self
play,
at
least
the
way
it
was
done
in
the
past
when
you
have
agents
which
are
somehow
compete
with
each
other,
it's
only
good
for
developing
a
certain
set
of
skills.
It
is
too
narrow.
It's
only
Ilya Sutskever (Co-founder and Chief Scientist)
02:25.060
good
for
like
negotiation,
conflict,
certain
social
skills.
strategizing
that
kind
of
stuff.
And
so
if
you
care
about
those
skills,
then
self-play
will
be
useful.
Now,
actually
I
think
that
self-play
did
find
a
home,
but
just
in
a
different
form,
in
a
different
form.
So
things
Ilya Sutskever (Co-founder and Chief Scientist)
02:47.460
like
debate,
prover
verifier,
you
have
some
kind
of
an
LLM
as
a
judge
which
is
also
incentivized
to
find
mistakes
in
your
work.
You
could
say
this
is
not
exactly
self-play,
but
this
is,
you
know,
a
related
adversarial
set
that
up
that
people
are
doing,
I
believe.
And
really
Ilya Sutskever (Co-founder and Chief Scientist)
03:03.860
self-play
is
an
example
of
um
is
a
special
case
of
more
general
like
um
competition
between
between
agents.
Right?
The
response,
the
natural
response
to
competition
is
to
try
to
be
different.
And
so
if
you
were
to
put
multiple
agents
and
you
tell
them,
you
know,
you
all
need
to
Ilya Sutskever (Co-founder and Chief Scientist)
03:19.700
work
on
some
problem
and
you
are
an
agent
and
you're
inspecting
what
everyone
else
is
working,
you're
going
to
say,
"Well,
if
they're
already
taking
this
approach,
it's
not
clear
I
should
pursue
it.
I
should
pursue
something
different
And
so
I
think
that
something
like
this
Ilya Sutskever (Co-founder and Chief Scientist)
03:34.060
could
also
create
an
incentive
for
a
diversity
of
approaches.
Dwarkesh Patel (Host)
03:38.100
Yeah.
Um,
final
question.
What
is
research
taste?
You're
obviously
the
person
in
the
world
who
is
considered
to
have
the
best
taste
in
doing
research
in
AI.
You
were
uh
the
co-author
on
many
of
the
biggest
the
biggest
things
that
have
happened
in
the
history
of
deep
learning.
Dwarkesh Patel (Host)
04:01.540
learning
from
Alex
Net
to
GPT-3
to
so
on.
What
is
it
that
how
do
you
characterize
how
you
come
up
with
these
ideas?
Ilya Sutskever (Co-founder and Chief Scientist)
04:10.100
I
can
answer
so
I
can
comment
on
this
for
myself.
I
think
different
people
do
it
differently.
But
one
thing
that
um
guides
me
personally
is
an
aesthetic
of
how
AI
should
be
by
thinking
about
how
people
are.
But
thinking
correctly.
Like,
it's
very
easy
to
think
about
how
people
Ilya Sutskever (Co-founder and Chief Scientist)
04:34.140
are
incorrectly.
But
what
does
it
mean
to
think
about
people
correctly?
So
I'll
give
you
some
examples.
The
idea
of
the
artificial
neuron
is
directly
inspired
by
the
brain.
And
it's
a
great
idea.
Why?
Because
you
say,
"Sure,
the
brain
has
all
these
different
organs,
it
has
the
Ilya Sutskever (Co-founder and Chief Scientist)
04:50.880
folds,
but
the
folds
probably
don't
matter."
Why
do
we
think
that
the
neurons
matter?
Because
there
is
many
of
them.
Kind
of
feels
right,
so
you
want
the
neuron.
Yeah.
You
want
some
kind
of
local
learning
rule
that
will
change
the
connections.
You
want
some
local
learning
rule
Ilya Sutskever (Co-founder and Chief Scientist)
05:04.720
rule
that
will
change
the
connections
between
the
neurons.
Right?
It
feels
plausible
that
the
brain
does
it.
The
idea
of
the
distributed
representation.
The
idea
that
the
brain,
you
know,
the
brain
responds
to
experience
and
neural
it
should
learn
from
experience,
not
response.
Ilya Sutskever (Co-founder and Chief Scientist)
05:20.720
The
brain
learns
from
experience,
then
neural
network
experience.
And
you
kind
of
ask
yourself,
is
some
is
something
fundamental
or
not
fundamental?
How
things
should
be?
Yeah.
And
I
think
that's
been
guiding
me
a
fair
bit,
kind
of
thinking
from
multiple
angles
and
looking
for
Ilya Sutskever (Co-founder and Chief Scientist)
05:37.400
almost
beauty,
beauty,
simplicity,
ugliness,
there's
no
room
for
ugliness.
It's
just
beauty,
simplicity,
elegance,
correct
inspiration
from
the
brain
and
all
of
those
things
need
to
be
present
at
the
same
time
and
the
more
they
are
present,
the
more
confident
you
can
be
in
a
top
Ilya Sutskever (Co-founder and Chief Scientist)
05:53.400
down
belief.
And
then
a
top
down
belief
is
the
thing
that
sustains
you
when
the
experiments
contradict
you.
Because
if
you
just
trust
the
data
all
the
time,
well
sometimes
you
can
be
doing
a
correct
thing
but
there's
a
bug.
But
you
don't
know
that
there
is
a
bug.
How
can
you
Ilya Sutskever (Co-founder and Chief Scientist)
06:07.920
tell
that
there
is
a
bug?
How
do
you
know
if
you
should
keep
the
bug
in
or
you
conclude
it's
the
wrong
direction?
Well,
it's
the
top
down.
Well,
how
should
you
can
say
the
things
have
to
be
this
way.
Something
like
this
has
to
work.
Therefore,
we
got
to
keep
going.
That's
the
Ilya Sutskever (Co-founder and Chief Scientist)
06:22.080
top
down.
And
it's
based
on
this
like
multifaceted
beauty
and
inspiration
by
the
brain.