Do LLMs Understand? AI Pioneer Yann LeCun Spars with DeepMind’s Adam Brown. - part 4/15
2025-12-12_17-05 • 1h 15m 39s
Janna Levin (Professor of Physics and Astronomy)
00:00.200
So,
I
I
could
play
devil's
advocate
and
say,
"Well,
how
do
I
know
that
what
a
human
being
doing
is
doing
is
that
much
different,
right?
We're
trained
on
lots
of
language.
We
get
some
dopamine
hit
or
some
reward
system
for
having
said
the
right
word
at
the
right
time
and
the
Janna Levin (Professor of Physics and Astronomy)
00:16.580
right
grammatical
structure
for
the
language
that
we're
immersed
in.
And
um
Janna Levin (Professor of Physics and Astronomy)
00:21.380
and
we
back
propagate.
We
try
to
do
a
better
job
the
next
time.
In
some
sense,
how
how
is
that
different
than
what
a
human
being
is
doing?
And
you
you
were
saying
maybe
it's
the
sensory
experience
of
being
immersed
in
the
world?
Yann LeCun (Chief AI Scientist)
00:35.940
Okay.
Um,
a
typical
LLM
as
and
I
mentioned
is
trained
on
tens
of
trillions
of
of
words.
Typically
There's
only
Janna Levin (Professor of Physics and Astronomy)
00:47.220
a
few
hundred
thousand
words
of
it.
You're
just
saying
sentences.
It's
combinations.
Yann LeCun (Chief AI Scientist)
00:51.140
No,
it's
30
trillion
30
trillion
words
is
is
a
a
typical
size
for
the
training
set
pre-training
of
of
an
LLM.
Uh,
a
a
word
is
represented
actually
as
sequences
of
tokens,
doesn't
really
matter.
Uh,
And
a
token
is
about
three
bytes.
So,
the
total
is
about
10
to
the
14
bytes,
Yann LeCun (Chief AI Scientist)
01:11.020
right?
One
with
14
zeros
of
training
data
to
train
those
LLM's.
Yann LeCun (Chief AI Scientist)
01:16.620
And
that
corresponds
to
basically
all
the
text
that
is
publicly
available
on
the
internet
plus
some
other
stuff
and
it
would
take
any
of
us
something
like
half
a
million
half
a
million
years
for
any
of
us
to
read
through
that
material,
right?
So,
it's
an
enormous
amount
of
Yann LeCun (Chief AI Scientist)
01:31.580
textual
data.
Now,
Yann LeCun (Chief AI Scientist)
01:33.140
compare
this
with
what
a
child
proceeds
during
the
first
few
years
of
life.
Psychologist
tell
us
that
a
4-year
old
has
been
awake
a
total
of
16,000
hours.
And
there's
about
one
byte
per
second
going
through
the
optic
nerve,
every
single
fiber
of
the
optic
nerve,
and
we
have
2
Yann LeCun (Chief AI Scientist)
01:56.460
million
of
them.
Yann LeCun (Chief AI Scientist)
01:57.340
So,
it's
about
2
megabytes
per
second
getting
to
the
visual
cortex.
Um,
during
16,000
hours,
do
the
mathematics
and
it's
about
10
to
the
14
bytes.
A
4-year-old
has
seen
as
much
visual
data
as
the
biggest
LLM
trained
on
the
entire
text
ever
produced.
Yann LeCun (Chief AI Scientist)
02:17.820
And
so
what
it
tells
you
is
that
there
is
way
more
information
in
the
real
world,
but
it's
also
much
more
complicated.
It's
noisy,
it's
high
dimensional,
it's
continuous
and
basically
the
methods
that
are
employed
to
train
LLM's
do
not
work
in
the
real
world.
Yann LeCun (Chief AI Scientist)
02:36.020
That
explains
why
we
have
LLMs
that
can
pass
the
bar
exam
or
solve
equations
or
compute
integrals
like
college
students
and
solve
math
problems.
But
we
still
don't
have
a
domestic
robot.
They
can,
you
know,
do
the
chores
in
the
house.
We
don't
We
don't
even
have
level
five
cell
Yann LeCun (Chief AI Scientist)
02:52.700
driving
cars.
I
mean,
we
have
them,
but
we
cheat.
So,
Yann LeCun (Chief AI Scientist)
02:56.500
um
I
mean,
we
certainly
don't
have
cell
driving
cars
that
can
learn
to
drive
in
20
hours
of
practice
like
any
teenager.
Right?
So,
obviously,
we're
missing
something
very
big
to
get
machines
to
the
level
of
human
or
even
animal
intelligence.
Well,
let's
not
talk
about
language.
Yann LeCun (Chief AI Scientist)
03:12.060
Let's
talk
about
how
a
cat
is
intelligent
or
dog.
Um,
Yann LeCun (Chief AI Scientist)
03:16.060
we
we're
not
even
at
that
level
with
AI
systems.
Janna Levin (Professor of Physics and Astronomy)
03:20.460
Adam,
you
you
think
you
impart
more
comprehension
on
the
part
of
the
LLMs
at
this
point
already?
Adam Brown (Research Scientist)
03:31.220
Uh,
I
think
that's
right.
So,
I
mean,
Jan
is
making
sort
of
excellent
points
that
the
LLMs
are
much
less,
for
example,
sample
efficient
than
humans.
Humans
or
indeed
your
cat
or
just
a
a
cat,
I
don't
know
if
it
was
your
cat
or
It's
a
your
very
cat
in
your
smart
cat
in
your
Adam Brown (Research Scientist)
03:47.580
example
um
is
able
to
learn
from
many
fewer
examples
than
a
large
language
model,
for
example,
can
learn
from
Adam Brown (Research Scientist)
03:57.220
that
takes
way
more
data
to
teach
it
to
the
same
level
of
proficiency.
Um
and
and
that's
true
and
that
that
is
a
thing
that
is
better
about
uh
the,
you
know,
architecture
of
animal
minds
compared
to
these
artificial
minds
that
we're
building.
Um,
on
the
other
hand,
sample
Adam Brown (Research Scientist)
04:12.540
efficiency
isn't
everything.
Um,
Adam Brown (Research Scientist)
04:15.940
we
see
this
frequently,
in
fact,
when
we
try
and,
you
know,
before
large
language
models,
when
we
try
and
put
uh
machines
on,
you
know,
make
artificial
minds
to
do
other
tasks.
Even
the
famous
chess
bots
that
we
built
uh
on
built
on
types
of
large
language
models.
Uh,
the
way
Adam Brown (Research Scientist)
04:32.660
they
were
trained
sort
of
Alpha
Zero
and
various
other
ones,
they
would
play
each
other
uh
Adam Brown (Research Scientist)
04:37.540
they
would
play
itself
at
chess
a
huge
number
of
times.
And
to
begin
with,
it
would
just
be
making
random
moves
and
then
uh
every
time
it
it
won
or
lost
the
game,
when
it
was
playing
itself,
it
was
sort
of
uh
you
know
reward
that
neural
pathway
or
punish
that
neural
pathway
and
Adam Brown (Research Scientist)
04:52.340
they
play
each
other
at
chess
again
and
again
and
when
they
played
as
many
games
as
a
human
grandmaster
has
played,
they
were
still
making
essentially
random
moves.
Autoscroll