EP20: Yann LeCun - part 3/11
December 15, 2025 • 1h 50m 6s
? (?)
00:00.110
those
are
the
same
because
people
have
been
using
that
term
sign
means
networks
even
recently
Yann LeCun (Chief AI Scientist)
00:05.510
that's
right
i
mean
the
concept
is
still
you
know
up
to
date
right
so
you
have
you
have
an
X
and
Y
then
think
of
the
X
as
some
sort
of
degraded
transformed
or
corrupted
version
of
Y
OK
you
run
both
X
and
Y
two
encoders
and
you
tell
the
system
look
X
and
Y
really
are
two
views
of
Yann LeCun (Chief AI Scientist)
00:24.150
the
same
thing
presentation
you
compute
should
be
the
same
right
so
if
you
just
train
a
neural
net
you
know
two
neural
nets
with
shared
weights
right
to
produce
the
same
representation
for
slightly
different
versions
of
the
same
object
view
whatever
it
is
it
collapses
it
doesn't
Yann LeCun (Chief AI Scientist)
00:44.670
produce
anything
useful
so
you
have
to
find
a
way
to
make
sure
that
the
system
you
know
extract
as
much
information
from
the
input
as
possible
and
the
original
idea
that
we
had
you
know
it
was
a
newspaper
from
nineteen
ninety
three
with
simmons
net
was
to
have
a
contrastive
term
Yann LeCun (Chief AI Scientist)
01:00.950
right
so
you
have
other
pairs
of
samples
that
you
know
are
different
and
you
train
the
system
to
produce
different
representations
so
you
have
a
cost
function
that
attracts
the
two
representations
when
you
show
two
examples
that
are
identified
or
similar
and
you
repel
them
when
Yann LeCun (Chief AI Scientist)
01:16.230
you
show
it
two
examples
that
are
dissimilar
and
we
came
up
with
this
idea
because
someone
came
to
us
and
said
like
can
you
encode
signatures
of
someone
you
know
drawing
a
signature
on
the
tablet
can
you
encode
this
on
less
than
eighty
bytes
because
if
you
can
encode
it
in
less
Yann LeCun (Chief AI Scientist)
01:33.630
than
eighty
bytes
we
can
write
it
on
the
magnetic
tape
of
a
credit
card
so
we
can
do
signature
application
for
credit
cards
right
and
so
we
give
up
this
idea
i
came
up
with
this
idea
of
training
a
neural
net
to
produce
ad
variables
that
will
quantize
one
by
each
and
then
Yann LeCun (Chief AI Scientist)
01:54.080
training
training
it
to
kind
of
do
this
thing
Ravid Shwartz-Ziv (Assistant Professor)
01:57.110
and
did
they
use
it
so
it
worked
Yann LeCun (Chief AI Scientist)
01:59.110
really
well
and
they
showed
it
to
their
you
know
business
people
who
said
oh
we're
just
going
to
ask
people
to
type
pink
codes
we
have
every
lesson
of
like
that
like
how
you
can
integrate
the
technology
right
and
you
know
i
knew
this
thing
was
kind
of
fishy
in
the
first
place
Yann LeCun (Chief AI Scientist)
02:17.440
because
like
you
know
there
were
countries
in
europe
that
were
using
smart
cards
right
and
it
was
much
better
but
they
just
didn't
want
to
use
smart
crops
for
some
reason
anyway
so
so
we
had
this
technology
in
the
mid
two
thousand
i
worked
with
two
of
my
students
on
to
revise
Yann LeCun (Chief AI Scientist)
02:35.150
this
idea
we
came
up
with
a
new
objective
functions
to
train
those
so
these
are
where
people
now
call
contractive
methods
it's
a
special
case
of
contrastive
methods
we
have
like
positive
examples
negative
examples
and
you
train
you
know
on
positive
examples
you
train
the
system
Yann LeCun (Chief AI Scientist)
02:48.430
to
have
low
energy
and
for
negative
samples
you
train
them
to
have
higher
energy
where
energy
is
the
distance
between
the
representations
so
we
had
two
papers
at
CDPR
in
two
thousand
five
two
thousand
six
by
raya
hadsel
who
is
now
the
head
of
deep
mind
foundation
the
sort
of
Yann LeCun (Chief AI Scientist)
03:07.270
fair
like
division
of
deep
mind
if
you
want
and
summit
chopra
who
is
actually
a
faculty
here
at
NYU
now
working
on
medical
imaging
and
so
this
gathered
a
bit
of
interest
in
the
community
and
sort
of
revived
a
little
bit
of
work
on
those
ideas
but
it
still
wasn't
working
very
Yann LeCun (Chief AI Scientist)
03:27.150
well
those
contrasting
methods
really
were
producing
representations
of
images
for
example
that
were
kind
of
relatively
low
dimensional
if
we
measured
like
those
the
eigenvalue
spectrum
of
the
coherence
matrix
of
the
representations
that
came
out
of
those
things
it
would
fill
up
Yann LeCun (Chief AI Scientist)
03:41.670
maybe
two
hundred
dimensions
never
more
like
even
training
on
imagenet
and
things
like
that
even
with
data
augmentation
and
so
that
was
kind
of
disappointing
and
it
did
work
OK
there
was
a
bunch
of
papers
on
this
and
it
worked
OK
there
was
there
was
white
paper
from
deeper
it
Yann LeCun (Chief AI Scientist)
03:58.110
seemed
clear
that
demonstrated
you
could
get
decent
performance
with
contrastive
training
applied
to
siamese
nets
but
then
about
five
years
ago
one
of
my
postdocs
stephane
denis
at
meta
tried
an
idea
that
at
first
i
didn't
think
it
would
work
which
was
to
essentially
have
some
Yann LeCun (Chief AI Scientist)
04:23.150
measure
of
information
quantity
that
comes
out
of
the
encoder
and
then
trying
to
maximize
that
OK
and
the
reason
i
didn't
think
it
would
work
is
because
i'd
seen
a
lot
of
experiments
along
those
lines
that
jeff
hinton
was
doing
in
the
nineteen
eighties
are
trying
to
maximize
Yann LeCun (Chief AI Scientist)
04:41.270
information
and
you
can
never
maximize
information
because
you
never
have
appropriate
measures
of
information
content
that
is
at
orbound
if
you
want
to
maximize
something
you
want
to
either
be
able
to
compute
it
or
you
want
to
a
lower
bound
on
it
so
you
can
push
it
up
right
and
Yann LeCun (Chief AI Scientist)
04:59.570
for
information
content
we
only
have
upper
balance
so
i
always
thought
this
was
completely
hopeless
and
then
you
know
stefan
came
up
with
a
technique
which
was
was
called
battle
twins
baru
is
a
famous
theoretical
neuroscientist
who
came
up
with
the
idea
information
maximization
Yann LeCun (Chief AI Scientist)
05:20.790
and
and
it
kind
of
worked
it
was
wow
so
there
i
said
like
we
have
to
push
this
right
so
we
came
up
with
another
method
with
a
student
of
mine
adrian
bard
co
advised
with
jean-ponse
who's
affiliated
with
NYU
two
technique
called
vikrag
variance
invariance
covariance
Yann LeCun (Chief AI Scientist)
05:42.240
regularization
and
that's
a
network
to
be
simpler
and
work
even
better
and
since
then
we
made
progress
and
randall
recently
you
know
discussed
an
idea
with
him
that
can
be
pushed
and
made
practical
it's
called
sigreg
the
whole
system
is
called
he's
responsible
for
the
name
at
Yann LeCun (Chief AI Scientist)
06:00.470
all
latent
euclidean
japa
right
yeah
and
sigreg
has
to
do
with
sort
of
making
sure
that
there
are
distribution
of
vectors
that
come
out
of
the
encoder
is
an
isotropic
gaussian
that's
the
i
in
the
G
so
i
mean
there's
a
lot
of
things
happening
in
this
domain
which
are
really
cool
Yann LeCun (Chief AI Scientist)
06:29.590
i
think
there's
going
to
be
some
more
progress
over
the
next
year
or
two
and
we'll
get
a
lot
of
experience
with
this
and
and
i
think
that's
kind
of
a
really
good
promising
set
of
techniques
to
train
models
that
learn
abstract
representations
which
i
think
is
key
Autoscroll