Ravid Shwartz-Ziv (Assistant Professor) 00:00.190
and what do you think are the missing party like do you think like more compute will help or like we need better algorithms like it's kind of like do you believe in the bitter lessons right like do you think
? (?) 00:13.310
well and and furthermore what do you think about you know the data quality problems with the internet post twenty twenty two right i've heard people compare it to low background steal now to refer to all that data before LLM's came out like low background tokens i mean OK
Yann LeCun (Chief AI Scientist) 00:27.270
yeah i think i'm totally escaping that problem OK here is the thing and i've i've been you know using this argument publicly over the last couple of years training an LLM if you wanted to have any kind of you know decent performance requires training on basically all the
Yann LeCun (Chief AI Scientist) 00:43.790
available freely available text on the internet plus some you know synthetic data plus licensed data et cetera so a typical LLM like you know number three you know going back a year or two is trained on thirty trillion tokens the token is typically three bytes so that's ten to
Yann LeCun (Chief AI Scientist) 01:01.510
the fourteen bytes for pre training OK we're not talking about fine tuning ten to the fourteen bytes and for the LLM's to be able to really kind of exploit this they need to have a lot of memory storage because basically those are isolated facts there is a little bit of
Yann LeCun (Chief AI Scientist) 01:22.100
redundancy in text but a lot of it is just isolated facts right and so you need a lot of you need very big networks because you need a lot of memory to store all those facts if we go to them OK now compare this with video ten to the fourteen bytes if you count two megabytes per
Yann LeCun (Chief AI Scientist) 01:44.960
second for video for you know relatively compressed video not how you compress but a bit that would represent fifteen thousand hours of video ten to the fourteen bytes if fifteen thousand hours of video you have the same amount of data as the charity of all the texts available
Yann LeCun (Chief AI Scientist) 02:03.710
on the internet now fifteen thousand hours of video is absolutely nothing it's thirty minutes of youtube uploads OK it's the amount of visual information that a four year old has seen in his or her life the entire life waking time is about sixteen thousand hours in four years so
Yann LeCun (Chief AI Scientist) 02:24.910
a lot of information we have video models now vijay vijaypat two actually that just came out last summer that was trained on the equivalent of a century of video data and it's all public data OK much more data but much less than the biggest LNM actually because even though it's
Yann LeCun (Chief AI Scientist) 02:46.590
it's more bytes it's more redundant to say OK it's more redundant so it's less useful actually when you use self supervised learning you do need redundancy you cannot learn anything in self supervise or anything by the way if it's completely random redundancy is what you can
Yann LeCun (Chief AI Scientist) 03:02.390
learn and so so this just much richer structure in you know real world data like video than there is in text which kind of led me to claim that we absolutely never ever going to get to human level AI by just training on text it's just never going to happen right it's a big
Yann LeCun (Chief AI Scientist) 03:23.870
debate in philosophy of whether AI should be grounded in reality or whether it could be just you know in the realm of symbolic manipulation and things like this
? (?) 03:33.870
we talk about for world models and grounding i think you know there's still a lot of people who don't even understand what the idealized world model is in a sense right so for example i'm influenced by having watched star trek which i would hope you've seen a little bit of and
? (?) 03:48.670
thinking of the holodecks right i always thought that the holodeck was like an idealized perfect world model right even so many episodes of it going too far right and people walking out of it right but you know it even simulates things like smell and physical touch so do you you
? (?) 04:04.110
think that something like that is like the idealized world model or do you think like a different model or like way of defining it would be
Yann LeCun (Chief AI Scientist) 04:12.110
OK this is an excellent question and there is an excellent is because it goes to the core of really what you know what i think we should be doing which i'm doing and how wrong a secret video is OK so so people think you know think that a world model is something that reproduces
Yann LeCun (Chief AI Scientist) 04:33.790
all details of what the world does they think of it as a simulator yeah right and of course because you know deep learning is the thing you're going to use some deep learning system as a simulator a lot of people also are focused on video generation which is kind of a cool thing
Yann LeCun (Chief AI Scientist) 04:48.070
right you you produce those cool videos and they're wow you know people are sort of really impressed by them now there's no guarantee whatsoever then when you train a video generation system it actually has an accurate model of the underlying dynamics of the world and it's
Yann LeCun (Chief AI Scientist) 05:03.070
learned anything you know particularly abstract about it and so so the idea that somehow a model needs to reproduce every detail of the reality is wrong and hurtful and i'm going to tell you why OK a good example of simulation is CFD computational fluid dynamics it's used all
Yann LeCun (Chief AI Scientist) 05:29.550
the time people use supercomputers for that right so you want to simulate the flow of air around an airplane you cut up the space into little cubes and within each cube you have a small vector that represents the state of that cube which is you know velocity density or mass and
Yann LeCun (Chief AI Scientist) 05:51.670
temperature and maybe a couple of other things right so and then you solve navier stokes equations which are which is a differential partial differential equation and you can see related flow of air now the thing is this does not actually necessarily solve the equations very
Yann LeCun (Chief AI Scientist) 06:10.590
accurately if you have chaotic behavior like turbulences and stuff like that simulation is only you know approximately correct but in fact that's already an abstract representation of the underlying phenomenon the underlying phenomenon is molecules of air that bump into each
Yann LeCun (Chief AI Scientist) 06:26.590
other and bump on the wing and on the airplane right ever goes to that level to do the simulation that would be crazy right it would require an amount of computation that's just insane and it would depend on the initial condition i mean there's all kinds of reasons we don't do
Yann LeCun (Chief AI Scientist) 06:43.990
this and maybe it's not molecules maybe it's you know at a lower level we should simulate particles and like you know do the feynman diagrams and simulate you know all the different paths that those particles are employing because they don't take one path right it's not
Yann LeCun (Chief AI Scientist) 06:57.190
classical it's quantum so at the bottom it's like quantum field theory and probably already that that is an abstract representation so so you know everything that takes place between us at the moment in principle can be described through quantum field theory OK we just have to
Yann LeCun (Chief AI Scientist) 07:17.200
measure the wave function of the universe in a cube that contains all of us and even that would not be sufficient because they're entering all particles on the other side of the universe that you know we have so it wouldn't be sufficient but let's imagine OK for the sake of of
Yann LeCun (Chief AI Scientist) 07:33.790
the argument first of all we would not be able to measure this wave function and second of all the amount of competition we would need to devote to this is absolutely gigantic it was released on gigantic quantum computer that you know is the size of euros or something so no way
Yann LeCun (Chief AI Scientist) 07:54.150
we can describe anything at that level and it's very likely that our simulation would be accurate for maybe a few nanoseconds you know beyond that we'll diverge from reality so what do we do we invent abstractions we invent abstractions like particles atoms molecules in the
Yann LeCun (Chief AI Scientist) 08:12.550
living world its proteins organelles sales organs organisms societies ecosystems etc right and basically every level in this hierarchy ignores a lot of details about the level below and what that allows us to do is make longer term more reliable longer term predictions OK so we
Yann LeCun (Chief AI Scientist) 08:38.830
can describe the dynamics between us now in terms of the underlying science and in terms of psychology OK that's a much much higher level abstraction than particle physics right and in fact you know every level in the hierarchy i just i just mentioned is a different field of
Yann LeCun (Chief AI Scientist) 08:55.990
science a field of science is essentially defined by the level of abstraction at which you start making predictions right that you allow yourself to use to make predictions in fact physicists have this down to an art in the sense that you know if i give you a box full of gas you
Yann LeCun (Chief AI Scientist) 09:18.190
could in principle simulate all the molecules of the gas but nobody ever does this but at a very abstract level we can say you know PV equals NRT right you know pressure times value equals the number of particle times you know temperature blah blah blah and so you know that you
Yann LeCun (Chief AI Scientist) 09:39.870
know global emergent phenomenological level if you increase the pressure the temperature will go up or if you increase the temperature the pressure will go up right or if you let some particles out then the pressure will go down and blah blah blah right so so we all the time we
Yann LeCun (Chief AI Scientist) 09:57.710
build phenomenological models of something complicated by ignoring all kinds of details that physicists call entropy but but it's really systematic that's the way we understand the world we do not memorize every detail of we certainly not reconstruct it of what we perceive so
Yann LeCun (Chief AI Scientist) 10:20.120
world models don't have to be simulators at all well there are simulators but in abstract representation space and what they simulate is only the relevant part of reality OK if i ask you where is jupiter going to be one hundred years from now i mean we have an enormous amount of
Yann LeCun (Chief AI Scientist) 10:37.310
information about jupiter right but within this whole information that we have about jupiter to be able to make that prediction where jupyter is going to be one hundred years from now you need exactly six numbers three positions and three velocities and the rest doesn't matter
Yann LeCun (Chief AI Scientist) 10:52.350
so you don't believe in a synthetic datasets i do no it's useful you know data from games i mean there's certainly a lot of things that you learn from synthetic data from you know from games and things like that i mean you know children learn a huge amount from from play which
Yann LeCun (Chief AI Scientist) 11:10.430
basically are kind of simulations you know the world a little bit right but but in conditions where they can't kill themselves but i worry at least for video games that for example the green screen like actors doing the animations they're doing extremely it's designed to look
Yann LeCun (Chief AI Scientist) 11:27.840
good you know for like an often badass i guess for an action game but these often don't correspond very well to reality and so i i worry that like a physical system that's you know been trained or through or with the assistance of world models might get similar quirks at least
Yann LeCun (Chief AI Scientist) 11:46.030
in the very short term is this something that worries you no it depends on what level you train them so for example i mean sure if you use a very accurate robotic simulator for example right it's going to accurately simulate the dynamics of an arm you know when you apply torques
Yann LeCun (Chief AI Scientist) 12:00.550
to it it's going to move in a particular way dynamics no problem now simulating the friction that happens you know when you grab an object and manipulate it that's super hard to do it accurately friction is very hard to simulate OK and so those simulators are not particularly
Yann LeCun (Chief AI Scientist) 12:15.950
accurate for manipulation they're good enough that you know you can train a system to do it and then you can do you know seem to real with a little bit of adaptation so that can work but it does not i mean the point is much more important like for example there's a lot of
Yann LeCun (Chief AI Scientist) 12:32.240
completely basic things about the world that we completely take for granted which we can learn at a very abstract level but it's not language related OK so the fact for example and i've used this example before and people made fun of me for it but it's really true OK i have
Yann LeCun (Chief AI Scientist) 12:47.270
those objects on the table and the fact that when i push the table the object moves with it like this is something we learned it's not something that you're born with OK the fact that most objects will fall when you let them go right with gravity maybe it's run this around the
Yann LeCun (Chief AI Scientist) 13:02.870
edge of and the reason people make fun of me with this is because i said you know LLM's don't understand this kind of stuff right and and they absolutely do not even today but but you can train them to give the right answer when you ask them a question you know if i put an
Yann LeCun (Chief AI Scientist) 13:20.990
object on the table then i push the table what will happen to the object it will answer the object moves with it but because it's been fine tuned to do that OK so it's more like regurgitation that sort of real understanding of the underlying dynamics but if you look on i don't
Yann LeCun (Chief AI Scientist) 13:34.630
know sura like nano nano banana they they have a good physics of the world right they are not perfect they have some physics yeah they have some physics right so do you think like we can't push it or do you think like it's a one way to learn physics actually make predictions in
Yann LeCun (Chief AI Scientist) 13:54.630
our presentation space they use diffusion transformers and that prediction that the computation of the video snippet at an abstract level is done in representation space OK not always auto regressively by the way sometimes it's just in parallel and then there's a second
Yann LeCun (Chief AI Scientist) 14:14.510
diffusion model that turns this abstract representations into a nice looking video and that might be more collapse we don't know right because we can't really measure like the coverage of such systems with reality but but like you know the to the the previous point i can train
Yann LeCun (Chief AI Scientist) 14:37.070
like here is another completely obvious concept to us that we don't even imagine that we learn but we do learn it a person cannot be in two places at the same time OK we run this because very early on we learn object permanence the fact that when an object disappears still
Yann LeCun (Chief AI Scientist) 14:56.990
exists OK and reappears it's the same object that you saw before how can we train an AI system to learn this concept so object permanence you know you just show it a lot of videos where objects you know go behind the screen and then reappear on the other side or what they go
Yann LeCun (Chief AI Scientist) 15:13.640
behind the screen and the screen goes away and the object is still there and when you show four months old babies scenarios where things like this are violated their eyes open like super big and they're like super surprised because reality just you know violated their internal
Yann LeCun (Chief AI Scientist) 15:29.790
model the same thing when you show a scenario of like a little car on the platform you push it off the platform and it appears to float in the air they also look at it you know nine months ten months old babies look at it like really surprised six months old baby barely pay
Yann LeCun (Chief AI Scientist) 15:45.870
attention because they haven't run over gravity yet so they haven't been able to like you know incorporate the notion every object is supposed to fall so this kind of learning is really what's what's important and you do this you can learn this from very abstract things you know
Yann LeCun (Chief AI Scientist) 16:01.470
the same way babies learn about like you know social interactions by you know being told stories with like simple pictures it's a simulation an abstract simulation of the world but it sort of launched them you know particular behavior so you could imagine like training a system
Yann LeCun (Chief AI Scientist) 16:18.830
from let's say an adventure