Models are Predictive Simulations

File:Model Architecture Athens.jpg
Model Architecture Athens [commons].

I Introduction

The term model has become rather ubiquitous in our modern world. Recently the term has gained traction in public discourse as part of the large language model (LLM) revolution, which started with the release of ChatGPT in late 2022. That artificial intelligences are based on models is shared understanding amongst laymen by now.

Before that, the public was aware of mathematical and statistical models, weather and climate models, etc. And before that, and still in use, are all our other usages of the term model. As in: miniature models, role models, fashion models, and so on.

Given that I mentioned^[1] that one of this blog’s topics is about modelling and about modelling reality, it is time to take a closer look of a) what models are and b) what they help us accomplish.

I want to give an account of what models are that says something useful about all types of models, aiming for the essence of the term.

Let’s start by noting that models are said to be models of something, or - in short - model something. Models are either representations or imitations of other things, or something to be imitated, as is the case where a role model sets an example for other people. A re-presentation is an image or a copy of one thing in another medium - anything really from paintings over to clay sculptures to mental imagery.

The thing modelled and the model share some properties and are similar to one another in some ways. But the model is always simpler than the original. Only a selection of properties is kept in the model compared to the original thing. So-called scale models mimic things first and foremost in their geometric appearance - see for example architectural models or toy cars. The reduction of detail - the abstraction - is obvious here. You wouldn’t find functioning mini-elevators inside a miniature skyscraper, as its purpose is to present an architectural vision. A miniature toy car lacks a functioning motor or electric window lifters, things not essential to ‘playing traffic’ with a bunch of them.

A fashion model’s purpose lies in demonstrating how good cloths could possibly look on a person. In that function her personality is regarded as negligible; instead, she embodies perfection of form. The people eventually wearing manufactured cloths are all more particular in their body shapes, while she alone combines all their qualities we’d consider beautiful. Similarly, a Renaissance statue like Michelangelo’s David demonstrates how the perfect body would look like. Particularities of personality and body shape are abstracted away in models and statues.

What enables us making the transfer between actual and ideal things is our mental capacity to make analogies between two or more different things and recognise them as being of the same type. Every time we point at a thing and name it ‘chair,’ we recognise a particular object as being one of a group of similar objects. Since we do this not only across space, as in when we see a couple of chairs next to each other, but across time, we must have an intermediate representations which we hold in our heads which permits future categorisations of things as chairs. These intermediate representations are by their nature idealisations - or ideas - of the things they actually represent.

But far from simply being still-images of things - which is to say memories of things’ shapes - such ideas of things are dynamic representations which form part of simulations which play before our minds eye. I will explain this in the following.

II Prediction

The perennial question of every organism with a nervous system is and always has been: “if I find myself in this or that situation, what should I do?” which is to say “what should I do in order to ensure my survival?” Regarding that, the first things of interest to any organism are other organisms, for life necessitates ingestion of organic material. Since organisms represent food to each other, other organisms, rather than the lifeless features of the landscape like, say, rock formations (or chairs!), draw all their attention.

That means an organism will from time to time find itself in the situation where it is under threat by another organism, and has to react to that other organism’s actions in order to save its skin. Or hunger forces it to act. That is, it will try to eat another organism to ensure its own survival. In any case, both organisms can act and react to each other’s actions and adjust their actions in any instant based on what the other one does.

Of crucial importance here is, though, that different types of animals display different behaviour patterns. Understanding of those behaviour patterns presents thus a huge advantage to any organism.

As a human, to be able to maneuver successfully in an encounter with, say, a lion, before anything else requires me to correctly identify that particular living thing in front of me as a lion. For that, I must have seen similar ones before and must have gotten the idea that this particular type of animal is capable of killing men (and does so at least once in a while).

But, having said that, I also may, while walking through the Savannah, encounter a huge lion which sleeps on the floor, next to a fresh carcass. I retreat carefully, of course, but I am pretty sure the lion won’t be interested in having me as a dessert right now, sleepy and satisfied as it looks.

That is why I said earlier that our mental models are not simple still images. Lions don’t always react in the same way. Lions react in certain ways under certain circumstances and in other ways in other circumstances. All aspects taken together form the idea of a lion, the possession of which enables me to make predictions about its behaviour under various circumstances.

What I end up doing in a particular situation of encountering a lion requires me to not only get the idea of the lion right, but also my idea of the overall situation, the concrete set of circumstances I encounter it in. Certain aspects of the situation, as we have seen with the carcass lying next to the lion, influence which behaviour the lion will actually display in all likelihood.

The features of the landscape are important, too (and the necessity of a general embodied understanding of physics is taken for granted here). Wading through water slows us down, as does slipping and falling to the ground. A smaller animal might escape us by disappearing into a hole too small for us. We might save ourselves from some attacks of at least some animals by climbing a nearby tree.

In general, before I can even look at different courses of action I can take, I need to understand which aspects of which objects (living and non-living) in a given situation are relevant and how they - taken together - influence how the situation will likely ‘play out’ without me doing anything.

We are capable of making these kinds of predictions by virtue of running simulations of situations in our heads - while they are happening. These simulations present themselves effortlessly to us, and when they are in accordance to what actually happens, it can be said we’ve made an accurate prediction.

These simulations before our minds eye finally are the models which I take to be prototypical for all models - dynamic (mental) models which make predictions which ‘carry over’ into reality.

The unique ability of nervous systems to “make models” is called intelligence, according^[2] to cognitive scientist Joscha Bach. Making local predictions in a basically infinitely complex universe is afforded by two things. Firstly by being able to compress aspects of it, given the severely constrained means available to a nervous system (or another substrate of comparable computational capacity); secondly by being able to recover relevant information later on, whereby what is relevant depends on our goals in a particular situation (a point we will get back to at the end of this article).

A thermostat, as Bach notes^[3], doesn’t have a model, for it can act on its disquilibrium instantaneously by translating its measurement of the temperature by either regulating it up or down. To account for the complexity and uncertainty of the real world, predictive simulation models are required. I will consider all other types of models as derivative of those, and as latching on to some of our mental faculties needed to make those prototypical ones possible.

III Technology

It is surely easier to see us as being the food in a encounter with a lion than the other way around, being the naked humans that we are. However, a quick glance at ancient cave-paintings should serve as a reminder that our use and mastery of weapons - technology, really - has changed that equation forever.

The capacity of ours to factor in features of the non-living environment into our ‘list of options,’ for example into encounters with animals, is the one by which we distinguish ourselves from all other animals. Even before we look at the crafting of proper weapons - a multi-step process - we can see that the grabbing of a nearby stone and smashing it over the head of another animal constitutes an act which no other animals other than us, with the exception of maybe chimpanzees, are capable of.

To a dog, a chair is […] a part of the house’s immovable fixtures as is a bookcase. He will jump on to it if it affords him a route to something he wants, but to move it to a more favourable position for such use is beyond his capacity, although well within that of a chimpanzee, whose idea of the thing is correspondingly more complex. Lowly animals lack any capacity to integrate aspects into an idea of an object, and so cannot respond to an object unless it is presented to them in the one particular aspect which they can recognize.^[4]

What sets us humans apart from “lowly” animals, as the quote above shows, is the capacity to not only consider relevant features of the landscape in fight or flight, but isolate and exploit usable aspects of things - their properties or qualities - on purpose.

As we have discussed further above, predictive mental models first of all let us play through the unfolding of a situation in our heads. This happens in a mostly involuntary fashion and starts with a prediction which doesn’t factor in any action on our side. However, we are able to mentally calculate different outcomes of the situation which are likely to happen given any of the different possible actions we could take. Usually we’ll come up with a best course of action on the spot.

But we are also capable of reasoning our way through the different options and compare them against one another in our heads - ahead of time. That is, we are able to run mental simulations outside of being in situations which trigger them. Especially non-living things are suitable elements of such mental manipulations, as their behaviour can relatively easy be understood compared to living beings with intentions and complex behaviour patterns. In the end, we might just sit there and ponder and recombine different aspects of different things and come up with new ideas of how to manipulate matter with matter, in order to make our lives easier.

One prototypical quality of various different things which is of utter usefulness to us is that of being able to cut other things.

On different occasions we may use the same object for different purposes, and so exploit different qualities, but what we call the object’s qualities are the ways in which it can be used. As our needs and methods grow more complicated, we recognize ever more qualities in things. Since it often happens that many quite different things can be used for one and the same purpose, we say that they have a quality in common, and in our thoughts we separate this quality, as an abstraction, and work out our plans in terms of it, without burdening our minds with all the miscellaneous and variable concomitants which accompany it in the real world. Sharpness, for instance, is that quality which makes a tool useful for cutting […]^[5]

We call the production of artifacts whose properties we deliberately leverage to cause certain outcomes technology.

Technology might be as ‘simple’ as using sticks to reach fruits or devise cutting tools. Cutting tools combined with the idea of throwing stones give us spears which allow us to kill animals from a safe distance. With animal traps, we anticipate an animal to act in a certain way, of its many ways we have seen it acting. The trap will force the animal to act in one particular way. This is achieved by arranging the environment to have certain properties or contain certain objects or arrangements of objects.^[6]

Technology is made possible by our faculty of abstracting properties from real world entities into our dynamic predictive models, by our ability to deliberately and slowly run alternative paths in these simulations, and then by flipping the relationship between model and reality on its head - by creating artifacts which are projections of these simulations into the real world.

By creating ever more of these artifacts, we learn to control our environment; we bend reality to our will.

The crucial fact about technology in the context of the current discussion is that we do not only predict a certain outcome given a particular situation as encountered, plus my intervention. What we do do instead is that we construct realities which are narrowly constrained and tighly follow the abstract nature of our models.

Truck on ramp of USS Newport (LST-1179) 1969 [commons].

From the general observation of things sliding or rolling down chaotically shaped, but overall sloped planes, we abstract the model of the two-dimensional inclined plane, on the basis of which we then build things like ramps, to offload cargo from a vessel, or trams which transport people up a hill.

IV Science

What we said so far about dwelling in mental simulations to figure out ways to ‘make our lives easier’ (with technology) is true, and we should add that doing this more often than not follows the pattern of first encountering a problem, a frustration of some sort; a recognition that something could be made more predictable - could only the right way be found. It is to say that there is always a spark of creativity involved - which cannot be forced, but is certainly driven by some instinct to control things. The raw material for creating solutions consists in our experience with things and their observed behaviour in the world.

Lisbon hill ascent. One Night In Lisbon [commons].

At some point, however, humans took the next step - by making efforts to methodically develop models already before we even know to which problems they might present solutions later on. This has become known as the Scientific Revolution, which started during the Renaissance period.

Galileo’s experiments^[7] with falling bodies and inclined planes^[8] can be seen as its kickoff point. Inclined planes of course are just ramps, and humans have used ramps for a very long time. However, they had never been studied methodically until then.

The inclined plane as an experimental setup involves a ramp whose inclination can be measured, an object whose path down the ramp is restricted to go only either up or down, whose weight, material and other properties can be measured or otherwise precisely specified.

Box on an Inclined Plane [commons].

In this type of setup we call measurable properties also variables, because they can take on different values; for example, we would change the object for one double its weight. We do this in order to measure what effect that has on other measurable properties; for example, how much time it takes for the object to reach bottom, versus the original object half its weight. The variables we directly control are called input variables, the changing values on the other end are called output variables.

We conduct this experiment repeatedly, with multiple different values for the input variables and measure the effects on the various output variables. Importantly, we focus on one pair of input and output variable at a time and record series of measurements for them. It is done this way to understand the input variable’s contribution to the effect on a particular output variable, as that property can be influenced by many input variables at the same time.

The purpose of this exercise is to develop a model which we can ‘ask questions’ in terms of its variables; we can specify values for all its input variables, and then ask how long it takes for the object to reach the bottom, for example. That means that at some point - having developed a good model - we would get our answer before we conduct the experiment another time; we can make predictions which ‘carry over’ into reality.

The buildup of models one relationship or aspect at a time reminds us of how we earlier described how we develop ideas of animals in our heads. We do this by experiencing them in different situations. Ideas of non-living things and how they behave in the physical world are created in a similar fashion, by repeated exposure and interaction. We don’t need to be scientists for that. It is quite the other way around: in their almost methodical fashion to play with toys, small children remind us of little scientists;^[9] which is to say that this faculty is largely inborn.

Somewhere in our human minds the ability to understand and manipulate the world of things is deeply rooted. And while our original models, which we perhaps share with many other animals, allow to predict situations on the spot (which is what we typically call heuristics), somewhere along the way in our evolutionary and historical past we learned to reason in more structured ways by separating things out, factor by factor. Our models grew ever more abstract and symbolic, which allowed us to recombine factors in novel ways, which helped us develop increasingly sophisticated technologies. Computers and software, for example.

V Artificial Intelligence

Large language models operate based on artificial neural networks. On another level, they relate ‘tokens,’ which are computer representations of word-parts, to one another. The combination of these two techniques yields something that has proven - one is tempted to say unreasonably - effective. Things happened so quickly that, by blinking in the wrong moment, one might have missed the fact that we passed the Turing test, the commonly accepted goal of artificial intelligence since its inception as a field.

Now three years into the LLM revolution, though, I wasn’t too surprised when, midway writing this article, I came across a YouTube video which features Yann LeCun, where he speaks about new approaches^[10] to achieving artificial general intelligence (AGI) that differ markably from language models. These newer types of models, in contrast to the now common LLMs, aim for a general understanding of the physical world with its spatial, temporal and causal features - all of which should remind us very much of the mental simulations we spoke about extensively.

The organisation of this essay loosely followed the evolution of living organisms, particularly humans. We first spoke about the primal experiences of organisms of eating and being eaten. Later on we spoke about human technology, and, finally, the slow, careful, and deliberate reasoning required in scientific settings.

A natural conclusion then is that the pinnacle of intelligence lies in formal reasoning. It was by this logic that early research in “Good Old-Fashioned” Artificial Intelligence (GOFAI) focused on symbolic reasoning. Artificial intelligence as a field has from its inception been inextricably linked to other fields like linguistics, philosophy, psychology and computer science,^[11] and early research in the field can be said to have broadly coincided with an analytic philosophy and linguistics heavily focused on symbols and grammar. As soon as the idea of computational AI was hatched, the mind itself was imagined as some sort of computer, a symbol manipulation machine.^[12]

GOFAI’s prototypical success case has been ‘solving’ chess; which we did eventually, once the computational resources have become available to handle the combinatorial explosion. Later on, using the newer techniques of neural networks, we solved the game of Go. Later, as we said, we solved the Turing test. We haven’t solved self-driving cars satisfactorily yet, though. And we haven’t solved general and genuinly creative problem-solving.

Frustration with unsolved problems during different historical phases (between the so-called AI winters) has always led to some sort of ‘correction,’ that is recondideration, of approaches. And so we find ourselves factoring in ever more fundamental and previously considered simple features of living nervous systems into our thinking about intelligence. The step from symbolic artificial to neural network artictectures marked such a transition.

As we see, in the quest to create AGI and to understand intelligence, we find ourselves on a trajectory which is quite the opposite from the one laid out in my article, which followed our evolutionary path, as I said. Instead, we begrudgingly make our way from the airy-fairy world of Platonic ideas into the wetlands of the hot, warm, fuzzy and chaotic world of very lowly living beings.

Which brings us back to the current new ideas introduced by LeCun and others. Mental simulations are back on the menu, it seems. The idea to bring physics simulations to artificial agents is not new. In 2017 for example Ullmann, Spelke, Battaglia & Tenenbaum proposed such an architecture based on game-engines.^[13] My knowledge about game-engines tells me that these are computer programs in themselves, which I imagine have been just made accessible to other parts of an agent architecture.

So what should, in my estimate, now be much more interesting in the new approaches, is that generalisations about the causal-dynamic structure of the world (i.e. its physics) are now somehow derived by the agents’ neural nets themselves, which sounds to me a very promising next step, as it removes one layer of explicit modelling and perhaps enables them making interesting generalisations about the world just by themselves. The new approaches use the insights won from the current generation of neural networks and LLMs.

What is not completely clear though with regards to higher intelligence itself (not only artificial one), is why we would ‘mirror’ the outside world in an explicit manner at all. Why are spatio-temporal-causal simulations playing before our minds eye and representing the outside world as we experience it necessary for the type of agents that we are? Answering that question is beyond the scope of this article. But it should be noted that in the same way that we might say a LLM model exists really only as its weights, we could imagine that our mental models produce their predictions simply based on the interconnections of neurons in our brains. I.e. why not just black boxes? From simple nervous systems to ones having complex brains there surely had to be a mostly gradual evolutionary path, and simple nervous systems couldn’t have had said, let’s call them four-dimensional, reality-mirroring, simulations from the get-go.

This is not to doubt their usefulness by any means, merely something we want to focus on in the future stages of AI-research, to find out something important about ourselves.

VI Conclusion

In exploring what is meant by the term model I came to the view that models are primarily of a nature which allows simulation of the world we inhabit. But having laid much emphasis on their dynamic nature, the important takeaway here is not movement per se, but rather the implied property of represening cause and effect. This becomes more visible the slower and more deliberate our thinking is. When we rotate an object in our minds, it is because that object might be useful in a rotated position to help achieve a purpose it otherwise couldn’t in its original position. When we throw a spear at an animal, it is because - and this will sound awfully technical - of an intended state transition; we want that object to change from living to not-living.

Desirable intended effects are problems to which models present solutions, and we humans have become master problem-solvers, not least because we have developed ways to project our models outwards and thereby share them with other humans. Examples are when we make a clay model, or an architectural model which is built in order allow others to critique the thing we want to actually build (a building).

But tinkering with physical things has other good effects. It allows us to employ cognitively less expensive but otherwise more powerful faculties like manipulation by touch to have their play. Since this all takes place in the imperfect physical world, it also invites happy accidents. Ultimately, the model can be transposed back into our heads,^[14] onto a blueprint, and outward again, to become eventually the building we never could have envisioned otherwise!

But not only can models cross boundaries and represent things outside themselves and be shared with others. They can also be applied across domains, were causal relationships understood in one domain are (tentatively) projected into a less understood domain. In technology and science, we see how for example the economy was once modelled as a system of water pipes,^[15] the mind was likened to a computer (as we’ve seen above), and the atom was likened to the solar system. And let’s not forget to mention the billiard-ball model of particle physics.

We mentioned at the beginning that model-making depends on the the recognition of similarities between things. And while similarities are maybe intuitively seen as similarities between properties like shape or color, the more important similarities for us in things are what we can do with them, as we have seen when we mentioned the property of “cutting” which is shared by various objects. Also animals, we have seen, are to a large extent remembered as the collection of their behaviours (i.e. the things they do).

The faculty of our minds to compress the information available ‘out there’ into something which is usable for prediction depends on abstracting irrelevant detail. What is and what isnt’t relevant detail, however, and therefore what particulars gets or don’t get retained in the process of abstraction, can only be seen with respect to the intended effects one wants to achieve.

Predictive models by their nature don’t capture every aspect of reality, for that would mean: no compression. That means that the “world model“ of any particular organism is in some way tied to its way of being, its behaviour pattern, its mode of interaction with the world (or its survival strategy, if you want). Which always leads back to questions of whether bodies^[16] of some sorts are necessary to achieve intelligence, and whether a neutral world model, which is independent of purpose, i.e. intended ways to manipulate the world, can even exist.

Footnotes

In my last essay titled Best Way to Live, see here.
He says that here, here, here and here.
See here.
p.39 in G.A. Wells: What’s in a Name? - Reflections on Language, Magic, and Religion; Open Court Publishing, 1993.
p.42-43, ibid.
Basically, we trick its model.
Maybe interesting to note here that physics experiments like these have made Physics the model for all other types of science in the sense of a role model. It seems the preciseness of physical experiments and theories cannot be matched elsewhere. Every field has its own methods which get justified by idiosyncrasies of their respective subjects, but one can hardly escape the impression that there exists often a little envy towards Physics (see “Physics Envy” at wikipedia.org).
Some successful science communication on this can be found here: “Galileo**’**s Measure Of Gravity Explained By Jim Al-Khalili | The Amazing World Of Gravity” on YouTube.
Note there is a book exploring that very thesis: The Scientist in the Crib: What Early Learning Tells Us About the Mind, 2000. (My edition. The first edition from 1999 carried the subtitle Minds, Brains, And How Children Learn) By Gopnik, Meltzoff, Kuhl.
Here. Although that is not a recommendation. It is probably best to do an internet search for the JEPA architecture.
I am not sure where to place cognitive science in that timeline, but neuroscience has surely been a more recent addition.
Compare Chinese Room Argument, which is a famous thought experiment. See plato.stanford.edu.
Mind Games: Game Engines as an Architecture for Intuitive Physics; research paper, sciencedirect.com. By Ullmann, Spelke, Battaglia, Tenenbaum.
There exists actually the idea that what we call the mind doesn’t end at one’s skull or skin, but rather extends into the human artifacts we build and are involved with. This is known as Enactivist cognition or Material Engagement Theory. See for example How Things Shape the Mind: A Theory of Material Engagement; MIT press, 2013. By Lambros Malafouris.
See Phillips Machine, wikipedia.org.
I’m alluding here to core tenets of Embodied cognition (see plato.stanford.edu). But maybe having a body can be taken to mean that an agent has a boundary which separates its internal states from external states, the latter of which it has only access to through interfaces of some sort.