I want to discuss an academic question with you. I’m going to tell you a rough idea I’ve been thinking about, and I’d like you to listen carefully and then help me analyze it, give me suggestions, and maybe help me develop it over time.

What I’ve been thinking about is this: people still do not really understand what machine learning and AI actually are, how they really work, or what their relationship is to the kind of intelligence we intuitively recognize in humans and animals. Another way to put it is that I want a theoretical framework for understanding current machine learning practice—what these methods are doing, why something like intelligence seems to emerge from them, and how that relates to the broader and more general concept of intelligence that we associate with humans and animals.

In my mind, this is really about building a theory of intelligence and contemporary AI. Or maybe, to use a phrase that I find very compelling, a kind of “physics of AI.” That phrase makes a lot of sense to me. The reason is that AI systems are becoming more and more capable, and at the same time AI safety is becoming more and more serious. We do not really understand how these systems work, and more fundamentally, we do not really understand how intelligence itself works. If we do not understand the general principles behind intelligence, then it is very hard to do safety in a deep way. So for me, two of the most important things we need to understand are: first, what intelligence actually is; and second, what current AI systems are really doing. And both of those seem to require a theory of intelligence.

Then I started asking myself: what does it even mean to have “a theory” of something? My current intuition is that a good theory should produce real information gain. And the best kind of information gain often comes from introducing concepts that connect the thing you are studying with other established disciplines. In other words, a strong theory does not just redescribe the same phenomenon; it reveals its relationship to other reliable bodies of knowledge. In a way, this has a kind of category-theoretic flavor for me: what matters is not just the object itself, but the structure of its relationships. In the ideal case, a deep theory of intelligence would connect this still somewhat fuzzy notion of “intelligence” to the hardest and most reliable frameworks we already have—mathematics, physics, systems theory, information theory, and so on. If we can make those relationships precise, that would be a very high-information-gain theory.

One angle I’ve been thinking about is whether intelligence, or at least current machine learning systems, can be understood from the perspective of dynamical systems or system theory. What is such a system really doing? Why does this phenomenon that we call learning or intelligence appear?

To start from a simple case, take supervised learning, or more generally a neural network that receives input and produces output. The input-output pairs we train on are not just abstract symbols floating in space. They come from the physical world, or at least from some environment with its own structure and dynamics. The inputs and outputs each occupy some state, and those states are influenced by the environment around them, and ultimately by the larger physical world. So instead of immediately talking about distributions, one can first think in terms of states and channels that may even vary over time.

From that perspective, the observed input-output data define, or are induced by, some stochastic process. That stochastic process is generated by the mapping or dynamics of the surrounding world. Then what machine learning does is to sample from that process and learn from it. In the idealized setting of statistical learning theory, we abstract away a lot of the outside complexity and assume a stationary distribution. Under that assumption, the model learns a mapping from input to output. That is the standard story. But intuitively, I feel that this is not the whole story.

My rough intuition is that what the model learns is something like an abstract nonlinear transfer function, or maybe an effective dynamical relation. Then when you deploy the model somewhere else—for example, when you put a large language model on my computer and connect it to prompts, tools, or other systems—you are not just evaluating a function anymore. In some sense, you are taking a piece of effective behavior or dynamics that was learned from one part of the world and plugging it into another causal chain in the physical world. You are inserting a learned transfer function into a new place. And once you do that, you are changing the state-transition dynamics of the larger system.

This is why I feel that a neural network should not be treated only as an abstract algorithmic object. It should also be treated as a physical entity running in the world. It has a material substrate, it consumes energy, it processes information, and it interacts with the environment through input and output channels. If that is true, then learning and deployment must obey physical constraints. Not just in the trivial sense that they cannot violate physics, but in the stronger sense that the emergence of learning or adaptation should depend on certain physical and informational conditions—memory, feedback, timescales, dissipation, noise, bandwidth, and so on.

That leads me to another idea I keep coming back to: maybe intelligent systems generally require some kind of multiscale information feedback loop. I’m thinking here of ideas like Michael Levin’s multiscale competency perspective. I keep noticing that many systems that seem intelligent or adaptive have this feature: they do not just react at one scale, but across multiple scales, with different levels of organization and different timescales interacting. So I wonder whether multiscale feedback is not just common, but actually necessary for learning or adaptation in a deeper sense.

I also have a more vague but persistent intuition about the relationship between internal dynamics and external dynamics. If a system is adaptive, perhaps the dynamics from input to output inside the system and the dynamics of the outside world cannot be arbitrarily unrelated. Maybe there has to be some kind of structural relationship between them. Earlier I had thoughts in terms of identity or inverse transfer functions, although I know that is still not the right formal language. But I strongly suspect there is some mathematical structure here worth making precise. And if that structure could be described, maybe it would tell us something indirect but powerful about what kinds of learning algorithms or architectures can really work.

This is also why I feel that describing intelligence purely at the algorithmic or functional level is not enough. I think ideas like world models are very useful, but they are still mostly descriptions at the functional level. Different physical implementations could realize similar functions while depending on very different underlying dynamics, information flows, and resource constraints. So algorithmic description is probably one aspect of intelligence, but not the whole story. To really understand intelligence, and to understand the relationship between current AI systems and more general biological intelligence, I think we need to connect the algorithmic, system-level, informational, and physical perspectives.

So this is my rough line of thought. It is definitely still incomplete and in places still fuzzy, but I feel there may be something real here. I’d like you to help me think it through, criticize it, organize it, and see whether there are genuinely insightful directions hidden inside it. In particular, I’d like help identifying where this idea is strongest, where it is confused, what existing theories it connects to, and what concrete research questions could grow out of it.