Neural Networks are Wave Propagation Systems

A Physical Theory of Learning from First Principles

📖 Research Context
This work derives learning algorithms from physical first principles, showing that backpropagation—historically viewed as biologically implausible—emerges naturally when neural networks are understood as wave propagation systems. We reconcile Hebbian learning, backpropagation, and biological constraints through a unified wave framework.

Authors: Macheng Shen + Claude (Opus 4.6) | Date: March 2026

Contents

  1. Motivation: The Biological Plausibility Problem
  2. From First Principles: Information Requires Waves
  3. Marr's Three Levels: Resolving Confusion
  4. Reconciling Hinton 2020: Solving Implausibilities
  5. Why It Looks Like Hebbian Learning
  6. Testable Predictions
  7. Mathematical Formalization

Part 0: Motivation

The Central Puzzle

The brain clearly learns—from infancy to adulthood, we acquire skills, knowledge, and behaviors through experience. But how does learning happen at the neural level?

Two competing frameworks exist:

Framework 1: Hebbian Learning (1949)

"Neurons that fire together, wire together."

Framework 2: Backpropagation (1986)

"Propagate error gradients backward to adjust weights."

Hinton's 2020 Critique

Lillicrap & Hinton (Nature Reviews Neuroscience, 2020):
"Backpropagation and the Brain"

Three major biological implausibilities identified:

  1. Weight transport problem: Backward pass requires symmetric weights (\(W^T\)), but biological synapses are unidirectional
  2. Phase separation: Forward and backward passes must be temporally separated, but neurons fire continuously
  3. Non-local credit assignment: Neurons need to know downstream derivatives, but biological learning is local

Our Question

Can we find a physical framework that:

Answer: Yes — by viewing neural networks as wave propagation systems.


Part I: From First Principles

Step 1: Neural Networks Must Transmit Information

Observation: A feedforward neural network transforms input \(\mathbf{x}_0\) into output \(\mathbf{x}_L\):

$$\mathbf{x}_0 \xrightarrow{\text{Layer 1}} \mathbf{x}_1 \xrightarrow{\text{Layer 2}} \cdots \xrightarrow{\text{Layer L}} \mathbf{x}_L$$

Question: What is the physical nature of this transformation?

Not a lookup table:

Therefore: Neural networks must perform dynamic transformation — each layer actively processes information, not just retrieves pre-stored answers.

Step 2: Information Transmission Requires Physical Carriers

Physical constraint: In the physical world, information cannot "teleport" from point A to point B instantaneously.

💡 Fundamental principle: Information must be carried by a physical medium.

Candidates in neural systems:

Common property: All are propagating disturbances in a medium.

Step 3: Physical Carriers Obey Wave Equations

Universal principle: Any disturbance propagating through a medium satisfies a wave equation.

General wave equation: $$\frac{\partial^2 \psi}{\partial t^2} = v^2 \nabla^2 \psi + F(\psi)$$

Where:

Examples across scales:

System Medium Wave Type Velocity
Sound Air Pressure wave ~343 m/s
Light Electromagnetic field EM wave ~3×10⁸ m/s
Action potential Axon membrane Voltage wave ~100 m/s
Dendritic signal Dendrite cable Current wave ~10 m/s

Step 4: Static Mapping vs Dynamic Wave

Conceptual difference:

Static Mapping

$$y = f(x)$$

Dynamic Wave

$$\psi(x, t)$$

Neural networks operate on millisecond timescales:

Conclusion: Neural computation is not instantaneous — it must involve wave propagation.

Step 5: Neural Networks as Pattern-Forming Media

Analogy: Generative models

Standard electricity (random noise)Neural networkStructured output (patterns)

This is analogous to:

Physical interpretation: Neural networks are pattern-forming media where waves naturally organize into structures through:

Conclusion of Part I

Fundamental insight:

Neural networks are wave propagation systems, not static function approximators. Information flows as physical waves through tunable media (synapses), and learning emerges from wave interference and impedance matching.


Part II: Marr's Three Levels

The Framework

David Marr (1982) proposed that any information-processing system can be understood at three levels:

Level Question Example (Vision)
1. Computational What is being computed? (Goal/objective) Extract depth from stereo images
2. Algorithmic How is it computed? (Procedure/steps) Match corresponding features, compute disparity
3. Implementation What physical substrate? (Hardware) V1 neurons, synaptic connections

Applying to Neural Network Learning

Level Traditional View Wave Theory View
Computational Minimize loss \(L(y, y^*)\) ✓ Same
Algorithmic Backpropagation (non-local gradients) Wave interference (local)
Implementation ❌ Unclear (biological problem!) ✓ Wave reflection + interference

Key Insight: Level Confusion

💡 Source of confusion:

Previous debates conflated levels:

Resolution:

Analogy:

Matrix multiplication can be algorithmically described as nested loops (non-parallel), but implemented with parallel hardware (GPU). The algorithm and implementation are different, but compute the same result.

Part III: Reconciling Hinton 2020

Now we can directly address Lillicrap & Hinton's three critiques, showing how wave theory resolves each biological implausibility.

Problem 1: Weight Transport

Hinton's critique:
"Backpropagation requires symmetric backward weights (\(W^T\)), but biological synapses are unidirectional."

Traditional backprop:

$$\delta_{l-1} = W_l^T \cdot \delta_l \odot \sigma'(z_{l-1})$$

Requires knowing \(W_l^T\) (transpose of forward weights)

Wave theory solution:

Reflected waves carry error automatically

Physical mechanism: When a wave encounters an impedance mismatch (at layer boundaries), it automatically reflects.

Reflection coefficient: $$R = \frac{Z_2 - Z_1}{Z_2 + Z_1}$$

Where \(Z_i\) = impedance of layer \(i\)

Key insight: Impedance is a local property:

$$Z_l \propto |\mathbf{x}_l - \mathbf{x}_l^*|^2$$

Biological analogue: Backpropagating action potentials (BAPs)

Problem 2: Phase Separation

Hinton's critique:
"Forward and backward passes must be temporally separated, but biological neurons fire continuously."

Traditional backprop:

  1. Forward pass: Compute activations layer-by-layer
  2. Wait until output is reached
  3. Backward pass: Propagate gradients back

Wave theory solution:

Forward and backward waves coexist

Physical principle: In wave systems, incident and reflected waves overlap.

Example: Sound echoes

Total wave at any point: $$\psi_{\text{total}}(x,t) = \psi_{\text{forward}}(x,t) + \psi_{\text{reflected}}(x,t)$$

Biological implication:

Problem 3: Non-Local Credit Assignment

Hinton's critique:
"Neurons need to know downstream derivatives, but biological learning is local."

Traditional backprop:

$$\frac{\partial L}{\partial w_{ij}} = \delta_j \cdot x_i$$ Where \(\delta_j\) depends on all downstream weights

Wave theory solution:

Credit assignment via local interference

Observable: Hebbian rule

$$\Delta w_{ij} \propto x_i \cdot x_j$$

Appears purely local: just correlation between pre- and post-synaptic activity.

Hidden mechanism:

Post-synaptic activity contains both components: $$x_j = x_j^{\text{forward}} + x_j^{\text{reflected}}$$

Therefore:

$$\begin{align} \Delta w_{ij} &\propto x_i \cdot (x_j^{\text{forward}} + x_j^{\text{reflected}}) \\ &= \underbrace{x_i \cdot x_j^{\text{forward}}}_{\text{Hebbian term}} + \underbrace{x_i \cdot x_j^{\text{reflected}}}_{\text{Backprop term!}} \end{align}$$

Key insight: What looks like "Hebbian" locally actually contains gradient information hidden in the reflected wave component!

Summary: All Three Problems Resolved

Problem Traditional Issue Wave Solution
Weight transport Need \(W^T\) Automatic reflection (no explicit weights)
Phase separation Temporal separation Waves coexist (no separation needed)
Non-locality Need downstream info Local interference encodes gradients

Part IV: Why It Looks Like Hebbian Learning

The Hypothesis

Central claim:

The brain is doing backpropagation, but it looks like Hebbian learning at the synaptic level because synaptic plasticity measures wave interference, not separate components.

Why Previous Experiments Seemed to Support Hebbian

Experimental observation (consistent across decades):

Synapses strengthen when pre- and post-synaptic neurons are simultaneously active.

Why this was interpreted as "purely Hebbian":

What was missing:

Detailed Mechanism

Step 1: Forward wave propagates

Input → Layer 1 → Layer 2 → ... → Output

Creates forward activity: \(x_j^{\text{forward}}\)

Step 2: Error signal reflects back

Output error → Reflect → Layer L-1 → ... → Layer 1

Creates reflected activity: \(x_j^{\text{reflected}}\) (proportional to gradient)

Step 3: Waves interfere at synapse

$$x_j^{\text{total}} = x_j^{\text{forward}} + x_j^{\text{reflected}}$$

Synaptic plasticity: Responds to total activity

$$\begin{align} \Delta w_{ij} &\propto x_i \cdot x_j^{\text{total}} \\ &= x_i \cdot x_j^{\text{forward}} + x_i \cdot x_j^{\text{reflected}} \end{align}$$

Two Components of "Hebbian" Plasticity

Component Origin Function Learning Type
\(x_i \cdot x_j^{\text{forward}}\) Correlation Discover input patterns Unsupervised
\(x_i \cdot x_j^{\text{reflected}}\) Error signal Task-specific optimization Supervised

Implication: Purely "Hebbian" experiments (no task, no error) only capture the first term. The second term (backprop) requires a loss function and output target.

Connection to Spike-Timing-Dependent Plasticity (STDP)

STDP observation (Markram et al. 1997):

Wave interpretation:

Timing = Phase relationship

STDP is measuring wave phase coherence, not just timing!


Part V: Testable Predictions

If this theory is correct, we should observe:

Prediction 1: Bidirectional Information Flow

Prediction

Both forward and backward waves should be detectable in neural tissue during learning.

How to test

Expected result

Existing evidence

Partially confirmed: Backpropagating action potentials (BAPs) observed in pyramidal neurons (Stuart & Sakmann 1994)

Prediction 2: Phase-Dependent Plasticity

Prediction

Synaptic plasticity should depend on phase relationship between forward and reflected waves, not just correlation.

How to test

Expected result

$$\Delta w \propto \cos(\phi)$$ Where \(\phi\) = phase difference between forward and reflected waves

Existing evidence

Consistent with STDP: Timing-dependent plasticity shows similar phase-dependence

Prediction 3: Impedance Matching in Trained Networks

Prediction

Well-trained neural networks should exhibit low impedance across layers (minimal reflection), while untrained networks have high impedance.

How to test

Expected result

Network Frequency Response Impedance
Trained Flat (all frequencies pass) Low
Untrained Frequency-dependent (filtering) High

Existing evidence

Not yet tested directly (novel prediction)

Prediction 4: Wave Interference Explains Feedback Alignment

Background

Lillicrap et al. (2016) showed that neural networks can learn with random feedback weights (not \(W^T\)), contradicting standard backprop.

Prediction

Feedback Alignment works because wave interference doesn't require exact weight symmetry — only that reflected waves carry some error information.

How to test

Existing evidence

Consistent: Feedback Alignment success supports wave interference mechanism


Part VI: Mathematical Formalization

Now that we've established physical motivation, we can formalize the mathematics.

Neural Network as Wave System

Standard formulation:

$$\mathbf{x}_l = \sigma(\mathbf{W}_l \mathbf{x}_{l-1})$$

Wave formulation:

$$\frac{\partial^2 \mathbf{x}_l}{\partial t^2} + \gamma \frac{\partial \mathbf{x}_l}{\partial t} = \mathbf{W}_l \frac{\partial \mathbf{x}_{l-1}}{\partial t} - \nabla_{\mathbf{x}_l} V(\mathbf{x}_l)$$

Where:

Impedance Definition

Layer impedance: $$Z_l = \|\mathbf{x}_l - \mathbf{x}_l^*\|^2$$

Where \(\mathbf{x}_l^*\) = ideal (target) activation

Physical meaning: Resistance to information flow

Backpropagation as Wave Reflection

Forward wave (incident): $$\mathbf{x}_l^{\text{forward}} = \sigma(\mathbf{W}_l \mathbf{x}_{l-1})$$ Boundary condition at output: $$\mathbf{x}_L \neq \mathbf{y}^* \quad \Rightarrow \quad \text{Impedance mismatch}$$ Reflected wave (error signal): $$\boldsymbol{\delta}_L = \frac{\partial L}{\partial \mathbf{x}_L} = 2(\mathbf{x}_L - \mathbf{y}^*)$$ Backward propagation: $$\boldsymbol{\delta}_{l-1} = \mathbf{W}_l^T \boldsymbol{\delta}_l \odot \sigma'(\mathbf{z}_{l-1})$$

Key insight: This is identical to standard backprop, but derived from wave reflection!

Gradient Descent = Impedance Minimization

Training objective: $$\min_{\mathbf{W}} \sum_l Z_l = \min_{\mathbf{W}} \sum_l \|\mathbf{x}_l - \mathbf{x}_l^*\|^2$$ Gradient descent: $$\mathbf{W}_{l} \leftarrow \mathbf{W}_{l} - \eta \frac{\partial Z_{\text{total}}}{\partial \mathbf{W}_{l}}$$

Physical interpretation: Training adjusts medium properties (weights) to minimize wave reflection (error).

Connection to Hebbian Learning

Observed synaptic plasticity: $$\Delta w_{ij} = \eta \cdot x_i \cdot x_j$$ Wave decomposition: $$x_j = x_j^{\text{forward}} + x_j^{\text{reflected}}$$ Therefore: $$\begin{align} \Delta w_{ij} &= \eta \cdot x_i \cdot (x_j^{\text{forward}} + x_j^{\text{reflected}}) \\ &= \underbrace{\eta \cdot x_i \cdot x_j^{\text{forward}}}_{\text{Hebbian (correlation)}} + \underbrace{\eta \cdot x_i \cdot x_j^{\text{reflected}}}_{\text{Backprop (gradient)}} \end{align}$$

Why This Unifies Everything

Framework Wave Interpretation
Hebbian learning Correlation term in wave interference
Backpropagation Reflected wave carries gradient
Oja's rule Wave saturation (finite medium capacity)
STDP Phase-dependent interference
Feedback Alignment Approximate wave reflection (random paths still convey error)

Conclusion

We have shown that viewing neural networks as wave propagation systems resolves the longstanding conflict between Hebbian learning and backpropagation:

Implications:

References

Primary Sources

About This Work

This research is part of a broader wave dynamics framework. All work is open-access and available at:

🔗 https://machengshen.github.io/research/

Contact: macshen93@gmail.com | Collaboration: Macheng Shen + Claude (Anthropic Opus 4.6)


© 2026 Macheng Shen. Research conducted with Claude (Opus 4.6).
Research Home | GitHub | Twitter