Nima Dehghani
← Blog · Nov 5, 2024

Systematizing Cellular Complexity: Toward a Hilbertian Program for Biology

biological-computationbiophysicssystems-biologycomplexity

Companion post to:
Systematizing Cellular Complexity: Toward a Hilbertian Program for Biology
Nima Dehghani Plos Complex Systems . PLOS Complex Syst 1(3): e0000013 DOI: https://doi.org/10.1371/journal.pcsy.0000013


Systematizing Cellular Complexity: Toward a Hilbertian Program for Biology

Modern biology has become extraordinarily good at identifying parts. We can measure genes, proteins, metabolites, ion channels, signaling molecules, cell states, and intracellular structures with a resolution that would have been difficult to imagine only a generation ago. Yet this success has created a second problem: biological knowledge is increasingly rich in components but often poor in system-level integration.

A signaling diagram, no matter how detailed, is still a static abstraction. A pathway map can show interactions, but it rarely captures the full causal, temporal, stochastic, and multiscale structure through which a living cell maintains itself, adapts, computes, and survives. The central question is therefore not simply how many molecular interactions we can catalog. It is how those interactions become a functional, robust, adaptive system.

In my recent paper, Systematizing cellular complexity: A Hilbertian approach to biological problems, I argue that biology needs a more problem-oriented theoretical program. The inspiration is David Hilbert’s famous list of mathematical problems: not because biology can be axiomatized in the same way as mathematics, but because carefully formulated problems can organize a field, expose missing concepts, and guide the development of appropriate methods.

The proposal is to begin not with a favored method, model class, or data modality, but with the problems that cells must solve. Once those problems are made explicit, we can ask what combination of biophysics, dynamical systems theory, causal inference, control theory, information theory, stochastic processes, multiscale modeling, and machine learning is actually appropriate.

In the paper, I develop three exemplar problems.


Problem 1: Where are the control switches?

Cells constantly face environmental and internal perturbations. A bacterium, a neuron, a cardiomyocyte, or a developing cell cannot simply sample all possible configurations of its molecular machinery by brute force. The space of possible regulatory states is too large, the measurements are too noisy, and many incorrect configurations would be damaging or lethal.

So the first problem is:

\[\textit{How does a cell identify and manipulate the relevant control variables at the right time?}\]

This problem is more subtle than finding a molecular “switch” in isolation. The relevant switch may be a gene, a protein state, a channel conformation, a regulatory motif, a feedback loop, or a higher-order network structure. In many cases, control is not purely hierarchical. It is distributed, heterarchical, and embedded in layers of feedback and modulation.

This is why static pathway diagrams are insufficient. They may tell us that component (A) interacts with component (B), but they do not tell us whether perturbing (A) changes (B) under a specific context, time scale, cell state, or environmental condition. They do not by themselves distinguish correlation, causation, modulation, compensation, redundancy, or control.

A Hilbertian framing of this problem pushes us toward multiscale causal inference. The goal is not merely to infer edges in a gene-regulatory or protein-interaction network. The goal is to infer which variables act as effective control points under specific constraints.

This requires combining several ideas:

  • causal models and counterfactual reasoning;
  • time-series and perturbational data;
  • network motifs and higher-order interactions;
  • multiscale mappings between molecular and cellular observables;
  • formal tools capable of handling concurrency, delay, and partial ordering.

Biological regulation is not a simple circuit board. But the analogy to in-circuit testing is useful: one does not understand a system merely by listing its connections. One must probe how components function within the active system.

For systems biology and computational neuroscience, this reframes the task. We should not only ask which variables are statistically associated. We should ask which variables have causal leverage over the state trajectory of the biological system.


Problem 2: How does a cell reconfigure itself?

The second problem concerns homeostasis, adaptation, and reconfiguration.

Cells do not merely preserve a fixed internal state. Under sufficiently strong or sustained perturbations, they may need to move into a different functional regime. In this sense, biological regulation is not only about returning to a set point. It is also about changing the relevant operating range, modifying internal constraints, and sometimes entering a new basin of attraction.

The problem can be stated as:

\[\textit{How does a cell maintain stability while changing the space of possible responses?}\]

Classical homeostasis is often described through negative feedback. But real biological systems must deal with delays, nonlinearities, intrinsic noise, extrinsic perturbations, and changing environmental statistics. This brings the discussion naturally into contact with control theory.

In the paper, I discuss several forms of regulation, including negative feedback, integral control, PID-like biological control, antithetic integral feedback, allostasis, heterostasis, and adaptive homeostasis. These concepts are not merely metaphors. They provide formal ways of thinking about how a biological system can reduce error, adapt to persistent deviations, and maintain function under noisy conditions.

Ion channels provide a concrete biophysical example. Their expression, localization, gating, and modulation define the excitability of cells such as neurons and cardiomyocytes. Yet ion-channel regulation is not reducible to the behavior of a single channel type. A cell may maintain macroscopic electrophysiological function despite variability in individual conductances. In neurons, for example, different combinations of ion-channel expression can preserve excitability, while spatial differences between soma and dendrites create additional layers of functional regulation.

This creates a natural connection to computational neuroscience. Neural function depends on a regulated, high-dimensional landscape of conductances, membrane properties, synaptic inputs, and intracellular signaling pathways. The stability of this landscape is not passive. It is actively maintained through homeostatic and adaptive mechanisms.

The bacterial potassium channel KcsA is one illustrative example. Its gating depends on pH, conformational state, membrane conditions, and prior channel history. This makes it a small but concrete case of a broader principle: biological function is often realized through nested, history-dependent, multiscale control.

For biophysics, this suggests that molecular mechanism and system-level regulation should not be separated. For biological computation, it suggests that cellular “computation” is not simply input-output transformation. It is the regulated evolution of a physical system through a constrained state space.


Problem 3: How does a cell harness noise?

Noise is often treated as something to be removed. In biology, that view is incomplete.

Cells are noisy at almost every relevant scale: molecular binding, receptor activation, ion-channel opening, gene expression, intracellular transport, membrane excitability, and cell-cell communication. Yet living systems do not merely tolerate this noise. In some regimes, they exploit it.

The third problem is therefore:

\[\textit{How do biological systems use stochasticity as part of their functional organization?}\]

One of the central examples in the paper is stochastic resonance. In nonlinear systems, noise can improve the detection of weak signals by helping subthreshold inputs cross an effective threshold. This phenomenon has been studied in sensory systems, neurons, mechanoreceptors, and ion channels.

At the level of ion channels, gating can be treated as a stochastic process shaped by thermal fluctuations and voltage-dependent energy barriers. In small channel clusters, noise affects the probability of individual channel transitions. In larger assemblies, collective effects can generate system-size resonance and temporal coherence. Thus, the functional role of noise depends on scale.

This has an important computational interpretation. In thresholded nonlinear systems, stochastic resonance can resemble dithering in signal processing. In digital audio or image processing, adding noise before quantization can reduce structured quantization error. Similarly, biological noise can effectively increase the number of usable response levels in a thresholded sensing system.

This gives us a useful way to think about ligand receptors, ion channels, and excitable membranes: they are not perfect analog sensors, nor are they simple digital switches. They are stochastic, biophysical sampling devices. Their noise properties, densities, thresholds, and collective organization shape the effective quantization of environmental and physiological signals.

For biological computation, this is a key point. Computation in cells is not noise-free symbol manipulation. It is physical, stochastic, and embodied. Noise is not always a failure mode. Under the right constraints, it becomes part of the computational substrate.


Why this matters for biological computation

The broader motivation of the paper is to move from descriptive complexity to problem-centered theory.

Biology is full of complicated systems, but complication is not the same as complexity. A static pathway diagram with hundreds of nodes may be complicated without capturing the essential causal and dynamical structure of the living system. Conversely, a well-posed theoretical problem can reveal the organizing principles behind a seemingly overwhelming set of observations.

A Hilbertian approach to biology would ask questions such as:

  • What are the control variables of a cell?
  • Which scales carry causal leverage?
  • When can macroscopic variables outperform microscopic descriptions?
  • How do biological systems regulate their own state spaces?
  • How is noise transformed into function?
  • What kinds of models are falsifiable rather than merely descriptive?
  • How should machine learning be constrained by biophysics?
  • What does it mean for a cell to compute?

These questions matter not only for cell biology, but also for computational neuroscience, systems biology, synthetic biology, and bio-inspired AI. They suggest that biological computation should be studied as the evolution of organized physical systems under constraints, rather than as an abstract information-processing metaphor detached from material implementation.


Machine learning is useful, but not sufficient

The paper also discusses the role of machine learning in biology. Modern machine learning is powerful, especially for extracting structure from high-dimensional data. But a black-box model that predicts a biological dataset is not automatically a theory of the biological system.

For biology, prediction is not enough. A useful model should also expose mechanisms, generate interventions, respect biophysical constraints, and make falsifiable claims.

This is where physics-informed machine learning, universal differential equations, symbolic regression, causal representation learning, and heterogeneous multiscale modeling become important. These approaches can help connect data-driven inference with mechanistic structure. But the method should follow the biological problem, not the other way around.

A problem-oriented framework helps prevent a common failure mode: choosing a fashionable method first and then forcing the biological system into that formalism. Instead, we should ask what the cell is trying to regulate, what constraints it faces, what observables are available, and what interventions would distinguish competing models.


Toward a problem-centered theory of living systems

The central claim of the paper is not that biology should imitate mathematics, nor that living systems can be reduced to a small set of elegant equations. Rather, the claim is that biological theory needs better problem formulation.

Hilbert’s list helped organize mathematics by identifying problems whose solutions would reshape the field. Biology may benefit from an analogous discipline of problem-setting: carefully identifying the fundamental challenges that living systems must solve and then building mathematical, computational, and experimental frameworks around those challenges.

The three problems discussed in the paper — control switches, adaptive reconfiguration, and noise harnessing — are only examples. The broader program is open-ended. Other problems may concern growth, morphology, memory, regeneration, collective decision-making, development, multiscale causality, or the emergence of biological individuality.

For those working in biological computation, biophysics, systems biology, and computational neuroscience, the message is simple: we should not be satisfied with larger maps alone. We need theories that explain how biological systems control themselves, reconfigure themselves, and compute with noisy physical substrates.

That requires a shift from cataloging components to formalizing problems.

And that, in essence, is the Hilbertian approach proposed in the paper.


The room this opens