Researchers trained lab-grown mini-brains to solve an engineering problem.

Gareth H. Whitfield • April 14, 2026 03:08

If these mini-brains can clear this test, can we already call it intelligence? No - but considering that some humans would not manage it either, it is still an impressive achievement.

The cartpole (also known as the inverted pendulum) task is a long-standing staple of mathematics, robotics and machine learning. It is a foundational engineering challenge: keeping an upside-down pendulum upright on a moving cart by continuously correcting its motion - an inherently unstable system, because even a tiny deviation can cause it to topple.

Researchers at the University of California, Santa Cruz set out to see whether mini brain organoids derived from mouse stem cells could be trained to solve this problem. These are three-dimensional, multicellular structures grown in vitro that reproduce, in simplified form, the anatomy and a handful of basic functions of a brain. On 24 February, they published their findings in Cell Reports: overall, the experiment broadly worked.

Brain organoids put through the cartpole (inverted pendulum) challenge

Although the cartpole is a dynamic-control benchmark used to train algorithms that end up inside cutting-edge technologies, most of us have effectively tried something similar without realising it. When you balance a ruler or a pen on your fingertip, you are dealing with almost the same principle: you fight the object’s oscillations, and gravity will naturally pull it down unless you counteract its movement.

In the lab, the team recreated the situation in a virtual simulation. The cart (the equivalent of your hand in real life) could move only left or right along a horizontal line. For the organoids, the pendulum’s tilt was converted into patterns of electrical stimulation delivered to the neuronal tissue: the further the bar leaned to one side, the more the signal changed. In return, the organoid’s own electrical activity was decoded as a control command - moving the cart left or right in an attempt to restore balance.

One point needs to be made crystal clear: these mini-brains were not conscious, and they did not “understand” the task in any meaningful sense. The scientific question was narrower: can a laboratory-grown neuronal network learn to make fewer mistakes when it is stimulated in the right way?

Each attempt ended once the pendulum exceeded a preset angle. If it drifted too far, the researchers stopped the run and started again. Performance was tracked across many consecutive trials to see whether the organoids improved - but improvement requires something crucial: the system must be able to use its errors to adjust its future behaviour.

When you balance a pen and it begins to fall, your brain instantly receives sensory input and you move your hand accordingly. This constant back-and-forth between action and correction is known as a feedback loop. To push the test to its limit, the researchers recreated that loop artificially.

Three stimulation protocols and an artificial feedback loop

The team compared three different protocols:

No feedback: the organoid received signals encoding the pendulum’s tilt, but nothing else influenced its activity based on how well it was doing.
Random stimulation: additional stimulation was delivered to some neurons, but it was not tied to success or failure.
Adaptive feedback: the most involved setup. When the researchers noticed that an organoid was keeping the pendulum balanced for less time than the average of its previous runs, they delivered short electrical bursts to a specific set of neurons. In practice, they checked whether, over several successive attempts, the pendulum stayed upright for less time than before; if so, they applied a brief electrical pulse to a defined group of neurons.

They then watched what happened on subsequent trials: if the pendulum remained upright for longer, the likelihood of stimulating that same neuronal group increased; if performance worsened, they changed which neurons were targeted. Only when this performance-linked adaptive feedback was introduced did the organoids show clear improvement, keeping the pendulum upright on the cart for longer.

Stronger performance - but no long-term memory

As soon as the mini-brains began the task, the researchers organised attempts into evaluation cycles. One cycle consisted of five trials, and to decide whether the organoids were genuinely improving, the team needed a baseline comparator: chance. The reason is simple - even a system that learns nothing can sometimes, by random fluctuation, keep the pendulum up slightly longer on a given run. The team therefore had to estimate what purely random behaviour would produce on average.

To do this, they simulated a controller that moved the cart left or right without any adaptive logic. They measured how long this random approach typically kept the pendulum balanced, and used that duration as a reference point.

Each five-trial cycle was compared with that benchmark. If the mean balance time across the five runs exceeded what chance could achieve, the cycle was counted as better than random behaviour.

With no feedback, only 2.3% of cycles cleared the threshold.
With random stimulation, performance improved to 4.4% - nearly double.
With the adaptive feedback loop, the figure jumped to 46%.

In every condition the pendulum eventually fell off the cart, but under adaptive feedback, in almost one cycle out of two, the organoids kept it upright longer than in the other experimental setups.

However, the gains did not last - or lasted only marginally. After 45 minutes of rest, performance returned to the starting level. Without ongoing electrical stimulation, the researchers did not observe consolidation of learning. In other words, this was fragile plasticity: a pressured, short-lived improvement that the mini-brains could not retain.

That naturally raises the question of what the point of such an experiment is if there is no lasting memory trace. For David Haussle, a bioinformatician and co-author of the study, the aim is not to bring about any form of primary intelligence in brain organoids, but to support therapeutic research. As he put it, their goal is to advance brain science and treatments for neurological disease - not to replace robotic controllers or computers with animal brain tissue grown in the lab. The value is primarily methodological: using a measurable read-out of plasticity to explore how and why neural connections fail in conditions such as Alzheimer’s disease or Parkinson’s disease. More broadly, the framework provides a simplified, repeatable model for investigating effects that, in an intact brain, would be far harder to isolate and demonstrate.

What this kind of brain-organoid benchmark can (and cannot) tell us

It is also important to interpret the results with care. The cartpole task offers a tidy numerical way to track changes in network behaviour, but it does not imply comprehension, intention or awareness. What it can provide is a controlled window into how living neuronal tissue responds to structured input - and how quickly that responsiveness fades when stimulation stops.

Looking ahead, this approach could be paired with disease-model organoids (or organoids exposed to specific chemical or genetic interventions) to see whether plasticity signatures shift in predictable ways. In that sense, the cartpole/inverted pendulum benchmark may become a practical tool for comparing experimental conditions - not as a step towards conscious machines, but as a way to quantify how “trainable” a biological network is under different constraints.