Artificial Intelligence in Biomedicine: Unlocking the Secrets of Millions of Cells
AI Business Strategy4 min read

Artificial Intelligence in Biomedicine: Unlocking the Secrets of Millions of Cells

Sam Carter

AI Strategy Consultant

November 1, 2025
Artificial Intelligence in Biomedicine: Unlocking the Secrets of Millions of Cells

Artificial Intelligence in Biomedicine: Unlocking the Secrets of Millions of Cells

Our bodies are made up of around 75 billion cells, each carrying out specific functions that define health or disease. But how exactly does one healthy person’s cells differ from those of someone with lung cancer, COVID infection, or even the effects of smoking? To uncover such differences, researchers must analyze massive datasets at the single-cell level. That’s where artificial intelligence (AI)—specifically machine learning—is proving to be a game-changer.

T**ackling Big Data in Cell Research:

Recent breakthroughs in single-cell technology now allow scientists to investigate tissues one cell at a time, mapping out their diverse functions. This type of analysis provides vital comparisons, showing how diseases or environmental factors reshape cellular structures. But as the data piles up, traditional computational tools struggle to keep pace. Researchers at TUM are taking a fresh approach—leveraging self-supervised learning, a branch of AI capable of making sense of 20 million or more cells without the need for pre-labeled training data.

Self-Supervised Learning Explained:

The study, led by Fabian Theis, Chair of Mathematical Modelling of Biological Systems at TUM, was published in Nature Machine Intelligence. Unlike classical methods that require pre-classified data, self-supervised learning works directly with unlabelled datasets, which are both abundant and well-suited for large-scale biological studies.

This approach uses two core strategies:

Masked learning: hiding parts of the input data so the model learns to reconstruct missing elements.

Contrastive learning: teaching the model to recognize similarities and differences between cells.

Applying both techniques, the team tested over 20 million cells and compared outcomes against traditional AI methods. They measured performance in tasks like predicting cell types and reconstructing gene expression.

From Better Predictions to Virtual Cells:

Results showed that self-supervised learning excels at transfer tasks, where smaller datasets benefit from knowledge gained on larger ones. Even zero-shot predictions—where no pre-training was involved—delivered promising results. Importantly, masked learning proved especially effective for huge single-cell datasets. Looking ahead, the team envisions virtual cells: digital models that capture the diversity of real cells across different datasets. Such models could transform the study of disease progression by simulating cellular changes in conditions like lung cancer or viral infections.

Why It Matters:

By improving efficiency in training and scaling models, self-supervised learning paves the way for more accurate, transferable, and accessible biomedical research tools. These insights could eventually help doctors predict disease at the cellular level and personalize treatments far earlier than current methods allow.

Found this helpful?

Share it with your network