Digital Pathology: Slides, AI, & Challenges
Pathology has progressed from glass slides to AI-driven digital diagnostics, promising to transform disease detection while facing adoption and market challenges
Bo Wang, Head of Biomedical AI at Xaira Therapeutics, puts it simply: “Pathology is the cornerstone of diagnosis, especially when it comes to cancer.” While continuously developing since ancient times, by the late 2000s the pathology workflow with glass slides, physical archives, and in-person consultations couldn’t keep pace with modern demands. It slowed diagnoses, made collaboration cumbersome, and limited access. High-resolution images were massive—often gigabytes each—and required costly whole-slide imaging (WSI) systems, which lacked FDA clearance for years.
By the mid-2010s, that changed. Advances in WSI, falling storage costs, and FDA approval sparked the rise of digital pathology—scanning slides for easier sharing, annotation, and remote consultation. Computationally minded researchers quickly recognized the value of these datasets, and startups emerged, often spun out of collaborations with hospitals and labs holding vast archives of digitized slides.
In this article: How it started — How it’s going — Time for Foundation Models — The Market Players — Not Only Cancer — Current State of the Industry
How it started
Pathology history begins with the careful observations of ancient doctors, who recognized the importance of tracking how diseases developed even without understanding their causes. The 17th-century BC Edwin Smith Papyrus, written in hieroglyphs, described cases like skin ulcerations but could not explain their origins.
Centuries later, Hippocrates proposed the influential Humoral Theory, suggesting illness arose from imbalances in four bodily fluids—a framework that guided medicine until the 17th century. In the Middle Ages and Renaissance, figures like Antonio Benivieni advanced the field by systematically recording autopsy findings, marking a shift toward pathology as a distinct discipline.
The invention of the microscope in the late 16th century, refined by Robert Hooke, took on new significance in the 19th century when Rudolf Virchow used it to establish that diseases originate at the cellular level. Improvements in tissue preparation (fixation, embedding, staining) helped shape modern histopathology. The 20th century brought integration with other sciences, as advances in immunology, chemistry, and molecular biology deepened understanding.
Antibody discovery enabled immunohistochemistry, allowing for the precise detection of proteins within tissues and more accurate diagnoses. The invention of PCR in 1983 pushed diagnostics forward by capacitating genetic material to be amplified from very small samples. By the 21st century, pathology had gained the ability to examine single cells, offering unprecedented precision in detecting disease and guiding prognosis.

How it’s going
Over time researchers started trying to integrate digital solutions into pathology. In 1965, Judith Prewitt and Mortimer Mendelsohn from Upenn performed a first computerized analysis of microscopy images of cells and chromosomes. Over three decades later, 1999 was marked by the introduction of the whole slide imaging (WSI). WSIs are created by scanning glass microscope slides to produce a high resolution digital image, which is later reviewed by a pathologist to determine the diagnosis. Fast forward to 2017 and Phillips received a milestone FDA approval for its IntelliSite, the first WSI system for primary diagnosis in surgical pathology.
Essentially, WSI analysis can be encoded as the image recognition problem, which made digital pathology a fertile soil for the involvement of machine learning algortihms.
AI arrival in WSI diagnostics
The use of AI for a wide range of diagnostic tasks involving whole-slide images (WSIs) has grown in recent years. A comprehensive meta-analysis by McGenity et al. in npj Medicine published last year provides plenty of insights in this area.
According to McGenity, while AI has shown promise across multiple disease types, its most notable successes have been in applications to cancer. A landmark early study by Bejnordi et al. (2017) evaluated 32 AI models developed for detecting breast cancer metastases in lymph nodes as part of the contest on computer-aided diagnosis in histopathology using WSI (CAMELYON16 grand challenge). The best-performing model achieved an area under the curve (AUC) of 0.994, achieving near-human performance in this controlled setting.
More recently, Lu et al. (2021) trained an AI model to predict the tumour site of origin in cases of cancer of unknown primary, achieving an AUC of 0.80 for top-1 accuracy and 0.93 for top-3 accuracy on an external test set. AI has also been applied to predictive tasks, such as estimating 5-year survival in colorectal cancer patients and determining mutation status across multiple tumour types. Other reviews have investigated AI applications in liver, skin, and kidney pathology, with certain models demonstrating strong diagnostic performance.
Time for Foundation Models
Interest in LLM foundation models (like ChatGPT, DeepSeek, and Grok) exists for a practical reason: one pretrained system can be adapted to many tasks at lower marginal cost. For digital pathology, the workable path is using them as the language-and-logic layer atop whole-slide image models.
In March 2024, Faisal Mahmood, a computer scientist at Harvard Medical School, released UNI—a self-supervised vision transformer (ViT-L) for computational pathology, pretrained with Meta’s DINOv2 (2023) on Mass-100K, a dataset of 100M+ patches from 100k+ diagnostic H&E WSIs spanning 20 tissue types. UNI addresses the limits of prior encoders trained on smaller, less diverse datasets like TCGA. It was evaluated on 34 varied computational pathology tasks—regions of interest- and slide-level classification, segmentation, retrieval, and few-shot learning—covering challenges from cancer grading to rare disease subtyping.
On benchmarks OT-43 and OT-108 (43 and 108 cancer types), UNI showed strong scaling laws with data/model size and outperformed state-of-the-art encoders (ResNet-50, CTransPath, REMEDIS), especially for rare cancers. The model was recently updated to UNI 2 with an expanded training set - more than 200 million images and 350,000 slides. To support the model, in June 2024, Mahmood’s team launched PathChat, an AI-copilot that combines the UNI model with a LLM and is fine-tuned on nearly one million medical Q&A pairs. Licensed to Modella AI in Boston, it can analyze pathology images, generate reports, and has received FDA breakthrough-device designation.
Similar to UNI, Mahmood’s model CONCH (Contrastive Learning from Captions for Histopathology) — demonstrated superior performance to other models in classification tasks, such as cancer subtyping, according to the researchers. For instance, it could identify cancer subtypes carrying BRCA gene mutations with over 90% accuracy, whereas competing models generally performed no better than random chance. Trained on over 1.17 million images, CONCH was also capable of classifying and captioning images to generate visual representations of patterns found in particular cancers. However, its accuracy in these multimodal tasks was lower than in classification. In direct comparisons, CONCH consistently surpassed baseline methods, even when only a small number of data points were available for downstream training.
Several research teams have created their own foundation models for pathology. Microsoft’s Prov-GigaPath, for example, was trained on 1.38 billion image tiles from >171,000 slides collected from 28 cancer centers across the United States to perform tasks like cancer subtyping and pathomics. Using real-world data and a two-stage self-supervised learning process, it achieves state-of-the-art performance on 25 of 26 benchmark tasks across cancer subtyping, mutation prediction. For instance, in EGFR mutation prediction on TCGA dataset Prov-GigaPath demonstrated +23.5% AUROC and +66.4% AUPRC vs. next-best model REMEDIS.
Alternative tool mSTAR (Multimodal Self-taught Pretraining), released in July 2024 by computer scientist Hao Chen and his team at the Hong Kong University of Science and Technology, is the first pathology foundation model to integrate three modalities—pathology slides, pathology reports, and gene expression data. It leverages >26,000 slide-level multimodal pairs from >10,000 patients across 32 cancer types (over 116 million pathological patches) to identify metastases, subtype cancers and perform other tasks. Similarly to PathChat, Chen’s group developed their own AI-chatbot - SmartPath, now in hospital trials in China, where it is being tested against pathologists in diagnosing breast, lung, and colon cancers.
All that being said, French-based Bioptimus decided to think bigger and in May 2025 they released H-Optimus-1, the largest open-source pathology foundation model. It is a 1.1-billion-parameter Vision Transformer trained self-supervised on >500,000 whole slides spanning 50 organs and ~800,000 patients from 4,000 clinics. Compared with its predecessor H-Optimus-0, it raises average AUROC—a score that tells how well a classifier separates true-positives from false-positives (1.0 = perfect, 0.5 = chance)—on nine mutation-and-biomarker tasks from 0.835 to 0.856 and nudges the HEST gene-expression correlation from 0.413 to 0.422. The model’s embeddings support tumour detection, metastasis screening, mutation prediction (e.g., KRAS, BRAF, MSI) and survival modelling, and can be fine-tuned for laboratory-specific workflows.