Generative Diffusion in Molecular Design
A quick field guide to diffusion-based generators in molecular design—how they work, where they complement transformers, and who is deploying them today
Last week, Californian drug discovery startup Terray Therapeutics introduced an experimentation-based machine intelligence platform called EMMI. The platform unites the company’s proprietary ultra-dense microarray technology with an AI stack built around its COATI foundation model, which maps chemical representations to respective molecular properties for better scientific understanding. EMMI is designed to guide R&D reasoning and propose molecular candidates with the later refinement and validation. Terray couples a 13-billion-measurement binding dataset with COATI-based diffusion and RL generators, and an uncertainty-aware selection layer, into a closed-loop system that decides not only what to propose but also which molecules are worth the cost of actually making and testing. In 2024, the company released its first latent diffusion-based molecular generator.
Terray’s work in diffusion methods prompted a broader reflection on generative AI in biology. Today, most conversations and publications center on Transformer-based systems, especially large language models (LLMs) and other foundation models (FMs). LLMs make up a major subset of FMs, but whereas language models are trained primarily on textual data like natural language, code, or biological sequences, foundation models extend the paradigm to additional modalities, including images, audio, video, and even multimodal combinations.
Recent meta-reviews in biomedical NLP collectively catalog nearly 300 LLM instances across hundreds of studies. Foundation models are also proliferating, with over 200 tools developed since 2022 in drug discovery alone. In contrast, the literature on diffusion models for biological and chemical applications remains comparatively modest. So far, there have been only a handful of reviews capturing the diffusion generators. Yet despite lower popularity, diffusion architectures are carving out a meaningful and distinctive role in biotech research and industry.
Before diving deeper into their role in biomedicine, let’s briefly review how diffusion models work in general.
In this article: Diffusion Models 101 — With or against Transformers? — Diffusion Models in Biomedicine — Dispersed Players — Diffusion Online Stations — An Afternote

