Discussion about this post

User's avatar
Mauricio Cruz Loya's avatar

I broadly agree with this. An important limitation, especially in biology, is that many problems can't be reduced to clean evaluation loops. When data is sparse and heterogeneous and hypotheses aren’t tied to a single metric, it can be unclear what the agent is even optimizing.

This makes fully AI-only agentic workflows fundamentally constrained. Even with multiple agents, you still get shared priors and failure modes as you point out.

To me, the more promising direction seems hybrid: workflows that incorporate AI alongside human judgment and mechanistic modeling. This does not just allow faster pattern recognition, but also representing and testing causal structure.

I wrote a short essay expanding on this if anyone’s interested.

No posts

Ready for more?