docs.google.com web signal

Neklyudov brings set-operations math to protein generation

TL;DR

  • Kirill Neklyudov of Université de Montréal and Mila presented 'Steering Generative Models: From Mathematics to Biomolecular Design' at ChemAI NYC 2026.
  • The talk frames protein design as probabilistic set operations, using Feynman-Kac correctors in both continuous and discrete forms to enforce targets and off-targets.
  • The group introduces a SelectBench tool for evaluating selectivity, and the deck concedes on the record that 'the problem is far from being solved.'

A talk from ChemAI NYC 2026 caught my eye because it takes something the biomolecule-design world actually needs, proteins that bind target A but not the closely related off-target B, and reframes it as clean probabilistic set arithmetic rather than another bespoke fine-tune.

The deck is by Kirill Neklyudov of Université de Montréal and Mila, titled "Steering Generative Models: From Mathematics to Biomolecular Design." The framing is simple in the way good research talks are. The space of possible proteins is a set. Proteins binding a specific target are a subset. Intersections, complements, and differences of those subsets are what a designer really wants to sample from. What his group is proposing is a mathematical toolkit, Feynman-Kac correctors in both continuous and discrete versions, that steers a generative process to respect those set constraints as it denoises.

Why should anyone outside this specific corner of ML-for-chemistry care? Specificity is the hard problem in binder design. Getting a model to produce a candidate that binds the target you asked for is now doable. Getting one that binds the target and not a closely related off-target is what separates a paper from a drug candidate. Treating "on-target, not off-target" as a first-class mathematical operation, rather than as a re-weighting hack layered on top, is at least an honest attempt at that gap. The group also introduces a SelectBench tool for evaluating selectivity, which suggests they know the field cannot just eyeball this.

The honest caveat is one Neklyudov himself puts on a slide, that the problem is far from being solved. The deck credits Marta Skreta, a postdoctoral researcher, as the "MVP of these works," and the framing is very much research-in-progress. What the reporting does not give you is a head-to-head against production binder-design pipelines, any wet-lab validation, or a sense of how the discrete Feynman-Kac corrector holds up outside toy protein settings.

The piece worth watching is whether the set-operations vocabulary catches on with other groups building diffusion and flow-based generators for molecules and proteins. If it does, benchmarks like SelectBench become the shared yardstick, and drug-discovery teams get a cleaner way to specify hit this, miss that without retraining from scratch.

Shared on Bluesky by 1 AI expert