Anthropic Tests Claude on NMR Chemistry, Matches Specialist Tools
TL;DR
- Opus 4.7 predicted hydrogen NMR peak positions with ±0.079 ppm error, beating the tolerance window for specialist chemistry software.
- All Claude models predicted sub-peak spacing within 0.5 hertz about 80% of the time, versus 26–35% for classical tools ChemDraw and MestReNova.
- Opus 4.7 correctly recovered all 8 simpler molecular structures via NMR elucidation on every attempt in the study.
NMR spectroscopy, the analytical technique chemists use to identify molecular structures from signals emitted by atomic nuclei in magnetic fields, has long depended on specialized software like MestReNova and ChemDraw. Anthropic's new research, published June 5, 2026, argues that Claude can now handle a meaningful slice of that work without any domain-specific training.
The study tested Claude models -- Opus 4.7, Opus 4.6, and Sonnet 4.6 -- on two tasks: predicting NMR spectra from a molecular structure (forward prediction) and recovering a molecular structure from NMR data (structure elucidation). For forward prediction, the team sourced 20 compounds from ChemRxiv preprints published after the models' training cutoff, specifically to avoid data contamination. Opus 4.7 predicted hydrogen peak positions with a mean error of ±0.079 ppm, described in the research as "well under half the tolerance window," and carbon peak errors of ±1.37 ppm, competitive with MestReNova's ±1.48 ppm. Across all Claude models, sub-peak spacing landed within 0.5 hertz roughly 80% of the time, compared to 26–35% for the classical tools.
On structure elucidation, Opus 4.7 was tested on 15 problems, split into 8 simpler targets (single-ring or two-fragment molecules) and 7 harder ones involving fused rings and spirocycles. It recovered all 8 simpler structures correctly on every attempt. For the harder targets, it produced the correct structure on all 3 runs for 4 compounds, and on 2 of 3 runs for the remainder.
The honest caveat is that the researchers themselves call the evaluation "small" -- 20 compounds across four scaffolds, with solvent coverage limited to DMSO-d6, CDCl3, and D2O, no 2D NMR experiments, and no stereochemistry assessment. A test set that narrow makes it genuinely hard to know how the results generalize to the full range of chemistry a working lab encounters. What the research doesn't address is how Claude handles organometallics or unusual heterocycles, or whether it can accept raw spectrometer output rather than pre-processed data.
The forward-looking case, if results hold across broader compound classes, is one of access: labs without expensive software licenses and students working without institutional tools would gain a capable assistant for routine NMR interpretation. Anthropic noted plans to extend the work into reaction reasoning, mechanism explanation, and chemical literature understanding -- areas that would push Claude further into the everyday workflows of practising chemists.
Shared on Bluesky by 2 AI experts
-
Hi Claude! Can you explain, while missing the entire point and all the actual challenges an organic chemist faces, while also showing a complete lack of familiarity with the problems with literature and characterization,…
View on Bluesky →
Originally reported by anthropic.com
Read the original article →Original headline: Making Claude a chemist