huggingface.co web signal

EO-WM Cuts Satellite Vegetation Decline Forecast Error 5.63%

generative ai computer vision world-models climate-ai satellite-imagery

TL;DR

  • EO-WM reduces NDVI decline amplitude error by 5.63% and improves directional hit rate by 7.80% over generic video baselines.
  • The model decomposes weather forcing into climatological baseline, weather anomalies, and cumulative heat and drought stress signals.
  • Two new benchmarks test extreme-event severity calibration and directional weather-to-vegetation response fidelity, not just pixel accuracy.

Predicting how vegetation will respond to a heatwave is a harder problem than predicting what a landscape will look like on an ordinary summer day, and most satellite forecasting benchmarks do not distinguish between the two. Researchers from the University of Hong Kong and Wuhan University addressed that gap directly with EO-WM, a video diffusion transformer built for multispectral Earth Observation forecasting under changing meteorological conditions.

The model's core idea is a physically informed decomposition of weather forcing into three signals: a climatological baseline capturing what a location normally experiences, a weather anomaly term measuring how current conditions deviate from that baseline, and a cumulative stress signal that accumulates heat and water deficit over multi-day windows. That last component matters most for extreme events. A single hot day is noise; weeks of compounding heat and drought is the kind of sustained forcing that actually degrades vegetation, and the model is designed to distinguish between them.

Evaluated against generic video generation baselines, EO-WM reduces NDVI decline amplitude error by 5.63% and improves directional hit rate by 7.80%. The team introduced two new benchmarks alongside the model: an Extreme Summer benchmark drawing 1,440 verified prediction windows from the 2018 European summer heat event across four Sentinel-2 tiles in France and Germany, and a Seasonal Matched-Pair benchmark with 422 pairs from 380 locations testing whether a model correctly reproduces vegetation responses when weather forcing differs between years at the same site. Both benchmarks target a real gap. Standard reconstruction metrics like EarthNetScore measure pixel accuracy but say nothing about whether a model actually responds to weather in the right direction.

The paper is candid about limits: the model is currently constrained to seasonal forecasting windows, error accumulates at longer horizons, and several land-surface states remain unobserved, including soil moisture, irrigation, and vegetation type. Code and benchmarks are planned for open release at github.com/Luo-Z13/EO-WM. The clearest near-term beneficiaries are crop-yield and climate-risk applications where knowing whether a model tracks the direction of vegetation stress under anomalous weather matters more than whether its pixels look realistic.