Probabilistic Super-Resolution for Urban Micrometeorology via a Schrödinger Bridge
TL;DR Summary
This study develops a Schrödinger bridge model for urban micrometeorology super-resolution, achieving comparable accuracy to diffusion models at one-fifth computational cost and enhanced uncertainty quantification, enabling feasible real-time ensemble forecasting.
Abstract
This study employs a neural network that represents the solution to a Schr"odinger bridge problem to perform super-resolution of 2-m temperature in an urban area. Schr"odinger bridges generally describe transformations between two data distributions based on diffusion processes. We use a specific Schr"odinger-bridge model (SM) that directly transforms low-resolution data into high-resolution data, unlike denoising diffusion probabilistic models (simply, diffusion models; DMs) that generate high-resolution data from Gaussian noise. Low-resolution and high-resolution data were obtained from separate numerical simulations with a physics-based model under common initial and boundary conditions. Compared with a DM, the SM attains comparable accuracy at one-fifth the computational cost, requiring 50 neural-network evaluations per datum for the DM and only 10 for the SM. Furthermore, high-resolution samples generated by the SM exhibit larger variance, implying superior uncertainty quantification relative to the DM. Owing to the reduced computational cost of the SM, our results suggest the feasibility of real-time ensemble micrometeorological prediction using SM-based super-resolution.
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
- Title: Probabilistic Super-Resolution for Urban Micrometeorology via a Schrödinger Bridge
- Authors:
- Yuki Yasuda: Affiliated with the Research Institute for Value-Added-Information Generation, Japan Agency for Marine-Earth Science and Technology (JAMSTEC).
- Ryo Onishi: Affiliated with the Supercomputing Research Center, Institute of Integrated Research, Institute of Science Tokyo.
- Journal/Conference: This paper is a preprint available on arXiv. It has not yet undergone formal peer review for publication in a journal or conference. arXiv is a widely respected repository for researchers to share their findings early and openly.
- Publication Year: The arXiv identifier suggests 2025.
- Abstract: The paper presents a novel application of a Schrödinger-bridge model (SM) for the super-resolution of 2-meter temperature data in an urban environment. Unlike standard diffusion models (DMs) that generate high-resolution data from noise, this SM directly transforms low-resolution data into high-resolution data. The study demonstrates that the SM achieves accuracy comparable to a DM but with only one-fifth of the computational cost (10 neural network evaluations vs. 50 for the DM). Additionally, the SM generates ensembles with larger variance, indicating superior uncertainty quantification. The authors conclude that the SM's efficiency makes real-time ensemble micrometeorological prediction a feasible prospect.
- Original Source Link:
-
Official Source: https://arxiv.org/abs/2510.12148
-
PDF Link: https://arxiv.org/pdf/2510.12148v1.pdf
-
Publication Status: The paper is a preprint on arXiv. The identifier
2510.12148suggests a submission intended for October 2025, which is in the future relative to the current time. This is unusual and may be a placeholder or a typo in the provided source material, but the analysis will proceed based on the content as given.
-
2. Executive Summary
-
Background & Motivation (Why):
- Core Problem: High-resolution numerical weather prediction, especially for complex urban environments (micrometeorology), is extremely computationally expensive. This makes real-time forecasting and ensemble (multi-possibility) predictions challenging.
- Existing Gaps: While deep learning-based super-resolution (SR) offers a way to accelerate these predictions, recent state-of-the-art methods like Denoising Diffusion Probabilistic Models (DMs) are themselves computationally intensive. DMs typically generate high-resolution outputs by starting from random noise, which is an inefficient process requiring many iterative steps. Furthermore, while DMs can quantify uncertainty, their practical application in meteorology is hindered by this high computational cost.
- Innovation: This paper introduces a Schrödinger-bridge model (SM) as a more efficient alternative. The key innovation is that the SM is designed to learn a direct transformation from low-resolution (LR) data to high-resolution (HR) data, rather than from noise to HR data. This is a more direct and intuitive approach to super-resolution. The study also investigates a critical but often overlooked aspect of SB models in this context: their ability to quantify uncertainty.
-
Main Contributions / Findings (What):
-
Novel Application of a Schrödinger-Bridge Model (SM): The paper is one of the first to apply an SM for probabilistic super-resolution in a meteorological context, specifically for urban 2-m temperature.
-
Superior Computational Efficiency: The SM achieves accuracy comparable to a baseline DM while requiring only one-fifth of the computational cost. The SM needed only 10 neural network evaluations per sample, compared to 50 for the DM, to achieve similar performance.
-
Improved Uncertainty Quantification: The ensemble of HR samples generated by the SM exhibits a larger and more appropriate spread (variance) than the DM. This suggests the SM provides a better-calibrated representation of prediction uncertainty, which is critical for reliable forecasting.
-
Demonstration of Real-Time Feasibility: The study shows that by combining a fast LR physics-based simulation with the efficient SM-based super-resolution, it is possible to generate high-resolution ensemble forecasts in near real-time, marking a significant step towards practical operational use.
-
3. Prerequisite Knowledge & Related Work
-
Foundational Concepts:
- Super-Resolution (SR): A technique to increase the spatial resolution of data. In this context, it involves using a model to generate a high-resolution (e.g., 5-meter) temperature map from a low-resolution (e.g., 20-meter) input.
- Urban Micrometeorology: The study of weather phenomena on a very small scale (meters to a few kilometers) within cities, which is heavily influenced by buildings, roads, and other urban structures.
- Probabilistic Generative Models: Models that learn the underlying probability distribution of a dataset. Instead of producing a single deterministic output, they can generate multiple diverse samples, which is useful for tasks like Uncertainty Quantification (UQ)—estimating the range of possible outcomes.
- Denoising Diffusion Probabilistic Models (DMs): A class of generative models that work in two stages. A "forward process" gradually adds Gaussian noise to a data sample until it becomes pure noise. A neural network then learns the "reverse process" to denoise the sample step-by-step, starting from noise to generate a new data sample.
- Schrödinger Bridge (SB): A mathematical framework that finds the most likely "path" or transformation between two different probability distributions. While a DM typically finds a path from a simple noise distribution to a complex data distribution, an SB can find a path between any two distributions, for example, from an LR data distribution to an HR data distribution. This makes it conceptually more direct for super-resolution.
- Stochastic Differential Equations (SDEs): Mathematical equations used to model systems that evolve over time with a random component. Both DMs and SBs are formulated as SDEs, where a neural network learns the "drift" term that guides the transformation over a "diffusion time."
-
Previous Works & Technological Evolution:
- The paper situates itself within the trend of using deep learning for accelerating numerical weather prediction. Early works focused on deterministic SR models.
- More recently, DMs became popular in meteorology (e.g.,
Hess et al. 2025,Mardani et al. 2025) because they are powerful generative models that can also provide uncertainty estimates through ensemble generation. - However, the authors note that DMs have a fundamental inefficiency: they start from random noise, which has no structural similarity to the target HR data. The transformation from noise to structured data requires many small steps, leading to high computational cost.
- The Schrödinger Bridge framework offers a solution. It generalizes diffusion processes and can model a direct transformation from LR data to HR data. This idea has been explored in computer vision (
Liu et al. 2023), but its application to meteorology and its UQ capabilities remained unexplored.
-
Differentiation:
-
DM vs. SM: The core difference is the starting point of the generation process. As illustrated in Figure 1, the DM starts with Gaussian noise and uses the LR data as a condition to guide the denoising process. The SM starts directly with the LR data itself and transforms it into an HR sample. This makes the SM's transformation path potentially shorter and more efficient, as the LR data already contains significant structural information about the final HR output.
该图像是图1,展示了通过随机微分方程(SDE)进行数据转换的示意图,上半部分为扩散模型(DM),通过逆过程将高斯噪声转为高分辨率数据,下半部分为Schrödinger桥模型(SM),直接将低分辨率数据转化为高分辨率数据,左侧配有输入数据示意图,右侧显示高分辨率输出及对应概率密度函数。
-
4. Methodology (Core Technology & Implementation)
The paper's methodology revolves around comparing two probabilistic super-resolution models: a custom Schrödinger-bridge model (SM) and a baseline Diffusion Model (DM).
The general SR task is to infer high-resolution data given low-resolution data and auxiliary information (e.g., topography). The goal is to learn the conditional probability distribution .
Schrödinger-Bridge Model (SM)
The SM is based on the work of Chen et al. (2024) and is designed to directly transform the LR data into samples from the HR distribution.
-
Principle: The SM solves a specific Schrödinger bridge problem that finds the most efficient transformation from a point mass at the LR data, , to the target conditional distribution .
-
SDE for Generation: The transformation is governed by the following SDE, which is integrated from diffusion time to :
- : The state of the data at diffusion time . It starts at and ends at the generated HR sample .
- : A drift function approximated by a U-Net. It represents the velocity guiding the transformation.
- : The score function, which is calculated from .
- : Pre-defined noise schedules.
- : A Wiener process (infinitesimal random noise).
-
Training: The U-Net that approximates is trained by minimizing the mean squared error between its prediction and the true "velocity" of a
stochastic interpolant. The loss function is:- : The neural network's approximation of the drift term.
- : The
stochastic interpolant, which is a noisy mixture of the LR and HR data pair at time . - : Time derivatives of the interpolation schedules. The network learns to predict the "instantaneous velocity" of this interpolant.
Diffusion Model (DM)
The DM baseline is a standard formulation (Ho et al. 2020) adapted for conditional generation.
-
Principle: The DM learns to reverse a process that gradually turns data into Gaussian noise. To generate a sample, it starts from pure noise and iteratively "denoises" it, conditioned on the LR data.
-
SDEs for Generation: The process is described by a pair of SDEs:
- Forward SDE (Noise Addition):
- This equation describes how the data residual () is transformed into noise ().
- Reverse SDE (Denoising/Generation):
- This equation is integrated backward in time from to , starting from a noise sample . The neural network learns the score function .
- The final HR sample is obtained by adding the generated residual to the LR input: .
- Forward SDE (Noise Addition):
-
Training: The score function is learned via denoising score matching. The network is trained to predict the noise that was added to a clean sample to create a noisy version . The loss is:
Neural Network Architecture
Both the SM and DM use the same U-Net architecture to ensure a fair comparison. This U-Net includes:
-
Four downsampling and four upsampling blocks.
-
Convolutional layers that progressively increase/decrease the number of channels (32, 64, 128, 256).
-
A
multi-head self-attentionblock at the lowest resolution to capture global spatial dependencies. -
Sinusoidal embeddings to encode the diffusion time , which are passed to each block using FiLM layers (
Perez et al. 2018).
5. Experimental Setup
-
Datasets:
-
Source: The data was generated using a physics-based model, the Multi-Scale Simulator for the Geoenvironment (MSSG).
-
Content: The dataset consists of 2-m temperature data over a 1.6 km square area centered on Tokyo Station (see Figure 2), simulated for extremely hot days between 2013 and 2020.
-
Resolution: LR data is at 20-m resolution ( grid), and HR data is at 5-m resolution ( grid).
-
Inputs: The models receive LR 2-m temperature, LR temperature and velocity at multiple vertical levels, HR building height, and HR land-use index. The HR static fields provide important geographical context.
-
Splits: Data from 2013-2018 is used for training, 2019 for validation, and 2020 for testing.
该图像是论文中图2,展示了城市气象模拟计算域内的建筑高度分布示意图。左侧为高分辨率(5米),右侧为低分辨率(20米),图中黑色方框标示了图3所示区域。
-
-
Evaluation Metrics:
-
Root Mean Square Error (RMSE): Measures the average pixel-wise difference between the generated HR sample and the ground-truth HR data.
- Conceptual Definition: It quantifies the magnitude of error. A lower RMSE indicates higher accuracy.
- Formula:
- Symbol Explanation: is the total number of grid points, is the predicted temperature at point , and is the ground-truth temperature.
-
Structural Similarity Index Measure (SSIM) Loss:
- Conceptual Definition: SSIM assesses the perceptual similarity between two images, considering structure, contrast, and luminance. It is generally a better measure of pattern similarity than RMSE. SSIM Loss is defined as , so lower is better.
- Formula:
- Symbol Explanation: are the means of images and ; are their variances; is their covariance; and are small constants to prevent instability.
-
Spread-Skill Ratio (Spread/RMSE):
- Conceptual Definition: A metric to evaluate the calibration of an ensemble forecast. It is the ratio of the ensemble spread (uncertainty) to the ensemble mean's error (RMSE). A ratio close to 1 is ideal. A ratio < 1 indicates the ensemble is underdispersive (too confident).
- Formula:
- Symbol Explanation:
Spreadis the standard deviation of the ensemble members, averaged over all grid points.RMSEis the root mean square error of the ensemble mean prediction.
-
Rank Histogram & Jensen-Shannon (JS) Distance:
- Conceptual Definition: The rank histogram plots the frequency of the rank of the true observation when placed within the sorted ensemble members. For a reliable ensemble, all ranks are equally likely, resulting in a flat histogram. A U-shaped histogram indicates underdispersion. The JS Distance measures the dissimilarity between the observed rank histogram and a perfect uniform distribution. A lower JS distance is better.
- Formula (JS Distance):
- Symbol Explanation: is the observed rank histogram distribution, is the uniform distribution, and is the Kullback-Leibler divergence, a measure of how one probability distribution diverges from another.
-
-
Baselines:
-
The primary baseline is a DM based on the
Palettemodel (Saharia et al. 2022), which is a U-Net-based diffusion model designed for image-to-image tasks. -
The DM also uses residual learning, where it predicts the difference , a technique known to improve performance.
-
6. Results & Analysis
Core Results
-
Qualitative Accuracy (Figure 3): Visual inspection of a sample case shows that both the DM (with steps) and the SM (with only steps) successfully generate fine-scale temperature patterns that are absent in the LR input and closely resemble the HR ground truth. This visually confirms that the SM can achieve high-quality results with significantly fewer computational steps. The error and spread maps show that uncertainty is higher in regions with larger errors, which is a desirable property.
该图像是论文中图3,展示了东京八重洲地区500m×500m范围内2米温度的超分辨率结果对比。图中横排上方为扩散模型(DM)结果,下方为薛定谔桥模型(SM)结果,包含单一样本、集成均值、真实高分辨率数据、绝对误差及集成标准差。中间列(c)底部为输入的低分辨率数据。DM与SM分别使用和。 -
Quantitative Accuracy and Efficiency (Figure 4 and Table 1):
-
Figure 4 shows that the DM's performance degrades significantly when the number of diffusion steps () falls below 50. In contrast, the SM maintains low error even down to . This highlights the SM's superior efficiency and robustness to fewer sampling steps.
-
Table 1 provides a direct comparison. The SM with 10 steps achieves a slightly lower RMSE (0.306 K vs. 0.319 K) and a comparable SSIM Loss (0.106 vs. 0.105) to the DM with 50 steps. This confirms that the SM delivers comparable accuracy at one-fifth the computational cost.
该图像是论文中图4的图表,展示了2米温度均方根误差(RMSE)和结构相似性损失(SSIM Loss)随扩散时间步数变化的关系。图中比较了扩散模型(DM)与施罗丁格桥模型(SM)在不同步数下的性能表现,SM在较少步数时表现出更低的误差和损失,且误差随步数增加趋于稳定。
-
-
Table 1 Transcription: Below is a manual transcription of the data from Table 1.
RMSE [K] SSIM Loss DM (1 Member, NT = 50) 0.319 ± 0.004 0.105 ± 0.004 SM (1 Member, NT = 10) 0.306 ± 0.007 0.106 ± 0.003 -
Uncertainty Quantification (Figure 5 and Table 2):
-
Both models exhibit underdispersion (spread-skill ratio < 1 and U-shaped rank histograms), a common issue in deep learning-based forecasting.
-
However, the SM is better calibrated. Figure 5 shows its scatter plot of spread vs. RMSE is closer to the ideal 1:1 diagonal, and its rank histogram is visibly flatter than the DM's.
-
Table 2 quantifies this improvement. The SM has a higher Spread/RMSE ratio (0.657 vs. 0.641) and a lower JS distance (0.189 vs. 0.234), both indicating more appropriate ensemble variance and milder underdispersion.
该图像是论文中图5的图表,展示了DM和SM两种模型下集成展开与均方根误差(RMSE)的散点图及秩直方图。左侧图中数字表示spread与RMSE的均值比,右侧图为Jensen-Shannon距离,数值越接近0表明统计分布越合理。
-
-
Table 2 Transcription: Below is a manual transcription of the data from Table 2.
Spread-Skill Ratio (Spread/RMSE) Jensen-Shannon Distance DM (64 Members, NT = 50) 0.641 ± 0.024 0.234 ± 0.025 SM (64 Members, NT = 10) 0.657 ± 0.013 0.189 ± 0.015 -
Total Inference Time and Real-Time Feasibility:
-
An HR physics-based simulation for 60 minutes took 206 minutes.
-
The hybrid approach (LR simulation + SM inference) took only 6.31 minutes (6.19 min for LR simulation + 7.29 s for SR). This represents a 32.7-fold speedup.
-
For a 64-member ensemble, the SM inference can be run in parallel with the LR simulation, leading to a total prediction time of about 8 minutes. This suggests that real-time, high-resolution ensemble forecasting is feasible with this method.
-
7. Conclusion & Reflections
-
Conclusion Summary: The paper successfully demonstrates that a Schrödinger-bridge model (SM) is a highly effective and efficient tool for probabilistic super-resolution of urban micrometeorological data. The SM achieves accuracy comparable to a standard diffusion model (DM) with only 20% of the computational cost. Crucially, the SM also produces better-calibrated ensembles with more appropriate uncertainty quantification. These findings strongly motivate the exploration of SMs and related diffusion frameworks as practical, real-time solutions for accelerating weather and climate predictions.
-
Limitations & Future Work:
- Authored Limitations: The authors acknowledge that both models still suffer from underdispersion, even though the SM performs better. This remains a common challenge for deep learning models in meteorology.
- Implicit Future Work: The paper concludes by suggesting that the success of the SM should encourage researchers to explore other types of diffusion processes beyond the standard DM for various meteorological applications.
-
Personal Insights & Critique:
- Significance: This paper is significant for bridging the gap between cutting-edge generative AI research (Schrödinger Bridges) and a practical, high-impact application domain (real-time weather forecasting). The dual focus on computational efficiency and uncertainty quantification is particularly valuable, as both are critical for operational use.
- Transferability: The approach seems highly transferable to other variables (e.g., wind speed, humidity) and other SR tasks in Earth sciences. The use of static high-resolution inputs like building height is a key feature that likely contributes to the high performance and could be replicated in other domains with static geographical data.
- Potential Improvements & Open Questions:
- The comparison is based on a specific DM and SM formulation. The landscape of generative models is evolving rapidly, with many new acceleration techniques (e.g.,
DDIM, consistency models). A broader comparison incorporating these methods could provide a more complete picture of the efficiency trade-offs. - While the SM is more efficient, the underlying U-Net is still a large model. Exploring more lightweight network architectures could further reduce computational demands.
- The study focuses on a single urban area. Testing the model's robustness and generalizability to different cities with varying climates and urban morphologies would be a critical next step for validating its real-world applicability.
- The comparison is based on a specific DM and SM formulation. The landscape of generative models is evolving rapidly, with many new acceleration techniques (e.g.,
Similar papers
Recommended via semantic vector search.