Focus 1: Foundational Research in Trustworthy Artificial Intelligence / Machine Learning


Broad Goals

1. Develop explainable AI (XAI) methods aligned with environmental science domain perspectives and priorities

  • Develop XAI and interpretable methods for ES data (including regression-based predictions, data with high spatiotemporal autocorrelations, and fielded data)
  • Develop XAI and interpretable methods that integrate physics into the explanations 
  • Develop XAI methods to explain AI model failures 
  • Develop XAI methods that facilitate knowledge and hypothesis discovery 
  • Develop XAI approaches to effectively communicate (measured through RC research) estimated uncertainty to the end user and tailor these to the needs of the end-user  

real-time visualization identifying supercells and squall line segments for forecastersReal-time storm morphology prediction with self-supervised Convolutional Neural Networks (CNNs). Built real-time visualization identifying supercells and squall line segments for forecasters from CNN trained on proxy task.NCAR Realtime Forecasts

CNN analysis of severe hailstorms graphicApplication: Convolutional Neural Network analysis of severe hail storms. Saliency maps identify different types of storms that produce severe hail and enable generation of storm type distributions.Interpretable Deep Learning for Spatial Analysis of Severe Hailstorms (Gagne et al. 2019)

Progress:

Year 2
  • Developed a 2nd benchmark to evaluate XAI methods for ES applications, this time for CNNs (paper submitted, figure above).
  • Two papers published on abstention networks for regression and classification tasks (Barnes and Barnes (2021a,b).
  • Paper under review on Interpretable AI using a “this looks like that” prototype network with applications to climate prediction (Barnes et al., under review)
  • Poster presented at ICLR workshop on interpretation of novel cascade network structure for modeling climate change (Anderson and Stock, 2022)
  • Developed tutorial on uncertainty quantification (UQ) in AI methods for ES applications (TAIES summer school 2022).
Year 1
  • Developed benchmark to evaluate XAI methods for ES applications (paper under review, figure above).
  • Paper: Analyzing Severe Storms in a Future Climate with XAI
  • Created PyMint XAI Python package
  • NCAR AI group received ASOS and MPING precip type data from Kim Elmore, matching data with RAP soundings.
  • NCAR AI Research thrusts: time consistency and uncertainty in ML methods, uncertainty in XAI outputs by combining XAI and UQ ML methods. We have assembled three benchmark datasets and plan to evaluate the sensitivity of XAI results to data and model perturbations.

Leaders: McGovern (OU), Gagne (NCAR), Ebert-Uphoff (CSU), Barnes (CSU)

Members: Bassill, Kurbanovas (Albany). Anderson, BansalGordillo, Haynes, Lee, McGraw, Musgrave, Stock (CSU). Lagerquist (CSU/NOAA). Becker, Demuth, Gantos, Molina, Schreck (NCAR). Potvin, Stewart (NOAA). Hall (NVIDIA). Diochnos, Fagg, Homeyer (OU).Kamangir, King, Krell, Tissot, Vicens Miquel (TAMUCC). Bostrom (UW). 

 

 

2. Develop physics-based AI techniques for environmental science domains

Physical constraints and physics-based AI/ML give us:

  • Robust feature creation
  • Physical, semantic, and redundancy constraints on generated features
  • Physics-constrained loss functions
  • Conditional hybrids of physical model and AI system predictions
  • Novel architectures for XAI

Overall Goal: Improved AI predictions and understanding of physically-based ES phenomena

  • Ensure AI methods always produce produce physically plausible and consistent solutions
  • Develop physics-guided approaches to autonomous feature discovery
  • Develop hybrid models that incorporate physics-based AI
  • Develop geometric AI models that can learn features across different spatial and temporal resolutions

Progress:

Year 2
  • Explored the use of topological data analysis (TDA) to extract physical properties from meteorological imagery (paper submitted to AIES journal, May 2022)
  • Explored use of meteorologically meaningful loss functions for neural networks:
    • Prior year: Developed practical guide for meteorologists, posted on arXiv in June 2021.
    • This year: Explored use of Fourier and wavelet transforms in loss functions (paper accepted in journal AIES subject to major revision, May 2022)
  • Evaluating the effect of changing the input vertical coordinate system on estimating precipitation type
Year 1
  • Developed a neural network algorithm with physics-based components for estimating radiative transfer using AI (figure above), published July 2021
    (https://journals.ametsoc.org/view/journals/atot/aop/JTECH-D-21-0007.1/JTECH-D-21-0007.1.xml).
  • Developed a tutorial for implementing meteorologically meaningful loss functions for neural networks
    (https://arxiv.org/abs/2106.09757).
  • Evaluated temporal consistency of ML predictions and explanations
  • The CSU team conducted experiments with spatially enhanced loss functions for neural networks that utilize neighborhood filters or spectral filters (using Fourier or Wavelet transforms). Lagerquist and Ebert-Uphoff are currently in the process of writing up the results for a journal paper.
  • Gagne and Ebert-Uphoff are coordinating a visit by a CSU Mathematics PhD student to investigate the use of geometric deep learning on AI2ES use cases.

Leaders: McGovern (OU), Hickey (Google), Gagne (NCAR), Ebert-Uphoff (CSU)

Members: Kurbanovas, Sulia (Albany). Anderson, Barnes, Gordillo, Lee, McGraw, Musgrave, Stock, Ver Hoef (CSU). Lagerquist (CSU/NOAA). Becker, Gantos, Schreck (NCAR). Potvin, Stewart (NOAA). Chase, Snook (OU). Kamangir (TAMUCC).

3. Develop robust AI prediction techniques, and empirically and theoretically validate their performance with adversarial data (e.g., missing data or intentionally wrong data).

Some ES datasets are limited by data collection challenges and the rareness of extreme events. AI2ES will address robust AI (theoretically and empirically) with the following approaches:

  • Class imbalance: characterize theoretical sample size needed to meet important verification metrics
  • Transfer learning: train ML on common phenomena and transfer to rare events; train within simulations and transfer to observations
  • Self-supervised learning: identify proxy supervised learning tasks that most aid in prediction when reliable labels are expensive or unavailable
  • Adversarial Data/Classifier Robustness: provide theoretical guarantees for robustness of ES ML models to intentionally or accidentally adversarial data

Goals:

damaged MESONET wind station
El Reno Mesonet station after recording 151 mph wind gust on May 24, 2011.
  • Develop robust semi-supervised and unsupervised learning algorithms for situations where reliable labels are not available
  • Develop theoretical and practical bounds on the robustness of the AI methods given class imbalance, a lack of reliable labels, and for adversarial situations (e.g. data may be missing or corrupted based on weather conditions)
  • Ensure AI methods are robust to noise (e.g., missing data or (un)intentionally wrong data, different regimes)
Map of Oklahoma Mesonet highest wind gusts by county
Map of Oklahoma Mesonet highest wind gusts by county, May 24, 2011.

Progress:

Year 2
  • Algorithms: Effect of regularization methods as defense mechanisms in situations where we have randomly corrupted datasets (datasets of reduced size). Improved semi-supervised storm mode algorithm to be competitive with fully-supervised approaches.
  • Data: Experimental evaluation using a (high-dimensional) wind-prediction dataset by McGovern et al (2021).
  • Publications: Published results at AAAI conference (student abstract) – Flansburg and Diochnos (2022).
  • Publications: Barnes and Barnes (2021a,b) published in JAMES
  • White Papers: posted paper on arxiv summarizing how to add uncertainty to neural network regression tasks that account for heteroscedastic, asymmetric uncertainties
Year 1
  • Developed and refined methods for neural networks to say “I Don’t Know” and learn the more confident samples better, termed controlled abstention networks (CAN)
  • Submitted two papers on CAN (Barnes and Barnes 2021a,b) to JAMES for earth science applications
  • Subdivision algorithm for computing a lower bound on the rate of the minority class
  • Reduction of a learning problem with high recall and high precision, to a learning problem with improved risk bound, within Probably Approximately Correct (PAC) learning
  • Paper in SIAM International Conference on Data Mining, SDM 2021
  • “Wind Prediction under Random Data Corruption (Student Abstract)” paper accepted in the Student Abstract track of the AAAI Conference on Artificial Intelligence (AAAI), 2022. This is the outcome of the REU work of Conner Flansburg with Dimitris Diochnos for the summer of 2021.

Leaders: McGovern (OU), Gagne (NCAR), Diochnos (OU)

Members: Kurbanovas, Thorncroft (Albany). Anderson, Barnes, Ebert-Uphoff, Haynes, McGraw, Musgrave, Stock (CSU). Lagerquist (CSU/NOAA). Williams (IBM). Becker, Molina (NCAR). Potvin, Stewart (NOAA). Hall (NVIDIA). Fagg, Rothenberger, Wilson (OU). Medrano, Vicens Miquel (TAMUCC).