Overview Schedule

All times are in PST (Vancouver time)

Start End Session Session Chair Keynotes / Panelists
06:45 06:55 Welcome S. Karthik Mukkavilli ---
06:55 08:55 Sensors and Sampling Johanna Hansen Yogesh Girdhar, Hannah Kerner, Renaud Detry, & Greg Dudek
08:55 10:55 Ecology Natasha Dudek Dan Morris & Giulio De Leo
10:55 12:45 Water S. Karthik Mukkavilli Pierre Gentine
12:45 13:25 Keynote: Milind Tambe S. Karthik Mukkavilli Milind Tambe
13:25 15:25 Atmosphere Tom Beucler Michael Pritchard & Elizabeth Barnes
15:25 17:20 ML Theory Karthik Kashinath Stephan Mandt & Rose Yu
17:20 18:00 People-Earth Mayur Mudigonda Dan Kammen, Milind Tambe, & Giulio De Leo
18:00 19:00 Solid-Earth Kelly Kochanski ---
19:00 20:55 Datasets Karthik Kashinath Stephan Rasp
20:55 21:00 Closing Remarks Organizers --

Join our slack for live Q&A

Live stream on neurips.cc


# Start Time Video Title Author(s) Details
0 6:45 Introduction Opening Remarks Karthik Mukkavilli (UC Irvine) Short introduction to the session

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 6:55 Introduction Introduction Johanna Hansen (McGill, Mila) Short introduction to the session
1 6:58 Session Keynote Enabling Vision Guided Interactive Exploration in Bandwidth Limited Environments Yogesh Girdhar (WHOI) WARPLab's research focuses on both the science and systems of exploration robots in extreme, communication starved environments such as the deep sea. It aims to develop robotics and machine learning-based techniques to enable search, discovery, and mapping of natural phenomena that are difficult to observe and study due to various physical and information-theoretic challenges.

WARPLab is headed by Yogesh Girdhar, and is part of the Deep Submergence Laboratory (DSL), and the Applied Ocean Physics & Engineering (AOPE) department at Woods Hole Oceanographic Institution.

2 7:22 Invited Talk Eyes in the sky without boots on the ground: Using satellites and machine learning to monitor agriculture and food security during COVID-19 Hannah Kerner (University of Maryland College Park) Hannah Kerner is an Assistant Research Professor at the University of Maryland, College Park. Her research focuses on developing machine learning solutions for remote sensing applications in agricultural monitoring, food security, and Earth/planetary science. She is the Machine Learning Lead and U.S. Domestic Co-Lead for NASA Harvest, NASA’s food security initiative run out of the University of Maryland.
3 7:35 Invited Talk Autonomous robot manipulation for planetary science: Mars Sample Return & Climbing Lava Tubes

Renaud Detry (Jet Propulsion Lab, UCLouvain) This talk will highlight work at NASA on robotic missions from a machine vision perspective. The discussion will focus on the science questions that NASA hopes to answer through returned samples from Mars and the challenges imposed on robotic systems used for scientific data collection.

Renaud Detry is the group leader for the Perception Systems group at NASA's Jet Propulsion Laboratory (JPL). Detry earned his Master's and Ph.D. degrees in computer engineering and robot learning from ULiege in 2006 and 2010. He served as a postdoc at KTH and ULiege between 2011 and 2015, before joining the Robotics and Mobility Section at JPL in 2016. His research interests are perception and learning for manipulation, robot grasping, and mobility, for terrestrial and planetary applications. At JPL, Detry leads the machine-vision team of the Mars Sample Return surface mission, and he leads and contributes to a variety of research projects related to industrial robot manipulation, orbital image understanding, in-space assembly, and autonomous wheeled or legged mobility for Mars, Europa, and Enceladus.

4 7:58 Invited Paper DeepFish: A realistic fish‑habitat dataset to evaluate algorithms for underwater visual analysis Alzayat Saleh (James Cook University), Issam Laradji, Dmitry Konovalov, Michael Bradley, David Vazquez, Marcus Sheaves Visual analysis of complex fish habitats is an important step towards sustainable fisheries for human consumption and environmental protection. Deep Learning methods have shown great promise for scene analysis when trained on large-scale datasets. However, current datasets for fish analysis tend to focus on the classification task within constrained, plain environments which do not capture the complexity of underwater fish habitats. To address this limitation, we present DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks. The dataset consists of approximately 40 thousand images collected underwater from 20 habitats in the marine-environments of tropical Australia.
5 8:06 Invited Paper Automatic 3D Mapping for Tree Diameter Measurements in Inventory Operations Jean-Francois Tremblay (McGill), Martin Beland (U Laval), Francois Pomerleau (U Laval), Richard Gagnon, Philippe Giguere (U Laval) Forestry is a major industry in many parts of the world, yet this potential domain of application area has been overlooked by the robotics community. For instance, forest inventory, a cornerstone of efficient and sustainable forestry, is still traditionally performed manually by qualified professionals. The lack of automation in this particular task, consisting chiefly of measuring tree attributes, limits its speed, and, therefore, the area that can be economically covered. To this effect, we propose to use recent advancements in three‐dimensional mapping approaches in forests to automatically measure tree diameters from mobile robot observations. While previous studies showed the potential for such technology, they lacked a rigorous analysis of diameter estimation methods in challenging and large‐scale forest environments. Here, we validated multiple diameter estimation methods, including two novel ones, in a new publicly‐available dataset which includes four different forest sites, 11 trajectories, totaling 1458 tree observations, and 14,000 m2.
7 8:20 Discussion Q/A and Discussion Johanna Hansen (McGill, Mila) Post your questions to slack to hear from our authors in live Q&A. Discussion panelists include: Greg Dudek (McGill, SamsungAI)
8 --- On-Demand Spectral Unmixing With Multinomial Mixture Kernel and Wasserstein Generative Adversarial Loss Savas Ozkan (METU); Gozde Akar (METU) This study proposes a novel framework for spectral unmixing by using 1D convolution kernels and spectral uncertainty. High-level representations are computed from data, and they are further modeled with the Multinomial Mixture Model to estimate fractions under severe spectral uncertainty. Furthermore, a new trainable uncertainty term based on a nonlinear neural network model is introduced in the reconstruction step. All uncertainty models are optimized by Wasserstein Generative Adversarial Network (WGAN) to improve stability and capture uncertainty. Experiments are performed on both real and synthetic datasets. The results validate that the proposed method obtains state-of-the-art performance, especially for the real datasets compared to the baselines.
9 --- On-Demand Interpretability in Convolutional Neural Networks for Building Damage Classification in Satellite Imagery Thomas Y Chen (The Academy for Mathematics, Science, and Engineering) Natural disasters ravage the world's cities, valleys, and shores on a monthly basis. Having precise and efficient mechanisms for assessing infrastructure damage is essential to channel resources and minimize the loss of life. Using a dataset that includes labeled pre- and post- disaster satellite imagery, we train multiple convolutional neural networks to assess building damage on a per-building basis. In order to investigate how to best classify building damage, we present a highly interpretable deep-learning methodology that seeks to explicitly convey the most useful information required to train an accurate classification model. We also delve into which loss functions best optimize these models. Our findings include that ordinal-cross entropy loss is the most optimal loss function to use and that including the type of disaster that caused the damage in combination with a pre- and post-disaster image best predicts the level of damage caused. Our research seeks to computationally contribute to aiding in this ongoing and growing humanitarian crisis, heightened by climate change.
10 --- On-Demand Domain Adaptive Shake-shake Residual Network for Corn Disease Recognition Yuan Fang (University of Waterloo); Linlin Xu (University of Waterloo); Yuhao Chen (University of Waterloo); Alexander Wong (University of Waterloo) Although advanced machine learning models are critical for hand-held camera-based corn disease detection, they are challenged by several key difficulties, e.g., subtle crop disease signatures and limited training samples from heterogeneous multi-source datasets. The paper presents a novel Domain Adaptive shake-shake Residual neural Network approach (DARNet) to simultaneously address these challenges. The proposed DARNet approach has the following characteristics. First, to efficiently capture the weak disease signature information from insufficient training samples, we build into DARNet a novel deep multi-branched residual neural network architecture with shake-shake regularization, where the multi-branched residual architecture is designed for efficient learning of the subtle disease signature, and the shake-shake regularization is designed to overcome the overfitting problem that usually happens in deep feature learning from insufficient dataset. Second, to efficiently learn disease signature information from both indoor and outdoor crop datasets, DARNet is designed to involve some rule-based domain mapping functions and transfer learning procedures to minimize the gaps between the two datasets and to enable transfer learning among two domains. The proposed DARNet is evaluated on two heterogeneous datasets produced from an indoor and an in-field environment respectively. DARNet is trained on both indoor and in-field corn disease datasets and tested on the in-field dataset. DARNet achieves a high test accuracy of 89.25%, demonstrating that the proposed machine learning approach can efficiently capture the subtle disease signature information from heterogeneous datasets with limited training samples and therefore is promising for developing hand-held camera-based crop disease recognition techniques.
11 --- On-Demand Towards Automated Satellite Conjunction Management with Bayesian Deep Learning Francesco Pinto (University of Oxford); Giacomo Acciarini (University of Strathclyde); Sascha Metz ( Technische Universität Darmstadt); Sarah Boufelja (IBM); Sylvester Kaczmarek (Imperial College London); Klaus Merz (European Space Agency); Jose Antonio Martinez Heras (European Space Agency); Francesca Letizia (European Space Agency); Christopher Bridges (University of Surrey); Atilim Gunes Baydin (University of Oxford) After decades of space travel, low Earth orbit is a junkyard of discarded rocket bodies, dead satellites, and millions of pieces of debris from collisions and explosions. Objects in high enough altitudes do not re-enter and burn up in the atmosphere, but stay in orbit around Earth for a long time. With a speed of 28,000 km/h, collisions in these orbits can generate fragments and potentially trigger a cascade of more collisions known as the Kessler syndrome. This could pose a planetary challenge, because the phenomenon could escalate to the point of hindering future space operations and damaging satellite infrastructure critical for space and Earth science applications. As commercial entities place mega-constellations of satellites in orbit, the burden on operators conducting collision avoidance manoeuvres will increase. For this reason, development of automated tools that predict potential collision events (conjunctions) is critical. We introduce a Bayesian deep learning approach to this problem, and develop recurrent neural network architectures (LSTMs) that work with time series of conjunction data messages (CDMs), a standard data format used by the space community. We show that our method can be used to model all CDM features simultaneously, including the time of arrival of future CDMs, providing predictions of conjunction event evolution with associated uncertainties.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 8:55 Introduction Introduction Natasha Dudek (McGill, Mila) Short introduction to the session
1 9:00 Keynote Taxonomizing “Impact” in AI for Sustainability Dan Morris (Microsoft) Dan Morris will present an introduction to Microsoft’s AI for Earth initiative. The talk will be a tour through a few examples where AI is having a positive impact on environmental sustainability.  More importantly, we’ll put those examples into several categories that represent broad “flavors” of impact, not *only* because I like making lists (although I do like making lists), but to show the ML community the variety of ways in which ML advances can have positive environmental impact, and hopefully encourage NeurIPS folks to think about the shortest path between your own expertise and environmental sustainability, incrementally advancing my secret mission of getting *everyone* in the ML community working on conservation issues.  Finally, we’ll discuss the generally-favorable-but-lacking-in-some-areas state of publicly-available training data for sustainability and conservation problems.
2 9:25 Session Keynote ML and control of parasitic diseases of poverty in tropical and subtropical countries, with a special focus on schistosomiasis Giulio De Leo (Stanford) ML and control of parasitic diseases of poverty in tropical and subtropical countries, with a special focus on schistosomiasi. Giulio De Leo is a Professor at Stanford University and a Senior Fellow at Stanford Woods Institute for the Environment.
MoreHis lab works on theoretical ecology with a focus on disease ecology, marine conservation, and public health.
3 9:55 Regular Talk Graph Learning for Inverse Landscape Genetics Prathamesh Dharangutte (New York University)*; Christopher Musco (New York University) Inferring unknown edges from data at a graph's nodes is a common problem across statistics and machine learning. We study a version that arises in the field of landscape genetics, where genetic similarity between organisms living in a heterogeneous landscape is explained by a graph that encodes the ease of dispersal through that landscape.
MoreOur main contribution is an efficient algorithm for inverse landscape genetics, the task of inferring edges in this graph based on the similarity of genetic data from populations at different nodes. This problem is important in discovering impediments to dispersal that threaten biodiversity and species survival. Drawing on influential work that models dispersal using graph effective resistances (McRae 2006), we reduce the inverse landscape genetics problem to that of inferring graph edges from noisy measurements of these resistances. Then, building on edge-inference techniques for social networks (Hoskins et al. 2018), we develop an efficient first-order optimization method for solving the problem, which significantly outperforms existing techniques in experiments on synthetic and real genetic data.
4 10:05 Regular Talk Segmentation of Soil Degradation Sites in Swiss Alpine Grasslands with Deep Learning Maxim Samarin (University of Basel)*; Lauren Zweifel (University of Basel); Christine Alewell (University of Basel); Volker Roth (University of Basel) Soil degradation is an important environmental problem which affects the Alpine ecosystem and agriculture. Research results suggest that soil degradation in Swiss Alpine grasslands has increased in recent years and it is expected to increase further due to climate and land-use change.
MoreHowever, reliably quantifying the increase in spatial extent of soil degradation is a challenging task. Although methods like Object-based Image Analysis (OBIA) can provide precise detection of erosion sites, an efficient large scale investigation is not feasible due to the labour intensive nature and lack of transferability of the method. In this study, we overcome these limitations by adapting the fully convolutional neural network U-Net trained on high-quality training data provided by OBIA to enable efficient segmentation of erosion sites in high-resolution aerial images. We find that segmentation results of both methods, OBIA and U-Net, are generally in good agreement, but display method specific difference, with an overall precision of 73% and recall of 84%. Importantly, both methods indicate an increase in soil degradation for a case study region over a 16-year period of 167% and 201% for OBIA and U-Net, respectively. Furthermore, we show that the U-Net approach transfers well to new regions (within our study region) and data from subsequent years, even when trained on a comparably small training dataset. Thus the proposed approach enables large scale analysis in Swiss Alpine grasslands and provides a tool for reliable assessment of temporal changes in soil degradation.
5 10:15 Lightning Talk Novel application of Convolutional Neural Networks for the meta-modeling of large-scale spatial data Kiri A Stern (Université de Montréal)*; Timothée Poisot (Université de Montréal) Species connectivity models play an important role in ecological research and biodiversity assessment. Unfortunately, simulations of connectivity models are typically slow, therefore preventing the rapid iteration and updates of models when evaluating different scenarios.
MoreIn this pilot study, we present the proof of concept of utilizing Deep Learning methodologies as a novel approach in ecology for significantly reducing the prediction rate of species connectivity models.
6 10:20 Lightning Talk Understanding Climate Impacts on Vegetation with Gaussian Processes in Granger Causality Miguel M Morata Dolz (University of Valencia)*; Gustau Camps-Valls (Universitat de València) Global warming is leading to unprecedented changes in our planet, with great societal, economical and environmental implications, especially with the growing demand of biofuels and food. Assessing the impact of climate on vegetation is of pressing need.
MoreWe approached the attribution problem with a novel nonlinear Granger causal (GC) methodology and used a large data archive of remote sensing satellite products, environmental and climatic variables spatio-temporally gridded over more than 30 years. We generalize kernel Granger causality by considering the variables cross-relations explicitly in Hilbert spaces, and use the covariance in Gaussian processes. The method generalizes the linear and kernel GC methods, and comes with tighter bounds of performance based on Rademacher complexity. Spatially-explicit global Granger footprints of precipitation and soil moisture on vegetation greenness are identified more sharply than previous GC methods.
7 10:25 Lightning Talk Interpreting the Impact of Weather on Crop Yield Using Attention Tryambak Gangopadhyay (Iowa State University)*; Johnathon Shook (Iowa State University); Asheesh K. Singh (Iowa State University); Soumik Sarkar (Iowa State University) Accurate prediction of crop yield supported by scientific and domain-relevant interpretations can improve agricultural breeding by providing monitoring across diverse climatic conditions. The use of this information in plant breeding can help provide protection against weather challenges to crop production, including erratic rainfall and temperature variations.
MoreIn addition to isolating the important time-steps, researchers are interested to understand the effect of different weather variables on crop yield. In this paper, we propose a novel attention-based model that can learn the most significant variables across different weeks in the crop growing season and highlight the most important time-steps (weeks) to predict the annual crop yield. We demonstrate our model's performance on a dataset based on historical performance records from Uniform Soybean Tests (UST) in North America. The interpretations provided by our model can help in understanding the impact of weather variability on agricultural production in the presence of climate change and formulating breeding strategies to circumvent these climatic challenges.
8 10:30 Discussion Q/A and Discussion Natasha Dudek (McGill, Mila) Post your questions to slack to hear from our authors in live Q&A.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 10:55 Introduction Introduction Karthik Mukkavilli (UC Irvine) Short introduction to the session
1 11:00 Session Keynote Hybrid Hydrological Modeling Pierre Gentine (Columbia) Pierre Gentine is an Associate Professor in the department of Earth and Environmental Engineering in the School of Engineering and Applied Sciences. He is an Investigator in the Columbia Water Center and a director of the Graduate Program in Earth and Environmental Engineering.
MoreDr. Gentine investigates the continental hydrologic cycle though land-atmosphere interaction, boundary layer turbulence, convection, ecohydrology and remote sensing.
2 11:25 Spotlight Talk A Machine Learner's Guide to Streamflow Prediction Martin Gauch (University of Waterloo)*; Daniel Klotz (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Frederik Kratzert (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Grey Nearing (Department of Geological Sciences, University of Alabama, Tuscaloosa, AL United States); Sepp Hochreiter (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Jimmy Lin (University of Waterloo) Although often subconsciously, many people deal with water-related issues on a daily basis. For instance, many regions rely on hydropower plants to produce their electricity, and, at the extreme, floods and droughts pose one of the big environmental threats of climate change.
MoreAt the same time, many machine learning researchers have started to look beyond their field and wish to contribute to environmental issues of our time. The modeling of streamflow—the amount of water that flows through a river cross-section at a given time—is a natural starting point to such contributions: It encompasses a variety of tasks that will be familiar to machine learning researchers, but it is also a vital component of flood and drought prediction (among other applications). Moreover, researchers can draw upon large open datasets, sensory networks, and remote sensing data to train their models. As a getting-started resource, this guide provides a brief introduction to streamflow modeling for machine learning researchers and highlights a number of possible research directions where machine learning could advance the domain.
3 11:40 Spotlight Talk A Deep Learning Architecture for Conservative Dynamical Systems: Application to Rainfall-Runoff Modeling Grey S Nearing (Google Research)*; Frederik Kratzert (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Daniel Klotz (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Hoshin Gupta (University of Arizona); Sella Nevo (Google Research); Yossi Matias The most accurate and generalizable rainfall-runoff models produced by the hydrological sciences community to-date are based on deep learning, and in particular, on Long Short Term Memory networks (LSTMs). Although LSTMs have an explicit state space and gates that mimic input-state-output relationships, these models are not based on physical principles.
MoreWe propose a deep learning architecture that is based on the LSTM and obeys conservation principles. The model is benchmarked on the mass-conservation problem of simulating streamflow.
4 11:55 Spotlight Talk Dynamic Hydrology Maps from Satellite-LiDAR Fusion Gonzalo Mateo-Garcia (University of Valencia)*; Dolores Garcia (Imdea Networks); Hannes Bernhardt (School of Earth and Space Exploration, Arizona State University); Ron Hagensieker (Osir.io); Ignacio Lopez-Francos (NASA); Johnatan Stock (USGS); Guy Schumann (RSS-Hydro); Kevin Dobbs (Frontier Development Lab); Alfredo Kalaitzis (University of Oxford) Where are the Earth's streams flowing right now? Inland surface waters expand with floods and contract with droughts. There is no one map of our streams.
MoreCurrent satellite approaches are limited to monthly observations that map only the widest streams. These are fed by smaller tributaries that make up much of the dendritic surface network but whose flow is unobserved. A complete map of our daily waters can give us an early warning for where droughts are born: the receding tips of the flowing network. Mapping them over years can give us a map of impermanence of our waters, showing where to expect water, and where not to. To that end, we feed the latest high-res sensor data to multiple deep learning models in order to map these flowing networks every day, stacking the times series maps over many years. Specifically, i) we enhance water segmentation to 50 cm/pixel resolution, a 60x improvement over previous state-of-the-art results. Our U-Net trained on 30-40cm WorldView3 images can detect streams as narrow as 1-3m (30-60x over SOTA). Our multi-sensor, multi-res variant, WasserNetz, fuses a multi-day window of 3m PlanetScope imagery with 1m LiDAR data, to detect streams 5-7m wide. Both U-Nets produce a water probability map at the pixel-level. ii) We integrate this water map over a DEM-derived synthetic valley network map, to produce a snapshot of flow at the stream level. iii) We apply this pipeline to a 2-year daily PlanetScope time-series of three watersheds in the US to produce the first high-fidelity dynamic map of stream flow frequency. The end result is a new map that, if applied at the national scale, could fundamentally improve how we manage our water resources around the world.
5 12:10 Regular Talk Efficient Reservoir Management through Deep Reinforcement Learning Xinrun Wang (Nanyang Technological University)*; Tarun Nair ( Ashoka Trust for Research in Ecology and the Environment); Haoyang Li (Nanyang Technological University); Yuh Sheng Reuben Wong (Nanyang Technological University); Nachiket Kelkar (Ashoka Trust for Research in Ecology and the Environment; Srinvias Vaidyanathan (FERAL); Rajat Nayak (FERAL); Bo An (Nanyang Technological University); Jagdish Krishnaswamy (ATREE); Milind Tambe (Google Research, India) Dams impact downstream river dynamics through flow regulation and disruption of upstream-downstream linkages. However, current dam operation is far from satisfactory due to the inability to respond the complicated and uncertain dynamics of the upstream-downstream system and various usages of the reservoir.
MoreEven further, the insuitable dam operation can cause floods in downstream areas. Therefore, we leverage reinforcement learning (RL) methods to compute efficient dam operation guidelines in this work. Specifically, we build offline simulators with real data and different mathematical models for the upstream inflow, i.e., generalized least square (GLS) and dynamic linear model (DLM), then use the simulator to train the state-of-the-art RL algorithms, including DDPG, TD3 and SAC. Experiments show that the simulator with DLM can efficiently model the inflow dynamics in the upstream and the dam operation policies trained by RL algorithms significantly outperform the human-generated policy.
6 12:20 Discussion Q/A and Discussion Karthik Mukkavilli (UC Irvine) Post your questions to slack to hear from our authors in live Q&A.
7 --- On-Demand Predicting Streamflow By Using BiLSTM with Attention from heterogeneous spatiotemporal remote sensing products Anshuman Yadav (IIT Gandhinagar); Pravin Bhasme (IIT Gandhinagar); Udit Bhatia (IIT Gandhinagar)*; Nipun Batra (IIT Gandhinagar) Assessment of water resources is essential for demand management in changing climate scenarios. Current modeling approaches for assessing hydrological extremes and flood hazards often rely on either physics-based parameterized models or data science methods.
MorePhysics-based models tend to implicitly assume a structure resulting in unrealistic parameter values and often run into issues of equifinality. In contrast, the wide availability of remote sensing datasets encouraged and attracted data scientists for modeling different physical processes. The capacity of data science approaches to understand complex dependencies in hydrometeorological datasets promotes its application in the prediction of different variables. We believe ours is the first study that proposes a Bidirectional LSTMs with Attention for image-based covariates in hydrology for knowledge discovery from heterogeneous data products and demonstrates its applicability to predict daily streamflow for the Susquehanna river basin. Model performance is compared against state-of-the-art deep learning architectures and baseline approaches, including ANNs, LSTMs, and Elastic Net. Results show that the proposed methodology outperforms the baselines at the river basin scale. This study highlights the applicability of advanced machine learning approach in the prediction of streamflow.
8 --- On-Demand Inductive Predictions of Extreme Hydrologic Events in The Wabash River Watershed Nicholas H Majeske (Indiana University)*; Ariful Azad (Intelligent System Engineering, Indiana University,Bloomington,IN); Lei Gong (Indiana University); Chen Zhu (Indiana University); Bidisha Faruque Abesh (Indiana University) We present a machine learning method to predict extreme hydrologic events from spatially and temporally varying hydrological and meteorological data. We used a timestep reduction technique to reduce the computational and memory requirements and trained a bidirection LSTM network to predict soil water and stream flow from time series data observed and simulated over eighty years in the Wabash River Watershed.
MoreWe show that our simple model can be trained much faster than complex attention networks such as GeoMAN without sacrificing accuracy. Based on the predicted values of soil water and stream flow, we predict the occurrence and severity of extreme hydrologic events such as drought with high accuracy. We also demonstrate that extreme events can be predicted in one geographical location even if the model is trained with data from a different location. This spatially-inductive setting can be very useful to predict extreme events in other areas in the US using our model trained with the Wabash Basin data.
9 --- On-Demand A Comparison of Data-Driven Models for Predicting Stream Water Temperature Helen Weierbach (Lawrence Berkeley )*; Charuleka Varadharajan (Lawrence Berkeley National Lab); Aranildo Lima (Aquatic Informatics); Boris Faybishenko (Lawrence Berkeley National Lab); Val Hendrix (Lawrence Berkeley National Lab); Danielle Christianson (Lawrence Berkeley National Lab) Changes to the Earth's climate are expected to negatively impact water resources in the future. It is important to have accurate modelling of river flow and water quality to make optimal decisions for water management.
MoreMachine learning and deep learning models have become promising methods for making such hydrological predictions. Using these models, however, requires careful consideration both of data constraints and of model complexity for a given problem. Here, we use machine learning models to predict monthly water temperature records at three monitoring locations in the Northwestern United States with long-term datasets, using meteorological data as predictors. We fit four models for comparison: a Multiple Linear Regression, a Random Forest Regression, a Support Vector Regression and a baseline persistence model. We show that all models are reasonably able to predict mean monthly stream temperatures with root mean-squared errors (RMSE) ranging from 0.63-0.91 degrees Celsius. Of the four models, the Support Vector Regression performs the best with an error of 0.63 degrees Celsius. However, all models perform poorly on extreme values of water temperature. We identify the need for machine learning approaches to predicting extreme values for variables such as water temperature, since it has significant implications for stream ecosystems and biota.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 12:45 Keynote AI for Conversation and Public Health: Learning and Planning in the Data to Deployment Pipeline Milind Tambe (Harvard, Google) Milind Tambe is Gordon McKay Professor of Computer Science and Director of Center for Research on Computation and Society at Harvard University; he is also Director "AI for Social Good" at Google Research India.
1 13:15 Discussion Q/A and Discussion Karthik Mukkavilli (UC Irvine) Post your questions to slack to hear from our authors in live Q&A.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 13:25 Introduction Introduction Tom Beucler (UCI, Columbia) Short introduction to the session
1 13:30 Session Keynote Towards Robust Neural Network Parameterizations of Convection Michael Pritchard (University of California, Irvine) Michael Pritchard’s expertise is in next generation climate simulation, using new algorithms (cloud superparameterization) and new computing techniques (machine learning) to study clouds and their interaction with climate and weather, in high fidelity
2 13:55 Session Keynote Identifying Opportunities for Skillful Weather Prediction with Interpretable Neural Networks Elizabeth A Barnes (Colorado State University)*; Kirsten Mayer (Colorado State University); Benjamin Toms (Intersphere, Inc.); Zane Martin (Colorado State University); Emily Gordon (Colorado State University) The atmosphere is chaotic. This fundamental property of the climate system makes forecasting weather incredibly challenging: it's impossible to expect weather models to ever provide perfect predictions of the Earth system beyond timescales of approximately 2 weeks.
MoreInstead, atmospheric scientists look for specific states of the climate system that lead to more predictable behaviour than others. Here, we demonstrate how neural networks can be used, not only to leverage these states to make skillful predictions, but moreover to identify the climatic conditions that lead to enhanced predictability. Furthermore, we employ a neural network interpretability method called ``layer-wise relevance propagation'' to create heatmaps of the regions in the input most relevant for a network's output. For Earth scientists, these relevant regions for the neural network's prediction are by far the most important product of our study: they provide scientific insight into the physical mechanisms that lead to enhanced weather predictability. While we demonstrate our approach for the atmospheric science domain, this methodology is applicable to a large range of geoscientific problems.
3 14:20 Spotlight Talk Spatio-temporal segmentation and tracking of weather patterns with light-weight Neural Networks Lukas Kapp-Schwoerer (ETH)*; Andre Graubner (ETH); Sol Kim (Lawrence Berkeley National Laboratory); Karthik Kashinath (Lawrence Berkeley National Laboratory) The reliable detection and tracking of weather patterns is a necessary first step towards characterizing extreme weather events in a warming world. Recent work [Prabhat et al.(2020)] has shown that weather pattern recognition by deep neural networks can work remarkably better than feature engineering, such as hand-crafted heuristics, used traditionally in climate science.
MoreAs an extension of this work, we perform Deep Learning - based semantic segmentation of atmospheric rivers and tropical cyclones on the expert-annotated ClimateNet data set, and track individual events using a spatio-temporal overlapping approach. Our approach is fast and scalable to more data modalities and event types, motivating expansion of the ClimateNet dataset and development of novel deep learning architectures. Furthermore, we show that the spatio-temporal tracking capability enables investigating a host of important climate science research questions pertaining to the behavior of extreme weather events in a warming world.
4 14:35 Spotlight Talk Leveraging Lightning with Convolutional Recurrent AutoEncoder and ROCKET for Severe Weather Detection Nadia Ahmed (University of California, Irvine)*; Maria J Molina (National Center for Atmospheric Research); Marek Slipski (NASA Jet Propulsion Laboratory); Iván Venzor (Universidad Autónoma de Nuevo León); Gregory Senay (xBrain); Clem Tillier (Lockheed Martin); Samantha Edgington (Lockheed Martin); Mark Cheung (Lockheed Martin) Previous studies have shown that increases in flash rates detected in ground-based lightning data can be a precursor to severe weather hazards. Lightning data from the Geostationary Lightning Mapper (GLM) aboard the GOES-R satellite is not part of an operational model used by forecasters and is underutilized in severe storm research.
MoreThe Advanced Baseline Imager's (ABI) visible imagery also shows cloud features, such as overshooting tops and above-anvil cirrus plumes, which have been associated with severe weather hazards. We introduce a generative video frame prediction methodology using a convolutional recurrent autoencoder, to leverage these spatio-temporal patterns in GLM and ABI, along with ground-based severe weather data. An initial case study is presented and contrasted with a time series classification of GLM data. Through this study, we seek to highlight the value of GLM data to assist meteorologists in time-constrained nowcasting (15-30 minute lead time) of severe hazards.
5 14:50 Lightning Talk Towards Data-Driven Physics-Informed Global Precipitation Forecasting from Satellite Imagery Valentina Zantedeschi (GE Global Research)*; Daniele De Martini (University of Oxford); Catherine Tong (University of Oxford); Christian A Schroeder de Witt (University of Oxford); Piotr Bilinski (University of Warsaw / University of Oxford); Alfredo Kalaitzis (University of Oxford); Matthew Chantry (University of Oxford); Duncan Watson-Parris (University of Oxford) Under the effects of global warming, extreme events such as floods and droughts are increasing in frequency and intensity. This trend directly affects communities and make all the more urgent widening the access to accurate precipitation forecasting systems for disaster preparedness.
MoreNowadays, weather forecasting relies on numerical models necessitating massive computing resources that most developing countries cannot afford. Machine learning approaches are still in their infancy but already show the promise for democratizing weather predictions, by leveraging any data source and requiring less compute. In this work, we propose a methodology for data-driven and physics-aware global precipitation forecasting from satellite imagery. To fully take advantage of the available data, we design the system as three elements: 1. The atmospheric state is estimated from recent satellite data. 2. The atmospheric state is propagated forward in time. 3. The atmospheric state is used to derive the precipitation intensity within a nearby time interval. In particular, our use of stochastic methods for forecasting the atmospheric state represents a novel application in this domain.
7 14:55 Discussion Q/A and Discussion Tom Beucler (UCI, Columbia) Post your questions to slack to hear from our authors in live Q&A.
8 --- On-Demand Bias correction of global climate model using machine learning algorithms to determine meteorological variables in different tropical climates of Indonesia Juan Nathaniel (School of Information System, Singapore Management University)*; Campbell Watson (IBM) Accurate and localized forecasting of climate variables are important especially in the face of uncertainty imposed by climate change. However, the data used for prediction are either incomplete at the local level or inaccurate because the simulation models do not explicitly consider local contexts and extreme events.
MoreThis paper, therefore, attempts to bridge this gap by applying tree-based machine learning algorithms to correct biases inherent in simulated, reanalysed climate model against local climate observations in differing tropical climate subsystems of Indonesia. The new observation datasets were compiled from various weather stations and agencies across the country. Our results show that regions of tropical savanna experience greatest bias corrections, followed by the tropical monsoon and tropical forest. Finally, to account for extreme events, we embed regional large-scale climate events into these models. In particular, we incorporate ENSO to account for the residual error of extreme rainfall observations, and have achieved an improved bias-correction of 36.67%.
9 --- On-Demand Temporally Weighting Machine Learning Models for High-Impact Severe Hail Prediction Amanda L Burke (University of Oklahoma)*; Amy McGovern (University of Oklahoma); Nathan Snook (University of Oklahoma); David J Gagne (National Center for Atmospheric Research) We explore a new method to improve machine-learning (ML) based severe hail predictions. A temporal weighting scheme allows the random forest models to increase importance of relevant feature data while maintaining general information about the problem domain from other feature data.
MoreWe show that the weighting scheme improves forecast skill and forecaster rust. With a flexible design, this method can produce localized forecasts under multiple different scenarios without6increasing computational expense
10 --- On-Demand Integrating data assimilation with structurally equivariant spatial transformers: Physically consistent data-driven models for weather forecasting Ashesh K Chattopadhyay (Rice University)*; Mustafa Mustafa (Lawrence Berkeley National Laboratory); Pedram Hassanzadeh (Rice University); Karthik Kashinath (Lawrence Berkeley National Laboratory) While recent years have seen an increase in interest to build data-driven models using deep learning techniques for seamless weather forecasting, the prediction horizon for skillful forecast remains inferior to numerical weather prediction (NWP) models. This can be attributed to the insufficient physics that is used as inputs to the deep learning models, inability of the models to perform high-quality spatio-temporal forecasts on physical fields, and the general challenges that exist due to the interacting scales of motion in turbulent flow.
MoreWhile in practice, operational weather models have data assimilation to correct the trajectory of the forecasts every $6$ hours, the forward forecasting model, typically a high-resolution NWP makes the framework computationally expensive. In this paper, we show that a carefully chosen data assimilation scheme can be coupled to a deep learning based forecast model that allows the framework to maintain correct trajectories for several weeks. We ensure that the deep learning model is structurally equivariant (thus constraining key rotational features in the physics of the flow) and can perform skillful forecasts with only partial input (without the knowledge of the full physics or the equations of motions used in an NWP) while the data assimilation schemes incorporate noisy observations every $24$ hours. The framework shows skillful predictions for multiple weeks starting from a noisy initial condition without losing trajectory of the forecasts on the Z500 field of the ECMWF Reanalysis 5 (ERA5) dataset. We conclude that such hybrid deep learning and data assimilation frameworks can enable better forecast performance for data-driven weather prediction and can be extended to operational-quality multi-ensemble probabilistic weather forecasts at a fraction of current computational cost.
11 --- On-Demand Unsupervised Regionalization of Particle-resolved Aerosol Mixing State Indices on the Global Scale Zhonghua Zheng (University of Illinois at Urbana-Champaign)*; Joseph Ching (Japan Meteorological Agency); Jeffrey Curtis (University of Illinois at Urbana-Champaign); Yu Yao (University of Illinois at Urbana-Champaign); Peng Xu (Southern University of Science and Technology); Matthew West (University of Illinois at Urbana-Champaign); Nicole Riemer (University of Illinois at Urbana-Champaign) The aerosol mixing state significantly affects the climate and health impacts of atmospheric aerosol particles. Simplified aerosol mixing state assumptions, common in Earth System models, can introduce errors in the prediction of these aerosol impacts.
MoreThe aerosol mixing state index, a metric to quantify aerosol mixing state, is a convenient measure for quantifying these errors. Global estimates of aerosol mixing state indices have recently become available via supervised learning models, but require regionalization to ease spatiotemporal analysis. Here we developed a simple but effective unsupervised learning approach to regionalize predictions of global aerosol mixing state indices. We used the monthly average of aerosol mixing state indices global distribution as the input data. Grid cells were then clustered into regions by the k-means algorithm without explicit spatial information as input. This approach resulted in eleven regions over the globe with specific spatial aggregation patterns. Each region exhibited a unique distribution of mixing state indices and aerosol compositions, showing the effectiveness of the unsupervised regionalization approach. This study defines “aerosol mixing state zones” that could be useful for atmospheric science research.
15 --- On-Demand Optimising Placement of Pollution Sensors in Windy Environments Sigrid Passano Hellan (University of Edinburgh)*; Christopher Lucas (University of Edinburgh); Nigel Goddard (University of Edinburgh) Air pollution is one of the most important causes of mortality in the world. Monitoring air pollution is useful to learn more about the link between health and pollutants, and to identify areas for intervention.
MoreSuch monitoring is expensive, so it is important to place sensors as efficiently as possible. Bayesian optimisation has proven useful in choosing sensor locations, but typically relies on kernel functions that neglect the statistical structure of air pollution, such as the tendency of pollution to propagate in the prevailing wind direction. We describe two new wind-informed kernels and investigate their advantage for the task of actively learning locations of maximum pollution using Bayesian optimisation.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 15:25 Introduction Introduction Karthik Kashinath (Berkeley Lab) Short introduction to the session
1 15:30 Session Keynote ML for Thermodynamics and Thermodynamic ML Stephan Mandt (University of California, Irvine) Stephan Mandt is an Assistant Professor of Computer Science at the University of California, Irvine. From 2016 until 2018, he was a Senior Researcher and Head of the statistical machine learning group at Disney Research, first in Pittsburgh and later in Los Angeles.
MoreHe held previous postdoctoral positions at Columbia University and Princeton University. Stephan holds a Ph.D. in Theoretical Physics from the University of Cologne. He is a Fellow of the German National Merit Foundation, a Kavli Fellow of the U.S. National Academy of Sciences, and was a visiting researcher at Google Brain. Stephan regularly serves as an Area Chair for NeurIPS, ICML, AAAI, and ICLR, and is a member of the Editorial Board of JMLR. His research is currently supported by NSF, DARPA, Intel, and Qualcomm.
3 15:55 Session Keynote Physics-Guided AI for Learning Spatiotemporal Dynamics Rose Yu (University of California, San Diego) Rose Yu assistant professor at UC San Diego department of Computer Science and Engineering. She works on the theory and application of machine learning, especially for large-scale spatiotemporal data.
MoreI am generally interested in optimization, deep learning, and spatiotemporal reasoning. I am particularly excited about the interplay between physics and machine learning. My work has been applied to learning dynamical systems in sustainability, health and physical sciences. For more details, see my curriculum vitae. For prospective students, please read this before emailing me.
4 16:20 Regular Talk Generating Synthetic Multispectral Satellite Imagery from Sentinel-2 Tharun Mohandoss (Radiant Earth Foundation); Aditya Kulkarni (Radiant Earth Foundation); Dan Northrup (Benson Hill); Ernest Mwebaze (Google); Hamed Alemohammad (Radiant Earth Foundation)* Multispectral satellite imagery provides valuable data at global scale for many environmental and socio-economic applications. Building supervised machine learning models based on these imagery, however, may require ground reference labels which are not available at global scale.
MoreHere, we propose a generative model to produce multi-resolution multispectral imagery based on Sentinel-2 data. The resulting synthetic images are indistinguishable from real ones by humans. This technique paves the road for generating labeled synthetic imagery that can be used for data augmentation in data scarce regions and applications.
5 16:30 Regular Talk Multiresolution Tensor Learning for Efficient and Interpretable Spatiotemporal Analysis Raechel D Walker (University of California, San Diego)*; Rose Yu (UC San Diego) We introduce the Spatiotemporal Multiresolution Tensor Learning (ST-MRTL), an extension of MRTL to spatiotemporal data. ST-MRTL offers better time-efficiency and produces interpretable latent factors by considering both spatial and temporal data in different resolutions.
MoreWe apply ST-MRTL to sea salinity and temperature data. Our method is able to predict the precipitation's variation from the mean in the Midwest region of the United States while generating interpretable latent factors. Additionally, ST-MRTL converges 9-20x faster than the original MRTL solution and depicts historic causes of precipitation in the Midwest, so we can understand patterns in rainfall over time.
6 16:40 Regular Talk Climate-StyleGAN : Modeling Turbulent ClimateDynamics Using Style-GAN Rishabh Gupta (University of Tokyo)*; Mustafa Mustafa (Lawrence Berkeley National Laboratory); Karthik Kashinath (Lawrence Berkeley National Laboratory) In recent years, unsupervised learning with generative adversarial networks (GANs) has been tremendously successful in computer vision applications for natural image generation. Comparatively, unsupervised learning with GANs for emulating physical systems has received less attention.
MoreSome success has been shown with physically constrained GANs but those are limited by their ability to compute constraints and to model higher resolution samples. In this work we leverage the success of StyleGANs for natural images to model complex turbulent climate data without any statistical or physical constraint. We demonstrate the use of a feature-matched and annealed LOGAN-based StyleGAN that outperforms state-of-the-art results on Rayleigh-Benard convection and successfully emulates updraft velocity fields of high-resolution climate simulations.
7 16:50 Lightning Talk Interpretable Deep Generative Spatio-Temporal Point Processes Shixiang Zhu (Georgia Institute of Technology)*; Shuang Li (Georgia Institute of Technology); Zhigang Peng (Georgia Institute of Technology); Yao Xie (Georgia Tech) We present a novel Neural Embedding Spatio-Temporal (NEST) point process model for spatio-temporal discrete event data and develop an efficient imitation learning (a type of reinforcement learning) based approach for model fitting. Despite the rapid development of one-dimensional temporal point processes for discrete event data, the study of spatial-temporal aspects of such data is relatively scarce.
MoreOur model captures complex spatio-temporal dependence between discrete events by carefully design a mixture of heterogeneous Gaussian diffusion kernels, whose parameters are parameterized by neural networks. This is the key that our model can capture intricate spatial dependence patterns and yet still lead to interpretable results as we examine maps of Gaussian diffusion kernel parameters. Furthermore, the likelihood function under our model enjoys tractable expression due to Gaussian kernel parameterization. Experiments based on real data show our method's good performance relative to the state-of-the-art and the good interpretability of NEST's result.
8 16:55 Lightning Talk Completing physics-based model by learning hidden dynamics through data assimilation Arthur Filoche (Sorbonne Université, LIP6)*; Julien Brajard (Sorbonne Université, Nansen Environmental and Remote Sensing Center); Anastase Charantonis (Sorbonne Université, ENSIIE); Dominique Béréziat (Sorbonne Université) Data Assimilation remains the operational choice when it comes to forecast and estimate Earth's dynamical systems. The analogy with Machine Learning has already been shown and is still being investigated to address the problem of improving physics-based models.
MoreEven though both techniques learn from data, machine learning focuses on inferring model while data assimilation concentrates on hidden system state estimation with the help of a dynamical model. In this work, we exploit the complementarity of these methods in a twin experiment where the system is partially observed and the known dynamics is incomplete. Finally, we succeed in partially retrieving a dynamics on a fully-unobserved variable by training a hybrid model through variational data assimilation.
9 17:00 Discussion Q/A and Discussion Karthik Kashinath (Berkeley Lab) Post your questions to slack to hear from our authors in live Q&A.
11 --- On-Demand Semantic Segmentation of Medium-Resolution Satellite Imagery using Conditional Generative Adversarial Networks Aditya Kulkarni (Radiant Earth Foundation); Tharun Mohandoss (Radiant Earth Foundation); Dan Northrup (Benson Hill); Ernest Mwebaze (Google); Hamed Alemohammad (Radiant Earth Foundation)* Semantic segmentation of satellite imagery is a common approach to identify patterns and detect changes around the planet. Most of the state-of-the-art semantic segmentation models are trained in a fully supervised way using Convolutional Neural Network (CNN).
MoreThe generalization property of CNN is poor for satellite imagery because the data can be very diverse in terms of landscape types, image resolutions, and scarcity of labels for different geographies and seasons. Hence, the performance of CNN doesn't translate well to images from unseen regions or seasons. Inspired by Conditional Generative Adversarial Networks (CGAN) based approach of image-to-image translation for high-resolution satellite imagery, we propose a CGAN framework for land cover classification using medium-resolution Sentinel-2 imagery. We find that the CGAN model outperforms the CNN model of similar complexity by a significant margin on an unseen imbalanced test dataset.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 17:20 Introduction Introduction Mayur Mudigonda (Berkeley Lab) Short introduction to the session
1 17:25 Discussion Q/A and Discussion Mayur Mudigonda (Berkeley Lab) Discussion panelists include: Milind Tambe (Harvard, Google) and Dan Kammen (Berkeley) and Giulio De Leo (Stanford)

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 18:00 Introduction Introduction Kelly Kochanski (Boulder) Short introduction to the session
1 18:05 Spotlight Talk Soft Attention Convolutional Neural Networks for Rare Event Detection in Sequences Mandar Kulkarni (Schlumberger)*; Aria Abubakar (Schlumberger); Purnaprajnya Mangsuli (Schlumberger) Automated event detection in the sequences is an important aspect of temporal data analytics. The events can be in the form of peaks, changes in data distribution, changes of spectral characteristics etc.
More In this work, we propose a Soft-Attention Convolutional Neural Network (CNN) based approach for rare event detection in sequences. For the purpose of demonstration, we experiment with well logs where we aim to detect events depicting the changes in the geological layers (a.k.a. well tops/markers). Well logs (single or multivariate) are inputted to a soft attention CNN and a model is trained to locate the marker position. Attention mechanism enables the machine to relatively scale relevant log features for the task. Experimental results show that our approach is able to locate rare events with high precision.
2 18:20 Regular Talk An End-to-End Earthquake Monitoring Method for Joint Earthquake Detection and Association using Deep Learning Weiqiang Zhu (Stanford University)*; Kai Sheng Tai (Stanford University); s.mostafa mousavi (Stanford University); Peter D Bailis (Stanford University); Gregory Beroza (Stanford University) Earthquake monitoring through seismometer networks typically involves a pipeline consisting of detection, phase picking, association, and localization stages. We introduce an earthquake detection and localization method based on a novel end-to-end deep neural network architecture that maps collections of raw seismic waveforms to proposed event times and epicenter locations.
MoreUnlike traditional approaches to this task, our method does not rely on hand-designed time series features or rules for combining predictions across multiple stations. We evaluate our proposed method on data from the 2019 Ridgecrest earthquake sequence, demonstrating its effectiveness when compared with four state-of-the-art earthquake catalogs.
3 18:30 Regular Talk Single-Station Earthquake Location Using Deep Neural Networks S.Mostafa Mousavi (Stanford University)* In seismology, earthquake location is commonly done based on observed arrival times at multiple stations using a velocity model for the region and through an iterative inversion process. This makes the location estimation of earthquakes that are sparsely recorded - either because they are small or because stations are widely separated - difficult.
MoreHere, we present a fast approach based on deep neural networks to directly locate earthquakes using single-station observations. We use a multi-task temporal convolutional neural network in a Bayesian framework to learn epicentral distance and P travel time from 1-minute seismograms along with their epistemic and aleatory uncertainties. We design a separate multi-input network using standard convolutional layers to estimate the back-azimuth angle and its epistemic uncertainty. Using this information, we estimate the epicenter, origin time, and depth along with their confidence intervals.
4 18:40 Lightning Talk Framework for automatic globally optimal well log correlation Oleh Datskiv (SoftServe)*; Mariia Veselovska (SoftServe); Andrii Struk (SoftServe); Oleh Bondarenko (SoftServe); Mykola Maksymenko (SoftServe); Volodymyr Karpiv (SoftServe) Well correlation based on well-logging data is a reliable tool that geological scientists use to interpret and deduce underground sedimentary morphology. Traditional methods are not fully-automated and require additional inputs from the experts to perform well correlation, which complicates the whole process and makes it time-consuming.
MoreWell-log data is often noisy and incomplete which significantly reduces performance of well correlation and the accuracy of geological interpretation. To address this issue, we present a framework for the global pattern correlation that is fully automated and does not require additional inputs from the user. Our framework efficiently handles imperfect data with multi-log curve integration. Global optimality in the proposed framework is achieved through adapting Hungarian algorithm to the assignment problem of well log correlation. Finally, we assess performance of the framework on real-world datasets.
5 18:45 Discussion Q/A and Discussion Kelly Kochanski (Boulder) Post your questions to slack to hear from our authors in live Q&A.
6 --- On-Demand Nowcasting Solar Irradiance Over Oahu Kyle Hart (University of Hawaii at Manoa); Giuseppe Torri (University of Hawaii at Manoa); Peter Sadowski (University of Hawaii at Manoa)* We use satellite data from GOES-17 and deep learning to predict solar radiance with a 10-60 minute forecast horizon. Neural networks were trained on data covering the the Hawaiian islands from 2019, and tested on 2020 data.
MoreOur 10-minute forecasts of solar radiance achieve an RMSE of 31 watt meter^-2 steradian^-1 micrometer^-1, a significant improvement over a simple persistence model benchmark (45 watt meter^-2 steradian^-1 micrometer^-1} on the same data). These results suggest that the approach could potentially be used by energy companies to more efficiently manage power-generators.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 19:00 Introduction Introduction Karthik Kashinath (Berkeley Lab) Short introduction to the session
1 19:05 Session Keynote Why Benchmarks are Crucial for Progress in AI and How to Design Good Ones for Earth Science Stephan Rasp (ClimateAI) Senior Data Scientist at ClimateAI
2 19:30 Spotlight Talk RainBench: Enabling Data-Driven Precipitation Forecasting on a Global Scale Catherine Tong (University of Oxford)*; Christian A Schroeder de Witt (University of Oxford); Valentina Zantedeschi (GE Global Research); Daniele De Martini (University of Oxford); Alfredo Kalaitzis (University of Oxford); Matthew Chantry (University of Oxford); Duncan Watson-Parris (University of Oxford); Piotr Bilinski (University of Warsaw / University of Oxford) Climate change is expected to aggravate extreme precipitation events, directly impacting the livelihood of millions. Without a global precipitation forecasting system in place, many regions -- especially those constrained in resources to collect expensive groundstation data -- are left behind.
MoreTo mitigate such unequal reach of climate change, a solution is to alleviate the reliance on numerical models (and by extension groundstation data) by enabling machine-learning-based global forecasts from satellite imagery. Though prior works exist in regional precipitation nowcasting, there lacks work in global, medium-term precipitation forecasting. Importantly, a common, accessible baseline for meaningful comparison is absent. In this work, we present RainBench, a multi-modal benchmark dataset dedicated to advancing global precipitation forecasting. We establish baseline tasks and release PyRain, a data-handling pipeline to enable efficient processing of decades-worth of data by any modeling framework. Whilst our work serves as a basis for a new chapter on global precipitation forecast from satellite imagery, the greater promise lies in the community joining forces to use our released datasets and tools in developing machine learning approaches to tackle this important challenge.
3 19:45 Spotlight Talk WildfireDB: A Spatio-Temporal Dataset Combining Wildfire Occurrence with Relevant Covariates Samriddhi Singla (University of California, Riverside)*; Ahmed Eldawy (University of California, Riverside); Tianhui Diao (Stanford University); Ayan Mukhopadhyay (Stanford University); Ross Shachter (Stanford University); Mykel Kochenderfer (Stanford University) Modeling fire spread is critical in fire risk management. Creating data-driven models to forecast spread remains challenging due to the lack of comprehensive data sources that relate fires with relevant covariates.
More We present the first comprehensive dataset that relates historical fire data with relevant covariates extracted from satellite imagery. This open-source dataset contains over 2 million data points. We discuss an algorithmic approach based on large-scale raster and vector analysis that can be used to create similar dataset.
4 20:00 Regular Talk LandCoverNet: A global benchmark land cover classification training dataset Hamed Alemohammad (Radiant Earth Foundation)*; Kevin Booth (Radiant Earth Foundation) Regularly updated and accurate land cover maps are essential for monitoring 14 of the 17 Sustainable Development Goals. Multispectral satellite imagery provide high-quality and valuable information at global scale that can be used to develop land cover classification models.
MoreHowever, such a global application requires a geographically diverse training dataset. Here, we present LandCoverNet, a global training dataset for land cover classification based on Sentinel-2 observations at 10m spatial resolution. Land cover class labels are defined based on annual time-series of Sentinel-2, and verified by consensus among three human annotators.
5 20:10 Regular Talk Applying Machine Learning to Crowd-sourced Data from Earthquake Detective Omkar Ranadive (Northwestern University)*; Suzan van der Lee (Northwestern University) We present the Earthquake Detective dataset - A crowdsourced set of labels on potentially triggered (PT) earthquakes and tremors. These events are those which may have been triggered by large magnitude and often distant earthquakes.
MoreWe apply Machine Learning to classify these PT seismic events and explore the challenges faced in segregating such low amplitude signals. The data set and code are available online.
6 20:20 Lightning Talk An Active Learning Pipeline to Detect Hurricane Washover in Post-Storm Aerial Images Evan Goldstein (University of North Carolina at Greensboro)*; Somya Mohanty (University of North Carolina at Greensboro); Shah Nafis Rafique (University of North Carolina at Greensboro); Jamison Valentine (University of North Carolina at Greensboro) We present an active learning pipeline to identify hurricane impacts to coastal landscapes. Previously unlabeled post-storm images are used in a three component workflow — first an online interface is used to crowd-source labels for imagery, second we develop a deep learning model from these labeled images, and third model predictions are displayed on an interactive map.
MoreBoth the labeler and interactive map allow coastal scientists to provide additional labels that will be used to develop a large labeled dataset and improve hurricane impact assessments.
7 20:25 Lightning Talk Developing High Quality Training Samples for Deep Learning Based Local Climate Classification in Korea Minho Kim (Seoul National University)*; Doyeong Jeong (Seoul National University); Hyoungwoo Choi (Seoul National University); Yongil Kim (Seoul National University) Two out of three people will be living in urban areas by 2050, as projected by the United Nations, emphasizing the need for sustainable urban development and monitoring. Common urban footprint data provide high-resolution city extents but lack essential information on the distribution, pattern, and characteristics.
MoreThe Local Climate Zone (LCZ) offers an efficient and standardized framework that can delineate the internal structure and characteristics of urban areas. Global-scale LCZ mapping has been explored, but are limited by low accuracy, variable labeling quality, or domain adaptation challenges. Instead, this study developed a custom LCZ data to map key Korean cities using a multi-scale convolutional neural network. Results demonstrated that using a novel, custom LCZ data with deep learning can generate more accurate LCZ map results compared to conventional community-based LCZ mapping with machine learning as well as transfer learning of the global So2Sat dataset.
8 20:30 Discussion Q/A and Discussion Karthik Kashinath (Berkeley Lab) Post your questions to slack to hear from our authors in live Q&A.
9 --- On-Demand MonarchNet: Differentiating Monarch Butterflies from Those with Similar Appearances Thomas Y Chen (The Academy for Mathematics, Science, and Engineering)* In recent years, the monarch butterfly’s iconic migration patterns have come under threat from a number of factors, from climate change to pesticide use. To track trends in their populations, scientists as well as citizen scientists must identify individuals accurately.
MoreThis is key because there exist other species of butterfly, such as viceroy butterflies, that are "look-alikes," having similar phenotypes. To tackle this problem and to aid in more efficient identification, we present MonarchNet, the first comprehensive dataset consisting of butterfly imagery for monarchs and five look-alike species. We train a baseline deep-learning classification model to serve as a tool for differentiating monarch butterflies and its various look-alikes. We seek to contribute to the study of biodiversity and butterfly ecology by providing a novel method for computational classification of these particular butterfly species. The ultimate aim is to help scientists track monarch butterfly population and migration trends in the most precise manner possible.

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets


# Start Time Video Title Author(s) Details
0 20:55 Closing Closing Remarks Karthik Mukkavilli (UC Irvine) Closing & Thanks

Jump to: Overview - Sensors - Ecology - Water - Keynote - Atmosphere - Theory - People-Earth - Solid-Earth - Datasets