The overarching goal of this project is to develop and demonstrate a new AI/ML technology for creating ground-level grids of ambient air quality. More specifically, the project focuses on developing a data translation process that increases the spatial and temporal detail of aerosol information that is retrieved from moderate to coarse spatial resolution satellite remote sensing measurements. The general outlay of the research is to leverage existing Python-based machine learning libraries, e.g., TensorFlow, to train end-to-end convolutional neural networks and data-classifiers.
It is expected that the proposed work offers a general framework for fusing local observations with satellite-derived information. While the project scope emphasizes downscaling of daytime column aerosol optical depth at 10km resolution, the technology and techniques developed could have broad applicability in downscaling other satellite-derived air quality observations. The overarching goal of this project is to develop and demonstrate a new AI/ML technology for creating ground-level grids of ambient air quality. More specifically, the project focuses on developing a data translation process that increases the spatial and temporal detail of aerosol information that is retrieved from moderate to coarse spatial resolution satellite remote sensing measurements.
Key Components:
Recent advancements in satellite retrieval algorithms, coupled with next-generation high spatial and spectral resolution spaceborne instruments, promise access to near-global air quality maps at spatial scales we have not seen before. However, current retrieval algorithms suffer from many sources of potential error, both random and systematic, which greatly limits the utility of these maps. The problem of map use limitation is often labeled as the “air quality downscales” problem, a name taken from the remote sensing domain where the mapping of ground monitor data to low-resolution satellite pixels is the initial mapping step. In the satellite retrieval domain, researchers have tried to kin various algorithms to solve the retrieval problem, many of which use the “ancillary data” approach to help the retrieval. The ancillary data may come in various sources or forms, each affecting a different layer of uncertainty in the eventual retrieval. Due to the typically large areas spanned by satellite pixels and the sparse nature of the ancillary data (with respect to the actual area spanned), potential errors are difficult to avoid. We address the air quality downscale problem from a new perspective, namely the use of machine learning (ML) and artificial intelligence (AI) tools in association with multiscale chemical transport modeling as a means of overcoming many of the weaknesses of current retrieval algorithms. The huge advantage of this approach is that through ML/AI-based models, we can learn about the relationship between air quality and its plethora of drivers for very high-resolution space and time domains in a real-data cross-validated setting. Further inherent in our approach is the accounting of retrieval errors in the training process, a situation very different from algorithms that are still governed by basic input-output statistical methods. Satellite sensors provide cost-effective, high-frequency, spatially continuous information on air quality. To fill in the gaps of these satellite air quality products and provide more accurate estimates, data downscaling methods can be applied to establish a relationship between the satellite-derived air quality and ground measurements. However, the current studies available for air quality downscaling are relatively complex, and the satellite-based air quality downscaling using machine learning (ML) and artificial intelligence (AI) techniques remain underexplored. Therefore, more advanced AI models need to be developed for comprehensive air quality mapping to improve human life and health. To carry this out, five foci are below reviewed:
- Air Quality: Understand the basics of satellite-based air quality products to include the necessary ancillary data prerequisites.
- Research Highlights: New advancements are proposed for satellite-based air quality data downscaling using AI/ML.
- Downscaling: Integrate predictors derived from machine learning (ML) and artificial intelligence (AI) techniques along with other datasets to downscale satellite-based air quality predictions.
- Data Integration: Combine satellite data with ground-based measurements to train the model.
- AI/ML Algorithms: Use machine learning techniques such as Random Forests or Convolutional Neural Networks (CNNs) for downscaling.
- Validation: Validate the model using independent datasets to ensure accuracy and reliability.
With the recent advancements of remote sensing technologies, air quality mapping has been facilitated significantly in urban and regional areas. Free global satellite-based products, such as daily aerosol optical depth (AOD) maps through moderate resolution imaging spectroradiometer (MODIS), have become preferable air quality datasets for researchers and policymakers for their convenience and reliability but at coarse spatial resolutions. When it comes to large data deficiencies extracted from conventional air quality monitoring networks, models used for satellite-based air quality predictions can become limited, so we cannot obtain the required air quality maps for detailed analysis. Data downscaling methods can therefore be employed to fill in these gaps and offer more precise predictions. The adaptation of machine learning (ML) and artificial intelligence (AI) techniques to develop AI/ML models can aid in a comprehensive mapping extension. With minimal information loss and straightforward modeling processes, the AI/ML models can offer spatially valuable air quality mapping results for large areas. As a result, model benefits include obtaining more spatially continuous data, training a model using additional features, and thereby allowing more specific air quality predictions. To carry out these processes effectively, AI/ML methods encompass the following categories.
Expected Outcome:
The objective was to develop and evaluate artificial intelligence (AI)/machine learning (ML) models trained to downscale, or increase location specificity, of the existing passive satellite sensor-based air quality products that have been developed by NASA over the last decade. This approach is important because while these existing, top-down air quality products are very accurate at capturing regional and statewide trends, errors in these air quality maps increase significantly when forcing simulations down to the regional and local scales. This has made it harder for individual researchers, Department of Health agencies, Environmental Quality agencies, federal land management agencies, regulatory entities, and private sector testing groups to access the products for use in pollution-related public health studies, climate studies and local regulatory activities. When ready for use by the public, the downscaling model will be placed on an interactive web application portal where users can upload a relatively small NASA NPP and/or MOPITT pollutant column history data sheet-containing file and download a data sheet back containing the pollutant history with expanded level of location specificity for use in studies, reports, and regulatory needs.