An official website of the United States government
blue sky with white clouds

World’s premier ground-based observations facility advancing atmospheric research

ARM Improves Merged Aerosol Size Distribution Machine Learning Product

Published: 11 December 2025

A 4x5 grid of data plots indicates predicted good, indeterminate, or bad data. Each plot shows aerodynamic particle sizer data, scanning mobility particle sizer data, and merged data from both instruments.
These sample plots from ARM’s EPCAPE field campaign on December 11, 2023, show quality classifications provided by the MERGEDSMPSAPSML value-added product (VAP), as well as the input and merged aerosol size distributions from the MERGEDSMPSAPS VAP. All the variables shown in the plots are available in MERGEDSMPSAPSML. Plots are from Maxwell Levin, Pacific Northwest National Laboratory.

New merged aerosol size distribution data are now available for the 2023–2024 Eastern Pacific Cloud Aerosol Precipitation Experiment (EPCAPE) from an updated version of a value-added product (VAP) that uses machine learning (ML) to perform additional quality checks on aerosol size distribution data merged from different instruments.

The Atmospheric Radiation Measurement (ARM) User Facility merges aerosol size distributions from the scanning mobility particle sizer (SMPS) and aerodynamic particle sizer (APS) with a VAP called MERGEDSMPSAPS. The merging process is sensitive to noise in the input data and sometimes results in merged size distributions that are non-physical yet difficult to algorithmically filter out.

ARM also produces the MERGEDSMPSAPSML VAP, which uses ML to provide a simple data quality evaluation for merged size distributions from MERGEDSMPSAPS. This allows scientists to easily differentiate between good and bad data. ARM staff recently updated MERGEDSMPSAPSML so it produces improved ML outputs.

The previous iteration of MERGEDSMPSAPSML used two small ML models: a random forest and a small dense neural network. To update the VAP, ARM staff switched to a larger dense neural network model and used feature engineering to give the ML model more information when classifying data quality.

The updated version now reports the model’s quality classification as well as a probability score, which is loosely interpreted as the model’s confidence in its chosen quality classification. ARM staff also improved metadata for quality control variables, making it easier to filter out bad data.

MERGEDSMPSAPSML evaluation data are now available for the EPCAPE field campaign’s main site on the Ellen Browning Scripps Memorial Pier in La Jolla, California, from February 15, 2023, to February 14, 2024.

Scientists can use the merged aerosol size distribution data for calculating aerosol scattering and mass loading, estimating the impact of aerosol on clouds, and verifying aerosol-related quantities in models. They are useful for scientists who need a representation of the aerosol size distribution from approximately 10–20,000 nanometers in diameter.

Once the updated VAP moves to production, ARM plans to reprocess previously released MERGEDSMPSAPSML evaluation data to production. In addition, production data are slated to be released for the Bankhead National Forest atmospheric observatory in Alabama and the 2024–2025 Coast-Urban-Rural Atmospheric Gradient Experiment (CoURAGE) in Maryland.

More information about this VAP is available on the MERGEDSMPSAPSML web page.

Access the data in the ARM Data Center. (To download the data, first create an ARM account.)

Please contact ARM translator John Shilling or VAP developer Maxwell Levin to ask questions, report data issues, or provide feedback to help improve this VAP before it moves to production.

Data can be referenced as doi:10.5439/1992332.

# # #

ARM is a DOE Office of Science user facility operated by nine DOE national laboratories.

ARM Logo

Follow Us:

Keep up with the Atmospheric Observer

Updates on ARM news, events, and opportunities delivered to your inbox

Subscribe Now

ARM User Profile

ARM welcomes users from all institutions and nations. A free ARM user account is needed to access ARM data.

Atmospheric Radiation Measurement (ARM) | Reviewed March 2025