ARM Improves Merged Aerosol Size Distribution Machine Learning Product
Published: 11 December 2025

New merged aerosol size distribution data are now available for the 2023–2024 Eastern Pacific Cloud Aerosol Precipitation Experiment (EPCAPE) from an updated version of a value-added product (VAP) that uses machine learning (ML) to perform additional quality checks on aerosol size distribution data merged from different instruments.
The Atmospheric Radiation Measurement (ARM) User Facility merges aerosol size distributions from the scanning mobility particle sizer (SMPS) and aerodynamic particle sizer (APS) with a VAP called MERGEDSMPSAPS. The merging process is sensitive to noise in the input data and sometimes results in merged size distributions that are non-physical yet difficult to algorithmically filter out.
ARM also produces the MERGEDSMPSAPSML VAP, which uses ML to provide a simple data quality evaluation for merged size distributions from MERGEDSMPSAPS. This allows scientists to easily differentiate between good and bad data. ARM staff recently updated MERGEDSMPSAPSML so it produces improved ML outputs.
The previous iteration of MERGEDSMPSAPSML used two small ML models: a random forest and a small dense neural network. To update the VAP, ARM staff switched to a larger dense neural network model and used feature engineering to give the ML model more information when classifying data quality.
The updated version now reports the model’s quality classification as well as a probability score, which is loosely interpreted as the model’s confidence in its chosen quality classification. ARM staff also improved metadata for quality control variables, making it easier to filter out bad data.
MERGEDSMPSAPSML evaluation data are now available for the EPCAPE field campaign’s main site on the Ellen Browning Scripps Memorial Pier in La Jolla, California, from February 15, 2023, to February 14, 2024.
Scientists can use the merged aerosol size distribution data for calculating aerosol scattering and mass loading, estimating the impact of aerosol on clouds, and verifying aerosol-related quantities in models. They are useful for scientists who need a representation of the aerosol size distribution from approximately 10–20,000 nanometers in diameter.
Once the updated VAP moves to production, ARM plans to reprocess previously released MERGEDSMPSAPSML evaluation data to production. In addition, production data are slated to be released for the Bankhead National Forest atmospheric observatory in Alabama and the 2024–2025 Coast-Urban-Rural Atmospheric Gradient Experiment (CoURAGE) in Maryland.
More information about this VAP is available on the MERGEDSMPSAPSML web page.
Access the data in the ARM Data Center. (To download the data, first create an ARM account.)
Please contact ARM translator John Shilling or VAP developer Maxwell Levin to ask questions, report data issues, or provide feedback to help improve this VAP before it moves to production.
Data can be referenced as doi:10.5439/1992332.
Keep up with the Atmospheric Observer
Updates on ARM news, events, and opportunities delivered to your inbox
ARM User Profile
ARM welcomes users from all institutions and nations. A free ARM user account is needed to access ARM data.