An official website of the United States government
blue sky with white clouds

World’s premier ground-based observations facility advancing atmospheric research

Machine Learning Reveals Key Drivers of Atmospheric New Particle Formation

Submitter

Wang, Yang — University of Miami
Mei, Fan — Pacific Northwest National Laboratory

Area of Research

Aerosol Properties

Journal Reference

Hao W, M Mehra, G Budhwani, T Chakraborty, F Mei, and Y Wang. 2026. "Employing Machine Learning for New Particle Formation Identification and Mechanistic Analysis: Insights from a Six‐Year Observational Study in the Southern Great Plains." Journal of Geophysical Research: Atmospheres, 131(1), e2024JD043116, 10.1029/2024JD043116.

Science

Aerosol particle size distribution over time during a new particle formation event, with data‑driven machine learning identifying the six most influential features. Image provided by authors from the publication Employing machine learning for new particle formation identification and mechanistic analysis: Insights from a six-year observational study in the Southern Great Plains.

New particle formation (NPF) is a major source of atmospheric nanoparticles that affect aerosol populations, air quality, human health, and the atmosphere. The complex and nonlinear interactions among radiation, gases, and meteorology make it difficult to pinpoint what conditions trigger events that form new particles. In this study, researchers applied a machine learning technique (random forest) to long-term atmospheric measurements in a rural continental environment to classify NPF and non‑NPF days and to identify which environmental factors matter most. The approach captures the intricate relationships that traditional methods often miss and provides a quantitative ranking of the controlling variables.

Impact

Understanding when and why NPF occurs is critical for improving air quality and earth system models. The machine learning framework developed here provides a new, systematic way to disentangle the roles of solar radiation, pollutants, and meteorology in particle formation. The results show that strong solar radiation markedly increases the probability of NPF, consistent with its role in driving photochemical production of low‑volatility vapors such as sulfuric acid and extremely low‑volatility organic compounds. This highlights the importance of accurate representation of photochemistry and related environmental drivers in models used for air‑quality management and prediction, and it demonstrates how data-driven tools can advance atmospheric process understanding.

Summary

Researchers used a random forest classification model to analyze NPF events in a rural continental site by using long‑term atmospheric observations. The random forest model distinguished NPF days from non‑NPF days and produced a feature‑importance ranking for key atmospheric variables. Partial dependence plots were then used to visualize how individual variables influence the probability of NPF while other factors remain constant.

Solar radiation intensity emerged as one of the most important predictors. The partial dependence plots show a strong positive relationship between solar radiation intensity and NPF likelihood, with a steep increase in probability once solar radiation intensity exceeds about 200 W m⁻². This behavior is consistent with the physical role of sunlight in driving photochemical reactions that generate low‑volatility vapors (such as sulfuric acid and extremely low‑volatility organic compounds) that are essential for NPF. The analysis focuses on daytime periods, when about 95 percent of observed NPF events occur. Together, the random forest and partial dependence plot results show that machine learning can align closely with established physical understanding while offering new quantitative insight into the environmental controls for particle formation.

ARM Logo

Follow Us:

Keep up with the Atmospheric Observer

Updates on ARM news, events, and opportunities delivered to your inbox

Subscribe Now

ARM User Profile

ARM welcomes users from all institutions and nations. A free ARM user account is needed to access ARM data.

Atmospheric Radiation Measurement (ARM) | Reviewed March 2025