Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy

Chongchong Qi, Min Zhou, Qiusong Chen, Tao Hu

Research output: Contribution to journalArticlepeer-review

Abstract

Purpose: Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods: Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions: The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions: This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract: (Figure presented.)

Original languageEnglish
Pages (from-to)3668-3683
Number of pages16
JournalJournal of Soils and Sediments
Volume24
Issue number11
Early online date14 Oct 2024
DOIs
Publication statusPublished - Nov 2024

Fingerprint

Dive into the research topics of 'Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy'. Together they form a unique fingerprint.

Cite this