TY - JOUR
T1 - Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy
AU - Qi, Chongchong
AU - Zhou, Min
AU - Chen, Qiusong
AU - Hu, Tao
PY - 2024/11
Y1 - 2024/11
N2 - Purpose: Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods: Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions: The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions: This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract: (Figure presented.)
AB - Purpose: Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods: Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions: The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions: This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract: (Figure presented.)
KW - LightGBM
KW - Manganese
KW - Random forest
KW - Visible-near infrared spectroscopy
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85206392667&partnerID=8YFLogxK
U2 - 10.1007/s11368-024-03914-7
DO - 10.1007/s11368-024-03914-7
M3 - Article
AN - SCOPUS:85206392667
SN - 1439-0108
VL - 24
SP - 3668
EP - 3683
JO - Journal of Soils and Sediments
JF - Journal of Soils and Sediments
IS - 11
ER -