TY - JOUR
T1 - A Comprehensive Comparison of Machine Learning and Feature Selection Methods for Maize Biomass Estimation Using Sentinel-1 SAR, Sentinel-2 Vegetation Indices, and Biophysical Variables
AU - Xu, Chi
AU - Ding, Yanling
AU - Zheng, Xingming
AU - Wang, Yeqiao
AU - Zhang, Rui
AU - Zhang, Hongyan
AU - Dai, Zewen
AU - Xie, Qiaoyun
PY - 2022/8/20
Y1 - 2022/8/20
N2 - Rapid and accurate estimation of maize biomass is critical for predicting crop productivity. The launched Sentinel-1 (S-1) synthetic aperture radar (SAR) and Sentinel-2 (S-2) missions offer a new opportunity to map biomass. The selection of appropriate response variables is crucial for improving the accuracy of biomass estimation. We developed models from SAR polarization indices, vegetation indices (VIs), and biophysical variables (BPVs) based on gaussian process regression (GPR) and random forest (RF) with feature optimization to retrieve maize biomass in Changchun, Jilin province, Northeastern China. Three new predictors from each type of remote sensing data were proposed based on the correlations to biomass measured in June, July, and August 2018. The results showed that a predictor combined by vertical-horizontal polarization (VV), vertical-horizontal polarization (VH), and the difference of VH and VV (VH-VV) derived from S-1 images of June, July, and August, respectively, with GPR and RF, provided a more accurate estimation of biomass (R2 = 0.81–0.83, RMSE = 0.40–0.41 kg/m2) than the models based on single SAR polarization indices or their combinations, or optimized features (R2 = 0.04–0.39, RMSE = 0.84–1.08 kg/m2). Among the S-2 VIs, the GPR model using a combination of ratio vegetation index (RVI) of June, normalized different infrared index (NDII) of July, and normalized difference vegetation index (NDVI) of August achieved a result with R2 = 0.83 and RMSE = 0.39 kg/m2, much better than single VIs or their combination, or optimized features (R2 of 0.31–0.77, RMSE of 0.47–0.87 kg/m2). A BPV predictor, combined with leaf chlorophyll content (CAB) in June, canopy water content (CWC) in July, and fractional vegetation cover (FCOVER) in August, with RF, also yielded the highest accuracy (R2 = 0.85, RMSE = 0.38 kg/m2) compared to that of single BPVs or their combinations, or optimized subset. Overall, the three combined predictors were found to be significant contributors to improving the estimation accuracy of biomass with GPR and RF methods. This study clearly sheds new insights on the application of S-1 and S-2 data on maize biomass modeling.
AB - Rapid and accurate estimation of maize biomass is critical for predicting crop productivity. The launched Sentinel-1 (S-1) synthetic aperture radar (SAR) and Sentinel-2 (S-2) missions offer a new opportunity to map biomass. The selection of appropriate response variables is crucial for improving the accuracy of biomass estimation. We developed models from SAR polarization indices, vegetation indices (VIs), and biophysical variables (BPVs) based on gaussian process regression (GPR) and random forest (RF) with feature optimization to retrieve maize biomass in Changchun, Jilin province, Northeastern China. Three new predictors from each type of remote sensing data were proposed based on the correlations to biomass measured in June, July, and August 2018. The results showed that a predictor combined by vertical-horizontal polarization (VV), vertical-horizontal polarization (VH), and the difference of VH and VV (VH-VV) derived from S-1 images of June, July, and August, respectively, with GPR and RF, provided a more accurate estimation of biomass (R2 = 0.81–0.83, RMSE = 0.40–0.41 kg/m2) than the models based on single SAR polarization indices or their combinations, or optimized features (R2 = 0.04–0.39, RMSE = 0.84–1.08 kg/m2). Among the S-2 VIs, the GPR model using a combination of ratio vegetation index (RVI) of June, normalized different infrared index (NDII) of July, and normalized difference vegetation index (NDVI) of August achieved a result with R2 = 0.83 and RMSE = 0.39 kg/m2, much better than single VIs or their combination, or optimized features (R2 of 0.31–0.77, RMSE of 0.47–0.87 kg/m2). A BPV predictor, combined with leaf chlorophyll content (CAB) in June, canopy water content (CWC) in July, and fractional vegetation cover (FCOVER) in August, with RF, also yielded the highest accuracy (R2 = 0.85, RMSE = 0.38 kg/m2) compared to that of single BPVs or their combinations, or optimized subset. Overall, the three combined predictors were found to be significant contributors to improving the estimation accuracy of biomass with GPR and RF methods. This study clearly sheds new insights on the application of S-1 and S-2 data on maize biomass modeling.
KW - biophysical variables
KW - feature optimization
KW - gaussian processes regression
KW - maize biomass
KW - polarization indices
KW - random forest
KW - Sentinel-1
KW - Sentinel-2
KW - vegetation indices
UR - http://www.scopus.com/inward/record.url?scp=85137817874&partnerID=8YFLogxK
U2 - 10.3390/rs14164083
DO - 10.3390/rs14164083
M3 - Article
SN - 2072-4292
VL - 14
JO - Remote Sensing
JF - Remote Sensing
IS - 16
M1 - 4083
ER -