Impact of ecological redundancy on the performance of machine learning classifiers in vegetation mapping

Paul D. Macintyre, Adriaan Van Niekerk, Mark P. Dobrowolski, James L. Tsakalos, Ladislav Mucina

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Vegetation maps are models of the real vegetation patterns and are considered important tools in conservation and management planning. Maps created through traditional methods can be expensive and time-consuming, thus, new more efficient approaches are needed. The prediction of vegetation patterns using machine learning shows promise, but many factors may impact on its performance. One important factor is the nature of the vegetation–environment relationship assessed and ecological redundancy. We used two datasets with known ecological redundancy levels (strength of the vegetation–environment relationship) to evaluate the performance of four machine learning (ML) classifiers (classification trees, random forests, support vector machines, and nearest neighbor). These models used climatic and soil variables as environmental predictors with pretreatment of the datasets (principal component analysis and feature selection) and involved three spatial scales. We show that the ML classifiers produced more reliable results in regions where the vegetation–environment relationship is stronger as opposed to regions characterized by redundant vegetation patterns. The pretreatment of datasets and reduction in prediction scale had a substantial influence on the predictive performance of the classifiers. The use of ML classifiers to create potential vegetation maps shows promise as a more efficient way of vegetation modeling. The difference in performance between areas with poorly versus well-structured vegetation–environment relationships shows that some level of understanding of the ecology of the target region is required prior to their application. Even in areas with poorly structured vegetation–environment relationships, it is possible to improve classifier performance by either pretreating the dataset or reducing the spatial scale of the predictions.

Original languageEnglish
Pages (from-to)6728-6737
Number of pages10
JournalEcology and Evolution
Volume8
Issue number13
DOIs
Publication statusPublished - 1 Jul 2018

Fingerprint

vegetation mapping
artificial intelligence
vegetation
prediction
pretreatment
machine learning
principal component analysis
planning
ecology
taxonomy
modeling
soil

Cite this

Macintyre, Paul D. ; Van Niekerk, Adriaan ; Dobrowolski, Mark P. ; Tsakalos, James L. ; Mucina, Ladislav. / Impact of ecological redundancy on the performance of machine learning classifiers in vegetation mapping. In: Ecology and Evolution. 2018 ; Vol. 8, No. 13. pp. 6728-6737.
@article{a65a98e4ca0b42e4b7721522919f41fc,
title = "Impact of ecological redundancy on the performance of machine learning classifiers in vegetation mapping",
abstract = "Vegetation maps are models of the real vegetation patterns and are considered important tools in conservation and management planning. Maps created through traditional methods can be expensive and time-consuming, thus, new more efficient approaches are needed. The prediction of vegetation patterns using machine learning shows promise, but many factors may impact on its performance. One important factor is the nature of the vegetation–environment relationship assessed and ecological redundancy. We used two datasets with known ecological redundancy levels (strength of the vegetation–environment relationship) to evaluate the performance of four machine learning (ML) classifiers (classification trees, random forests, support vector machines, and nearest neighbor). These models used climatic and soil variables as environmental predictors with pretreatment of the datasets (principal component analysis and feature selection) and involved three spatial scales. We show that the ML classifiers produced more reliable results in regions where the vegetation–environment relationship is stronger as opposed to regions characterized by redundant vegetation patterns. The pretreatment of datasets and reduction in prediction scale had a substantial influence on the predictive performance of the classifiers. The use of ML classifiers to create potential vegetation maps shows promise as a more efficient way of vegetation modeling. The difference in performance between areas with poorly versus well-structured vegetation–environment relationships shows that some level of understanding of the ecology of the target region is required prior to their application. Even in areas with poorly structured vegetation–environment relationships, it is possible to improve classifier performance by either pretreating the dataset or reducing the spatial scale of the predictions.",
keywords = "functional redundancy, machine learning, predictive modeling, predictive vegetation mapping, vegetation patterns, vegetation–environment relationship",
author = "Macintyre, {Paul D.} and {Van Niekerk}, Adriaan and Dobrowolski, {Mark P.} and Tsakalos, {James L.} and Ladislav Mucina",
year = "2018",
month = "7",
day = "1",
doi = "10.1002/ece3.4176",
language = "English",
volume = "8",
pages = "6728--6737",
journal = "Ecology and Evolution",
issn = "2045-7758",
publisher = "John Wiley & Sons",
number = "13",

}

Impact of ecological redundancy on the performance of machine learning classifiers in vegetation mapping. / Macintyre, Paul D.; Van Niekerk, Adriaan; Dobrowolski, Mark P.; Tsakalos, James L.; Mucina, Ladislav.

In: Ecology and Evolution, Vol. 8, No. 13, 01.07.2018, p. 6728-6737.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Impact of ecological redundancy on the performance of machine learning classifiers in vegetation mapping

AU - Macintyre, Paul D.

AU - Van Niekerk, Adriaan

AU - Dobrowolski, Mark P.

AU - Tsakalos, James L.

AU - Mucina, Ladislav

PY - 2018/7/1

Y1 - 2018/7/1

N2 - Vegetation maps are models of the real vegetation patterns and are considered important tools in conservation and management planning. Maps created through traditional methods can be expensive and time-consuming, thus, new more efficient approaches are needed. The prediction of vegetation patterns using machine learning shows promise, but many factors may impact on its performance. One important factor is the nature of the vegetation–environment relationship assessed and ecological redundancy. We used two datasets with known ecological redundancy levels (strength of the vegetation–environment relationship) to evaluate the performance of four machine learning (ML) classifiers (classification trees, random forests, support vector machines, and nearest neighbor). These models used climatic and soil variables as environmental predictors with pretreatment of the datasets (principal component analysis and feature selection) and involved three spatial scales. We show that the ML classifiers produced more reliable results in regions where the vegetation–environment relationship is stronger as opposed to regions characterized by redundant vegetation patterns. The pretreatment of datasets and reduction in prediction scale had a substantial influence on the predictive performance of the classifiers. The use of ML classifiers to create potential vegetation maps shows promise as a more efficient way of vegetation modeling. The difference in performance between areas with poorly versus well-structured vegetation–environment relationships shows that some level of understanding of the ecology of the target region is required prior to their application. Even in areas with poorly structured vegetation–environment relationships, it is possible to improve classifier performance by either pretreating the dataset or reducing the spatial scale of the predictions.

AB - Vegetation maps are models of the real vegetation patterns and are considered important tools in conservation and management planning. Maps created through traditional methods can be expensive and time-consuming, thus, new more efficient approaches are needed. The prediction of vegetation patterns using machine learning shows promise, but many factors may impact on its performance. One important factor is the nature of the vegetation–environment relationship assessed and ecological redundancy. We used two datasets with known ecological redundancy levels (strength of the vegetation–environment relationship) to evaluate the performance of four machine learning (ML) classifiers (classification trees, random forests, support vector machines, and nearest neighbor). These models used climatic and soil variables as environmental predictors with pretreatment of the datasets (principal component analysis and feature selection) and involved three spatial scales. We show that the ML classifiers produced more reliable results in regions where the vegetation–environment relationship is stronger as opposed to regions characterized by redundant vegetation patterns. The pretreatment of datasets and reduction in prediction scale had a substantial influence on the predictive performance of the classifiers. The use of ML classifiers to create potential vegetation maps shows promise as a more efficient way of vegetation modeling. The difference in performance between areas with poorly versus well-structured vegetation–environment relationships shows that some level of understanding of the ecology of the target region is required prior to their application. Even in areas with poorly structured vegetation–environment relationships, it is possible to improve classifier performance by either pretreating the dataset or reducing the spatial scale of the predictions.

KW - functional redundancy

KW - machine learning

KW - predictive modeling

KW - predictive vegetation mapping

KW - vegetation patterns

KW - vegetation–environment relationship

UR - http://www.scopus.com/inward/record.url?scp=85050183231&partnerID=8YFLogxK

U2 - 10.1002/ece3.4176

DO - 10.1002/ece3.4176

M3 - Article

VL - 8

SP - 6728

EP - 6737

JO - Ecology and Evolution

JF - Ecology and Evolution

SN - 2045-7758

IS - 13

ER -