The applied impact of 'naïve' statistical modelling of clustered observations of motion data in injury biomechanics research

Jonathan M D Staynor, Sean D Byrne, Jacqueline A Alderson, Cyril J Donnelly

Research output: Contribution to journalArticle

Abstract

OBJECTIVES: Appropriate statistical analysis of clustered data necessitates accounting for within-participant effects to ensure results are repeatable and translatable to real-world applications. This study aimed to compare statistical output and injury risk interpretation differences from two statistical regression models built from a clinical movement sidestepping database. A "naïve" regression model, which does not account for within-participant effects, was compared with an appropriately applied mixed effects model.

DESIGN: Comparative study.

METHODS: Three-dimensional unplanned sidestepping joint angle data (trunk, hip, and knee) from 35 males (112 observations) were used to model peak knee valgus moments and anterior cruciate ligament injury risk during the impact phase of stance. Both statistical models were cross-validated using a k-fold analysis.

RESULTS: The naïve regression returned inflated goodness of fit statistics (R2=0.50), which was evident following cross-validation (predicted R2=0.43). Following cross-validation, the mixed effects model (predicted R2=0.40) explained a similar amount of variance, despite containing three less predictors. The naïve model produced inaccurate parameter estimates, overestimating the effects of certain kinematic parameters by as much as 79 %.

CONCLUSIONS: A regression model naïvely applied to clustered observations of sidestepping data resulted in erroneous parameter estimates and goodness of fit statistics which have the potential to mislead future research and real-world applications. It is important for sport and clinical scientists to use statistically appropriate mixed effects models when modelling clustered motion capture data for injury biomechanics research to protect the translatability of the findings.

Original languageEnglish
Pages (from-to)420-424
Number of pages5
JournalJournal of Science and Medicine in Sport
Volume22
Issue number4
Early online date26 Oct 2018
DOIs
Publication statusPublished - Apr 2019

Fingerprint

Statistical Models
Biomechanical Phenomena
Knee
Statistical Data Interpretation
Wounds and Injuries
Research
Sports
Hip
Joints
Regression Analysis
Databases
Anterior Cruciate Ligament Injuries

Cite this

@article{9757473f421e49be8ff0ec5816d2cb49,
title = "The applied impact of 'na{\"i}ve' statistical modelling of clustered observations of motion data in injury biomechanics research",
abstract = "OBJECTIVES: Appropriate statistical analysis of clustered data necessitates accounting for within-participant effects to ensure results are repeatable and translatable to real-world applications. This study aimed to compare statistical output and injury risk interpretation differences from two statistical regression models built from a clinical movement sidestepping database. A {"}na{\"i}ve{"} regression model, which does not account for within-participant effects, was compared with an appropriately applied mixed effects model.DESIGN: Comparative study.METHODS: Three-dimensional unplanned sidestepping joint angle data (trunk, hip, and knee) from 35 males (112 observations) were used to model peak knee valgus moments and anterior cruciate ligament injury risk during the impact phase of stance. Both statistical models were cross-validated using a k-fold analysis.RESULTS: The na{\"i}ve regression returned inflated goodness of fit statistics (R2=0.50), which was evident following cross-validation (predicted R2=0.43). Following cross-validation, the mixed effects model (predicted R2=0.40) explained a similar amount of variance, despite containing three less predictors. The na{\"i}ve model produced inaccurate parameter estimates, overestimating the effects of certain kinematic parameters by as much as 79 {\%}.CONCLUSIONS: A regression model na{\"i}vely applied to clustered observations of sidestepping data resulted in erroneous parameter estimates and goodness of fit statistics which have the potential to mislead future research and real-world applications. It is important for sport and clinical scientists to use statistically appropriate mixed effects models when modelling clustered motion capture data for injury biomechanics research to protect the translatability of the findings.",
author = "Staynor, {Jonathan M D} and Byrne, {Sean D} and Alderson, {Jacqueline A} and Donnelly, {Cyril J}",
note = "Copyright {\circledC} 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.",
year = "2019",
month = "4",
doi = "10.1016/j.jsams.2018.10.006",
language = "English",
volume = "22",
pages = "420--424",
journal = "Journal of Science and Medicine in Sport",
issn = "1440-2440",
publisher = "Elsevier",
number = "4",

}

TY - JOUR

T1 - The applied impact of 'naïve' statistical modelling of clustered observations of motion data in injury biomechanics research

AU - Staynor, Jonathan M D

AU - Byrne, Sean D

AU - Alderson, Jacqueline A

AU - Donnelly, Cyril J

N1 - Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

PY - 2019/4

Y1 - 2019/4

N2 - OBJECTIVES: Appropriate statistical analysis of clustered data necessitates accounting for within-participant effects to ensure results are repeatable and translatable to real-world applications. This study aimed to compare statistical output and injury risk interpretation differences from two statistical regression models built from a clinical movement sidestepping database. A "naïve" regression model, which does not account for within-participant effects, was compared with an appropriately applied mixed effects model.DESIGN: Comparative study.METHODS: Three-dimensional unplanned sidestepping joint angle data (trunk, hip, and knee) from 35 males (112 observations) were used to model peak knee valgus moments and anterior cruciate ligament injury risk during the impact phase of stance. Both statistical models were cross-validated using a k-fold analysis.RESULTS: The naïve regression returned inflated goodness of fit statistics (R2=0.50), which was evident following cross-validation (predicted R2=0.43). Following cross-validation, the mixed effects model (predicted R2=0.40) explained a similar amount of variance, despite containing three less predictors. The naïve model produced inaccurate parameter estimates, overestimating the effects of certain kinematic parameters by as much as 79 %.CONCLUSIONS: A regression model naïvely applied to clustered observations of sidestepping data resulted in erroneous parameter estimates and goodness of fit statistics which have the potential to mislead future research and real-world applications. It is important for sport and clinical scientists to use statistically appropriate mixed effects models when modelling clustered motion capture data for injury biomechanics research to protect the translatability of the findings.

AB - OBJECTIVES: Appropriate statistical analysis of clustered data necessitates accounting for within-participant effects to ensure results are repeatable and translatable to real-world applications. This study aimed to compare statistical output and injury risk interpretation differences from two statistical regression models built from a clinical movement sidestepping database. A "naïve" regression model, which does not account for within-participant effects, was compared with an appropriately applied mixed effects model.DESIGN: Comparative study.METHODS: Three-dimensional unplanned sidestepping joint angle data (trunk, hip, and knee) from 35 males (112 observations) were used to model peak knee valgus moments and anterior cruciate ligament injury risk during the impact phase of stance. Both statistical models were cross-validated using a k-fold analysis.RESULTS: The naïve regression returned inflated goodness of fit statistics (R2=0.50), which was evident following cross-validation (predicted R2=0.43). Following cross-validation, the mixed effects model (predicted R2=0.40) explained a similar amount of variance, despite containing three less predictors. The naïve model produced inaccurate parameter estimates, overestimating the effects of certain kinematic parameters by as much as 79 %.CONCLUSIONS: A regression model naïvely applied to clustered observations of sidestepping data resulted in erroneous parameter estimates and goodness of fit statistics which have the potential to mislead future research and real-world applications. It is important for sport and clinical scientists to use statistically appropriate mixed effects models when modelling clustered motion capture data for injury biomechanics research to protect the translatability of the findings.

U2 - 10.1016/j.jsams.2018.10.006

DO - 10.1016/j.jsams.2018.10.006

M3 - Article

VL - 22

SP - 420

EP - 424

JO - Journal of Science and Medicine in Sport

JF - Journal of Science and Medicine in Sport

SN - 1440-2440

IS - 4

ER -