Predicting language difficulties in middle childhood from early developmental milestones: A comparison of traditional regression and machine learning techniques

Rebecca Armstrong, Martyn Symons, James G. Scott, Wendy L. Arnott, David A. Copland, Katie L. McMahon, Andrew J.O. Whitehouse

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Purpose: The current study aimed to compare traditional logistic regression models with machine learning algorithms to investigate the predictive ability of (a) communication performance at 3 years old on language outcomes at 10 years old and (b) broader developmental skills (motor, social, and adaptive) at 3 years old on language outcomes at 10 years old. Method: Participants (N = 1,322) were drawn from the Western Australian Pregnancy Cohort (Raine) Study (Straker et al., 2017). A general developmental screener, the Infant Monitoring Questionnaire (Squires, Bricker, & Potter, 1990), was completed by caregivers at the 3-year follow-up. Language ability at 10 years old was assessed using the Clinical Evaluation of Language Fundamentals–Third Edition (Semel, Wiig, & Secord, 1995). Logistic regression models and interpretable machine learning algorithms were used to assess predictive abilities of early developmental milestones for later language outcomes. Results: Overall, the findings showed that prediction accuracies were comparable between logistic regression and machine learning models using communication-only performance as well as performance on communication and broader developmental domains to predict language performance at 10 years old. Decision trees are incorporated to visually present these findings but must be interpreted with caution because of the poor accuracy of the models overall. Conclusions: The current study provides preliminary evidence that machine learning algorithms provide equivalent predictive accuracy to traditional methods. Furthermore, the inclusion of broader developmental skills did not improve predictive capability. Assessment of language at more than 1 time point is necessary to ensure children whose language delays emerge later are identified and supported.

Original languageEnglish
Pages (from-to)1926-1944
Number of pages19
JournalJournal of Speech, Language, and Hearing Research
Volume61
Issue number8
DOIs
Publication statusPublished - 1 Aug 2018

Fingerprint

Language
childhood
regression
Logistic Models
Aptitude
language
learning
Communication
logistics
performance
communication
ability
Language Development Disorders
Child Language
Decision Trees
Motor Skills
Machine Learning
Childhood
Language Difficulties
Caregivers

Cite this

@article{cd396cfb98764777a5f48b1baf40fa5e,
title = "Predicting language difficulties in middle childhood from early developmental milestones: A comparison of traditional regression and machine learning techniques",
abstract = "Purpose: The current study aimed to compare traditional logistic regression models with machine learning algorithms to investigate the predictive ability of (a) communication performance at 3 years old on language outcomes at 10 years old and (b) broader developmental skills (motor, social, and adaptive) at 3 years old on language outcomes at 10 years old. Method: Participants (N = 1,322) were drawn from the Western Australian Pregnancy Cohort (Raine) Study (Straker et al., 2017). A general developmental screener, the Infant Monitoring Questionnaire (Squires, Bricker, & Potter, 1990), was completed by caregivers at the 3-year follow-up. Language ability at 10 years old was assessed using the Clinical Evaluation of Language Fundamentals–Third Edition (Semel, Wiig, & Secord, 1995). Logistic regression models and interpretable machine learning algorithms were used to assess predictive abilities of early developmental milestones for later language outcomes. Results: Overall, the findings showed that prediction accuracies were comparable between logistic regression and machine learning models using communication-only performance as well as performance on communication and broader developmental domains to predict language performance at 10 years old. Decision trees are incorporated to visually present these findings but must be interpreted with caution because of the poor accuracy of the models overall. Conclusions: The current study provides preliminary evidence that machine learning algorithms provide equivalent predictive accuracy to traditional methods. Furthermore, the inclusion of broader developmental skills did not improve predictive capability. Assessment of language at more than 1 time point is necessary to ensure children whose language delays emerge later are identified and supported.",
author = "Rebecca Armstrong and Martyn Symons and Scott, {James G.} and Arnott, {Wendy L.} and Copland, {David A.} and McMahon, {Katie L.} and Whitehouse, {Andrew J.O.}",
year = "2018",
month = "8",
day = "1",
doi = "10.1044/2018_JSLHR-L-17-0210",
language = "English",
volume = "61",
pages = "1926--1944",
journal = "JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH",
issn = "1092-4388",
publisher = "American Speech Language Hearing Association",
number = "8",

}

Predicting language difficulties in middle childhood from early developmental milestones : A comparison of traditional regression and machine learning techniques. / Armstrong, Rebecca; Symons, Martyn; Scott, James G.; Arnott, Wendy L.; Copland, David A.; McMahon, Katie L.; Whitehouse, Andrew J.O.

In: Journal of Speech, Language, and Hearing Research, Vol. 61, No. 8, 01.08.2018, p. 1926-1944.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Predicting language difficulties in middle childhood from early developmental milestones

T2 - A comparison of traditional regression and machine learning techniques

AU - Armstrong, Rebecca

AU - Symons, Martyn

AU - Scott, James G.

AU - Arnott, Wendy L.

AU - Copland, David A.

AU - McMahon, Katie L.

AU - Whitehouse, Andrew J.O.

PY - 2018/8/1

Y1 - 2018/8/1

N2 - Purpose: The current study aimed to compare traditional logistic regression models with machine learning algorithms to investigate the predictive ability of (a) communication performance at 3 years old on language outcomes at 10 years old and (b) broader developmental skills (motor, social, and adaptive) at 3 years old on language outcomes at 10 years old. Method: Participants (N = 1,322) were drawn from the Western Australian Pregnancy Cohort (Raine) Study (Straker et al., 2017). A general developmental screener, the Infant Monitoring Questionnaire (Squires, Bricker, & Potter, 1990), was completed by caregivers at the 3-year follow-up. Language ability at 10 years old was assessed using the Clinical Evaluation of Language Fundamentals–Third Edition (Semel, Wiig, & Secord, 1995). Logistic regression models and interpretable machine learning algorithms were used to assess predictive abilities of early developmental milestones for later language outcomes. Results: Overall, the findings showed that prediction accuracies were comparable between logistic regression and machine learning models using communication-only performance as well as performance on communication and broader developmental domains to predict language performance at 10 years old. Decision trees are incorporated to visually present these findings but must be interpreted with caution because of the poor accuracy of the models overall. Conclusions: The current study provides preliminary evidence that machine learning algorithms provide equivalent predictive accuracy to traditional methods. Furthermore, the inclusion of broader developmental skills did not improve predictive capability. Assessment of language at more than 1 time point is necessary to ensure children whose language delays emerge later are identified and supported.

AB - Purpose: The current study aimed to compare traditional logistic regression models with machine learning algorithms to investigate the predictive ability of (a) communication performance at 3 years old on language outcomes at 10 years old and (b) broader developmental skills (motor, social, and adaptive) at 3 years old on language outcomes at 10 years old. Method: Participants (N = 1,322) were drawn from the Western Australian Pregnancy Cohort (Raine) Study (Straker et al., 2017). A general developmental screener, the Infant Monitoring Questionnaire (Squires, Bricker, & Potter, 1990), was completed by caregivers at the 3-year follow-up. Language ability at 10 years old was assessed using the Clinical Evaluation of Language Fundamentals–Third Edition (Semel, Wiig, & Secord, 1995). Logistic regression models and interpretable machine learning algorithms were used to assess predictive abilities of early developmental milestones for later language outcomes. Results: Overall, the findings showed that prediction accuracies were comparable between logistic regression and machine learning models using communication-only performance as well as performance on communication and broader developmental domains to predict language performance at 10 years old. Decision trees are incorporated to visually present these findings but must be interpreted with caution because of the poor accuracy of the models overall. Conclusions: The current study provides preliminary evidence that machine learning algorithms provide equivalent predictive accuracy to traditional methods. Furthermore, the inclusion of broader developmental skills did not improve predictive capability. Assessment of language at more than 1 time point is necessary to ensure children whose language delays emerge later are identified and supported.

UR - http://www.scopus.com/inward/record.url?scp=85051464790&partnerID=8YFLogxK

U2 - 10.1044/2018_JSLHR-L-17-0210

DO - 10.1044/2018_JSLHR-L-17-0210

M3 - Article

VL - 61

SP - 1926

EP - 1944

JO - JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH

JF - JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH

SN - 1092-4388

IS - 8

ER -