On the Misidentification of Species: Sampling Error in Primates and Other Mammals Using Geometric Morphometrics in More Than 4000 Individuals

Andrea Cardini, Sarah Elton, Kris Kovarovic, Una Strand Viđarsdóttir, P. David Polly

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)


An accurate classification is the basis for research in biology. Morphometrics and morphospecies play an important role in modern taxonomy, with geometric morphometrics increasingly applied as a favourite analytical tool. Yet, really large samples are seldom available for modern species and even less common in palaeontology, where morphospecies are often identified, described and compared using just one or a very few specimens. The impact of sampling error and how large a sample must be to mitigate the inaccuracy are important questions for morphometrics and taxonomy. Using more than 4000 crania of adult mammals and taxa representing each of the four placental superorders, we assess the impacts of sampling error on estimates of species means, variances and covariances in Procrustes shape data using resampling experiments. In each group of closely related species (mostly congeneric), we found that a species can be identified fairly accurately even when means are based on relatively small samples, although errors are frequent with fewer specimens and primates more prone to inaccuracies. A precise reconstruction of similarity relationships, in contrast, sometimes requires very large samples (> 100), but this varies widely depending on the study group. Medium-sized samples are necessary to accurately estimate standard errors of mean shapes or intraspecific variance covariance structure, but in this case minimum sample sizes are broadly similar across all groups (≈ 20–50 individuals). Overall, thus, the minimum sample sized required for a study varies across taxa and depends on what is being assessed, but about 25–40 specimens (for each sex, if a species is sexually dimorphic) may be on average an adequate and attainable minimum sample size for estimating the most commonly used shape parameters. As expected, the best predictor of the effects of sampling error is the ratio of between- to within-species variation: the larger the ratio, the smaller the sample size needed to obtain the same level of accuracy. Even though ours is the largest study to date of the uncertainties in estimates of means, variances and covariances in geometric morphometrics, and despite its generally high congruence with previous analyses, we feel it would be premature to generalize. Clearly, there is no a priori answer for what minimum sample size is required for a particular study and no universal recipe to control for sampling error. Exploratory analyses using resampling experiments are thus desirable, easy to perform and yield powerful preliminary clues about the effect of sampling on parameter estimates in comparative studies of morphospecies, and in a variety of other morphometric applications in biology and medicine. Morphospecies descriptions are indeed a small piece of provisional evidence in a much more complex evolutionary puzzle. However, they are crucial in palaeontology, and provide important complimentary evidence in modern integrative taxonomy. Thus, if taxonomy provides the bricks for accurate research in biology, understanding the robustness of these bricks is the first fundamental step to build scientific knowledge on sound, stable and long-lasting foundations.

Original languageEnglish
Pages (from-to)190-220
Number of pages31
JournalEvolutionary Biology
Issue number2
Early online date26 Feb 2021
Publication statusPublished - Jun 2021


Dive into the research topics of 'On the Misidentification of Species: Sampling Error in Primates and Other Mammals Using Geometric Morphometrics in More Than 4000 Individuals'. Together they form a unique fingerprint.

Cite this