Introduction: Video otoscopy plays an important role in improving access to ear health services. This study investigated the clinician-rated quality of video otoscopy recordings and still images, and compared their suitability for asynchronous diagnosis of middle-ear disease. Methods: Two hundred and eighty video otoscopy image–recording pairs were collected from 150 children (aged six months to 15 years) by an ear, nose, and throat (ENT) specialist, audiologists, and trained research assistants, and independently rated by an audiologist and ENT surgeon. On a five-point scale, clinicians rated the cerumen amount, field of view, quality, focus, light, and gave an overall rating, and asked whether they could make an accurate diagnosis for both still images and recordings. Results: More video otoscopy recordings were rated as ‘good’ or ‘excellent’ compared to still images across all domains. The mean difference between the two otoscopic procedures ratings was significant across almost all domains (p < 0.05), except ‘cerumen amount’. The suitability to make a diagnosis significantly improved when using recordings (p<0.05). Younger participant age was found to have a significant, negative impact on the ratings across all domains (p < 0.03). The role of the tester conducting video otoscopy did not have a significant impact on the ratings. Discussion: Video otoscopy recordings were found to provide clearer views of the tympanic membrane and increase the ability to make diagnoses, compared to still images, for both audiologists and ENT surgeons. Research assistants with limited practice were able to obtain video otoscopy images and recordings that were comparable to the ones obtained by clinicians.