Background: Malignant melanoma can most successfully be cured when diagnosed at an early stage in the natural history. However, there is controversy over screening programs and many advocate screening only for high-risk individuals.
Objectives: This study aimed to evaluate the accuracy of an artificial intelligence neural network (Deep Ensemble for Recognition of Melanoma [DERM]) to identify malignant melanoma from dermoscopic images of pigmented skin lesions and to show how this compared to doctors' performance assessed by meta-analysis.
Methods: DERM was trained and tested using 7,102 dermoscopic images of both histologically confirmed melanoma (24%) and benign pigmented lesions (76%). A meta-analysis was conducted of studies examining the accuracy of naked-eye examination, with or without dermoscopy, by specialist and general physicians whose clinical diagnosis was compared to histopathology. The meta-analysis was based on evaluation of 32,226 pigmented lesions including 3,277 histopathology-confirmed malignant melanoma cases. The receiver operating characteristic (ROC) curve was used to examine and compare the diagnostic accuracy.
Results: DERM achieved a ROC area under the curve (AUC) of 0.93 (95% confidence interval: 0.92-0.94), and sensitivity and specificity of 85.0% and 85.3%, respectively. Avoidance of false-negative results is essential, so different decision thresholds were examined. At 95% sensitivity DERM achieved a specificity of 64.1% and at 95% specificity the sensitivity was 67%. The meta-analysis showed primary care physicians (10 studies) achieve an AUC of 0.83 (95% confidence interval: 0.79-0.86), with sensitivity and specificity of 79.9% and 70.9%; and dermatologists (92 studies) 0.91 (0.88-0.93), 87.5%, and 81.4%, respectively.
Conclusions: DERM has the potential to be used as a decision support tool in primary care, by providing dermatologist-grade recommendation on the likelihood of malignant melanoma.