As an initial phase of the Endophthalmitis Population Study of Western Australia (EPSWA), this paper reports the results from an intensive comparative validation of all possible surgery-related endophthalmitis cases identified for the period from 1980 to June 1999 from the Hospital Morbidity Data System (HMDS) of the WA Record Linkage Project with external sources. The external sources were the microbiology and anaesthetic databases from Royal Perth Hospital (where most of the cases of endophthalmitis were treated) and surgeon logbooks of two vitreoretinal surgeons treating endophthalmitis in Perth over the study period. As it was discovered that a large proportion of all cases coded with endophthalmitis did not have any ocular surgery, the validation also included a sample from these cases. The purpose of validating these cases was to ensure that our count of postoperative endophthalmitis had not excluded any cases whose surgery might not have been recorded in the HMDS database. It was also intended to provide an estimate of all miscoded endophthalmitis cases as a first step towards future improvement of coding accuracy. Since we suspected that phaco-emulsification was under-coded, we also examined a sample of cataract procedures.Of all surgery-related endophthalmitis cases coded in the HMDS, only 50.9% (274 Of 538) were found to be valid cases. External sources identified 83 cases of endophthalmitis, 49 did not have endophthalmitis codes but were in the HMDS file with an associated code. Of the remaining externally identified cases, 13 were missing altogether from the HMDS file, 7 of which were correctly coded in the notes while the other 6 were coded with associated codes, and 21 were diagnosed after the date the HMDS file was extracted. The validation of a random sample of the non-surgery-related cases coded with endophthalmitis suggested that the vast majority of them were miscoded (88%, 139 of 158 sampled from 1474 cases).The systematic coding errors reported in this paper may be attributed to both the clinical and the coding departments of the hospital. In any case, coding inaccuracy itself is a serious concern for data quality of any linked database systems and for epidemiological researchers using such data. The increased use of aggregated data in epidemiological research further underscores the importance of coding accuracy and thus data validation. The use of external sources for case identification and case validation are two ways of ensuring data completeness/quality and validity of results.