GeoDocA – Fast analysis of geological content in mineral exploration reports: A text mining approach

Research output: Contribution to journalArticlepeer-review

21 Citations (Scopus)


Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.

Original languageEnglish
Article number102919
JournalOre Geology Reviews
Publication statusPublished - 1 Aug 2019


Dive into the research topics of 'GeoDocA – Fast analysis of geological content in mineral exploration reports: A text mining approach'. Together they form a unique fingerprint.

Cite this