TY - JOUR
T1 - GeoDocA – Fast analysis of geological content in mineral exploration reports
T2 - A text mining approach
AU - Holden, Eun Jung
AU - Liu, Wei
AU - Horrocks, Tom
AU - Wang, Rui
AU - Wedge, Daniel
AU - Duuring, Paul
AU - Beardsmore, Trevor
PY - 2019/8/1
Y1 - 2019/8/1
N2 - Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.
AB - Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.
KW - Automated document analysis
KW - Geological text mining
KW - Mineral exploration reports
UR - http://www.scopus.com/inward/record.url?scp=85067297771&partnerID=8YFLogxK
U2 - 10.1016/j.oregeorev.2019.05.005
DO - 10.1016/j.oregeorev.2019.05.005
M3 - Article
AN - SCOPUS:85067297771
SN - 0169-1368
VL - 111
JO - Ore Geology Reviews
JF - Ore Geology Reviews
M1 - 102919
ER -