GeoDocA – Fast analysis of geological content in mineral exploration reports: A text mining approach

Research output: Contribution to journalArticle

Abstract

Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.

Original languageEnglish
Article number102919
JournalOre Geology Reviews
Volume111
DOIs
Publication statusPublished - 1 Aug 2019

Fingerprint

Mineral exploration
mineral exploration
Minerals
repository
mineral
Ores
User interfaces
visual analysis
Semantics
Rocks
geological time
browsing
systems analysis
document
analysis
rock
modeling

Cite this

@article{4ebb10c66ef44a5ba89a75304d79d368,
title = "GeoDocA – Fast analysis of geological content in mineral exploration reports: A text mining approach",
abstract = "Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.",
keywords = "Automated document analysis, Geological text mining, Mineral exploration reports",
author = "Holden, {Eun Jung} and Wei Liu and Tom Horrocks and Rui Wang and Daniel Wedge and Paul Duuring and Trevor Beardsmore",
year = "2019",
month = "8",
day = "1",
doi = "10.1016/j.oregeorev.2019.05.005",
language = "English",
volume = "111",
journal = "Ore Geology Reviews",
issn = "0169-1368",
publisher = "Pergamon",

}

TY - JOUR

T1 - GeoDocA – Fast analysis of geological content in mineral exploration reports

T2 - A text mining approach

AU - Holden, Eun Jung

AU - Liu, Wei

AU - Horrocks, Tom

AU - Wang, Rui

AU - Wedge, Daniel

AU - Duuring, Paul

AU - Beardsmore, Trevor

PY - 2019/8/1

Y1 - 2019/8/1

N2 - Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.

AB - Records of past exploration in open-file mineral exploration reports are an important source of information for mineral explorers. These reports document existing geological knowledge that may be relevant to modelling ore forming processes in a particular area of interest. This paper presents the development of GeoDocA, a geological document analysis system, that applies automated text analysis techniques with the specific aim of assisting geologists in browsing of and searching for documents based on relevant geological contents within a large repository of documents. GeoDocA analysed 25,419 exploration reports and using a customised set of keywords pertaining to broad categories such as mineral occurrences, rock types, alteration types, and geological time. An interactive user interface was developed to facilitate visual analysis of exploration reports. For individual reports, it provides a summary of their content in graph form, a gallery of extracted figures and tables, and a list of similar reports based on shared geological keywords. In addition, it assists document search efforts through auto-generated keyword suggestions which are based on associations of keywords learnt by the system from all reports in the repository. While the text mining methods reported here is the foundation for further development to incorporate semantic analysis towards geological knowledge extraction, the outcomes of this study demonstrate the effectiveness of automated text analysis in supporting a fast analysis of a large number of reports to identify the targeted mineral systems and their associated geological environments.

KW - Automated document analysis

KW - Geological text mining

KW - Mineral exploration reports

UR - http://www.scopus.com/inward/record.url?scp=85067297771&partnerID=8YFLogxK

U2 - 10.1016/j.oregeorev.2019.05.005

DO - 10.1016/j.oregeorev.2019.05.005

M3 - Article

VL - 111

JO - Ore Geology Reviews

JF - Ore Geology Reviews

SN - 0169-1368

M1 - 102919

ER -