A survey on extraction of causal relations from natural language text

Jie Yang, Soyeon Caren Han, Josiah Poon

Research output: Contribution to journalArticlepeer-review

68 Citations (Scopus)

Abstract

As an essential component of human cognition, cause–effect relations appear frequently in text, and curating cause–effect relations from text helps in building causal networks for predictive tasks. Existing causality extraction techniques include knowledge-based, statistical machine learning (ML)-based, and deep learning-based approaches. Each method has its advantages and weaknesses. For example, knowledge-based methods are understandable but require extensive manual domain knowledge and have poor cross-domain applicability. Statistical machine learning methods are more automated because of natural language processing (NLP) toolkits. However, feature engineering is labor-intensive, and toolkits may lead to error propagation. In the past few years, deep learning techniques attract substantial attention from NLP researchers because of its powerful representation learning ability and the rapid increase in computational resources. Their limitations include high computational costs and a lack of adequate annotated training data. In this paper, we conduct a comprehensive survey of causality extraction. We initially introduce primary forms existing in the causality extraction: explicit intra-sentential causality, implicit causality, and inter-sentential causality. Next, we list benchmark datasets and modeling assessment methods for causal relation extraction. Then, we present a structured overview of the three techniques with their representative systems. Lastly, we highlight existing open challenges with their potential directions.
Original languageEnglish
Pages (from-to)1161-1186
Number of pages26
JournalKnowledge & Information Systems
Volume64
Issue number5
DOIs
Publication statusPublished - May 2022
Externally publishedYes

Fingerprint

Dive into the research topics of 'A survey on extraction of causal relations from natural language text'. Together they form a unique fingerprint.

Cite this