TY - JOUR
T1 - E2EET
T2 - from pipeline to end-to-end entity typing via transformer-based embeddings
AU - Stewart, Michael
AU - Liu, Wei
PY - 2022/1
Y1 - 2022/1
N2 - Entity typing (ET) is the process of identifying the semantic types of every entity within a corpus. ET involves labelling each entity mention with one or more class labels. As a multi-class, multi-label task, it is considerably more challenging than named entity recognition. This means existing entity typing models require pre-identified mentions and cannot operate directly on plain text. Pipeline-based approaches are therefore used to join a mention extraction model and an entity typing model to process raw text. Another key limiting factor is that these mention-level ET models are trained on fixed context windows, which makes the entity typing results sensitive to window size selection. In light of these drawbacks, we propose an end-to-end entity typing model (E2EET) using a Bi-GRU to remove the dependency on window size. To demonstrate the effectiveness of our E2EET model, we created a stronger baseline mention-level model by incorporating the latest contextualised transformer-based embeddings (BERT). Extensive ablative studies demonstrate the competitiveness and simplicity of our end-to-end model for entity typing.
AB - Entity typing (ET) is the process of identifying the semantic types of every entity within a corpus. ET involves labelling each entity mention with one or more class labels. As a multi-class, multi-label task, it is considerably more challenging than named entity recognition. This means existing entity typing models require pre-identified mentions and cannot operate directly on plain text. Pipeline-based approaches are therefore used to join a mention extraction model and an entity typing model to process raw text. Another key limiting factor is that these mention-level ET models are trained on fixed context windows, which makes the entity typing results sensitive to window size selection. In light of these drawbacks, we propose an end-to-end entity typing model (E2EET) using a Bi-GRU to remove the dependency on window size. To demonstrate the effectiveness of our E2EET model, we created a stronger baseline mention-level model by incorporating the latest contextualised transformer-based embeddings (BERT). Extensive ablative studies demonstrate the competitiveness and simplicity of our end-to-end model for entity typing.
KW - Bidirectional GRU
KW - Bidirectional LSTM
KW - Entity typing
KW - Natural language processing
KW - Neural language models
KW - Sequence labelling
UR - https://www.scopus.com/pages/publications/85120321362
U2 - 10.1007/s10115-021-01626-9
DO - 10.1007/s10115-021-01626-9
M3 - Article
AN - SCOPUS:85120321362
SN - 0219-1377
VL - 64
SP - 95
EP - 113
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 1
ER -