Projects per year
Supervised domain-specific term extraction often suffers from two common problems, namely labourious manual feature selection, and the lack of labelled data. In this paper, we introduce a weakly supervised bootstrapping approach using two deep learning classifiers. Each classifier learns the representations of terms separately by taking word embedding vectors as inputs, thus no manually selected feature is required. The two classifiers are firstly trained on a small set of labelled data, then independently make predictions on a subset of the unlabeled data. The most confident predictions are subsequently added to the training set to retrain the classifiers. This co-training process minimises the reliance on labelled data. Evaluations on two datasets demonstrate that the proposed co-training approach achieves a competitive performance with limited training data as compared to standard supervised learning baseline.
|Title of host publication||Proceedings of the Australasian Language Technology Association Workshop 2016|
|Place of Publication||Australia|
|Publisher||Australasian Language Technology Association|
|Publication status||Published - 2016|
|Event||Australasian Language Technology Association Workshop 2016 - Monash University, Melbourne, Australia|
Duration: 5 Dec 2016 → 7 Dec 2016
|Conference||Australasian Language Technology Association Workshop 2016|
|Abbreviated title||ALTA 2016|
|Period||5/12/16 → 7/12/16|
FingerprintDive into the research topics of 'Featureless Domain-Specific Term Extraction with Minimal Labelled Data'. Together they form a unique fingerprint.
1/01/12 → 31/12/12
1/01/15 → 1/09/15