Enabling Precision/Recall Preferences for Semi-supervised SVM Training

Zeyi Wen, Rui Zhang, Kotagiri Ramamohanarao

Research output: Chapter in Book/Conference paperConference paper

5 Citations (Scopus)

Abstract

Semi-supervised learning is an essential approach to classification when the available labeled data is insufficient and we need to also make use of unlabeled data in the learning process. Numerous research efforts have focused on designing algorithms to improve the Fi score, but have any mechanism to control precision or recall individually. However, many applications have precision/recall preferences. For instance, an email spam classifier requires a precision of 0.9 to mitigate the false dismissal of useful emails. In this paper, we propose a method that allows to specify a precision/recall preference while maximising the Fx score. Our key idea is that we divide the semi-supervised learning process into multiple rounds of supervised learning, and the classifier learned at each round is calibrated using a subset of the labeled dataset before we use it on the unlabeled dataset for enlarging the training dataset. Our idea is applicable to a number of learning models such as Support Vector Machines (SVMs), Bayesian networks and neural networks. We focus our research and the implementation of our idea on SVMs. We conduct extensive experiments to validate the effectiveness of our method. The experimental results show that our method can train classifiers with a precision/recall preference, while the popular semi-supervised SVM training algorithm (which we use as the baseline) cannot. When we specify the precision preference and the recall preference to be the same, which indicates to maximise the Fi score only as the baseline does, our method achieves better or similar Fi scores to the baseline. An additional advantage of our method is that it converges much faster than the baseline.
Original languageEnglish
Title of host publicationThe 23rd ACM Conference on Information and Knowledge Management (CIKM 2014)
Place of PublicationUSA
PublisherAssociation for Computing Machinery (ACM)
Pages421-430
Number of pages10
ISBN (Print)978-1-4503-2598-1
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event23rd ACM International Conference on Information and Knowledge Management - Shanghai, China
Duration: 3 Nov 20147 Nov 2014

Conference

Conference23rd ACM International Conference on Information and Knowledge Management
Abbreviated titleCIKM 2014
CountryChina
CityShanghai
Period3/11/147/11/14

Fingerprint Dive into the research topics of 'Enabling Precision/Recall Preferences for Semi-supervised SVM Training'. Together they form a unique fingerprint.

Cite this