Kernel naive Bayes discrimination for high-dimensional pattern recognition

Inge Koch, Kanta Naito, Hiroaki Tanaka

Research output: Contribution to journalArticle

Abstract

Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high-dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two-class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill-posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first- and second-degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.

Original languageEnglish
Pages (from-to)401-428
Number of pages28
JournalAustralian and New Zealand Journal of Statistics
Volume61
Issue number4
DOIs
Publication statusPublished - 1 Dec 2019

Fingerprint Dive into the research topics of 'Kernel naive Bayes discrimination for high-dimensional pattern recognition'. Together they form a unique fingerprint.

Cite this