TY - JOUR

T1 - Quantum data compression by principal component analysis

AU - Yu, Chao Hua

AU - Gao, Fei

AU - Lin, Song

AU - Wang, Jingbo

PY - 2019/8/1

Y1 - 2019/8/1

N2 - Data compression can be achieved by reducing the dimensionality of high-dimensional but approximately low-rank datasets, which may in fact be described by the variation of a much smaller number of parameters. It often serves as a preprocessing step to surmount the curse of dimensionality and to gain efficiency, and thus it plays an important role in machine learning and data mining. In this paper, we present a quantum algorithm that compresses an exponentially large high-dimensional but approximately low-rank dataset in quantum parallel, by dimensionality reduction (DR) based on principal component analysis (PCA), the most popular classical DR algorithm. We show that the proposed algorithm has a runtime polylogarithmic in the dataset’s size and dimensionality, which is exponentially faster than the classical PCA algorithm, when the original dataset is projected onto a polylogarithmically low-dimensional space. The compressed dataset can then be further processed to implement other tasks of interest, with significantly less quantum resources. As examples, we apply this algorithm to reduce data dimensionality for two important quantum machine learning algorithms, quantum support vector machine and quantum linear regression for prediction. This work demonstrates that quantum machine learning can be released from the curse of dimensionality to solve problems of practical importance.

AB - Data compression can be achieved by reducing the dimensionality of high-dimensional but approximately low-rank datasets, which may in fact be described by the variation of a much smaller number of parameters. It often serves as a preprocessing step to surmount the curse of dimensionality and to gain efficiency, and thus it plays an important role in machine learning and data mining. In this paper, we present a quantum algorithm that compresses an exponentially large high-dimensional but approximately low-rank dataset in quantum parallel, by dimensionality reduction (DR) based on principal component analysis (PCA), the most popular classical DR algorithm. We show that the proposed algorithm has a runtime polylogarithmic in the dataset’s size and dimensionality, which is exponentially faster than the classical PCA algorithm, when the original dataset is projected onto a polylogarithmically low-dimensional space. The compressed dataset can then be further processed to implement other tasks of interest, with significantly less quantum resources. As examples, we apply this algorithm to reduce data dimensionality for two important quantum machine learning algorithms, quantum support vector machine and quantum linear regression for prediction. This work demonstrates that quantum machine learning can be released from the curse of dimensionality to solve problems of practical importance.

KW - Curse of dimensionality

KW - Data compression

KW - Principal component analysis

KW - Quantum algorithm

KW - Quantum machine learning

UR - http://www.scopus.com/inward/record.url?scp=85068579196&partnerID=8YFLogxK

U2 - 10.1007/s11128-019-2364-9

DO - 10.1007/s11128-019-2364-9

M3 - Article

AN - SCOPUS:85068579196

VL - 18

JO - Quantum Information Processing

JF - Quantum Information Processing

SN - 1570-0755

IS - 8

M1 - 249

ER -