Computationally efficient multidimensional analysis of complex flow cytometry data using second order polynomial histograms

John Zaunders, Junmei Jing, Michael Leipold, Holden Maecker, Anthony D. Kelleher, Inge Koch

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)


Many methods have been described for automated clustering analysis of complex flow cytometry data, but so far the goal to efficiently estimate multivariate densities and their modes for a moderate number of dimensions and potentially millions of data points has not been attained. We have devised a novel approach to describing modes using second order polynomial histogram estimators (SOPHE). The method divides the data into multivariate bins and determines the shape of the data in each bin based on second order polynomials, which is an efficient computation. These calculations yield local maxima and allow joining of adjacent bins to identify clusters. The use of second order polynomials also optimally uses wide bins, such that in most cases each parameter (dimension) need only be divided into 4-8 bins, again reducing computational load. We have validated this method using defined mixtures of up to 17 fluorescent beads in 16 dimensions, correctly identifying all populations in data files of 100,000 beads in <10 s, on a standard laptop. The method also correctly clustered granulocytes, lymphocytes, including standard T, B, and NK cell subsets, and monocytes in 9-color stained peripheral blood, within seconds. SOPHE successfully clustered up to 36 subsets of memory CD4 T cells using differentiation and trafficking markers, in 14-color flow analysis, and up to 65 subpopulations of PBMC in 33-dimensional CyTOF data, showing its usefulness in discovery research. SOPHE has the potential to greatly increase efficiency of analysing complex mixtures of cells in higher dimensions.

Original languageEnglish
Pages (from-to)44-58
Number of pages15
JournalCytometry Part A
Issue number1
Publication statusPublished - 1 Jan 2016
Externally publishedYes


Dive into the research topics of 'Computationally efficient multidimensional analysis of complex flow cytometry data using second order polynomial histograms'. Together they form a unique fingerprint.

Cite this