Using UMAP to process high colour flow cytometry data

Introduction to UMAP

I find the UMAP algorithm very useful in separating the major phenotypic populations of leukocytes, but I have recently discovered – through trial and error, that the UMAP data can be skewed by small outlying populations.

To use UMAP to its best potential, I have found that merging datasets and running the data through the UMAP algorithm has been key. This is especially true when I am able to separate the files and look at comparisons using the same plot. However, this in itself can be a challenge.

In my most recent tests I have sent varying numbers of files (out of the 49 in this dataset from the Flow Repository), each up to 50MB in size, and received some very good results with UMAP.

Examples of UMAP previews in CytoSwarm (black and white) and in VenturiOne analysis software (coloured) are shown below:

Single files 





Merged files 

6, 10 files merged

7, 10 different files merged

8, 10 randomly picked files merged

 9, 10 different randomly picked files merged

10, first 10 files merged

11, 10 randomly picked files merged

Gated and Merged 

12, All files gated on CD4 and merged

13, All files gated on CD8 and merged

14, All files gated on Tregs and merged

As you can see, certain plots have been skewed by some outlying data. So what do I do with the effected plots? There are a few things to consider.

What is this outlying data? Is it credible data? Does it need to be run again, or can it simply be gated out?

I will discuss all my considerations in my next blog.

In the meantime

Please feel free to discuss your own considerations with me at I would love to hear your own experiences with UMAP!

Contact Us

To speak to one of our team email or telephone +44 (0)1909 547210. Alternatively, please complete the form on the Contact page and one of our team will contact you as soon as possible.

Get In Touch