Phenotyping Single Cell Data in CytoMAP

8 minute read

Published:

In this post I walk through how to phenotype cells using CytoMAP

After preparing tissues, imaging them, and segmenting those images; figuring out what cell types you can define is the next step in the long journey of image analysis. The are many ways to do this and a plethora of software options out there. For many cases, if you have a few markers and a specific question about a specific population of cells, manually defining cells based on their mean fluorescent intensity (MFI) is the way to go. However, if you have a bunch of channels, and are asking more general questions about what is in the tissue microenvironment, clustering based cellular phenotyping can be very informative. In this post I am going to walk through how to do this in CytoMAP using the example data available here.

Step 1: Import your data into CytoMAP

This might might sound like a trivial step, but if your data isn’t formatted correctly, or your starting data has junk in it every step down stream from this will be much more difficult. To start, if you haven’t defined any cell types, you should have a .csv file for each sample and the name of the file should match the name of the sample like this:

LN1_auLN_All.csv

LN2_auLN_All.csv

LN3_auLN_All.csv

LN4_bLN_All.csv

LN5_bLN_All.csv

Here we have five steady state mouse Lymph Nodes (LN) that are either auricular (auLN) or brachial (bLN). Inside the .csv files each row is a single cell and each column is a different channel:

CD64_PESIRPa_594CD207_488CD169_514Lyvev1_490LSCD11c_421CD31_480MHC2_395xlClec9a_633CD301b_660B220_700CD3_APC-F750CD301b_correctedSIRPa_correctedCD31_correctedCD11c_correctedCD169_correctedMHC2_B220cleanMyeloidPositionXPositionYPositionZchIDSphericityVolumeClassifier
1.354450.56272894.559118.926937.45490.6510353.166262.840445.7545717.31181.930571.3069415.96350.5018273.124240.61327611.70582.3227880.7795382.9778.69.2304855940.887945241.9331
7.2051428.954126.562314.063212.324820.161315.907819.688711.488228.552249.355919.06118.3719520.784810.909418.180710.39397.103267.0661739.4688.49.0674255950.648447334.4021
16.727252.90963.5977313.204734.069833.880631.018529.14759.8675913.141236.662525.77894.4262333.530520.175329.30187.9340111.105569.5939862.2715.810.5193555960.809646342.5351
12.073744.44572.281775.6114222.257846.755120.423620.195215.611410.664836.053426.11423.4548832.70359.6243143.60594.443835.637276.8895958.1777.311.3851555970.84242875.33651
26.461540.52753.4605525.3826186.98316.782620.34532.38629.052294.7018350.582617.40730.58899131.164214.84515.025719.840410.560.7404979.8784.19.7282555980.607925148.7461
19.432568.94272.731458.7799314.699821.748721.400920.82597.2997510.527444.879420.44141.9595353.06714.597418.95536.109194.6564167.3361066.2685.210.1091555990.798238340.6531
10.93462.87274.9002411.653633.168112.616730.611926.580514.252215.30465.198720.89242.02242.450925.17369.524746.239594.3417156.64891086.4733.410.7830556000.73209172.2781
7.7383227.60592.431466.713412.255541.037426.082620.0816.4283514.286647.78516.73522.7772616.414313.59536.91434.722743.5685453.96731095.9737.611.6616556010.68818684.82981
9.5277810.369330.781918.258210.10618.559716.661523.808615.9314.809746.940316.65433.266466.5061711.681116.551413.71195.5781960.44241147.4780.811.0933556020.807941133.6951

For best results, try not to use symbols, start a name with a number, or have spaces in the channel names. CytoMAP is fairly robust to these channel names, but sometimes this still causes errors. If you want to re-name the headers of a bunch of .csv files to make them consistent across samples, I have a MATLAB function to help here. If you name your positional columns PositionX, PositionY, PositionZ, CytoMAP will automatically recognize these as the spatial position channels. If CytoMAP can’t find position it will ask when loading data. Additionally, since everything in CytoMAP is set up to work with three-dimensional data, if your data just has X, and Y we will add a Z channel that is all zeros. Now that your data is formatted correctly simply run File > Import Multiple Samples

Load Multiple Samples

After selecting the samples you want to load a window will pop up where you can rename any samples or channels. These can also be changed later in CytoMAP so for now we are going to just click Load.

Re-Name Channels

The last step is to make sure everything worked. To do this make a New Figure, plot some of your samples, and color code them by a few different channel markers. Below are the five LNs I loaded in color coded by B220, Cell Density, Lyve1, CD169, and CD3. All of the channel intensities look similar to what I saw in the original images.

Check Loaded Cells

Step 1.5: Pre-gating

Sometimes it is useful to build surfaces on different objects in your image then load all of your surfaces into one .csv. You can then add a “classifier” channel that keeps track of which object type each cell is. In the example data used here we have done just that. We built surfaces on Myeloid cells then added spot objects. The spot objects give us locally averaged pixel values which we can use to identify the T cell zone, and B cell follicles in our image without explicitly segmenting on T cells and B cells. To simplify this example we are only going to cluster the Myeloid cell population. To get to just the Myeloid cells, make a New Figure, plot Classifier on the Y axis and Sphericity on the X axis. Sine all of the spot objects have Sphericity = 1, using this channel simplifies knowing which objects are spots and which are Myeloid cells.

To gate on these cells click the new rectangular gate button at the top of the figure window (left red arrow). Name your new population, click ok, then draw your gate on the the plot. Finally click the save gate button (right red arrow).

Gate On Cells

Next we can plot our two new populations of cells to see their spatial distribution below. For this lymph node the Myeloid cells aren’t located in the B cell follicles.

Cellular Spatial Distribution

Step 2: Run the clustering algorithm

Now that we have isolated just our Myeloid cells we can cluster these cells into different phenotypes. To do this click the Cluster Cells button in the main CytoMAP window. This will bring up a new window with a lot of options. For this example, we want to use all of our fluorescent channels but we don’t want to double any channels. Thus, for the few channels where we have a corrected version (ran background subtraction etc.) we want to be sure to only use the corrected version and not the raw version of that channel. In the bottom left of this window we selected standardized cellular MFI and we did this calculation on a sample by sample basis and not for the whole pooled dataset. We manually selected 30 clusters so that we over-cluster the data. Picking a number of regions is faster than letting the computer decide and we can always combine clusters that are too similar later.

Cluster Cells User Interface

After running the clustering algorithm we can now color-code our cells by which cluster number they belong to. This highlights the distinct spatial regions the individual cell types live in.

Cell Cluster Spatial Distribution

After checking the spatial distribution of the clusters we can use the cell_heatmaps extension to look at the average channel intensity in the cells that belong to each of the clusters we just defined.

Cell MFI Heatmaps

Step 3: Annotate the cell clusters

The final step in defining cells with CytoMAP is to annotate the different clusters of cells with names that make sense. To do this make plots of the spatial distribution of your cells, plots of the channel MFI per cluster, and plots of the different channel MFI per cell. Next use the annotate cells function to name your populations.

Annotate Plots

By clicking on your plots of cells color-coded by cluster number, you can explore which cell type belongs to each cluster. In the example below we see that cells from cluster number 24 has high CD64, and CD169 expression. In the Annotate Cells window we can name this cluster subcapsular sinus macrophages (SCS Macs).

Cell MFI Heatmaps

After we have given a name to all of the cell types we are interested in we can plot these cells in space, but now color coded by the annotated cell type.

Cell MFI Heatmaps

We are now ready to use these cell types for downstream analysis such as cell-cell correlation, cell distance relationships, or neighborhood based region analysis. For more information, see my CytoMAP Wiki page. If you get stuck, or need some help; post a topic on the image.sc forum with the tag cytomap.

Best of luck exploring your data!