Basic output from SCPA

The output from SCPA provides 4-5 columns, depending on the type of comparison. These will be 1) your pathways 2) raw pval 3) adjusted pval 4) qval, and, if a two-sample comparison is done 5) fold change. Here is an example output:

head(scpa_out, 10)
#>                               Pathway          Pval      adjPval     qval
#> 32            HALLMARK_MYC_TARGETS_V1 5.788383e-101 2.894192e-99 9.926655
#> 2        HALLMARK_ALLOGRAFT_REJECTION  7.532277e-93 3.766139e-91 9.509159
#> 31          HALLMARK_MTORC1_SIGNALING  7.532277e-93 3.766139e-91 9.509159
#> 36 HALLMARK_OXIDATIVE_PHOSPHORYLATION  1.061001e-89 5.305003e-88 9.342126
#> 23       HALLMARK_IL2_STAT5_SIGNALING  1.315549e-86 6.577745e-85 9.175071
#> 33            HALLMARK_MYC_TARGETS_V2  1.435838e-83 7.179192e-82 9.007992
#> 13               HALLMARK_E2F_TARGETS  4.522036e-82 2.261018e-80 8.924444
#> 27 HALLMARK_INTERFERON_GAMMA_RESPONSE  1.379483e-80 6.897414e-79 8.840889
#> 48            HALLMARK_UV_RESPONSE_UP  1.379483e-80 6.897414e-79 8.840889
#> 18            HALLMARK_G2M_CHECKPOINT  4.076176e-79 2.038088e-77 8.757327
#>           FC
#> 32 -87.81108
#> 2  -21.07656
#> 31 -45.82991
#> 36 -47.98635
#> 23 -20.23701
#> 33 -20.71347
#> 13 -31.36869
#> 27 -13.62249
#> 48 -20.07974
#> 18 -28.20888

Which output values should you use?

Instead of looking at pathway enrichment, SCPA assesses changes in the multivariate distribution of a pathway. However, pathways that show enrichment in a given population will also necessarily show large changes in multivariate distribution. This means that with SCPA you’re able to detect 1) enriched pathways and 2) non-enriched pathways that have transcriptional changes that are independent of enrichment. This output is reflected in the qval, and we recommend people to use this as their primary statistic. Here, the larger the qval, the larger the change in pathway ‘activity’. As SCPA measures multivariate distributions, there will be pathways that show significantly large qvals, but no overall fold change/enrichment in a given population. Whilst these pathways are not enriched, we know that these distribution changes identified by SCPA are still important for cellular behaviour. We show an example of this in Figure 4 our paper, where arachidonic acid metabolism is not enriched, but shows changes in multivariate distribution, and is shown to be critical for T cell activation. We are therefore proposing that considering changes in multivariate distributions of pathways is a better overall reflection of pathway activity, and because of this, you should use the qval as your statistical measurement of pathway activity, and fold change only as a secondary informative value.

Filtering and visualizing the SCPA output

In general, it’s most informative to visualize the whole output of the pathway analysis so you can understand the global pattern of pathway changes that are occurring in your populations. So whilst you can use a typical statistical filter e.g. adjusted pval < 0.01, we generally propose using something like a ranking plot to visualize the distribution of qvals, whilst still being able to highlight certain pathways. We provide some basic functions including plot_rank() that can visualize the SCPA output like this, and also provide the plot_heatmap() function that can visualize more than one comparison. See the Visualization tutorial for some examples of this.