vignettes/interpreting_scpa_output.Rmd
interpreting_scpa_output.Rmd
The output from SCPA provides 4-5 columns, depending on the type of comparison. These will be 1) your pathways 2) raw pval 3) adjusted pval 4) qval, and, if a two-sample comparison is done 5) fold change. Here is an example output:
head(scpa_out, 10)
#> Pathway Pval adjPval qval
#> 32 HALLMARK_MYC_TARGETS_V1 5.788383e-101 2.894192e-99 9.926655
#> 2 HALLMARK_ALLOGRAFT_REJECTION 7.532277e-93 3.766139e-91 9.509159
#> 31 HALLMARK_MTORC1_SIGNALING 7.532277e-93 3.766139e-91 9.509159
#> 36 HALLMARK_OXIDATIVE_PHOSPHORYLATION 1.061001e-89 5.305003e-88 9.342126
#> 23 HALLMARK_IL2_STAT5_SIGNALING 1.315549e-86 6.577745e-85 9.175071
#> 33 HALLMARK_MYC_TARGETS_V2 1.435838e-83 7.179192e-82 9.007992
#> 13 HALLMARK_E2F_TARGETS 4.522036e-82 2.261018e-80 8.924444
#> 27 HALLMARK_INTERFERON_GAMMA_RESPONSE 1.379483e-80 6.897414e-79 8.840889
#> 48 HALLMARK_UV_RESPONSE_UP 1.379483e-80 6.897414e-79 8.840889
#> 18 HALLMARK_G2M_CHECKPOINT 4.076176e-79 2.038088e-77 8.757327
#> FC
#> 32 -87.81108
#> 2 -21.07656
#> 31 -45.82991
#> 36 -47.98635
#> 23 -20.23701
#> 33 -20.71347
#> 13 -31.36869
#> 27 -13.62249
#> 48 -20.07974
#> 18 -28.20888
Instead of looking at pathway enrichment, SCPA assesses changes in the multivariate distribution of a pathway. However, pathways that show enrichment in a given population will also necessarily show large changes in multivariate distribution. This means that with SCPA you’re able to detect 1) enriched pathways and 2) non-enriched pathways that have transcriptional changes that are independent of enrichment. This output is reflected in the qval, and we recommend people to use this as their primary statistic. Here, the larger the qval, the larger the change in pathway ‘activity’. As SCPA measures multivariate distributions, there will be pathways that show significantly large qvals, but no overall fold change/enrichment in a given population. Whilst these pathways are not enriched, we know that these distribution changes identified by SCPA are still important for cellular behaviour. We show an example of this in Figure 4 our paper, where arachidonic acid metabolism is not enriched, but shows changes in multivariate distribution, and is shown to be critical for T cell activation. We are therefore proposing that considering changes in multivariate distributions of pathways is a better overall reflection of pathway activity, and because of this, you should use the qval as your statistical measurement of pathway activity, and fold change only as a secondary informative value.
In general, it’s most informative to visualize the whole output of
the pathway analysis so you can understand the global pattern of pathway
changes that are occurring in your populations. So whilst you can use a
typical statistical filter e.g. adjusted pval < 0.01, we generally
propose using something like a ranking plot to visualize the
distribution of qvals, whilst still being able to highlight certain
pathways. We provide some basic functions including
plot_rank()
that can visualize the SCPA output like this,
and also provide the plot_heatmap()
function that can
visualize more than one comparison. See the Visualization
tutorial for some examples of this.