This function takes a Seurat object as an input, and compares gene sets over specified conditions/populations.

compare_seurat(
  seurat_object,
  assay = "RNA",
  group1 = NULL,
  group1_population = NULL,
  group2 = NULL,
  group2_population = NULL,
  pathways,
  downsample = 500,
  min_genes = 15,
  max_genes = 500,
  parallel = FALSE,
  cores = NULL
)

Arguments

seurat_object

Seurat object with populations defined in the meta data

assay

Assay to pull expression data from

group1

First comparison group as defined by meta data in Seurat object e.g. cell_type

group1_population

Populations within group1 to compare e.g. c("t_cell", "b_cell")

group2

Second comparison group as defined by column names in Seurat object e.g. hour

group2_population

Population within group2 to compare e.g. 24

pathways

Pathway gene sets with each pathway in a separate list. For formatting of gene lists, see documentation at https://jackbibby1.github.io/SCPA/articles/using_gene_sets.html

downsample

Option to downsample cell numbers. Defaults to 500 cells per condition. If a population has < 500 cells, all cells from that condition are used.

min_genes

Gene sets with fewer than this number of genes will be excluded

max_genes

Gene sets with more than this number of genes will be excluded

parallel

Should parallel processing be used?

cores

The number of cores used for parallel processing

Value

Statistical results from the SCPA analysis. The qval should be the primary metric that is used to interpret pathway differences i.e. a higher qval translates to larger pathway differences between conditions. If only two samples are provided, a fold change (FC) enrichment score will also be calculated. The FC output is generated from a running sum of mean changes in gene expression from all genes of the pathway. It's calculated from average pathway expression in population1 - population2, so a negative FC means the pathway is higher in population2.

Examples

if (FALSE) {
scpa_out <- compare_sce(
     group1 = "cell",
     group1_population = c("t_cell", "b_cell"),
     group2 = "hour",
     group2_population = c("24"),
     pathways = pathways)
}