Get summed proportions — get_summed

get_summed_proportions Given the target gene set, randomly sample gene lists of equal length, obtain the specificity of these and then obtain the mean specificity in each sampled list (and the target list).

get_summed_proportions(
  hits,
  sct_data,
  annotLevel,
  reps,
  no_cores = 1,
  geneSizeControl,
  controlledCT = NULL,
  control_network = NULL,
  store_gene_data = TRUE,
  verbose = TRUE
)

Arguments

hits: list of gene names. The target gene set.
sct_data: List generated using generate_celltype_data.
annotLevel: An integer indicating which level of sct_data to analyse (Default: 1).
reps: Number of random gene lists to generate (Default: 100, but should be >=10,000 for publication-quality results).
no_cores: Number of cores to parallelise bootstrapping reps over.
geneSizeControl: Whether you want to control for GC content and transcript length. Recommended if the gene list originates from genetic studies (Default: FALSE). If set to TRUE, then hits must be from humans.
controlledCT: [Optional] If not NULL, and instead is the name of a cell type, then the bootstrapping controls for expression within that cell type.
control_network: If geneSizeControl=TRUE, then must provide the control network.
store_gene_data: Store sampled gene data for every bootstrap iteration. When the number of bootstrap reps is very high (>=100k) and/or the number of genes in hits is very high, you may want to set store_gene_data=FALSE to avoid using excessive amounts of CPU memory.
verbose: Print messages.

Value

A list containing three elements:

hit.cells: vector containing the summed proportion of expression in each cell type for the target list.
gene_data: data.table showing the number of time each gene appeared in the bootstrap sample.
bootstrap_data: matrix in which each row represents the summed proportion of expression in each cell type for one of the random lists
controlledCT: the controlled cell type (if applicable)

Details

See bootstrap_enrichment_test for examples.