merged_ewce
combines enrichment results from multiple studies
targetting the same scientific problem
merged_ewce(results, reps = 100)
a list of EWCE results generated using add_res_to_merging_list.
Number of random gene lists to generate (Default=100 but should be >=10,000 for publication-quality results).
dataframe in which each row gives the statistics (p-value, fold change and number of standard deviations from the mean) associated with the enrichment of the stated cell type in the gene list.
# Load the single cell data
ctd <- ewceData::ctd()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
# Use 3 bootstrap lists for speed, for publishable analysis use >10000
reps <- 3
# Use 5 up/down regulated genes (thresh) for speed, default is 250
thresh <- 5
# Load the data
tt_alzh_BA36 <- ewceData::tt_alzh_BA36()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
tt_alzh_BA44 <- ewceData::tt_alzh_BA44()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
# Run EWCE analysis
tt_results_36 <- EWCE::ewce_expression_data(
sct_data = ctd,
tt = tt_alzh_BA36,
thresh = thresh,
annotLevel = 1,
reps = reps,
ttSpecies = "human",
sctSpecies = "mouse"
)
#> Warning: genelistSpecies not provided. Setting to 'human' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Preparing gene_df.
#> character format detected.
#> Converting to data.frame
#> Extracting genes from input_gene.
#> 15,259 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 13,416 genes extracted.
#> Extracting genes from ortholog_gene.
#> 13,416 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 46 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 56 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Returning gene_map as dictionary
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 2,016 / 15,259 (13%)
#> Total genes remaining after convert_orthologs :
#> 13,243 / 15,259 (87%)
#> Generating gene background for mouse x human ==> human
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Gene table with 21,207 rows retrieved.
#> Returning all 21,207 genes from mouse.
#> --
#> --
#> Preparing gene_df.
#> data.frame format detected.
#> Extracting genes from Gene.Symbol.
#> 21,207 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 17,355 genes extracted.
#> Extracting genes from ortholog_gene.
#> 17,355 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 131 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 498 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Adding input_gene col to gene_df.
#> Adding ortholog_gene col to gene_df.
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 4,725 / 21,207 (22%)
#> Total genes remaining after convert_orthologs :
#> 16,482 / 21,207 (78%)
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 16,482 / 21,207 (77.72%) target_species genes remain after ortholog conversion.
#> 16,482 / 19,129 (86.16%) reference_species genes remain after ortholog conversion.
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 19,129 / 19,129 (100%) target_species genes remain after ortholog conversion.
#> 19,129 / 19,129 (100%) reference_species genes remain after ortholog conversion.
#> 16,482 intersect background genes used.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Returning 19,129 unique genes from entire human genome.
#> Using intersect between background gene lists: 16,482 genes.
#> Standardising sct_data.
#> Using 1st column of tt as gene column: HGNC.symbol
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 6 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 2 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 pyramidal_CA1 1 0 1.411635 2.789677 0
#> 2 pyramidal_SS 1 0 1.192536 1.785915 0
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 5 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 3 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 microglia 1 0 2.841819 6.454695 0
#> 2 pyramidal_SS 1 0 1.455683 4.158702 0
#> 3 astrocytes_ependymal 1 0 1.530297 1.227513 0
tt_results_44 <- EWCE::ewce_expression_data(
sct_data = ctd,
tt = tt_alzh_BA44,
thresh = thresh,
annotLevel = 1,
reps = reps,
ttSpecies = "human",
sctSpecies = "mouse"
)
#> Warning: genelistSpecies not provided. Setting to 'human' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Preparing gene_df.
#> character format detected.
#> Converting to data.frame
#> Extracting genes from input_gene.
#> 15,259 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 13,416 genes extracted.
#> Extracting genes from ortholog_gene.
#> 13,416 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 46 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 56 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Returning gene_map as dictionary
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 2,016 / 15,259 (13%)
#> Total genes remaining after convert_orthologs :
#> 13,243 / 15,259 (87%)
#> Generating gene background for mouse x human ==> human
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Gene table with 21,207 rows retrieved.
#> Returning all 21,207 genes from mouse.
#> --
#> --
#> Preparing gene_df.
#> data.frame format detected.
#> Extracting genes from Gene.Symbol.
#> 21,207 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 17,355 genes extracted.
#> Extracting genes from ortholog_gene.
#> 17,355 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 131 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 498 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Adding input_gene col to gene_df.
#> Adding ortholog_gene col to gene_df.
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 4,725 / 21,207 (22%)
#> Total genes remaining after convert_orthologs :
#> 16,482 / 21,207 (78%)
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 16,482 / 21,207 (77.72%) target_species genes remain after ortholog conversion.
#> 16,482 / 19,129 (86.16%) reference_species genes remain after ortholog conversion.
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 19,129 / 19,129 (100%) target_species genes remain after ortholog conversion.
#> 19,129 / 19,129 (100%) reference_species genes remain after ortholog conversion.
#> 16,482 intersect background genes used.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Returning 19,129 unique genes from entire human genome.
#> Using intersect between background gene lists: 16,482 genes.
#> Standardising sct_data.
#> Using 1st column of tt as gene column: HGNC.symbol
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 6 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 3 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 endothelial_mural 1 0 2.002147 5.407831 0
#> 2 oligodendrocytes 1 0 2.010357 4.419611 0
#> 3 astrocytes_ependymal 1 0 1.156367 3.094029 0
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 5 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 2 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 astrocytes_ependymal 1 0 1.803066 2.350458 0
#> 2 pyramidal_CA1 1 0 1.493630 1.528110 0
# Fill a list with the results
results <- EWCE::add_res_to_merging_list(tt_results_36)
results <- EWCE::add_res_to_merging_list(tt_results_44, results)
# Perform the merged analysis
# For publication reps should be higher
merged_res <- EWCE::merged_ewce(
results = results,
reps = 2
)
print(merged_res)
#> CellType p fc sd_from_mean
#> astrocytes_ependymal astrocytes_ependymal 0.66130 0.8866746 -0.5885042
#> endothelial_mural endothelial_mural 0.33110 1.1171572 0.8496895
#> interneurons interneurons 0.78115 0.7658062 -0.9465746
#> microglia microglia 0.66555 0.8459999 -0.6892282
#> oligodendrocytes oligodendrocytes 0.00000 1.7426915 2.4818484
#> pyramidal_CA1 pyramidal_CA1 0.44395 1.0142639 0.2497222
#> pyramidal_SS pyramidal_SS 1.00000 0.9174883 -1.4337207
#> astrocytes_ependymal1 astrocytes_ependymal 0.00000 1.6844598 3.1154348
#> endothelial_mural1 endothelial_mural 1.00000 0.5559731 -1.8899465
#> interneurons1 interneurons 0.77590 0.7063486 -0.8681162
#> microglia1 microglia 0.00000 1.4252970 1.4699207
#> oligodendrocytes1 oligodendrocytes 0.22070 1.1261607 0.5880097
#> pyramidal_CA11 pyramidal_CA1 0.11095 1.1953337 1.6270440
#> pyramidal_SS1 pyramidal_SS 0.00000 1.2607602 3.0064648
#> Direction
#> astrocytes_ependymal Up
#> endothelial_mural Up
#> interneurons Up
#> microglia Up
#> oligodendrocytes Up
#> pyramidal_CA1 Up
#> pyramidal_SS Up
#> astrocytes_ependymal1 Down
#> endothelial_mural1 Down
#> interneurons1 Down
#> microglia1 Down
#> oligodendrocytes1 Down
#> pyramidal_CA11 Down
#> pyramidal_SS1 Down