merged_ewce combines enrichment results from multiple studies
targetting the same scientific problem
merged_ewce(results, reps = 100)a list of EWCE results generated using add_res_to_merging_list.
Number of random gene lists to generate (Default=100 but should be >=10,000 for publication-quality results).
dataframe in which each row gives the statistics (p-value, fold change and number of standard deviations from the mean) associated with the enrichment of the stated cell type in the gene list.
# Load the single cell data
ctd <- ewceData::ctd()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
# Use 3 bootstrap lists for speed, for publishable analysis use >10000
reps <- 3
# Use 5 up/down regulated genes (thresh) for speed, default is 250
thresh <- 5
# Load the data
tt_alzh_BA36 <- ewceData::tt_alzh_BA36()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
tt_alzh_BA44 <- ewceData::tt_alzh_BA44()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
# Run EWCE analysis
tt_results_36 <- EWCE::ewce_expression_data(
sct_data = ctd,
tt = tt_alzh_BA36,
thresh = thresh,
annotLevel = 1,
reps = reps,
ttSpecies = "human",
sctSpecies = "mouse"
)
#> Warning: genelistSpecies not provided. Setting to 'human' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Preparing gene_df.
#> character format detected.
#> Converting to data.frame
#> Extracting genes from input_gene.
#> 15,259 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 13,416 genes extracted.
#> Extracting genes from ortholog_gene.
#> 13,416 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 46 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 56 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Returning gene_map as dictionary
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 2,016 / 15,259 (13%)
#> Total genes remaining after convert_orthologs :
#> 13,243 / 15,259 (87%)
#> Generating gene background for mouse x human ==> human
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Gene table with 21,207 rows retrieved.
#> Returning all 21,207 genes from mouse.
#> --
#> --
#> Preparing gene_df.
#> data.frame format detected.
#> Extracting genes from Gene.Symbol.
#> 21,207 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 17,355 genes extracted.
#> Extracting genes from ortholog_gene.
#> 17,355 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 131 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 498 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Adding input_gene col to gene_df.
#> Adding ortholog_gene col to gene_df.
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 4,725 / 21,207 (22%)
#> Total genes remaining after convert_orthologs :
#> 16,482 / 21,207 (78%)
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 16,482 / 21,207 (77.72%) target_species genes remain after ortholog conversion.
#> 16,482 / 19,129 (86.16%) reference_species genes remain after ortholog conversion.
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 19,129 / 19,129 (100%) target_species genes remain after ortholog conversion.
#> 19,129 / 19,129 (100%) reference_species genes remain after ortholog conversion.
#> 16,482 intersect background genes used.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Returning 19,129 unique genes from entire human genome.
#> Using intersect between background gene lists: 16,482 genes.
#> Standardising sct_data.
#> Using 1st column of tt as gene column: HGNC.symbol
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 6 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 2 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 astrocytes_ependymal 1 0 1.593940 6.780700 0
#> 2 oligodendrocytes 1 0 1.634139 4.738468 0
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 5 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 2 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 oligodendrocytes 1 0 1.462798 8.531779 0
#> 2 microglia 1 0 2.686755 2.852975 0
tt_results_44 <- EWCE::ewce_expression_data(
sct_data = ctd,
tt = tt_alzh_BA44,
thresh = thresh,
annotLevel = 1,
reps = reps,
ttSpecies = "human",
sctSpecies = "mouse"
)
#> Warning: genelistSpecies not provided. Setting to 'human' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Warning: sctSpecies_origin not provided. Setting to 'mouse' by default.
#> Preparing gene_df.
#> character format detected.
#> Converting to data.frame
#> Extracting genes from input_gene.
#> 15,259 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 13,416 genes extracted.
#> Extracting genes from ortholog_gene.
#> 13,416 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 46 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 56 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Returning gene_map as dictionary
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 2,016 / 15,259 (13%)
#> Total genes remaining after convert_orthologs :
#> 13,243 / 15,259 (87%)
#> Generating gene background for mouse x human ==> human
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Gene table with 21,207 rows retrieved.
#> Returning all 21,207 genes from mouse.
#> --
#> --
#> Preparing gene_df.
#> data.frame format detected.
#> Extracting genes from Gene.Symbol.
#> 21,207 genes extracted.
#> Converting mouse ==> human orthologs using: homologene
#> Retrieving all organisms available in homologene.
#> Mapping species name: mouse
#> Common name mapping found for mouse
#> 1 organism identified from search: 10090
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Checking for genes without orthologs in human.
#> Extracting genes from input_gene.
#> 17,355 genes extracted.
#> Extracting genes from ortholog_gene.
#> 17,355 genes extracted.
#> Checking for genes without 1:1 orthologs.
#> Dropping 131 genes that have multiple input_gene per ortholog_gene (many:1).
#> Dropping 498 genes that have multiple ortholog_gene per input_gene (1:many).
#> Filtering gene_df with gene_map
#> Adding input_gene col to gene_df.
#> Adding ortholog_gene col to gene_df.
#>
#> =========== REPORT SUMMARY ===========
#> Total genes dropped after convert_orthologs :
#> 4,725 / 21,207 (22%)
#> Total genes remaining after convert_orthologs :
#> 16,482 / 21,207 (78%)
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 16,482 / 21,207 (77.72%) target_species genes remain after ortholog conversion.
#> 16,482 / 19,129 (86.16%) reference_species genes remain after ortholog conversion.
#> Gathering ortholog reports.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> --
#>
#> =========== REPORT SUMMARY ===========
#> 19,129 / 19,129 (100%) target_species genes remain after ortholog conversion.
#> 19,129 / 19,129 (100%) reference_species genes remain after ortholog conversion.
#> 16,482 intersect background genes used.
#> Retrieving all genes using: homologene.
#> Retrieving all organisms available in homologene.
#> Mapping species name: human
#> Common name mapping found for human
#> 1 organism identified from search: 9606
#> Gene table with 19,129 rows retrieved.
#> Returning all 19,129 genes from human.
#> Returning 19,129 unique genes from entire human genome.
#> Using intersect between background gene lists: 16,482 genes.
#> Standardising sct_data.
#> Using 1st column of tt as gene column: HGNC.symbol
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 6 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 2 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 oligodendrocytes 1 0 1.828088 2.891376 0
#> 2 microglia 1 0 1.303466 2.027416 0
#> 1 core(s) assigned as workers (3 reserved).
#> Standardising CellTypeDataset
#> Checking gene list inputs.
#> Running without gene size control.
#> 5 hit gene(s) remain after filtering.
#> Computing gene scores.
#> Using previously sampled genes.
#> Computing gene counts.
#> Testing for enrichment in 7 cell types...
#> Sorting results by p-value.
#> Computing BH-corrected q-values.
#> 1 significant cell type enrichment results @ q<0.05 :
#> CellType annotLevel p fold_change sd_from_mean q
#> 1 astrocytes_ependymal 1 0 1.805527 1.328802 0
# Fill a list with the results
results <- EWCE::add_res_to_merging_list(tt_results_36)
results <- EWCE::add_res_to_merging_list(tt_results_44, results)
# Perform the merged analysis
# For publication reps should be higher
merged_res <- EWCE::merged_ewce(
results = results,
reps = 2
)
print(merged_res)
#> CellType p fc sd_from_mean
#> astrocytes_ependymal astrocytes_ependymal 0.00000 1.4687607 2.216106067
#> endothelial_mural endothelial_mural 0.88745 0.8875690 -0.897374689
#> interneurons interneurons 1.00000 0.6851364 -2.164713522
#> microglia microglia 0.22615 1.1189468 0.962195598
#> oligodendrocytes oligodendrocytes 0.00000 1.7224824 5.865831164
#> pyramidal_CA1 pyramidal_CA1 1.00000 0.7706657 -1.868383625
#> pyramidal_SS pyramidal_SS 0.77845 0.8386722 -1.470335230
#> astrocytes_ependymal1 astrocytes_ependymal 0.22155 1.2243773 0.688081191
#> endothelial_mural1 endothelial_mural 0.44215 1.0023543 0.008742742
#> interneurons1 interneurons 1.00000 0.6952096 -1.275238460
#> microglia1 microglia 0.22470 1.2819567 0.764834358
#> oligodendrocytes1 oligodendrocytes 0.33610 1.2106870 0.731059514
#> pyramidal_CA11 pyramidal_CA1 0.33075 1.0210804 0.139707333
#> pyramidal_SS1 pyramidal_SS 0.77530 0.7672755 -0.896820478
#> Direction
#> astrocytes_ependymal Up
#> endothelial_mural Up
#> interneurons Up
#> microglia Up
#> oligodendrocytes Up
#> pyramidal_CA1 Up
#> pyramidal_SS Up
#> astrocytes_ependymal1 Down
#> endothelial_mural1 Down
#> interneurons1 Down
#> microglia1 Down
#> oligodendrocytes1 Down
#> pyramidal_CA11 Down
#> pyramidal_SS1 Down