test_gene_rank() takes as input a `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a `tbl` with the GSEA statistics
test_gene_rank(
.data,
.entrez,
.arrange_desc,
species,
gene_sets = NULL,
gene_set = NULL
)
# S4 method for class 'SummarizedExperiment'
test_gene_rank(
.data,
.entrez,
.arrange_desc,
species,
gene_sets = NULL,
gene_set = NULL
)
# S4 method for class 'RangedSummarizedExperiment'
test_gene_rank(
.data,
.entrez,
.arrange_desc,
species,
gene_sets = NULL,
gene_set = NULL
)
A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))
The ENTREZ ID of the transcripts/genes
A column name of the column to arrange in decreasing order
A character. For example, human or mouse. MSigDB uses the latin species names (e.g., \"Mus musculus\", \"Homo sapiens\")
A character vector or a list. It can take one or more of the following built-in collections as a character vector: c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), to be used with EGSEA buildIdx. c1 is human specific. Alternatively, a list of user-supplied gene sets can be provided, to be used with EGSEA buildCustomIdx. In that case, each gene set is a character vector of Entrez IDs and the names of the list are the gene set names.
DEPRECATED. Use gene_sets instead.
A consistent object (to the input)
A `SummarizedExperiment` object
A `RangedSummarizedExperiment` object
This wrapper execute gene enrichment analyses of the dataset using a list of transcripts and GSEA. This wrapper uses clusterProfiler (DOI: doi.org/10.1089/omi.2011.0118) on the back-end.
Undelying method: my_gene_collection <- msigdbr::msigdbr(species = species)
my_gene_collection <- filter(my_gene_collection, gs_collection
# Execute calculation nest(data = -gs_collection) |> mutate(fit = map( data, ~ clusterProfiler::GSEA( my_entrez_rank, TERM2GENE=.x |> select(gs_name, ncbi_gene), pvalueCutoff = 1 )
))
Mangiola, S., Molania, R., Dong, R., Doyle, M. A., & Papenfuss, A. T. (2021). tidybulk: an R tidy framework for modular transcriptomic data analysis. Genome Biology, 22(1), 42. doi:10.1186/s13059-020-02233-7
Yu, G., Wang, L. G., Han, Y., & He, Q. Y. (2012). clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology, 16(5), 284-287. doi:10.1089/omi.2011.0118
## Load airway dataset for examples
data('airway', package = 'airway')
# Ensure a 'condition' column exists for examples expecting it
SummarizedExperiment::colData(airway)$condition <- SummarizedExperiment::colData(airway)$dex
print("Not run for build time.")
#> [1] "Not run for build time."
if (FALSE) { # \dontrun{
df_entrez = airway
df_entrez = mutate(df_entrez, do_test = .feature %in% c("TNFRSF4", "PLCH2", "PADI4", "PAX7"))
df_entrez = df_entrez |> test_differential_abundance(~ condition)
test_gene_rank(
df_entrez,
.sample = .sample,
.entrez = entrez,
species="Homo sapiens",
gene_sets =c("C2"),
.arrange_desc = logFC
)
} # }