test_gene_overrepresentation() takes as input a `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a `tbl` with the GSEA statistics

test_gene_overrepresentation(
  .data,
  .entrez,
  .do_test,
  species,
  .sample = NULL,
  gene_sets = NULL,
  gene_set = NULL
)

# S4 method for spec_tbl_df
test_gene_overrepresentation(
  .data,
  .entrez,
  .do_test,
  species,
  .sample = NULL,
  gene_sets = NULL,
  gene_set = NULL
)

# S4 method for tbl_df
test_gene_overrepresentation(
  .data,
  .entrez,
  .do_test,
  species,
  .sample = NULL,
  gene_sets = NULL,
  gene_set = NULL
)

# S4 method for tidybulk
test_gene_overrepresentation(
  .data,
  .entrez,
  .do_test,
  species,
  .sample = NULL,
  gene_sets = NULL,
  gene_set = NULL
)

# S4 method for SummarizedExperiment
test_gene_overrepresentation(
  .data,
  .entrez,
  .do_test,
  species,
  .sample = NULL,
  gene_sets = NULL,
  gene_set = NULL
)

# S4 method for RangedSummarizedExperiment
test_gene_overrepresentation(
  .data,
  .entrez,
  .do_test,
  species,
  .sample = NULL,
  gene_sets = NULL,
  gene_set = NULL
)

Arguments

.data

A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))

.entrez

The ENTREZ ID of the transcripts/genes

.do_test

A boolean column name symbol. It indicates the transcript to check

species

A character. For example, human or mouse. MSigDB uses the latin species names (e.g., \"Mus musculus\", \"Homo sapiens\")

.sample

The name of the sample column

gene_sets

A character vector. The subset of MSigDB datasets you want to test against (e.g. \"C2\"). If NULL all gene sets are used (suggested). This argument was added to avoid time overflow of the examples.

gene_set

DEPRECATED. Use gene_sets instead.

Value

A consistent object (to the input)

A `spec_tbl_df` object

A `tbl_df` object

A `tidybulk` object

A `SummarizedExperiment` object

A `RangedSummarizedExperiment` object

Details

`r lifecycle::badge("maturing")`

This wrapper execute gene enrichment analyses of the dataset using a list of transcripts and GSEA. This wrapper uses clusterProfiler (DOI: doi.org/10.1089/omi.2011.0118) on the back-end.

Undelying method: msigdbr::msigdbr(species = species) nest(data = -gs_cat) mutate(test = map( data, ~ clusterProfiler::enricher( my_entrez_rank, TERM2GENE=.x pvalueCutoff = 1 ) ))

Examples


#se_mini = aggregate_duplicates(tidybulk::se_mini, .transcript = entrez)
#df_entrez = mutate(df_entrez, do_test = feature %in% c("TNFRSF4", "PLCH2", "PADI4", "PAX7"))

if (FALSE) {
  test_gene_overrepresentation(
    df_entrez,
    .sample = sample,
    .entrez = entrez,
    .do_test = do_test,
    species="Homo sapiens",
   gene_sets =c("C2")
  )
}