keep_abundant() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a consistent object (to the input) with additional columns for the statistics from the hypothesis test.
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
# S4 method for class 'spec_tbl_df'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
# S4 method for class 'tbl_df'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
# S4 method for class 'tidybulk'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
# S4 method for class 'SummarizedExperiment'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
# S4 method for class 'RangedSummarizedExperiment'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))
The name of the sample column
The name of the transcript/gene column
The name of the transcript/gene abundance column
The name of the column of the factor of interest. This is used for defining sample groups for the filtering process. It uses the filterByExpr function from edgeR.
A real positive number. It is the threshold of count per million that is used to filter transcripts/genes out from the scaling procedure.
A real positive number between 0 and 1. It is the threshold of proportion of samples for each transcripts/genes that have to be characterised by a cmp bigger than the threshold to be included for scaling procedure.
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A `SummarizedExperiment` object
A `SummarizedExperiment` object
questioning
At the moment this function uses edgeR (DOI: 10.1093/bioinformatics/btp616)
Underlying method: edgeR::filterByExpr( data, min.count = minimum_counts, group = string_factor_of_interest, min.prop = minimum_proportion )
keep_abundant(
tidybulk::se_mini
)
#> No group or design set. Assuming all samples belong to one group.
#> class: SummarizedExperiment
#> dim: 182 5
#> metadata(0):
#> assays(1): count
#> rownames(182): ACAP1 ACP5 ... ZNF286A ZNF324
#> rowData names(2): entrez .abundant
#> colnames(5): SRR1740034 SRR1740035 SRR1740043 SRR1740058 SRR1740067
#> colData names(5): Cell.type time condition days dead