R/methods.R
, R/methods_SE.R
impute_missing_abundance-methods.Rd
impute_missing_abundance() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a consistent object (to the input) with additional sample-transcript pairs with imputed transcript abundance.
impute_missing_abundance(
.data,
.formula,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
suffix = "",
force_scaling = FALSE
)
# S4 method for class 'spec_tbl_df'
impute_missing_abundance(
.data,
.formula,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
suffix = "",
force_scaling = FALSE
)
# S4 method for class 'tbl_df'
impute_missing_abundance(
.data,
.formula,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
suffix = "",
force_scaling = FALSE
)
# S4 method for class 'tidybulk'
impute_missing_abundance(
.data,
.formula,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
suffix = "",
force_scaling = FALSE
)
# S4 method for class 'SummarizedExperiment'
impute_missing_abundance(
.data,
.formula,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
suffix = "",
force_scaling = FALSE
)
# S4 method for class 'RangedSummarizedExperiment'
impute_missing_abundance(
.data,
.formula,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
suffix = "",
force_scaling = FALSE
)
A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))
A formula with no response variable, representing the desired linear model where the first covariate is the factor of interest and the second covariate is the unwanted variation (of the kind ~ factor_of_interest + batch)
The name of the sample column
The name of the transcript/gene column
The name of the transcript/gene abundance column
A character string. This is added to the imputed count column names. If empty the count column are overwritten
A boolean. In case a abundance-containing column is not scaled (columns with _scale suffix), setting force_scaling = TRUE will result in a scaling by library size, to compensating for a possible difference in sequencing depth.
A consistent object (to the input) non-sparse abundance
A consistent object (to the input) with imputed abundance
A consistent object (to the input) with imputed abundance
A consistent object (to the input) with imputed abundance
A `SummarizedExperiment` object
A `SummarizedExperiment` object
`r lifecycle::badge("maturing")`
This function imputes the abundance of missing sample-transcript pair using the median of the sample group defined by the formula
res =
impute_missing_abundance(
tidybulk::se_mini,
~ condition
)
#> tidybulk says: count appears not to be scaled for sequencing depth (missing _scaled suffix; if you think this column is idependent of sequencing depth ignore this message), therefore the imputation can produce non meaningful results if sequencing depth for samples are highly variable. If you use force_scaling = TRUE library size will be used for eliminatig some sequencig depth effect before imputation