adjust_abundance() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a consistent object (to the input) with an additional adjusted abundance column. This method uses scaled counts if present.

adjust_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  transform = log1p,
  inverse_transform = expm1,
  action = "add",
  ...,
  log_transform = NULL
)

# S4 method for spec_tbl_df
adjust_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  transform = log1p,
  inverse_transform = expm1,
  action = "add",
  ...,
  log_transform = NULL
)

# S4 method for tbl_df
adjust_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  transform = log1p,
  inverse_transform = expm1,
  action = "add",
  ...,
  log_transform = NULL
)

# S4 method for tidybulk
adjust_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  transform = log1p,
  inverse_transform = expm1,
  action = "add",
  ...,
  log_transform = NULL
)

# S4 method for SummarizedExperiment
adjust_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  transform = log1p,
  inverse_transform = expm1,
  action = "add",
  ...,
  log_transform = NULL
)

# S4 method for RangedSummarizedExperiment
adjust_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  transform = log1p,
  inverse_transform = expm1,
  action = "add",
  ...,
  log_transform = NULL
)

Arguments

.data

A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))

.formula

A formula with no response variable, representing the desired linear model where the first covariate is the factor of interest and the second covariate is the unwanted variation (of the kind ~ factor_of_interest + batch)

.sample

The name of the sample column

.transcript

The name of the transcript/gene column

.abundance

The name of the transcript/gene abundance column

transform

A function that will tranform the counts, by default it is log1p for RNA sequencing data, but for avoinding tranformation you can use identity

inverse_transform

A function that is the inverse of transform (e.g. expm1 is inverse of log1p). This is needed to tranform back the counts after analysis.

action

A character string. Whether to join the new information to the input tbl (add), or just get the non-redundant tbl with the new information (get).

...

Further parameters passed to the function sva::ComBat

log_transform

DEPRECATED - A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)

Value

A consistent object (to the input) with additional columns for the adjusted counts as `<COUNT COLUMN>_adjusted`

A consistent object (to the input) with additional columns for the adjusted counts as `<COUNT COLUMN>_adjusted`

A consistent object (to the input) with additional columns for the adjusted counts as `<COUNT COLUMN>_adjusted`

A consistent object (to the input) with additional columns for the adjusted counts as `<COUNT COLUMN>_adjusted`

A `SummarizedExperiment` object

A `SummarizedExperiment` object

Details

`r lifecycle::badge("maturing")`

This function adjusts the abundance for (known) unwanted variation. At the moment just an unwanted covariate is allowed at a time using Combat (DOI: 10.1093/bioinformatics/bts034)

Underlying method: sva::ComBat(data, batch = my_batch, mod = design, prior.plots = FALSE, ...)

Examples




cm = tidybulk::se_mini
cm$batch = 0
cm$batch[colnames(cm) %in% c("SRR1740035", "SRR1740043")] = 1

 cm %>%
 identify_abundant() |>
  adjust_abundance(  ~ condition + batch  )
#> No group or design set. Assuming all samples belong to one group.
#> Found2batches
#> Adjusting for1covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding parametric adjustments
#> Adjusting the Data
#> class: SummarizedExperiment 
#> dim: 527 5 
#> metadata(0):
#> assays(2): count count_adjusted
#> rownames(527): ABCB4 ABCB9 ... ZNF324 ZNF442
#> rowData names(2): entrez .abundant
#> colnames(5): SRR1740034 SRR1740035 SRR1740043 SRR1740058 SRR1740067
#> colData names(6): Cell.type time ... dead batch