Adjust transcript abundance for unwanted variation — adjust

adjust_abundance() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a consistent object (to the input) with an additional adjusted abundance column. This method uses scaled counts if present.

adjust_abundance(
  .data,
  .formula = NULL,
  .factor_unwanted = NULL,
  .factor_of_interest = NULL,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  method = "combat_seq",
  action = "add",
  ...,
  log_transform = NULL,
  transform = NULL,
  inverse_transform = NULL
)

# S4 method for class 'spec_tbl_df'
adjust_abundance(
  .data,
  .formula = NULL,
  .factor_unwanted = NULL,
  .factor_of_interest = NULL,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  method = "combat_seq",
  action = "add",
  ...,
  log_transform = NULL,
  transform = NULL,
  inverse_transform = NULL
)

# S4 method for class 'tbl_df'
adjust_abundance(
  .data,
  .formula = NULL,
  .factor_unwanted = NULL,
  .factor_of_interest = NULL,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  method = "combat_seq",
  action = "add",
  ...,
  log_transform = NULL,
  transform = NULL,
  inverse_transform = NULL
)

# S4 method for class 'tidybulk'
adjust_abundance(
  .data,
  .formula = NULL,
  .factor_unwanted = NULL,
  .factor_of_interest = NULL,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  method = "combat_seq",
  action = "add",
  ...,
  log_transform = NULL,
  transform = NULL,
  inverse_transform = NULL
)

# S4 method for class 'SummarizedExperiment'
adjust_abundance(
  .data,
  .formula = NULL,
  .factor_unwanted = NULL,
  .factor_of_interest = NULL,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  method = "combat_seq",
  action = "add",
  ...,
  log_transform = NULL,
  transform = NULL,
  inverse_transform = NULL
)

# S4 method for class 'RangedSummarizedExperiment'
adjust_abundance(
  .data,
  .formula = NULL,
  .factor_unwanted = NULL,
  .factor_of_interest = NULL,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  method = "combat_seq",
  action = "add",
  ...,
  log_transform = NULL,
  transform = NULL,
  inverse_transform = NULL
)

Arguments

.data: A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))
.formula: DEPRECATED - A formula with no response variable, representing the desired linear model where the first covariate is the factor of interest and the second covariate is the unwanted variation (of the kind ~ factor_of_interest + batch)
.factor_unwanted: A tidy select, e.g. column names without double quotation. c(batch, country) These are the factor that we want to adjust for, including unwanted batcheffect, and unwanted biological effects.
.factor_of_interest: A tidy select, e.g. column names without double quotation. c(treatment) These are the factor that we want to preserve.
.sample: The name of the sample column
.transcript: The name of the transcript/gene column
.abundance: The name of the transcript/gene abundance column
method: A character string. Methods include combat_seq (default), combat and limma_remove_batch_effect.
action: A character string. Whether to join the new information to the input tbl (add), or just get the non-redundant tbl with the new information (get).
...: Further parameters passed to the function sva::ComBat
log_transform: DEPRECATED - A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)
transform: DEPRECATED - A function that will tranform the counts, by default it is log1p for RNA sequencing data, but for avoinding tranformation you can use identity
inverse_transform: DEPRECATED - A function that is the inverse of transform (e.g. expm1 is inverse of log1p). This is needed to tranform back the counts after analysis.

Value

A consistent object (to the input) with additional columns for the adjusted counts as `<COUNT COLUMN>_adjusted`

A `SummarizedExperiment` object

Details

`r lifecycle::badge("maturing")`

This function adjusts the abundance for (known) unwanted variation. At the moment just an unwanted covariate is allowed at a time using Combat (DOI: 10.1093/bioinformatics/bts034)

Underlying method: sva::ComBat(data, batch = my_batch, mod = design, prior.plots = FALSE, ...)

Examples




cm = tidybulk::se_mini
cm$batch = 0
cm$batch[colnames(cm) %in% c("SRR1740035", "SRR1740043")] = 1

cm |>
identify_abundant() |>
adjust_abundance(  .factor_unwanted = batch, .factor_of_interest =  condition, method="combat"  )
#> No group or design set. Assuming all samples belong to one group.
#> Found2batches
#> Adjusting for1covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding parametric adjustments
#> Adjusting the Data
#> class: SummarizedExperiment 
#> dim: 527 5 
#> metadata(0):
#> assays(2): count count_adjusted
#> rownames(527): ABCB4 ABCB9 ... ZNF324 ZNF442
#> rowData names(2): entrez .abundant
#> colnames(5): SRR1740034 SRR1740035 SRR1740043 SRR1740058 SRR1740067
#> colData names(6): Cell.type time ... dead batch