quantile_normalise_abundance() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and Scales transcript abundance compansating for sequencing depth (e.g., with TMM algorithm, Robinson and Oshlack doi.org/10.1186/gb-2010-11-3-r25).

quantile_normalise_abundance(
  .data,
  .abundance = NULL,
  method = "limma_normalize_quantiles",
  target_distribution = NULL
)

# S4 method for class 'SummarizedExperiment'
quantile_normalise_abundance(
  .data,
  .abundance = NULL,
  method = "limma_normalize_quantiles",
  target_distribution = NULL
)

# S4 method for class 'RangedSummarizedExperiment'
quantile_normalise_abundance(
  .data,
  .abundance = NULL,
  method = "limma_normalize_quantiles",
  target_distribution = NULL
)

Arguments

.data

A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))

.abundance

The name of the transcript/gene abundance column

method

A character string. Either "limma_normalize_quantiles" for limma::normalizeQuantiles or "preprocesscore_normalize_quantiles_use_target" for preprocessCore::normalize.quantiles.use.target for large-scale datasets.

target_distribution

A numeric vector. If NULL the target distribution will be calculated by preprocessCore. This argument only affects the "preprocesscore_normalize_quantiles_use_target" method.

Value

A tbl object with additional columns with scaled data as `<NAME OF COUNT COLUMN>_scaled`

A `SummarizedExperiment` object

A `SummarizedExperiment` object

Details

`r lifecycle::badge("maturing")`

Tranform the feature abundance across samples so to have the same quantile distribution (using preprocessCore).

Underlying method

If `limma_normalize_quantiles` is chosen

.data |>limma::normalizeQuantiles()

If `preprocesscore_normalize_quantiles_use_target` is chosen

.data |> preprocessCore::normalize.quantiles.use.target( target = preprocessCore::normalize.quantiles.determine.target(.data) )

References

Mangiola, S., Molania, R., Dong, R., Doyle, M. A., & Papenfuss, A. T. (2021). tidybulk: an R tidy framework for modular transcriptomic data analysis. Genome Biology, 22(1), 42. doi:10.1186/s13059-020-02233-7

Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47. doi:10.1093/nar/gkv007

Examples

## Load airway dataset for examples

  data('airway', package = 'airway')
  # Ensure a 'condition' column exists for examples expecting it

    SummarizedExperiment::colData(airway)$condition <- SummarizedExperiment::colData(airway)$dex




 airway |>
   quantile_normalise_abundance()
#> # A SummarizedExperiment-tibble abstraction: 509,416 × 25
#> # Features=63677 | Samples=8 | Assays=counts, counts_scaled
#>    .feature      .sample counts counts_scaled SampleName cell  dex   albut Run  
#>    <chr>         <chr>    <int>         <dbl> <fct>      <fct> <fct> <fct> <fct>
#>  1 ENSG00000000… SRR103…    679         691.  GSM1275862 N613… untrt untrt SRR1…
#>  2 ENSG00000000… SRR103…      0           0   GSM1275862 N613… untrt untrt SRR1…
#>  3 ENSG00000000… SRR103…    467         469.  GSM1275862 N613… untrt untrt SRR1…
#>  4 ENSG00000000… SRR103…    260         258.  GSM1275862 N613… untrt untrt SRR1…
#>  5 ENSG00000000… SRR103…     60          58.6 GSM1275862 N613… untrt untrt SRR1…
#>  6 ENSG00000000… SRR103…      0           0   GSM1275862 N613… untrt untrt SRR1…
#>  7 ENSG00000000… SRR103…   3251        3439.  GSM1275862 N613… untrt untrt SRR1…
#>  8 ENSG00000001… SRR103…   1433        1474.  GSM1275862 N613… untrt untrt SRR1…
#>  9 ENSG00000001… SRR103…    519         527.  GSM1275862 N613… untrt untrt SRR1…
#> 10 ENSG00000001… SRR103…    394         395.  GSM1275862 N613… untrt untrt SRR1…
#> # ℹ 40 more rows
#> # ℹ 16 more variables: avgLength <int>, Experiment <fct>, Sample <fct>,
#> #   BioSample <fct>, condition <fct>, gene_id <chr>, gene_name <chr>,
#> #   entrezid <int>, gene_biotype <chr>, gene_seq_start <int>,
#> #   gene_seq_end <int>, seq_name <chr>, seq_strand <int>,
#> #   seq_coord_system <int>, symbol <chr>, GRangesList <list>