ensembl_to_symbol() takes as input a `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a consistent object (to the input) with the additional transcript symbol column

ensembl_to_symbol(.data, .ensembl, action = "add")

# S4 method for class 'spec_tbl_df'
ensembl_to_symbol(.data, .ensembl, action = "add")

# S4 method for class 'tbl_df'
ensembl_to_symbol(.data, .ensembl, action = "add")

# S4 method for class 'tidybulk'
ensembl_to_symbol(.data, .ensembl, action = "add")

Arguments

.data

a `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))

.ensembl

A character string. The column that is represents ensembl gene id

action

A character string. Whether to join the new information to the input tbl (add), or just get the non-redundant tbl with the new information (get).

Value

A consistent object (to the input) including additional columns for transcript symbol

A consistent object (to the input) including additional columns for transcript symbol

A consistent object (to the input) including additional columns for transcript symbol

A consistent object (to the input) including additional columns for transcript symbol

Details

[Questioning]

This is useful since different resources use ensembl IDs while others use gene symbol IDs. At the moment this work for human (genes and transcripts) and mouse (genes) data.

Examples




# This function was designed for data.frame
# Convert from SummarizedExperiment for this example. It is NOT reccomended.

tidybulk::se_mini |> tidybulk() |> as_tibble() |> ensembl_to_symbol(.feature)
#> # A tibble: 2,635 × 11
#>    .feature .sample    count Cell.type time  condition  days  dead entrez
#>    <chr>    <chr>      <dbl> <chr>     <chr> <lgl>     <dbl> <dbl> <chr> 
#>  1 ABCB4    SRR1740034  1035 b_cell    0 d   TRUE          1     1 5244  
#>  2 ABCB9    SRR1740034    45 b_cell    0 d   TRUE          1     1 23457 
#>  3 ACAP1    SRR1740034  7151 b_cell    0 d   TRUE          1     1 9744  
#>  4 ACHE     SRR1740034     2 b_cell    0 d   TRUE          1     1 43    
#>  5 ACP5     SRR1740034  2278 b_cell    0 d   TRUE          1     1 54    
#>  6 ADAM28   SRR1740034 11156 b_cell    0 d   TRUE          1     1 10863 
#>  7 ADAMDEC1 SRR1740034    72 b_cell    0 d   TRUE          1     1 27299 
#>  8 ADAMTS3  SRR1740034     0 b_cell    0 d   TRUE          1     1 9508  
#>  9 ADRB2    SRR1740034   298 b_cell    0 d   TRUE          1     1 154   
#> 10 AIF1     SRR1740034     8 b_cell    0 d   TRUE          1     1 199   
#> # ℹ 2,625 more rows
#> # ℹ 2 more variables: transcript <chr>, ref_genome <chr>