Unnest expands a list-column containing data frames into rows and columns.
# S3 method for class 'tidySingleCellExperiment_nested'
unnest(
data,
cols,
...,
keep_empty = FALSE,
ptype = NULL,
names_sep = NULL,
names_repair = "check_unique",
.drop,
.id,
.sep,
.preserve
)
unnest_single_cell_experiment(
data,
cols,
...,
keep_empty = FALSE,
ptype = NULL,
names_sep = NULL,
names_repair = "check_unique",
.drop,
.id,
.sep,
.preserve
)
A data frame.
<tidy-select
> List-columns to unnest.
When selecting multiple columns, values from the same row will be recycled to their common size.
:
previously you could write df %>% unnest(x, y, z)
.
Convert to df %>% unnest(c(x, y, z))
. If you previously created a new
variable in unnest()
you'll now need to do it explicitly with mutate()
.
Convert df %>% unnest(y = fun(x, y, z))
to df %>% mutate(y = fun(x, y, z)) %>% unnest(y)
.
By default, you get one row of output for each element
of the list that you are unchopping/unnesting. This means that if there's a
size-0 element (like NULL
or an empty data frame or vector), then that
entire row will be dropped from the output. If you want to preserve all
rows, use keep_empty = TRUE
to replace size-0 elements with a single row
of missing values.
Optionally, a named list of column name-prototype pairs to
coerce cols
to, overriding the default that will be guessed from
combining the individual values. Alternatively, a single empty ptype
can be supplied, which will be applied to all cols
.
If NULL
, the default, the outer names will come from the
inner names. If a string, the outer names will be formed by pasting
together the outer and the inner column names, separated by names_sep
.
Used to check that output data frame has valid names. Must be one of the following options:
"minimal
": no name repair or checks, beyond basic existence,
"unique
": make sure names are unique and not empty,
"check_unique
": (the default), no name repair, but check they are unique,
"universal
": make the names unique and syntactic
a function: apply custom name repair.
tidyr_legacy: use the name repair from tidyr 0.8.
a formula: a purrr-style anonymous function (see rlang::as_function()
)
See vctrs::vec_as_names()
for more details on these terms and the
strategies used to enforce them.
:
all list-columns are now preserved; If there are any that you
don't want in the output use select()
to remove them prior to
unnesting.
:
convert df %>% unnest(x, .id = "id")
to df %>% mutate(id = names(x)) %>% unnest(x))
.
`tidySingleCellExperiment`
tidyr 1.0.0 introduced a new syntax for nest()
and unnest()
that's
designed to be more similar to other functions. Converting to the new syntax
should be straightforward (guided by the message you'll receive) but if
you just need to run an old analysis, you can easily revert to the previous
behaviour using nest_legacy()
and unnest_legacy()
as follows:
Other rectangling:
hoist()
,
unnest_longer()
,
unnest_wider()
data(pbmc_small)
pbmc_small |>
nest(data=-groups) |>
unnest(data)
#> # A SingleCellExperiment-tibble abstraction: 80 × 17
#> # Features=230 | Cells=80 | Assays=counts, logcounts
#> .cell orig.ident nCount_RNA nFeature_RNA RNA_snn_res.0.8 letter.idents
#> <chr> <fct> <dbl> <int> <fct> <fct>
#> 1 ATGCCAGAACG… SeuratPro… 70 47 0 A
#> 2 GAACCTGATGA… SeuratPro… 87 50 1 B
#> 3 TGACTGGATTC… SeuratPro… 127 56 0 A
#> 4 AGTCAGACTGC… SeuratPro… 173 53 0 A
#> 5 AGGTCATGAGT… SeuratPro… 62 31 0 A
#> 6 GGGTAACTCTA… SeuratPro… 101 41 0 A
#> 7 CATGAGACACG… SeuratPro… 51 26 0 A
#> 8 TACGCCACTCC… SeuratPro… 99 45 0 A
#> 9 GTAAGCACTCA… SeuratPro… 67 33 0 A
#> 10 TACATCACGCT… SeuratPro… 109 41 0 A
#> # ℹ 70 more rows
#> # ℹ 11 more variables: RNA_snn_res.1 <fct>, file <chr>, ident <fct>,
#> # groups <chr>, PC_1 <dbl>, PC_2 <dbl>, PC_3 <dbl>, PC_4 <dbl>, PC_5 <dbl>,
#> # tSNE_1 <dbl>, tSNE_2 <dbl>