`filter()`

retains the rows where the conditions you provide a `TRUE`

. Note
that, unlike base subsetting with `[`

, rows where the condition evaluates
to `NA`

are dropped.

`summarise()`

creates a new data frame. It will have one (or more) rows for
each combination of grouping variables; if there are no grouping variables,
the output will have a single row summarising all observations in the input.
It will contain one column for each grouping variable and one column
for each of the summary statistics that you have specified.

`summarise()`

and `summarize()`

are synonyms.

`mutate()`

adds new variables and preserves existing ones;
`transmute()`

adds new variables and drops existing ones.
New variables overwrite existing variables of the same name.
Variables can be removed by setting their value to `NULL`

.

Rename individual variables using `new_name=old_name`

syntax.

See this repository for alternative ways to perform row-wise operations.

`slice()`

lets you index rows by their (integer) locations. It allows you
to select, remove, and duplicate rows. It is accompanied by a number of
helpers for common use cases:

`slice_head()`

and`slice_tail()`

select the first or last rows.`slice_sample()`

randomly selects rows.`slice_min()`

and`slice_max()`

select rows with highest or lowest values of a variable.

If `.data`

is a grouped_df, the operation will be performed on each group,
so that (e.g.) `slice_head(df, n=5)`

will select the first five rows in
each group.

Select (and optionally rename) variables in a data frame, using a concise
mini-language that makes it easy to refer to variables based on their name
(e.g. `a:f`

selects all columns from `a`

on the left to `f`

on the
right). You can also use predicate functions like is.numeric to select
variables based on their properties.

`sample_n()`

and `sample_frac()`

have been superseded in favour of
`slice_sample()`

. While they will not be deprecated in the near future,
retirement means that we will only perform critical bug fixes, so we recommend
moving to the newer alternative.

These functions were superseded because we realised it was more convenient to
have two mutually exclusive arguments to one function, rather than two
separate functions. This also made it to clean up a few other smaller
design issues with `sample_n()`

/`sample_frac`

:

The connection to

`slice()`

was not obvious.The name of the first argument,

`tbl`

, is inconsistent with other single table verbs which use`.data`

.The

`size`

argument uses tidy evaluation, which is surprising and undocumented.It was easier to remove the deprecated

`.env`

argument.`...`

was in a suboptimal position.

`pull()`

is similar to `$`

. It's mostly useful because it looks a little
nicer in pipes, it also works with remote data frames, and it can optionally
name the output.

```
bind_rows(..., .id = NULL, add.cell.ids = NULL)
bind_cols(..., .id = NULL)
# S3 method for SummarizedExperiment
filter(.data, ..., .preserve = FALSE)
```

- ...
For use by methods.

- .id
Data frame identifier.

When

`.id`

is supplied, a new column of identifiers is created to link each row to its original data frame. The labels are taken from the named arguments to`bind_rows()`

. When a list of data frames is supplied, the labels are taken from the names of the list. If no names are found a numeric sequence is used instead.- add.cell.ids
from SummarizedExperiment 3.0 A character vector of length(x=c(x, y)). Appends the corresponding values to the start of each objects' cell names.

- .data
A tidySummarizedExperiment object or any data frame

- .preserve
when

`FALSE`

(the default), the grouping structure is recalculated based on the resulting data, otherwise it is kept as is.- .keep_all
If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values. (See dplyr)

- .drop
When

`.drop=TRUE`

, empty groups are dropped. See`group_by_drop_default()`

for what the default value is for this argument.- data
Input data frame.

- x
tbls to join. (See dplyr)

- y
tbls to join. (See dplyr)

- by
A character vector of variables to join by. (See dplyr)

- copy
If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. (See dplyr)

- suffix
If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2. (See dplyr)

- tbl
A data.frame.

- size
<

`tidy-select`

> For`sample_n()`

, the number of rows to select. For`sample_frac()`

, the fraction of rows to select. If`tbl`

is grouped,`size`

applies to each group.- replace
Sample with or without replacement?

- weight
<

`tidy-select`

> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1.- .env
DEPRECATED.

- name
An optional parameter that specifies the column to be used as names for a named vector. Specified in a similar manner as

`var`

.

A tidySummarizedExperiment object

An object of the same type as `.data`

.

Rows are a subset of the input, but appear in the same order.

Columns are not modified.

The number of groups may be reduced (if

`.preserve`

is not`TRUE`

).Data frame attributes are preserved.

A grouped data frame, unless the combination of
`...`

and `add`

yields a non empty set of grouping columns, a
regular (ungrouped) data frame otherwise.

An object *usually* of the same type as `.data`

.

The rows come from the underlying

`group_keys()`

.The columns are a combination of the grouping keys and the summary expressions that you provide.

If

`x`

is grouped by more than one variable, the output will be another grouped_df with the right-most group removed.If

`x`

is grouped by one variable, or is not grouped, the output will be a tibble.Data frame attributes are

**not**preserved, because`summarise()`

fundamentally creates a new data frame.

An object of the same type as `.data`

.

For `mutate()`

:

Rows are not affected.

Existing columns will be preserved unless explicitly modified.

New columns will be added to the right of existing columns.

Columns given value

`NULL`

will be removedGroups will be recomputed if a grouping variable is mutated.

Data frame attributes are preserved.

For `transmute()`

:

Rows are not affected.

Apart from grouping variables, existing columns will be remove unless explicitly kept.

Column order matches order of expressions.

Groups will be recomputed if a grouping variable is mutated.

Data frame attributes are preserved.

An object of the same type as `.data`

.

Rows are not affected.

Column names are changed; column order is preserved

Data frame attributes are preserved.

Groups are updated to reflect new names.

A `tbl`

A `tbl`

A tidySummarizedExperiment object

A tidySummarizedExperiment object

A tidySummarizedExperiment object

A tidySummarizedExperiment object

An object of the same type as `.data`

. The output has the following
properties:

Each row may appear 0, 1, or many times in the output.

Columns are not modified.

Groups are not modified.

Data frame attributes are preserved.

An object of the same type as `.data`

. The output has the following
properties:

Rows are not affected.

Output columns are a subset of input columns, potentially with a different order. Columns will be renamed if

`new_name=old_name`

form is used.Data frame attributes are preserved.

Groups are maintained; you can't select off grouping variables.

A tidySummarizedExperiment object

A vector the same size as `.data`

.

dplyr is not yet smart enough to optimise filtering optimisation
on grouped datasets that don't need grouped calculations. For this reason,
filtering is often considerably faster on `ungroup()`

ed data.

`rowwise()`

is used for the results of `do()`

when you
create list-variables. It is also useful to support arbitrary
complex operations that need to be applied to each row.

Currently, rowwise grouping only works with data frames. Its
main impact is to allow you to work with list-variables in
`summarise()`

and `mutate()`

without having to
use `[[1]]`

. This makes `summarise()`

on a rowwise tbl
effectively equivalent to `plyr::ldply()`

.

Slice does not work with relational databases because they have no
intrinsic notion of row order. If you want to perform the equivalent
operation, use `filter()`

and `row_number()`

.

Because filtering expressions are computed within groups, they may yield different results on grouped tibbles. This will be the case as soon as an aggregating, lagging, or ranking function is involved. Compare this ungrouped filtering:

The former keeps rows with `mass`

greater than the global average
whereas the latter keeps rows with `mass`

greater than the gender

average.

Because mutating expressions are computed within groups, they may yield different results on grouped tibbles. This will be the case as soon as an aggregating, lagging, or ranking function is involved. Compare this ungrouped mutate:

This function is a **generic**, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

The following methods are currently available in loaded packages:

These function are **generic**s, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

Methods available in currently loaded packages:

This function is a **generic**, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

The following methods are currently available in loaded packages:

These function are **generic**s, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

Methods available in currently loaded packages:

This function is a **generic**, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

The following methods are currently available in loaded packages:

These function are **generic**s, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

Methods available in currently loaded packages:

`slice()`

: no methods found .`slice_head()`

: no methods found .`slice_tail()`

: no methods found .`slice_min()`

: no methods found .`slice_max()`

: no methods found .`slice_sample()`

: no methods found .

**generic**, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

The following methods are currently available in loaded packages: no methods found .

**generic**, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

The following methods are currently available in loaded packages: no methods found .

The data frame backend supports creating a variable and using it in the
same summary. This means that previously created summary variables can be
further transformed or combined within the summary, as in `mutate()`

.
However, it also means that summary variables with the same names as previous
variables overwrite them, making those variables unavailable to later summary
variables.

This behaviour may not be supported in other backends. To avoid unexpected results, consider using new names for your summary variables, especially when creating multiple summaries.

Use the three scoped variants (`rename_all()`

, `rename_if()`

, `rename_at()`

)
to renaming a set of variables with a function.

`filter_all()`

, `filter_if()`

and `filter_at()`

.

```
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
distinct(.sample)
#> tidySummarizedExperiment says: Key columns are missing. A data frame is returned for independent data analysis.
#> # A tibble: 7 × 1
#> .sample
#> <chr>
#> 1 untrt1
#> 2 untrt2
#> 3 untrt3
#> 4 untrt4
#> 5 trt1
#> 6 trt2
#> 7 trt3
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
filter(.sample == "untrt1")
#> # A SummarizedExperiment-tibble abstraction: 14,599 × 5
#> # Features=14599 | Samples=1 | Assays=counts
#> .feature .sample counts condition type
#> <chr> <chr> <int> <chr> <chr>
#> 1 FBgn0000003 untrt1 0 untreated single_end
#> 2 FBgn0000008 untrt1 92 untreated single_end
#> 3 FBgn0000014 untrt1 5 untreated single_end
#> 4 FBgn0000015 untrt1 0 untreated single_end
#> 5 FBgn0000017 untrt1 4664 untreated single_end
#> 6 FBgn0000018 untrt1 583 untreated single_end
#> 7 FBgn0000022 untrt1 0 untreated single_end
#> 8 FBgn0000024 untrt1 10 untreated single_end
#> 9 FBgn0000028 untrt1 0 untreated single_end
#> 10 FBgn0000032 untrt1 1446 untreated single_end
#> # … with 40 more rows
# Learn more in ?dplyr_tidy_eval
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
group_by(.sample)
#> tidySummarizedExperiment says: A data frame is returned for independent data analysis.
#> # A tibble: 102,193 × 5
#> # Groups: .sample [7]
#> .feature .sample counts condition type
#> <chr> <chr> <int> <chr> <chr>
#> 1 FBgn0000003 untrt1 0 untreated single_end
#> 2 FBgn0000008 untrt1 92 untreated single_end
#> 3 FBgn0000014 untrt1 5 untreated single_end
#> 4 FBgn0000015 untrt1 0 untreated single_end
#> 5 FBgn0000017 untrt1 4664 untreated single_end
#> 6 FBgn0000018 untrt1 583 untreated single_end
#> 7 FBgn0000022 untrt1 0 untreated single_end
#> 8 FBgn0000024 untrt1 10 untreated single_end
#> 9 FBgn0000028 untrt1 0 untreated single_end
#> 10 FBgn0000032 untrt1 1446 untreated single_end
#> # … with 102,183 more rows
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
summarise(mean(counts))
#> tidySummarizedExperiment says: A data frame is returned for independent data analysis.
#> # A tibble: 1 × 1
#> `mean(counts)`
#> <dbl>
#> 1 907.
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
mutate(logcounts=log2(counts))
#> # A SummarizedExperiment-tibble abstraction: 102,193 × 6
#> # Features=14599 | Samples=7 | Assays=counts, logcounts
#> .feature .sample counts logcounts condition type
#> <chr> <chr> <int> <dbl> <chr> <chr>
#> 1 FBgn0000003 untrt1 0 -Inf untreated single_end
#> 2 FBgn0000008 untrt1 92 6.52 untreated single_end
#> 3 FBgn0000014 untrt1 5 2.32 untreated single_end
#> 4 FBgn0000015 untrt1 0 -Inf untreated single_end
#> 5 FBgn0000017 untrt1 4664 12.2 untreated single_end
#> 6 FBgn0000018 untrt1 583 9.19 untreated single_end
#> 7 FBgn0000022 untrt1 0 -Inf untreated single_end
#> 8 FBgn0000024 untrt1 10 3.32 untreated single_end
#> 9 FBgn0000028 untrt1 0 -Inf untreated single_end
#> 10 FBgn0000032 untrt1 1446 10.5 untreated single_end
#> # … with 40 more rows
`%>%` <- magrittr::`%>%`
# tidySummarizedExperiment::pasilla %>%
#
# rename(cond=condition)
`%>%` <- magrittr::`%>%`
`%>%` <- magrittr::`%>%`
tt <- tidySummarizedExperiment::pasilla
tt %>% left_join(tt %>% distinct(condition) %>% mutate(new_column=1:2))
#> tidySummarizedExperiment says: Key columns are missing. A data frame is returned for independent data analysis.
#> Joining with `by = join_by(condition)`
#> # A SummarizedExperiment-tibble abstraction: 102,193 × 6
#> # Features=14599 | Samples=7 | Assays=counts
#> .feature .sample counts condition type new_column
#> <chr> <chr> <int> <chr> <chr> <int>
#> 1 FBgn0000003 untrt1 0 untreated single_end 1
#> 2 FBgn0000008 untrt1 92 untreated single_end 1
#> 3 FBgn0000014 untrt1 5 untreated single_end 1
#> 4 FBgn0000015 untrt1 0 untreated single_end 1
#> 5 FBgn0000017 untrt1 4664 untreated single_end 1
#> 6 FBgn0000018 untrt1 583 untreated single_end 1
#> 7 FBgn0000022 untrt1 0 untreated single_end 1
#> 8 FBgn0000024 untrt1 10 untreated single_end 1
#> 9 FBgn0000028 untrt1 0 untreated single_end 1
#> 10 FBgn0000032 untrt1 1446 untreated single_end 1
#> # … with 40 more rows
`%>%` <- magrittr::`%>%`
tt <- tidySummarizedExperiment::pasilla
tt %>% inner_join(tt %>% distinct(condition) %>% mutate(new_column=1:2) %>% slice(1))
#> tidySummarizedExperiment says: Key columns are missing. A data frame is returned for independent data analysis.
#> Joining with `by = join_by(condition)`
#> # A SummarizedExperiment-tibble abstraction: 58,396 × 6
#> # Features=14599 | Samples=4 | Assays=counts
#> .feature .sample counts condition type new_column
#> <chr> <chr> <int> <chr> <chr> <int>
#> 1 FBgn0000003 untrt1 0 untreated single_end 1
#> 2 FBgn0000008 untrt1 92 untreated single_end 1
#> 3 FBgn0000014 untrt1 5 untreated single_end 1
#> 4 FBgn0000015 untrt1 0 untreated single_end 1
#> 5 FBgn0000017 untrt1 4664 untreated single_end 1
#> 6 FBgn0000018 untrt1 583 untreated single_end 1
#> 7 FBgn0000022 untrt1 0 untreated single_end 1
#> 8 FBgn0000024 untrt1 10 untreated single_end 1
#> 9 FBgn0000028 untrt1 0 untreated single_end 1
#> 10 FBgn0000032 untrt1 1446 untreated single_end 1
#> # … with 40 more rows
`%>%` <- magrittr::`%>%`
tt <- tidySummarizedExperiment::pasilla
tt %>% right_join(tt %>% distinct(condition) %>% mutate(new_column=1:2) %>% slice(1))
#> tidySummarizedExperiment says: Key columns are missing. A data frame is returned for independent data analysis.
#> Joining with `by = join_by(condition)`
#> # A SummarizedExperiment-tibble abstraction: 58,396 × 6
#> # Features=14599 | Samples=4 | Assays=counts
#> .feature .sample counts condition type new_column
#> <chr> <chr> <int> <chr> <chr> <int>
#> 1 FBgn0000003 untrt1 0 untreated single_end 1
#> 2 FBgn0000008 untrt1 92 untreated single_end 1
#> 3 FBgn0000014 untrt1 5 untreated single_end 1
#> 4 FBgn0000015 untrt1 0 untreated single_end 1
#> 5 FBgn0000017 untrt1 4664 untreated single_end 1
#> 6 FBgn0000018 untrt1 583 untreated single_end 1
#> 7 FBgn0000022 untrt1 0 untreated single_end 1
#> 8 FBgn0000024 untrt1 10 untreated single_end 1
#> 9 FBgn0000028 untrt1 0 untreated single_end 1
#> 10 FBgn0000032 untrt1 1446 untreated single_end 1
#> # … with 40 more rows
`%>%` <- magrittr::`%>%`
tt <- tidySummarizedExperiment::pasilla
tt %>% full_join(tibble::tibble(condition="treated", dose=10))
#> Joining with `by = join_by(condition)`
#> # A SummarizedExperiment-tibble abstraction: 102,193 × 6
#> # Features=14599 | Samples=7 | Assays=counts
#> .feature .sample counts condition type dose
#> <chr> <chr> <int> <chr> <chr> <dbl>
#> 1 FBgn0000003 untrt1 0 untreated single_end NA
#> 2 FBgn0000008 untrt1 92 untreated single_end NA
#> 3 FBgn0000014 untrt1 5 untreated single_end NA
#> 4 FBgn0000015 untrt1 0 untreated single_end NA
#> 5 FBgn0000017 untrt1 4664 untreated single_end NA
#> 6 FBgn0000018 untrt1 583 untreated single_end NA
#> 7 FBgn0000022 untrt1 0 untreated single_end NA
#> 8 FBgn0000024 untrt1 10 untreated single_end NA
#> 9 FBgn0000028 untrt1 0 untreated single_end NA
#> 10 FBgn0000032 untrt1 1446 untreated single_end NA
#> # … with 40 more rows
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
slice(1)
#> # A SummarizedExperiment-tibble abstraction: 1 × 5
#> # Features=1 | Samples=1 | Assays=counts
#> .feature .sample counts condition type
#> <chr> <chr> <int> <chr> <chr>
#> 1 FBgn0000003 untrt1 0 untreated single_end
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
select(.sample, .feature, counts)
#> # A SummarizedExperiment-tibble abstraction: 102,193 × 3
#> # Features=14599 | Samples=7 | Assays=counts
#> .feature .sample counts
#> <chr> <chr> <int>
#> 1 FBgn0000003 untrt1 0
#> 2 FBgn0000008 untrt1 92
#> 3 FBgn0000014 untrt1 5
#> 4 FBgn0000015 untrt1 0
#> 5 FBgn0000017 untrt1 4664
#> 6 FBgn0000018 untrt1 583
#> 7 FBgn0000022 untrt1 0
#> 8 FBgn0000024 untrt1 10
#> 9 FBgn0000028 untrt1 0
#> 10 FBgn0000032 untrt1 1446
#> # … with 40 more rows
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
sample_n(50)
#> tidySummarizedExperiment says: A data frame is returned for independent data analysis.
#> # A tibble: 50 × 5
#> .feature .sample counts condition type
#> <chr> <chr> <int> <chr> <chr>
#> 1 FBgn0033624 untrt3 380 untreated paired_end
#> 2 FBgn0003250 untrt1 1 untreated single_end
#> 3 FBgn0031545 trt1 7 treated single_end
#> 4 FBgn0039338 trt1 4438 treated single_end
#> 5 FBgn0033453 untrt1 773 untreated single_end
#> 6 FBgn0259975 trt1 0 treated single_end
#> 7 FBgn0036578 trt1 481 treated single_end
#> 8 FBgn0015831 trt3 0 treated paired_end
#> 9 FBgn0037263 trt2 9 treated paired_end
#> 10 FBgn0037547 trt2 0 treated paired_end
#> # … with 40 more rows
tidySummarizedExperiment::pasilla %>%
sample_frac(0.1)
#> tidySummarizedExperiment says: A data frame is returned for independent data analysis.
#> # A tibble: 10,219 × 5
#> .feature .sample counts condition type
#> <chr> <chr> <int> <chr> <chr>
#> 1 FBgn0052407 trt3 49 treated paired_end
#> 2 FBgn0039099 trt2 24 treated paired_end
#> 3 FBgn0038110 trt1 766 treated single_end
#> 4 FBgn0259739 untrt2 43 untreated single_end
#> 5 FBgn0036793 trt2 0 treated paired_end
#> 6 FBgn0050486 untrt3 1 untreated paired_end
#> 7 FBgn0037174 untrt2 1 untreated single_end
#> 8 FBgn0031051 trt1 2612 treated single_end
#> 9 FBgn0052438 untrt2 1821 untreated single_end
#> 10 FBgn0051116 trt1 366 treated single_end
#> # … with 10,209 more rows
`%>%` <- magrittr::`%>%`
tidySummarizedExperiment::pasilla %>%
pull(feature)
#> Warning: tidySummarizedExperiment says: from version 1.3.1, the special columns including sample/feature id (colnames(se), rownames(se)) has changed to ".sample" and ".feature". This dataset is returned with the old-style vocabulary (feature and sample), however we suggest to update your workflow to reflect the new vocabulary (.feature, .sample)
```