`summarise()` creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.
`summarise()` and `summarize()` are synonyms.
A tbl. (See dplyr)
<[`tidy-eval`][dplyr_tidy_eval]> Name-value pairs of summary functions. The name will be the name of the variable in the result.
The value can be:
* A vector of length 1, e.g. `min(x)`, `n()`, or `sum(is.na(y))`. * A vector of length `n`, e.g. `quantile()`. * A data frame, to add multiple columns from a single expression.
An object _usually_ of the same type as `.data`.
* The rows come from the underlying `group_keys()`. * The columns are a combination of the grouping keys and the summary expressions that you provide. * If `x` is grouped by more than one variable, the output will be another [grouped_df] with the right-most group removed. * If `x` is grouped by one variable, or is not grouped, the output will be a [tibble]. * Data frame attributes are **not** preserved, because `summarise()` fundamentally creates a new data frame.
* Center: [mean()], [median()] * Spread: [sd()], [IQR()], [mad()] * Range: [min()], [max()], [quantile()] * Position: [first()], [last()], [nth()], * Count: [n()], [n_distinct()] * Logical: [any()], [all()]
The data frame backend supports creating a variable and using it in the same summary. This means that previously created summary variables can be further transformed or combined within the summary, as in [mutate()]. However, it also means that summary variables with the same names as previous variables overwrite them, making those variables unavailable to later summary variables.
This behaviour may not be supported in other backends. To avoid unexpected results, consider using new names for your summary variables, especially when creating multiple summaries.
This function is a **generic**, which means that packages can provide implementations (methods) for other classes. See the documentation of individual methods for extra arguments and differences in behaviour.
The following methods are currently available in loaded packages:
# A summary applied to ungrouped tbl returns a single row
mtcars |>
summarise(mean = mean(disp))
#> mean
#> 1 230.7219