R/methods.R
, R/methods_SE.R
cluster_elements-methods.Rd
cluster_elements() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and identify clusters in the data.
cluster_elements(
.data,
.element = NULL,
.feature = NULL,
.abundance = NULL,
method,
of_samples = TRUE,
transform = log1p,
action = "add",
...,
log_transform = NULL
)
# S4 method for class 'spec_tbl_df'
cluster_elements(
.data,
.element = NULL,
.feature = NULL,
.abundance = NULL,
method,
of_samples = TRUE,
transform = log1p,
action = "add",
...,
log_transform = NULL
)
# S4 method for class 'tbl_df'
cluster_elements(
.data,
.element = NULL,
.feature = NULL,
.abundance = NULL,
method,
of_samples = TRUE,
transform = log1p,
action = "add",
...,
log_transform = NULL
)
# S4 method for class 'tidybulk'
cluster_elements(
.data,
.element = NULL,
.feature = NULL,
.abundance = NULL,
method,
of_samples = TRUE,
transform = log1p,
action = "add",
...,
log_transform = NULL
)
# S4 method for class 'SummarizedExperiment'
cluster_elements(
.data,
.element = NULL,
.feature = NULL,
.abundance = NULL,
method,
of_samples = TRUE,
transform = log1p,
action = "add",
...,
log_transform = NULL
)
# S4 method for class 'RangedSummarizedExperiment'
cluster_elements(
.data,
.element = NULL,
.feature = NULL,
.abundance = NULL,
method,
of_samples = TRUE,
transform = log1p,
action = "add",
...,
log_transform = NULL
)
A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))
The name of the element column (normally samples).
The name of the feature column (normally transcripts/genes)
The name of the column including the numerical value the clustering is based on (normally transcript abundance)
A character string. The cluster algorithm to use, at the moment k-means is the only algorithm included.
A boolean. In case the input is a tidybulk object, it indicates Whether the element column will be sample or transcript column
A function that will tranform the counts, by default it is log1p for RNA sequencing data, but for avoinding tranformation you can use identity
A character string. Whether to join the new information to the input tbl (add), or just get the non-redundant tbl with the new information (get).
Further parameters passed to the function kmeans
DEPRECATED - A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)
A tbl object with additional columns with cluster labels
A tbl object with additional columns with cluster labels
A tbl object with additional columns with cluster labels
A tbl object with additional columns with cluster labels
A `SummarizedExperiment` object
A `SummarizedExperiment` object
`r lifecycle::badge("maturing")`
identifies clusters in the data, normally of samples. This function returns a tibble with additional columns for the cluster annotation. At the moment only k-means (DOI: 10.2307/2346830) and SNN clustering (DOI:10.1016/j.cell.2019.05.031) is supported, the plan is to introduce more clustering methods.
Underlying method for kmeans do.call(kmeans(.data, iter.max = 1000, ...)
Underlying method for SNN .data Seurat::CreateSeuratObject() Seurat::ScaleData(display.progress = TRUE,num.cores = 4, do.par = TRUE) Seurat::FindVariableFeatures(selection.method = "vst") Seurat::RunPCA(npcs = 30) Seurat::FindNeighbors() Seurat::FindClusters(method = "igraph", ...)
cluster_elements(tidybulk::se_mini, centers = 2, method="kmeans")
#> Warning: tidybulk says: highly abundant transcripts were not identified (i.e. identify_abundant()) or filtered (i.e., keep_abundant), therefore this operation will be performed on unfiltered data. In rare occasions this could be wanted. In standard whole-transcriptome workflows is generally unwanted.
#> class: SummarizedExperiment
#> dim: 527 5
#> metadata(0):
#> assays(1): count
#> rownames(527): ABCB4 ABCB9 ... ZNF324 ZNF442
#> rowData names(1): entrez
#> colnames(5): SRR1740034 SRR1740035 SRR1740043 SRR1740058 SRR1740067
#> colData names(6): Cell.type time ... dead cluster_kmeans