R/unharmonised.R
get_unharmonised_dataset.Rd
Various metadata fields are not common between datasets, so it does not make sense for these to live in the main metadata table. This function is a utility that allows easy fetching of this data if necessary.
get_unharmonised_dataset(
dataset_id,
cells = NULL,
conn = dbConnect(drv = duckdb(), read_only = TRUE),
remote_url = UNHARMONISED_URL,
cache_directory = get_default_cache_dir()
)
A character vector, where each entry is a dataset ID
obtained from the $file_id
column of the table returned from
get_metadata()
An optional character vector of cell IDs. If provided, only metadata for those cells will be returned.
An optional DuckDB connection object. If provided, it will re-use the existing connection instead of opening a new one.
Optional character vector of length 1. An HTTP URL pointing to the root URL under which all the unharmonised dataset files are located.
Optional character vector of length 1. A file path on your local system to a directory (not a file) that will be used to store the unharmonised metadata files.
A named list, where each name is a dataset file ID, and each value is
a "lazy data frame", ie a tbl
.
# \donttest{
dataset <- "838ea006-2369-4e2c-b426-b2a744a2b02b"
harmonised_meta <- get_metadata() |>
dplyr::filter(file_id == dataset) |> dplyr::collect()
unharmonised_meta <- get_unharmonised_dataset(dataset)
#> Error in get_unharmonised_dataset(dataset): could not find function "get_unharmonised_dataset"
unharmonised_tbl <- dplyr::collect(unharmonised_meta[[dataset]])
#> Error in dplyr::collect(unharmonised_meta[[dataset]]): object 'unharmonised_meta' not found
dplyr::left_join(harmonised_meta, unharmonised_tbl, by=c("file_id", "cell_"))
#> Error in is.data.frame(y): object 'unharmonised_tbl' not found
# }