Map a data records from row records (records that are exactly single rows) to block records (records that may be more than one row).

rowrecs_to_blocks(
  wideTable,
  controlTable,
  ...,
  checkNames = TRUE,
  checkKeys = FALSE,
  strict = FALSE,
  controlTableKeys = colnames(controlTable)[[1]],
  columnsToCopy = NULL,
  tmp_name_source = wrapr::mk_tmp_name_source("rrtbl"),
  temporary = TRUE,
  allow_rqdatatable = FALSE
)

# S3 method for default
rowrecs_to_blocks(
  wideTable,
  controlTable,
  ...,
  checkNames = TRUE,
  checkKeys = FALSE,
  strict = FALSE,
  controlTableKeys = colnames(controlTable)[[1]],
  columnsToCopy = NULL,
  tmp_name_source = wrapr::mk_tmp_name_source("rrtobd"),
  temporary = TRUE,
  allow_rqdatatable = FALSE
)

# S3 method for relop
rowrecs_to_blocks(
  wideTable,
  controlTable,
  ...,
  checkNames = TRUE,
  checkKeys = FALSE,
  strict = FALSE,
  controlTableKeys = colnames(controlTable)[[1]],
  columnsToCopy = NULL,
  tmp_name_source = wrapr::mk_tmp_name_source("rrtbl"),
  temporary = TRUE,
  allow_rqdatatable = FALSE
)

Arguments

wideTable

data.frame containing data to be mapped (in-memory data.frame).

controlTable

table specifying mapping (local data frame).

...

force later arguments to be by name.

checkNames

logical, if TRUE check names.

checkKeys

logical, if TRUE check columnsToCopy form row keys (not a requirement, unless you want to be able to invert the operation).

strict

logical, if TRUE check control table name forms.

controlTableKeys

character, which column names of the control table are considered to be keys.

columnsToCopy

character array of column names to copy.

tmp_name_source

a tempNameGenerator from cdata::mk_tmp_name_source()

temporary

logical, if TRUE use temporary tables

allow_rqdatatable

logical, if TRUE allow rqdatatable shortcutting on simple conversions.

Value

long table built by mapping wideTable to one row per group

Details

The controlTable defines the names of each data element in the two notations: the notation of the tall table (which is row oriented) and the notation of the wide table (which is column oriented). controlTable[ , 1] (the group label) cross colnames(controlTable) (the column labels) are names of data cells in the long form. controlTable[ , 2:ncol(controlTable)] (column labels) are names of data cells in the wide form. To get behavior similar to tidyr::gather/spread one builds the control table by running an appropriate query over the data.

Some discussion and examples can be found here: https://winvector.github.io/FluidData/FluidData.html and here https://github.com/WinVector/cdata.

rowrecs_to_blocks.default will change some factor columns to character, and there are issues with time columns with different time zones.

See also

Examples

# un-pivot example d <- data.frame(AUC = 0.6, R2 = 0.2) cT <- build_unpivot_control(nameForNewKeyColumn= 'meas', nameForNewValueColumn= 'val', columnsToTakeFrom= c('AUC', 'R2')) rowrecs_to_blocks(d, cT)
#> meas val #> 1 AUC 0.6 #> 2 R2 0.2
d <- data.frame(AUC = 0.6, R2 = 0.2) cT <- build_unpivot_control( nameForNewKeyColumn= 'meas', nameForNewValueColumn= 'val', columnsToTakeFrom= c('AUC', 'R2')) ops <- rquery::local_td(d) %.>% rowrecs_to_blocks(., cT) cat(format(ops))
#> mk_td("d", c( #> "AUC", #> "R2")) %.>% #> non_sql_node(., CREATE TEMPORARY TABLE "OUT" AS SELECT b."meas", CASE WHEN CAST(b."meas" AS VARCHAR) = 'AUC' THEN a."AUC" WHEN CAST(b."meas" AS VARCHAR) = 'R2' THEN a."R2" ELSE NULL END AS "val" FROM "IN" a CROSS JOIN "rrtbl_23845669547197155044_0000000002" b )
if(requireNamespace("rqdatatable", quietly = TRUE)) { library("rqdatatable") d %.>% ops %.>% print(.) }
#> meas val #> 1 AUC 0.6 #> 2 R2 0.2
if(requireNamespace("RSQLite", quietly = TRUE)) { db <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") DBI::dbWriteTable(db, 'd', d, overwrite = TRUE, temporary = TRUE) db %.>% ops %.>% print(.) DBI::dbDisconnect(db) }
#> meas val #> 1 AUC 0.6 #> 2 R2 0.2