Please see http://www.win-vector.com/blog/2017/05/managing-spark-data-handles-in-r/ for details. Note: one usually needs to alter the keys column which is just populated with all columns.

tableDescription(tableName, handle, ..., keyInspector = key_inspector_all_cols)

Arguments

tableName

name of table to add to join plan.

handle

table or table handle to add to join plan (can already be in the plan).

...

force later arguments to bind by name.

keyInspector

function that determines preferred primary key set for table.

Value

table describing the data.

Details

Please see vignette('DependencySorting', package = 'replyr') and vignette('joinController', package= 'replyr') for more details.

See also

Examples

d <- data.frame(x=1:3, y=NA) tableDescription('d', d)
#> # A tibble: 1 x 8 #> tableName handle columns keys colClass sourceClass isEmpty indicatorColumn #> <chr> <list> <list> <lis> <list> <chr> <lgl> <chr> #> 1 d <df[,2]<chr [2<chr<chr [2… character FALSE table_d_present