Please see http://www.win-vector.com/blog/2017/05/managing-spark-data-handles-in-r/ for details. Note: one usually needs to alter the keys column which is just populated with all columns.
tableDescription(tableName, handle, ..., keyInspector = key_inspector_all_cols)
tableName | name of table to add to join plan. |
---|---|
handle | table or table handle to add to join plan (can already be in the plan). |
... | force later arguments to bind by name. |
keyInspector | function that determines preferred primary key set for table. |
table describing the data.
Please see vignette('DependencySorting', package = 'replyr')
and vignette('joinController', package= 'replyr')
for more details.
#> # A tibble: 1 x 8 #> tableName handle columns keys colClass sourceClass isEmpty indicatorColumn #> <chr> <list> <list> <lis> <list> <chr> <lgl> <chr> #> 1 d <df[,2]… <chr [2… <chr… <chr [2… character FALSE table_d_present