Spark 2* union_all has issues ( https://github.com/WinVector/replyr/blob/master/issues/UnionIssue.md ), and exposed union_all semantics differ from data-source back-end to back-end. This is an attempt to provide a join-based replacement.

replyr_union_all(
  tabA,
  tabB,
  ...,
  useDplyrLocal = TRUE,
  useSparkRbind = TRUE,
  tempNameGenerator = mk_tmp_name_source("replyr_union_all")
)

Arguments

tabA

not-NULL table with at least 1 row.

tabB

not-NULL table with at least 1 row on same data source as tabA and common columns.

...

force later arguments to be bound by name.

useDplyrLocal

logical if TRUE use dplyr::bind_rows for local data.

useSparkRbind

logical if TRUE try to use rbind on Sparklyr data

tempNameGenerator

temp name generator produced by wrapr::mk_tmp_name_source, used to record dplyr::compute() effects.

Value

table with all rows of tabA and tabB (union_all).

Examples

d1 <- data.frame(x = c('a','b'), y = 1, stringsAsFactors= FALSE) d2 <- data.frame(x = 'c', z = 1, stringsAsFactors= FALSE) replyr_union_all(d1, d2, useDplyrLocal= FALSE)
#> y z x #> 1 1 NA a #> 2 1 NA b #> 3 NA 1 c