Methods to reliably use dplyr
on remote data sources in R
(SQL
databases,
Spark
2.0.0
and above) in a generic fashion.
replyr
is going into maintenance mode. It has been hard to track
shifting dplyr
/dbplyr
/rlang
APIs and data structures post dplyr
0.5
.
Most of what it does is now done better in one of the newer non-monolithic packages:
Programming and meta-programming tools: wrapr
https://CRAN.R-project.org/package=wrapr.
Adapting dplyr
to standard evaluation interfaces: seplyr
https://CRAN.R-project.org/package=seplyr.
Big data data manipulation: rquery
https://CRAN.R-project.org/package=rquery and cdata
https://CRAN.R-project.org/package=cdata.
replyr
helps with the following:
Summarizing remote data (via replyr_summarize
).
Facilitating writing "source generic" code that works similarly on multiple 'dplyr' data sources.
Providing big data versions of functions for splitting data, binding rows, pivoting, adding row-ids, ranking, and completing experimental designs.
Packaging common data manipulation tasks into operators such as the gapply
function.
Providing support code for common SparklyR
tasks, such as tracking temporary handle IDs.
replyr
is in maintenance mode. Better version of the functionality have been ported to the following packages:
wrapr
, cdata
, rquery
, and seplyr
.
To learn more about replyr, please start with the vignette:
vignette('replyr','replyr')