Partitions from by values in grouping column, and returns list. Only advised for a moderate number of groups and better if grouping column is an index. This plus lapply and replyr::bind_rows is powerful enough to implement "The Split-Apply-Combine Strategy for Data Analysis" https://www.jstatsoft.org/article/view/v040i01

replyr_split(
  df,
  gcolumn,
  ...,
  ocolumn = NULL,
  decreasing = FALSE,
  partitionMethod = "extract",
  maxgroups = 100,
  eagerCompute = FALSE
)

Arguments

df

remote dplyr data item

gcolumn

grouping column

...

force later values to be bound by name

ocolumn

ordering column (optional)

decreasing

if TRUE sort in decreasing order by ocolumn

partitionMethod

method to partition the data, one of 'split' (only works over local data frames), or 'extract'

maxgroups

maximum number of groups to work over

eagerCompute

if TRUE call compute on split results

Value

list of data items

Examples

d <- data.frame(group=c(1,1,2,2,2), order=c(.1,.2,.3,.4,.5), values=c(10,20,2,4,8)) dSplit <- replyr_split(d, 'group', partitionMethod='extract') dApp <- lapply(dSplit, function(di) data.frame(as.list(colMeans(di)))) replyr_bind_rows(dApp)
#> group order values #> 1 1 0.15 15.000000 #> 2 2 0.40 4.666667