Partitions from by values in grouping column, and returns list. Only advised for a moderate number of groups and better if grouping column is an index. This plus lapply and replyr::bind_rows is powerful enough to implement "The Split-Apply-Combine Strategy for Data Analysis" https://www.jstatsoft.org/article/view/v040i01
replyr_split( df, gcolumn, ..., ocolumn = NULL, decreasing = FALSE, partitionMethod = "extract", maxgroups = 100, eagerCompute = FALSE )
df | remote dplyr data item |
---|---|
gcolumn | grouping column |
... | force later values to be bound by name |
ocolumn | ordering column (optional) |
decreasing | if TRUE sort in decreasing order by ocolumn |
partitionMethod | method to partition the data, one of 'split' (only works over local data frames), or 'extract' |
maxgroups | maximum number of groups to work over |
eagerCompute | if TRUE call compute on split results |
list of data items
d <- data.frame(group=c(1,1,2,2,2), order=c(.1,.2,.3,.4,.5), values=c(10,20,2,4,8)) dSplit <- replyr_split(d, 'group', partitionMethod='extract') dApp <- lapply(dSplit, function(di) data.frame(as.list(colMeans(di)))) replyr_bind_rows(dApp)#> group order values #> 1 1 0.15 15.000000 #> 2 2 0.40 4.666667