Solve for a good set of right-exclusive x-cuts such that the overall graph of y~x is well-approximated by a piecewise linear function. Solution is a ready for use with with base::findInterval() and stats::approx() (demonstrated in the examples).

solve_for_partitionc(
  x,
  y,
  ...,
  w = NULL,
  penalty = 0,
  min_n_to_chunk = 1000,
  min_seg = 1,
  max_k = length(x)
)

Arguments

x

numeric, input variable (no NAs).

y

numeric, result variable (no NAs, same length as x).

...

not used, force later arguments by name.

w

numeric, weights (no NAs, positive, same length as x).

penalty

per-segment cost penalty.

min_n_to_chunk

minimum n to subdivied problem.

min_seg

positive integer, minimum segment size.

max_k

maximum segments to divide into.

Value

a data frame appropriate for stats::approx().

Examples

# example data d <- data.frame( x = 1:8, y = c(-1, -1, -1, -1, 1, 1, 1, 1)) # solve for break points soln <- solve_for_partitionc(d$x, d$y) # show solution print(soln)
#> x pred group what #> 1 1 -1 1 left #> 2 2 -1 1 right #> 3 3 -1 2 left #> 4 4 -1 2 right #> 5 5 1 3 left #> 6 6 1 3 right #> 7 7 1 4 left #> 8 8 1 4 right
# label each point d$group <- base::findInterval( d$x, soln$x[soln$what=='left']) # apply piecewise approximation d$estimate <- stats::approx( soln$x, soln$pred, xout = d$x, method = 'constant', rule = 2)$y # show result print(d)
#> x y group estimate #> 1 1 -1 1 -1 #> 2 2 -1 1 -1 #> 3 3 -1 2 -1 #> 4 4 -1 2 -1 #> 5 5 1 3 1 #> 6 6 1 3 1 #> 7 7 1 4 1 #> 8 8 1 4 1