The substitution modes

wrapr::let() now has three substitution implementations:

  • Language substitution (subsMethod='langsubs' the new default). In this mode user code is captured as an abstract syntax tree (or parse tree) and substitution is performed only on nodes known to be symbols or behaving in a symbol-role ("X" in d$"X" is one such example).
  • Substitute substitution (subsMethod='subsubs'). In this mode substitution is performed by R’s own base::substitute().
  • String substitution (subsMethod='stringsubs', the previous default now deprecated). In this mode user code is captured as text and then string replacement on word-boundaries is used to substitute in variable re-mappings.

The semantics of the three methods can be illustrated by showing the effects of substituting the variable name “y” for “X” and the function “sin” for “F” in the somewhat complicated block of statements:

  {
    d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
    X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
    X$a
    "X"$a
    X = function(X, ...) { X + 1 }
  }

This block a lot of different examples and corner-cases.

Language substitution (subsMethod='langsubs')

library("wrapr")

let(
  c(X = 'y', F = 'sin'),
  {
    d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
    X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
    X$a
    "X"$a
    X = function(X, ...) { X + 1 }
  },
  eval = FALSE, subsMethod = 'langsubs')
## {
##     d <- data.frame(y = "X", X2 = "XX", d = y * y, .X = X_)
##     y <- list(y = d$y, X2 = d$y, v1 = y, v2 = ` X`, sin(1:2))
##     y$a
##     "X"$a
##     y = function(y, ...) {
##         y + 1
##     }
## }

Notice the substitution replaced all symbol-like uses of “X”, and only these (including correctly working with some that were quoted!).

String substitution (subsMethod='stringsubs')

let(
  c(X = 'y', F = 'sin'),
  {
    d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
    X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
    X$a
    "X"$a
    X = function(X, ...) { X + 1 }
  },
  eval = FALSE, subsMethod = 'stringsubs')
## expression({
##     d <- data.frame(y = "y", X2 = "XX", d = y * y, .y = X_)
##     y <- list(y = d$y, X2 = d$y, v1 = y, v2 = ` y`, sin(1:2))
##     y$a
##     "y"$a
##     y = function(y, ...) {
##         y + 1
##     }
## })

Notice string substitution has a few flaws: it went after variable names that appeared to start with a word-boundary (the cases where the variable name started with a dot or a space). Substitution also occurred in some string constants (which as we have seen could be considered a good thing).

These situations are all avoidable as both the code inside the let-block and the substitution targets are chosen by the programmer, so they can be chosen to be simple and mutually consistent. We suggest “ALL_CAPS” style substitution targets as they jump out as being macro targets. But, of course, it is better to have stricter control on substitution.

Think of the language substitution implementation as a lower-bound on a perfect implementation (cautious, with a few corner cases to get coverage) and string substitution as an upper bound on a perfect implementation (aggressive, with a few over-reaches).

Substitute substitution (subsMethod='subsubs')

let(c(X = 'y', F = 'sin'),
    {
      d <- data.frame("X" = "X", X2 = "XX", d = X*X, .X = X_)
      X <- list(X = d$X, X2 = d$"X", v1 = `X`, v2 = ` X`, F(1:2))
      X$a
      "X"$a
      X = function(X, ...) { X + 1 }
    },
    eval = FALSE, subsMethod = 'subsubs')
## {
##     d <- data.frame(X = "X", X2 = "XX", d = y * y, .X = X_)
##     y <- list(X = d$y, X2 = d$X, v1 = y, v2 = ` X`, sin(1:2))
##     y$a
##     "X"$a
##     y = function(X, ...) {
##         y + 1
##     }
## }

Notice base::substitute() doesn’t re-write left-hand-sides of argument bindings. This is why I originally didn’t consider using this implementation. Re-writing left-hand-sides of assignments is critical in expressions such as dplyr::mutate( RESULTCOL = INPUTCOL + 1). Also base::substitute() doesn’t special case the d$"X" situation (but that really isn’t very important).

Conclusion

wrapr::let() when used prudently is a safe and powerful tool.