This is my (new)1 “New” also meaning we need the latest versions of wrapr and seplyr. favorite way to bulk rename columns in R, especially when using sparklyr to work with big data.
There are, of course, many other ways to do this.
Consider the following data:2 Source code for this note can be found here.
class(d)
## [1] "tbl_spark" "tbl_sql" "tbl_lazy" "tbl"
d %.>%
head(.) %.>%
knitr::kable(.)
| name | height | mass | hair_color | skin_color | eye_color | birth_year | gender | homeworld | species |
|---|---|---|---|---|---|---|---|---|---|
| Luke Skywalker | 172 | 77 | blond | fair | blue | 19.0 | male | Tatooine | Human |
| C-3PO | 167 | 75 | NA | gold | yellow | 112.0 | NA | Tatooine | Droid |
| R2-D2 | 96 | 32 | NA | white, blue | red | 33.0 | NA | Naboo | Droid |
| Darth Vader | 202 | 136 | none | white | yellow | 41.9 | male | Tatooine | Human |
| Leia Organa | 150 | 49 | brown | light | brown | 19.0 | female | Alderaan | Human |
| Owen Lars | 178 | 120 | brown, grey | light | blue | 52.0 | male | Tatooine | Human |
Notice this data is a remote Spark data handle (not an in-memory data.frame).
Suppose (for no reason at all) we want all columns with under-bars to be in upper case.
This is easy to achieve:3 People have asked, so I will discuss it (again). A lot of the other methods for renaming columns do not work with Spark data. For example, using the current CRAN versions of rlang (0.1.4), dplyr (0.7.4), dbplyr (1.1.0), sparklyr (0.6.4): both rlang::set_names(d, toupper), and dplyr::rename(d, HAIR_COLOR=hair_color, SKIN_COLOR=skin_color) throw when given Spark data handles. This is a topic of some discussion: 1, 2, 3, 4, and 5.
library("seplyr")
# requires version at least 1.0.1
packageVersion("wrapr")
## [1] '1.0.1'
# requires version at least 0.1.6
packageVersion("seplyr")
## [1] '0.1.6'
# find columns matching our condition
colsToRename <- grepdf(".*_.*", d)
print(colsToRename)
## [1] "hair_color" "skin_color" "eye_color" "birth_year"
# build the right to left (NEW := old) assignment
mapping <- toupper(colsToRename) := colsToRename
print(mapping)
## HAIR_COLOR SKIN_COLOR EYE_COLOR BIRTH_YEAR
## "hair_color" "skin_color" "eye_color" "birth_year"
# apply the transform
d %.>%
rename_se(., mapping) %.>%
head(.) %.>%
knitr::kable(.)
| name | height | mass | HAIR_COLOR | SKIN_COLOR | EYE_COLOR | BIRTH_YEAR | gender | homeworld | species |
|---|---|---|---|---|---|---|---|---|---|
| Luke Skywalker | 172 | 77 | blond | fair | blue | 19.0 | male | Tatooine | Human |
| C-3PO | 167 | 75 | NA | gold | yellow | 112.0 | NA | Tatooine | Droid |
| R2-D2 | 96 | 32 | NA | white, blue | red | 33.0 | NA | Naboo | Droid |
| Darth Vader | 202 | 136 | none | white | yellow | 41.9 | male | Tatooine | Human |
| Leia Organa | 150 | 49 | brown | light | brown | 19.0 | female | Alderaan | Human |
| Owen Lars | 178 | 120 | brown, grey | light | blue | 52.0 | male | Tatooine | Human |