Renaming Columns in R

John Mount, Win-Vector LLC

2017-11-17

This is my (new)1 “New” also meaning we need the latest versions of wrapr and seplyr. favorite way to bulk rename columns in R, especially when using sparklyr to work with big data.

There are, of course, many other ways to do this.

Consider the following data:2 Source code for this note can be found here.

class(d)
## [1] "tbl_spark" "tbl_sql"   "tbl_lazy"  "tbl"
d %.>%
  head(.) %.>%
  knitr::kable(.)
name height mass hair_color skin_color eye_color birth_year gender homeworld species
Luke Skywalker 172 77 blond fair blue 19.0 male Tatooine Human
C-3PO 167 75 NA gold yellow 112.0 NA Tatooine Droid
R2-D2 96 32 NA white, blue red 33.0 NA Naboo Droid
Darth Vader 202 136 none white yellow 41.9 male Tatooine Human
Leia Organa 150 49 brown light brown 19.0 female Alderaan Human
Owen Lars 178 120 brown, grey light blue 52.0 male Tatooine Human

Notice this data is a remote Spark data handle (not an in-memory data.frame).

Suppose (for no reason at all) we want all columns with under-bars to be in upper case.

This is easy to achieve:3 People have asked, so I will discuss it (again). A lot of the other methods for renaming columns do not work with Spark data. For example, using the current CRAN versions of rlang (0.1.4), dplyr (0.7.4), dbplyr (1.1.0), sparklyr (0.6.4): both rlang::set_names(d, toupper), and dplyr::rename(d, HAIR_COLOR=hair_color, SKIN_COLOR=skin_color) throw when given Spark data handles. This is a topic of some discussion: 1, 2, 3, 4, and 5.

library("seplyr")
# requires version at least 1.0.1
packageVersion("wrapr") 
## [1] '1.0.1'
# requires version at least 0.1.6
packageVersion("seplyr") 
## [1] '0.1.6'
# find columns matching our condition
colsToRename <- grepdf(".*_.*", d)
print(colsToRename)
## [1] "hair_color" "skin_color" "eye_color"  "birth_year"
# build the right to left (NEW := old) assignment
mapping <- toupper(colsToRename) := colsToRename
print(mapping)
##   HAIR_COLOR   SKIN_COLOR    EYE_COLOR   BIRTH_YEAR 
## "hair_color" "skin_color"  "eye_color" "birth_year"
# apply the transform
d %.>% 
  rename_se(., mapping) %.>%
  head(.) %.>%
  knitr::kable(.)
name height mass HAIR_COLOR SKIN_COLOR EYE_COLOR BIRTH_YEAR gender homeworld species
Luke Skywalker 172 77 blond fair blue 19.0 male Tatooine Human
C-3PO 167 75 NA gold yellow 112.0 NA Tatooine Droid
R2-D2 96 32 NA white, blue red 33.0 NA Naboo Droid
Darth Vader 202 136 none white yellow 41.9 male Tatooine Human
Leia Organa 150 49 brown light brown 19.0 female Alderaan Human
Owen Lars 178 120 brown, grey light blue 52.0 male Tatooine Human