Compute per-column summaries and return as a data.frame. Warning: can be an expensive operation.

rsummary(
  db,
  tableName,
  ...,
  countUniqueNum = FALSE,
  quartiles = FALSE,
  cols = NULL,
  qualifiers = NULL
)

Arguments

db

database connection.

tableName

name of table.

...

force additional arguments to be bound by name.

countUniqueNum

logical, if TRUE include unique non-NA counts for numeric cols.

quartiles

logical, if TRUE add Q1 (25%), median (50%), Q3 (75%) quartiles.

cols

if not NULL set of columns to restrict to.

qualifiers

optional named ordered vector of strings carrying additional db hierarchy terms, such as schema.

Value

data.frame summary of columns.

Details

For numeric columns includes NaN in nna count (as is typical for R, e.g., is.na(NaN)).

Examples

if (requireNamespace("DBI", quietly = TRUE) && requireNamespace("RSQLite", quietly = TRUE)) { d <- data.frame(p= c(TRUE, FALSE, NA), s= NA, w= 1:3, x= c(NA,2,3), y= factor(c(3,5,NA)), z= c('a',NA,'a'), stringsAsFactors=FALSE) db <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") RSQLite::initExtension(db) rq_copy_to(db, "dRemote", d, overwrite = TRUE, temporary = TRUE) print(rsummary(db, "dRemote")) DBI::dbDisconnect(db) }
#> column index class nrows nna nunique min max mean sd lexmin lexmax #> 1 p 1 integer 3 1 NA 0 1 0.5 0.7071068 <NA> <NA> #> 2 s 2 integer 3 3 0 NA NA NA NA <NA> <NA> #> 3 w 3 integer 3 0 NA 1 3 2.0 1.0000000 <NA> <NA> #> 4 x 4 numeric 3 1 NA 2 3 2.5 0.7071068 <NA> <NA> #> 5 y 5 character 3 1 2 NA NA NA NA 3 5 #> 6 z 6 character 3 1 1 NA NA NA NA a a