Compute per-column summaries and return as a data.frame
. Warning: can be an expensive operation.
rsummary( db, tableName, ..., countUniqueNum = FALSE, quartiles = FALSE, cols = NULL, qualifiers = NULL )
db | database connection. |
---|---|
tableName | name of table. |
... | force additional arguments to be bound by name. |
countUniqueNum | logical, if TRUE include unique non-NA counts for numeric cols. |
quartiles | logical, if TRUE add Q1 (25%), median (50%), Q3 (75%) quartiles. |
cols | if not NULL set of columns to restrict to. |
qualifiers | optional named ordered vector of strings carrying additional db hierarchy terms, such as schema. |
data.frame summary of columns.
For numeric columns includes NaN
in nna
count (as is typical for R
, e.g.,
is.na(NaN)
).
if (requireNamespace("DBI", quietly = TRUE) && requireNamespace("RSQLite", quietly = TRUE)) { d <- data.frame(p= c(TRUE, FALSE, NA), s= NA, w= 1:3, x= c(NA,2,3), y= factor(c(3,5,NA)), z= c('a',NA,'a'), stringsAsFactors=FALSE) db <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") RSQLite::initExtension(db) rq_copy_to(db, "dRemote", d, overwrite = TRUE, temporary = TRUE) print(rsummary(db, "dRemote")) DBI::dbDisconnect(db) }#> column index class nrows nna nunique min max mean sd lexmin lexmax #> 1 p 1 integer 3 1 NA 0 1 0.5 0.7071068 <NA> <NA> #> 2 s 2 integer 3 3 0 NA NA NA NA <NA> <NA> #> 3 w 3 integer 3 0 NA 1 3 2.0 1.0000000 <NA> <NA> #> 4 x 4 numeric 3 1 NA 2 3 2.5 0.7071068 <NA> <NA> #> 5 y 5 character 3 1 2 NA NA NA NA 3 5 #> 6 z 6 character 3 1 1 NA NA NA NA a a