-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refer to NEWS.md for details of this update.
- Loading branch information
1 parent
e0ef585
commit 191ecc8
Showing
18 changed files
with
181 additions
and
68 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,17 @@ | ||
Package: cleandata | ||
Type: Package | ||
Title: To Inspect, Impute, Encode, and Partition Data; and to Keep Track of This Process | ||
Version: 0.2.0 | ||
Title: To Inspect and Manipulate Data; and to Keep Track of This Process | ||
Version: 0.3.0 | ||
Author: Sherry Zhao | ||
Maintainer: Sherry Zhao <[email protected]> | ||
Description: Functions to work with data frames to prepare data for further analysis. | ||
The functions for imputation, encoding, and Partitioning can produce log files to keep track of data manipulation process. | ||
The functions for imputation, encoding, partitioning, and other manipulation can produce log files to keep track of process. | ||
BugReports: https://github.com/sherrisherry/cleandata/issues | ||
URL: https://github.com/sherrisherry/cleandata | ||
Depends: R (>= 3.0.0) | ||
Imports: stats | ||
Suggests: rmarkdown, knitr | ||
Suggests: R.rsp | ||
License: MIT + file LICENSE | ||
Encoding: UTF-8 | ||
VignetteBuilder: knitr | ||
VignetteBuilder: R.rsp | ||
LazyData: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
YEAR: 2018 | ||
YEAR: 2018 - 2019 | ||
COPYRIGHT HOLDER: Xiaoli (Sherry) Zhao |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
exportPattern("^inspect_.+|^encode_.+|^impute_.+|^partition_.+") | ||
exportPattern("^(.$|[^i].+|i[^n].*|in[^_].+)") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,23 @@ | ||
# impute NAs in factorial columns by the mode of corresponding columns | ||
impute_mode<-function(x,cols=colnames(x),idx=row.names(x),log=FALSE){ | ||
if(is.null(dim(x)))stop('data frame degraded to vector, use df[ , , drop=FALSE]') | ||
for(i in 1:length(cols))x[is.na(x[,cols[i]]),cols[i]]<-names(which.max(table(x[idx,cols[i]]))) | ||
if(is.list(log))log_plan2(x = x, cols = cols, log = log, method = 'Mode') | ||
impute_mode<-function(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default)){ | ||
if(is.null(dim(x)))stop(in_msg1) | ||
for(i in 1:length(cols))x[is.na(x[,cols[i]]),cols[i]]<-in_mode(x[idx,cols[i]]) | ||
if(is.list(log))in_log2(x = x, cols = cols, log = log, method = 'Mode') | ||
return(x) | ||
} | ||
|
||
# impute NAs in numerical columns by the median of corresponding columns | ||
impute_median<-function(x,cols=colnames(x),idx=row.names(x),log=FALSE){ | ||
if(is.null(dim(x)))stop('data frame degraded to vector, use df[ , , drop=FALSE]') | ||
impute_median<-function(x,cols=colnames(x),idx=row.names(x), log = eval.parent(in_log_default)){ | ||
if(is.null(dim(x)))stop(in_msg1) | ||
for(i in 1:length(cols))x[is.na(x[,cols[i]]),cols[i]]<-stats::median(x[idx,cols[i]],na.rm = TRUE) | ||
if(is.list(log))log_plan2(x = x, cols = cols, log = log, method = 'Median') | ||
if(is.list(log))in_log2(x = x, cols = cols, log = log, method = 'Median') | ||
return(x) | ||
} | ||
|
||
# impute NAs in numerical columns by the mean of corresponding columns | ||
impute_mean<-function(x,cols=colnames(x),idx=row.names(x),log=FALSE){ | ||
if(is.null(dim(x)))stop('data frame degraded to vector, use df[ , , drop=FALSE]') | ||
impute_mean<-function(x,cols=colnames(x),idx=row.names(x), log = eval.parent(in_log_default)){ | ||
if(is.null(dim(x)))stop(in_msg1) | ||
for(i in 1:length(cols))x[is.na(x[,cols[i]]),cols[i]]<-mean(x[idx,cols[i]],na.rm = TRUE) | ||
if(is.list(log))log_plan2(x = x, cols = cols, log = log, method = 'Mean') | ||
if(is.list(log))in_log2(x = x, cols = cols, log = log, method = 'Mean') | ||
return(x) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
|
||
wh_dict <- function(x, attr, value){ | ||
if(missing(attr) || missing(value))stop('Please supply attr and value') | ||
lv <- unique(x[, attr]) | ||
dictionary <- data.frame(lv) | ||
colnames(dictionary) <- attr | ||
dictionary$Keys <- NA | ||
for(i in 1:length(lv))dictionary[i, 'Keys'] <- as.character(x[x[,attr]==lv[i], value][1]) | ||
return(dictionary) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.