Skip to content

Commit

Permalink
Merge pull request #311 from billdenney/convert_to_date
Browse files Browse the repository at this point in the history
Add convert_to_date (fix #310)
  • Loading branch information
billdenney authored Mar 11, 2020
2 parents 95cd8af + 0e1f5f7 commit c01e368
Show file tree
Hide file tree
Showing 10 changed files with 422 additions and 8 deletions.
7 changes: 4 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,13 @@ Depends:
R (>= 3.1.2)
Imports:
dplyr (>= 0.7.0),
tidyr (>= 0.7.0),
snakecase (>= 0.9.2),
lubridate,
magrittr,
purrr,
rlang,
tidyselect (>= 1.0.0)
snakecase (>= 0.9.2),
tidyselect (>= 1.0.0),
tidyr (>= 0.7.0)
License: MIT + file LICENSE
LazyData: true
RoxygenNote: 7.0.2
Expand Down
8 changes: 8 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ export(clean_names)
export(compare_df_cols)
export(compare_df_cols_same)
export(convert_to_NA)
export(convert_to_date)
export(convert_to_datetime)
export(crosstab)
export(describe_class)
export(excel_numeric_to_date)
Expand All @@ -47,5 +49,11 @@ export(tabyl)
export(top_levels)
export(untabyl)
export(use_first_valid_of)
importFrom(lubridate,ymd)
importFrom(lubridate,ymd_hms)
importFrom(magrittr,"%>%")
importFrom(rlang,dots_n)
importFrom(rlang,expr)
importFrom(rlang,syms)
importFrom(stats,na.omit)
importFrom(tidyselect,eval_select)
4 changes: 2 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,16 @@

* The variables considered by the function `get_dupes()` can be specified using the select helper functions from `tidyselect`. This includes `-column_name` to omit a variable as well as the matching functions `starts_with()`, `ends_with()`, `contains()`, and `matches()`. See `?tidyselect::select_helpers` for more (#326, thanks to **@jzadra** for suggesting and implementing).

* The new function `signif_half_up()` rounds a numeric vector to the specified number of significant digits with halves rounded up (#314, thanks to **@khueyama** for suggesting and implementing).

* The new function `signif_half_up()` rounds a numeric vector to the specified number of significant digits with halves rounded up (#314, thanks to **@khueyama** for suggesting and implementing).
* The new functions `convert_to_date()` and `convert_to_datetime()` generalize the work done by `excel_numeric_to_date()` allowing conversion to date or datetimes from many forms of input from numeric, to characters that look like numbers, to characters that look like dates or datetimes, to Dates, to date-times (POSIXt) (#310, thanks to **@billdenney* for implementing).

## Minor features

* A `quiet` argument was added to `remove_empty()` and `remove_constant()` providing more information (when `FALSE`) (#70, thanks to **@jbkunst** for suggesting and **@billdenney** for implementing).

* `row_to_names()` will now work on matrix input (#320, thanks to **@billdenney** for suggesting and implementing


## Bug fixes

* `adorn_ns()` doesn't append anything to character columns when called on a data.frame resulting from a call to `adorn_percentages()`. (#195).
Expand Down
173 changes: 173 additions & 0 deletions R/convert_to_date.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
#' Convert many date and datetime formats as may be received from Microsoft
#' Excel
#'
#' @details Character conversion checks if it matches something that looks like
#' a Microsoft Excel numeric date, converts those to numeric, and then runs
#' convert_to_datetime_helper() on those numbers. Then, character to Date or
#' POSIXct conversion occurs via `character_fun(x, ...)` or
#' `character_fun(x, tz=tz, ...)`, respectively.
#'
#' @param x The object to convert
#' @param tz The timezone for POSIXct output, unless an object is POSIXt
#' already. Ignored for Date output.
#' @param ... Passed to further methods. Eventually may be passed to
#' `excel_numeric_to_date()`, `base::as.POXIXct()`, or `base::as.Date()`.
#' @param character_fun A function to convert non-numeric-looking, non-NA values
#' in `x` to POSIXct objects.
#' @param string_conversion_failure If a character value fails to parse into the
#' desired class and instead returns `NA`, should the function return the
#' result with a warning or throw an error?
#' @return POSIXct objects for `convert_to_datetime()` or Date objects for
#' `convert_to_date()`.
#' @examples
#' convert_to_date("2009-07-06")
#' convert_to_date(40000)
#' convert_to_date("40000.1")
#' # Mixed date source data can be provided.
#' convert_to_date(c("2020-02-29", "40000.1"))
#' @export
#' @family If your input data has a mix of Excel numeric dates and actual dates,
#' see the more powerful functions `convert_to_date` and
#' `convert_to_datetime`.
#' @importFrom lubridate ymd
convert_to_date <- function(x, ..., character_fun=lubridate::ymd, string_conversion_failure=c("error", "warning")) {
string_conversion_failure <- match.arg(string_conversion_failure)
convert_to_datetime_helper(
x, ...,
character_fun=character_fun,
string_conversion_failure=string_conversion_failure,
out_class="Date"
)
}

#' @describeIn convert_to_date Convert to a date-time (POSIXct)
#' @examples
#' convert_to_datetime(
#' c("2009-07-06", "40000.1", "40000", NA),
#' character_fun=lubridate::ymd_h, truncated=1, tz="UTC"
#' )
#' @export
#' @importFrom lubridate ymd_hms
convert_to_datetime <- function(x, ..., tz="UTC", character_fun=lubridate::ymd_hms, string_conversion_failure=c("error", "warning")) {
string_conversion_failure <- match.arg(string_conversion_failure)
convert_to_datetime_helper(
x, ...,
tz=tz,
character_fun=character_fun,
string_conversion_failure=string_conversion_failure,
out_class="POSIXct"
)
}

#' The general method to convert either to a datetime or a date.
#' @param x the object to convert
#' @param out_class The class expected for output.
#' @return An object of class `out_class`
#' @noRd
convert_to_datetime_helper <- function(x, ..., out_class=c("POSIXct", "Date"))
UseMethod("convert_to_datetime_helper")

convert_to_datetime_helper.numeric <- function(x, ...,
date_system="modern",
include_time=NULL,
round_seconds=TRUE,
tz="UTC",
out_class=c("POSIXct", "Date")) {
if (!is.null(include_time)) {
warning("`include_time` is ignored in favor of `out_class`.")
}
out_class <- match.arg(out_class)
excel_numeric_to_date(
date_num=x,
date_system="modern",
round_seconds=round_seconds,
tz=tz,
include_time=out_class %in% "POSIXct"
)
}

convert_to_datetime_helper.factor <- function(x, ..., out_class=c("POSIXct", "Date")) {
convert_to_datetime_helper.character(as.character(x), ..., out_class=out_class)
}

convert_to_datetime_helper.POSIXt <- function(x, ..., out_class=c("POSIXct", "Date")) {
out_class <- match.arg(out_class)
if (out_class %in% "POSIXct") {
# Ensure that POSIXlt gets converted to POSIXct
as.POSIXct(x, ...)
} else {
as.Date(x, ...)
}
}

convert_to_datetime_helper.Date <- function(x, ..., tz="UTC", out_class=c("POSIXct", "Date")) {
out_class <- match.arg(out_class)
if (out_class %in% "POSIXct") {
ret <- as.POSIXct(x, ...)
# as.POSIXct.Date ignores the time zone, so manually apply it.
attr(ret, "tzone") <- tz
} else {
ret <- x
}
ret
}

convert_to_datetime_helper.character <- function(x, ..., tz="UTC", character_fun=lubridate::ymd_hms, string_conversion_failure=c("error", "warning"), out_class=c("POSIXct", "Date")) {
string_conversion_failure <- match.arg(string_conversion_failure)
out_class <- match.arg(out_class)
mask_na <- is.na(x)
mask_excel_numeric <- !mask_na & grepl(pattern="^[0-9]{5}(?:\\.[0-9]*)?$", x=x)
mask_character <- !(mask_na | mask_excel_numeric)
if (out_class %in% "POSIXct") {
ret <- as.POSIXct(x=rep(NA, length(x)), tz="UTC")
} else {
ret <- as.Date(x=rep(NA, length(x)))
}
if (any(mask_excel_numeric)) {
ret[mask_excel_numeric] <- convert_to_datetime_helper(as.numeric(x[mask_excel_numeric]), ..., tz=tz)
}
if (any(mask_character)) {
characters_converted <-
if (out_class %in% "POSIXct") {
character_fun(x[mask_character], tz=tz, ...)
} else {
character_fun(x[mask_character], ...)
}
if (!(out_class %in% class(characters_converted))) {
stop(
"`character_fun(x)` must return class ", out_class,
"; the returned class was: ", paste(class(characters_converted), collapse=", ")
)
}
ret[mask_character] <- characters_converted
if (any(is.na(ret[mask_character]))) {
not_converted_values <- unique(x[mask_character & is.na(ret)])
# Don't provide too many error values
if (length(not_converted_values) > 10) {
not_converted_values <-
paste(
paste0('"', not_converted_values[1:9], '"', collapse=", "),
"... and", length(not_converted_values) - 9, "other values."
)
} else {
not_converted_values <-
paste0('"', not_converted_values, '"', collapse=", ")
}
not_converted_message <-
paste0(
"Not all character strings converted to class ", out_class,
". Values not converted were: ",
not_converted_values
)
if (string_conversion_failure %in% "error") {
stop(not_converted_message)
} else {
warning(not_converted_message)
}
}
}
if (out_class %in% "POSIXct") {
attr(ret, "tzone") <- tz
}
ret
}
3 changes: 3 additions & 0 deletions R/excel_dates.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@
#' excel_numeric_to_date(40000.521, include_time = TRUE) # Time is included
#' excel_numeric_to_date(40000.521, include_time = TRUE,
#' round_seconds = FALSE) # Time with fractional seconds is included
#' @family If your input data has a mix of Excel numeric dates and actual dates,
#' see the more powerful functions `convert_to_date` and
#' `convert_to_datetime`.

# Converts a numeric value like 42414 into a date "2016-02-14"

Expand Down
4 changes: 2 additions & 2 deletions R/get_dupes.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
#'
#' # You can use tidyselect helpers to specify variables:
#' mtcars %>% get_dupes(weight = wt, starts_with("cy"))
#'

#' @importFrom tidyselect eval_select
#' @importFrom rlang expr dots_n syms
get_dupes <- function(dat, ...) {

expr <- rlang::expr(c(...))
Expand Down
79 changes: 79 additions & 0 deletions man/convert_to_date.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions man/excel_numeric_to_date.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion man/get_dupes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit c01e368

Please sign in to comment.