-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using dplyr::group_by() with tbl_json objects changes the type of ..JSON
column
#135
Comments
Thanks for reporting!! That is surprising behavior indeed. I'll take a look and see if this can be improved. To be fair, I suspect the problem that is happening here is that the "grouped data frame" is no longer a |
Here's some more context: #' @importFrom rlang .data
collect_samples <- function(tbl_json) {
samples_variants <- tbl_json %>%
tidyjson::enter_object('samples_variants') %>%
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
samples_training <- tbl_json %>%
tidyjson::enter_object('samples_training') %>%
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
all_samples <-
tidyjson::bind_rows(samples_variants, samples_training) %>%
dplyr::group_by(.data$..page, .data$array.index) %>%
dplyr::mutate(., sample_id = seq_len(dplyr::n()), .after = 'array.index') %>%
dplyr::arrange('sample_id', .by_group = TRUE) %>%
dplyr::ungroup() %>%
tidyjson::as.tbl_json(json.column = '..JSON') # Needed because of https://github.com/colearendt/tidyjson/issues/135.
return(all_samples)
} I found that using |
That makes a lot of sense! Thank you for the context!
Grouped mutates are perfect. One of the concerns we have for supporting grouped tibbles is "summarize" operations, which will necessarily destroy the JSON data and make future tidyjson operations largely meaningless.
It'd be great if we could find a nice middle way that supports grouped tibbles for certain operations, or at least makes the confusing state here more clear / understandable.
…On Feb 6 2021, at 10:28 am, Ramiro Magno ***@***.***> wrote:
Here's some more context:
#' @importFrom rlang .datacollect_samples <- function(tbl_json) {
samples_variants <- tbl_json %>%
tidyjson::enter_object('samples_variants') %>%
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
samples_training <- tbl_json %>%
tidyjson::enter_object('samples_training') %>%
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
all_samples <-
tidyjson::bind_rows(samples_variants, samples_training) %>%
dplyr::group_by(.data$..page, .data$array.index) %>%
dplyr::mutate(., sample_id = seq_len(dplyr::n()), .after = 'array.index') %>%
dplyr::arrange('sample_id', .by_group = TRUE) %>%
dplyr::ungroup() %>%
tidyjson::as.tbl_json(json.column = '..JSON') # Needed because of #135.
return(all_samples)
}
I found that using tidyjson::as.tbl_json(json.column = '..JSON') works as a workaround.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub (#135 (comment)), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AFQBVVWYAPFBVYSJ25RY6HDS5VNZHANCNFSM4XF2JK5A).
|
btw, I should probably create a separate issue, but something I see doing myself too often is drop a recently created index column with tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id') you think it would be possible to allow that argument |
Perhaps? tidyjson::gather_array(column.name = NULL) |
Yeah, I like that idea!
…Sent from my iPhone
On Feb 6, 2021, at 10:41 AM, Ramiro Magno <[email protected]> wrote:
Perhaps?
tidyjson::gather_array(column.name = NULL)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#135 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFQBVVW5LATS4FY65GXZHMDS5VPLBANCNFSM4XF2JK5A>.
|
Here
..JSON
is of typecharacter
:Now is of type
list
(scroll to the right):The text was updated successfully, but these errors were encountered: