-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dtplyr changes result of summarize of empty data.table #282
Comments
Could you please rework your reproducible example to use the reprex package ? That makes it easier to see both the input and the output, formatted in such a way that I can easily re-run in a local session. |
Here's a reprex: library(data.table)
library(dplyr, warn.conflicts = FALSE)
dt <- data.table(x = -1)
dt %>%
filter(x > 0) %>%
summarize(nrow = n(), xmean = mean(x)) %>%
as.data.table()
#> nrow xmean
#> 1: 0 NaN
library(dtplyr)
dtp_result <-
dt %>%
filter(x > 0) %>%
summarize(nrow = n(), xmean = mean(x))
as.data.table(dtp_result)
#> Empty data.table (0 rows and 2 cols): nrow,xmean
show_query(dtp_result)
#> `_DT1`[x > 0, .(nrow = .N, xmean = mean(x))] Created on 2021-08-30 by the reprex package (v2.0.1) If you library(data.table)
library(dplyr, warn.conflicts = FALSE)
dt <- data.table(x = -1)
dt %>%
filter(x > 0) %>%
compute() %>%
summarize(nrow = n(), xmean = mean(x)) %>%
as.data.table()
#> nrow xmean
#> 1: 0 NaN Created on 2021-08-31 by the reprex package (v2.0.1) That's because if this is done as two separate
The first two seem pretty bad. I'm not sure if the third option is that useful since it doesn't take care of other cases where dplyr returns 1 row e.g. just library(data.table)
library(dplyr, warn.conflicts = FALSE)
dt <- data.table(x = -1)
dt[x > 0, .(nrow = .N, xmean = mean(x))]
#> Empty data.table (0 rows and 2 cols): nrow,xmean
dt[x > 0][, .(nrow = .N, xmean = mean(x))]
#> nrow xmean
#> 1: 0 NaN
## Same behavior in general
dt[x > 0, .(y = 1)]
#> Empty data.table (0 rows and 1 cols): y
dt[x > 0][, .(y = 1)]
#> y
#> 1: 1
Created on 2021-08-30 by the reprex package (v2.0.1) |
Hi,
When summarizing a data.table that has zero rows without dtplyr, I receive a table with one row. When dtplyr is loaded, the same code results in a table with zero rows:
The text was updated successfully, but these errors were encountered: