Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some issues in chapter 5 #7

Open
fakecv opened this issue Jan 16, 2017 · 0 comments
Open

some issues in chapter 5 #7

fakecv opened this issue Jan 16, 2017 · 0 comments

Comments

@fakecv
Copy link

fakecv commented Jan 16, 2017

  1. In the code "mails_by_day_of_week.r"
inbox_count <- dates_count(dates=inbox_data['date'], element='%a')
sent_count <- dates_count(dates=sent_data['date'], element='%a')

days_of_week <- c("Mon","Tue","Wed","Thu","Fri","Sat","Sun")

I think you should use %u instead of %a, otherwise the frequency will sort literally as

> test <- function(dates,element) {
+  dates <- as.Date(as.vector(as.matrix(dates)),"%Y-%m-%dT%H:%M:%S")
+ elements <- format(dates, element)
+ data.frame(table(elements))
+ }
> inbox_test <- test(dates=inbox_data['date'], element='%a')
> inbox_test
  elements Freq
1      Fri 1983
2      Mon 1568
3      Sat  142
4      Sun  360
5      Thu 1845
6      Tue 1776
7      Wed 1940

not from Monday to Sunday as the sequence of vector days_of_week.

So the surprising conclusion in the book doesn't exist, email count will reach the low point in the weekend instead of middle of the week. ( Sat/Son sit in positions of Wed/Thu)

this also happened as same in "mails_by_month.r" (use %m instead of %b)

  1. when read from .csv file, an addition option "quote='' " will be better, because some single quote appeared in email address( just like european.vp's(AT)enron.com, nicholas.o'day(AT)enron.com)
-inbox_data <- read.table("inbox_data_enron.csv", header=TRUE, sep=",")
+inbox_data <- read.table("inbox_data_enron.csv", header=TRUE, sep=",",comment.char='',quote='')

I like your book, and things like this even make more fun. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant