Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't mention iris dataset in documentation #752

Merged
merged 7 commits into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ Config/testthat/edition: 3
Encoding: UTF-8
Note: libxls v1.6.2 (patched) 45abe77
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.2
8 changes: 6 additions & 2 deletions R/read_excel.R
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,17 @@ NULL
#' read_excel(datasets, "mtcars")
#'
#' # Skip rows and use default column names
#' read_excel(datasets, skip = 148, col_names = FALSE)
#' read_excel(datasets, skip = 10, col_names = FALSE)
#'
#' # Recycle a single column type
#' read_excel(datasets, col_types = "text")
#'
#' # Specify some col_types and guess others
#' read_excel(datasets, col_types = c("text", "guess", "numeric", "guess", "guess"))
#' read_excel(
#' readxl_example("deaths.xlsx"),
#' skip = 4, n_max = 10, col_names = TRUE,
#' col_types = c("text", "text", "guess", "guess", "guess", "guess")
#' )
#'
#' # Accomodate a column with disparate types via col_type = "list"
#' df <- read_excel(readxl_example("clippy.xlsx"), col_types = c("text", "list"))
Expand Down
6 changes: 3 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Specify a worksheet by name or number.

```{r}
read_excel(xlsx_example, sheet = "chickwts")
read_excel(xls_example, sheet = 4)
read_excel(xls_example, sheet = 3)
```

There are various ways to control which cells are read. You can even specify the sheet here, if providing an Excel-style cell range.
Expand All @@ -109,7 +109,7 @@ read_excel(xlsx_example, range = "mtcars!B1:D5")
If `NA`s are represented by something other than blank cells, set the `na` argument.

```{r}
read_excel(xlsx_example, na = "setosa")
read_excel(xlsx_example, na = "0")
```

If you are new to the tidyverse conventions for data import, you may want to consult the [data import chapter](https://r4ds.had.co.nz/data-import.html) in R for Data Science. readxl will become increasingly consistent with other packages, such as [readr](https://readr.tidyverse.org/).
Expand Down Expand Up @@ -149,7 +149,7 @@ Here are some other packages with functionality that is complementary to readxl
__Writing Excel files__: The example files `datasets.xlsx` and `datasets.xls` were created with the help of [openxlsx](https://CRAN.R-project.org/package=openxlsx) (and Excel). openxlsx provides "a high level interface to writing, styling and editing worksheets".

```{r eval = FALSE}
l <- list(iris = iris, mtcars = mtcars, chickwts = chickwts, quakes = quakes)
l <- list(mtcars = mtcars, chickwts = chickwts, quakes = quakes)
openxlsx::write.xlsx(l, file = "inst/extdata/datasets.xlsx")
```

Expand Down
100 changes: 50 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ readxl_example()
#> [5] "deaths.xls" "deaths.xlsx" "geometry.xls" "geometry.xlsx"
#> [9] "type-me.xls" "type-me.xlsx"
readxl_example("clippy.xls")
#> [1] "/private/tmp/RtmpM1GkLC/temp_libpatha8e46f7f62bf/readxl/extdata/clippy.xls"
#> [1] "/Users/fontikar/Library/R/arm64/4.4/library/readxl/extdata/clippy.xls"
```

`read_excel()` reads both xls and xlsx files and detects the format from
Expand All @@ -84,30 +84,30 @@ the extension.
``` r
xlsx_example <- readxl_example("datasets.xlsx")
read_excel(xlsx_example)
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> # ℹ 147 more rows
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> # ℹ 29 more rows

xls_example <- readxl_example("datasets.xls")
read_excel(xls_example)
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> # ℹ 147 more rows
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> # ℹ 29 more rows
```

List the sheet names with `excel_sheets()`.

``` r
excel_sheets(xlsx_example)
#> [1] "iris" "mtcars" "chickwts" "quakes"
#> [1] "mtcars" "chickwts" "quakes"
```

Specify a worksheet by name or number.
Expand All @@ -121,7 +121,7 @@ read_excel(xlsx_example, sheet = "chickwts")
#> 2 160 horsebean
#> 3 136 horsebean
#> # ℹ 68 more rows
read_excel(xls_example, sheet = 4)
read_excel(xls_example, sheet = 3)
#> # A tibble: 1,000 × 5
#> lat long depth mag stations
#> <dbl> <dbl> <dbl> <dbl> <dbl>
Expand All @@ -136,34 +136,34 @@ specify the sheet here, if providing an Excel-style cell range.

``` r
read_excel(xlsx_example, n_max = 3)
#> # A tibble: 3 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> # A tibble: 3 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
read_excel(xlsx_example, range = "C1:E4")
#> # A tibble: 3 × 3
#> Petal.Length Petal.Width Species
#> <dbl> <dbl> <chr>
#> 1 1.4 0.2 setosa
#> 2 1.4 0.2 setosa
#> 3 1.3 0.2 setosa
#> disp hp drat
#> <dbl> <dbl> <dbl>
#> 1 160 110 3.9
#> 2 160 110 3.9
#> 3 108 93 3.85
read_excel(xlsx_example, range = cell_rows(1:4))
#> # A tibble: 3 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> # A tibble: 3 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
read_excel(xlsx_example, range = cell_cols("B:D"))
#> # A tibble: 150 × 3
#> Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl>
#> 1 3.5 1.4 0.2
#> 2 3 1.4 0.2
#> 3 3.2 1.3 0.2
#> # ℹ 147 more rows
#> # A tibble: 32 × 3
#> cyl disp hp
#> <dbl> <dbl> <dbl>
#> 1 6 160 110
#> 2 6 160 110
#> 3 4 108 93
#> # ℹ 29 more rows
read_excel(xlsx_example, range = "mtcars!B1:D5")
#> # A tibble: 4 × 3
#> cyl disp hp
Expand All @@ -178,14 +178,14 @@ If `NA`s are represented by something other than blank cells, set the
`na` argument.

``` r
read_excel(xlsx_example, na = "setosa")
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 <NA>
#> 2 4.9 3 1.4 0.2 <NA>
#> 3 4.7 3.2 1.3 0.2 <NA>
#> # ℹ 147 more rows
read_excel(xlsx_example, na = "0")
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 NA 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 NA 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> # ℹ 29 more rows
```

If you are new to the tidyverse conventions for data import, you may
Expand Down Expand Up @@ -249,7 +249,7 @@ openxlsx provides “a high level interface to writing, styling and
editing worksheets”.

``` r
l <- list(iris = iris, mtcars = mtcars, chickwts = chickwts, quakes = quakes)
l <- list(mtcars = mtcars, chickwts = chickwts, quakes = quakes)
openxlsx::write.xlsx(l, file = "inst/extdata/datasets.xlsx")
```

Expand Down
Binary file modified inst/extdata/datasets.xls
Binary file not shown.
Binary file modified inst/extdata/datasets.xlsx
Binary file not shown.
Binary file modified inst/extdata/deaths.xls
Binary file not shown.
8 changes: 6 additions & 2 deletions man/read_excel.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 13 additions & 13 deletions vignettes/articles/readxl-workflows.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,32 +47,32 @@ Solution: cache a CSV snapshot of your raw data tables at the time of export. Ev
Pipe the output of `read_excel()` directly into `readr::write_csv()` like so:

```{r}
iris_xl <- readxl_example("datasets.xlsx") %>%
read_excel(sheet = "iris") %>%
write_csv("iris-raw.csv")
mtcars_xl <- readxl_example("datasets.xlsx") %>%
read_excel(sheet = "mtcars") %>%
write_csv("mtcars-raw.csv")
```

```{r include = FALSE}
delete_on_exit <- c(delete_on_exit, "iris-raw.csv")
delete_on_exit <- c(delete_on_exit, "mtcars-raw.csv")
```

Why does this work? `readr::write_csv()` is a well-mannered "write" function: it does its main job *and returns its input invisibly*. The above command reads the iris sheet from readxl's `datasets.xlsx` example workbook and caches a CSV version of the resulting data frame to file.

Let's check. Did we still import the data? Did we write the CSV file?

```{r}
iris_xl
dir(pattern = "iris")
mtcars_xl
dir(pattern = "mtcars")
```

Yes! Is the data written to CSV an exact copy of what we imported from Excel?

```{r}
iris_alt <- read_csv("iris-raw.csv")
mtcars_alt <- read_csv("mtcars-raw.csv")
## readr leaves a note-to-self in `spec` that records its column guessing,
## so we remove that attribute before the check
attr(iris_alt, "spec") <- NULL
identical(iris_xl, iris_alt)
attr(mtcars_alt, "spec") <- NULL
identical(mtcars_xl, mtcars_alt)
```

Yes! If we needed to restart or troubleshoot this fictional analysis, `iris-raw.csv` is available as a second, highly accessible alternative to `datasets.xlsx`.
Expand Down Expand Up @@ -192,11 +192,11 @@ Rework examples from above but using base R only, other than readxl.
### Cache a CSV snapshot

```{r eval = FALSE}
iris_xl <- read_excel(readxl_example("datasets.xlsx"), sheet = "iris")
write.csv(iris_xl, "iris-raw.csv", row.names = FALSE, quote = FALSE)
iris_alt <- read.csv("iris-raw.csv", stringsAsFactors = FALSE)
mtcars_xl <- read_excel(readxl_example("datasets.xlsx"), sheet = "mtcars")
write.csv(iris_xl, "mtcars-raw.csv", row.names = FALSE, quote = FALSE)
mtcars_alt <- read.csv("mtcars-raw.csv", stringsAsFactors = FALSE)
## coerce iris_xl back to a data.frame
identical(as.data.frame(iris_xl), iris_alt)
identical(as.data.frame(mtcars_xl), mtcars_alt)
```

### Iterate over multiple worksheets in a workbook
Expand Down
Loading