-
Notifications
You must be signed in to change notification settings - Fork 5
/
README.Rmd
341 lines (246 loc) · 9.36 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
if (dir.exists("~/pins")) unlink("~/pins", recursive = TRUE)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
eval = TRUE
)
add_figure <- function(path, width = 400) {
char_html <- paste0("<img src='", path, "' width ='", width, "px'/><br/>")
htmltools::HTML(char_html)
}
toc <- function() {
re <- readLines("README.Rmd")
has_title <- as.logical(lapply(re, function(x) substr(x, 1, 2) == "##"))
only_titles <- re[has_title]
titles <- trimws(gsub("#", "", only_titles))
links <- trimws(gsub("`", "", titles))
links <- tolower(links)
links <- trimws(gsub(" ", "-", links))
links <- trimws(gsub(",", "", links))
toc_list <- lapply(
seq_along(titles),
function(x) {
pad <- ifelse(substr(only_titles[x], 1, 3) == "###", " - ", " - ")
paste0(pad, "[", titles[x], "](#", links[x], ")")
}
)
toc_full <- paste(toc_list, collapse = "\n")
cat(toc_full)
}
```
# connections
<!-- badges: start -->
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![R-CMD-check](https://github.com/rstudio/connections/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/rstudio/connections/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/rstudio/connections/branch/main/graph/badge.svg)](https://app.codecov.io/gh/rstudio/connections?branch=main)
[![CRAN status](https://www.r-pkg.org/badges/version/connections)](https://CRAN.R-project.org/package=connections)
<!-- badges: end -->
- [Installation](#installation)
- [Functions](#functions)
- [Uploading and referencing tables with `dplyr`](#uploading-and-referencing-tables-with-dplyr)
- [`pins`](#pins)
- [Pin a database connection](#pin-a-database-connection)
- [Pin a `dplyr` database query](#pin-a-dplyr-database-query)
- [Full `pins` example](#full-pins-example)
- [Back-end examples](#back-end-examples)
- [BigQuery, via `bigrquery`](#bigquery-via-bigrquery)
- [PostgreSQL, via `RPostgres`](#postgresql-via-rpostgres)
- [`DBI` connections](#dbi-connections)
The main goal of `connections` is to integrate `DBI`-compliant packages with the
RStudio IDE's [Connection
Pane](https://solutions.posit.co/connections/db/tooling/connections/). Packages such as
[RPostgres](https://github.com/r-dbi/RPostgres), [RSQLite](https://github.com/r-dbi/RSQLite),
[RMariaDB](https://github.com/r-dbi/RMariaDB) and
[bigrquery](https://github.com/r-dbi/bigrquery) connect R to those databases, but
do not provide a direct integration with the Connections Pane. `connections`
reads the configuration of the connection and creates the integration with RStudio.
A second goal is to provide integration with the [pins](https://pins.rstudio.com/)
package. The `connections` package allows you to pin database connections and
[dplyr](https://dplyr.tidyverse.org/) table objects.
## Installation
Install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("remotes")
remotes::install_github("rstudio/connections")
```
```{r setup, include = FALSE}
library(connections)
library(RSQLite)
library(dplyr)
library(pins)
if (file.exists("local.sqlite")) unlink("local.sqlite")
```
## Functions
The two main functions added by `connections` are:
- `connection_open()` - Opens the database connection. Use instead of
`dbConnect()`, but use the exact same arguments. It also automatically starts
the Connections pane.
- `connection_close()` - Closes the database connection.
```{r}
library(connections)
library(RSQLite)
con <- connection_open(SQLite(), "local.sqlite")
```
`r add_figure("man/figures/connection-1.png")`
The connection can now be closed by using the appropriate button in the
Connections pane, or by using `connection_close()`
```{r}
connection_close(con)
```
`r add_figure("man/figures/connection-2.png")`
The connection code is parsed when connecting to the database, and it is
visible once the connection is closed.
## Uploading and referencing tables with `dplyr`
`connections` integrates with `dplyr` by supporting the following two functions:
- `tbl()` - To create a pointer to a table or view within the database.
- `copy_to()` - To copy data from the R session to the database.
The version of `copy_to()` inside `connections` automatically updates the
Connections pane, so the new table automatically shows up.
```{r}
con <- connection_open(SQLite(), "local.sqlite")
copy_to(con, mtcars, temporary = FALSE, overwrite = TRUE)
```
To use an existing table inside the database use `tbl()`.
```{r}
db_mtcars <- tbl(con, "mtcars")
```
`r add_figure("man/figures/pane-1.png")`
The `tbl()` function opens the rest of the already available `dplyr`
database integration.
```{r}
db_mtcars %>%
group_by(am) %>%
summarise(avg_mpg = mean(mpg, na.rm = TRUE))
```
## `pins`
The `connections` package integrates with `pins`. It adds the ability to "pin"
database connections and queries. It follows the same approach as the
[vetiver](https://rstudio.github.io/vetiver-r/reference/vetiver_pin_write.html)
package. `connections` now has two new functions:
- `connection_pin_write()`
- `connection_pin_read()`
### Pin a database connection
The `connection_pin_write()` function does **not** save the R object. It records
the code necessary to recreate the connection.
```{r}
library(pins)
board <- board_folder("~/pins")
connection_pin_write(board, con, name = "my_conn")
```
`r add_figure("man/figures/pins-1.png")`
If you wish to see the code that `connections` will use when recreating the
conneciton from the pin, you can use `connection_code()`:
```{r}
connection_code(con)
```
`connection_pin_read()` will replay the exact same code used to initially
connect to the database. Assign the output to a variable, such as
`con1`. The variable will work just like any connection variable.
```{r}
con1 <- connection_pin_read(board, "my_conn")
```
The `con1` variable is now a regular database connection variable.
```{r}
db_mtcars <- tbl(con1, "mtcars") %>%
group_by(am) %>%
summarise(avg_mpg = mean(mpg, na.rm = TRUE))
db_mtcars
```
### Pin a `dplyr` database query
When `dplyr` works with database data, the resulting query is not executed until
the data is explicitly collected into R, or when printing the top results to the
R Console. The `pin` records two things:
- The `dplyr` R object that contains all of the transformations. **It does not
save the actual results**.
- The necessary information to recreate the database connection. This is to make
sure that the data is being retrieved from the original database connection.
```{r}
connection_pin_write(board, db_mtcars, name = "avg_mpg")
```
`r add_figure("man/figures/pins-2.png")`
`connection_pin_read()` will connect to the database, and return the `dplyr` object.
Without assigning it to a variable, the pin will immediately print the results of the
database. Those results are being processed at the time `connection_pin_read()` runs.
```{r}
connection_pin_read(board, "avg_mpg")
```
### Full `pins` example
The way `pins` integrates with databases, via the `connections` package, allows
to open the connection from a pin, and pipe all of the subsequent code into a
new pin. Afterwards, that pin can be used to collect or to continue using the
`dplyr` object.
```{r}
board <- board_folder("~/pins")
con <- connection_pin_read(board, "my_conn")
tbl_summary <- con %>%
tbl("mtcars") %>%
group_by(cyl) %>%
summarise(avg_mpg = mean(mpg, na.rm = TRUE))
connection_pin_write(board, tbl_summary, name = "cyl_mpg")
connection_close(con)
connection_pin_read(board, "cyl_mpg")
```
`r add_figure("man/figures/pins-3.png")`
## Back-end examples
There are a couple of examples of how the Connections pane will look when
opening the connection via `connections`.
### BigQuery, via `bigrquery`
```{r, eval = FALSE}
library(connections)
library(bigrquery)
con <- connection_open(
bigquery(),
project = "bigquery-public-data",
dataset = "austin_311",
billing = "my_project_billing",
use_legacy_sql = FALSE
)
```
`r add_figure("man/figures/bigquery-1.png")`
```{r, eval = FALSE}
connection_close(con)
```
`r add_figure("man/figures/bigquery-2.png")`
### PostgreSQL, via `RPostgres`
```{r, eval = FALSE}
library(connections)
library(RPostgres)
con <- connection_open(
Postgres(),
host = "localhost",
dbname = "datawarehouse",
user = "[user id]",
password = "[password]",
bigint = "integer",
port = "5432"
)
```
`r add_figure("man/figures/postgres-1.png")`
## `DBI` connections
It is possible to integrate `DBI` connections not opened via `connection_open()`.
To do that, use `connection_view()` and pass it the variable containing the
existing database connection.
```{r, eval = FALSE}
library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
connection_view(con)
```
`r add_figure("man/figures/dbi-1.png", 200)`
Changes to the database will not automatically load in the Connections pane.
The `connection_update()` function will refresh the pane with the latest.
```{r, eval = FALSE}
dbWriteTable(con, "mtcars", mtcars)
connection_update(con)
```
`r add_figure("man/figures/dbi-2.png", 300)`
```{r, eval = FALSE}
connection_close(con)
```
`r add_figure("man/figures/dbi-3.png", 300)`