-
Notifications
You must be signed in to change notification settings - Fork 65
/
gridfs.rmd
109 lines (68 loc) · 3.42 KB
/
gridfs.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
title: "Proposal: GridFS Design in mongolite"
output:
word_document: default
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
GridFS API in mongolite will be a new top level object class, consistent with the current API to instantiate regular mongodb collection objects.
```r
fs <- mongolite::gridfs(db = "test", url = "mongodb://localhost")
```
The initial API will focus on basic read/write/delete operations.
## Error Handling
All methods will automatically translate `mongoc` exceptions into R errors.
## Listing Files
Returns a data frame with files and fixed meta data (size, date, content-type, etc).
```r
list <- fs$list(filter = '{}', options = '{}')
```
__*References*__:
- [mongoc_gridfs_find_with_opts](http://mongoc.org/libmongoc/current/mongoc_gridfs_t.html) for listing
- [mongoc_gridfs_file_t](http://mongoc.org/libmongoc/current/mongoc_gridfs_file_t.html) for reading file properties
## Reading Files
A file can be read either into a buffer, or streamed to a file or connection. The default behavior is to read the entire file and return the data in a raw data vector:
```r
buf <- fs$read(name = "myfile.bin")
```
Alternatively the user can supply an R connection object that we can use to stream data to e.g. a file or network socket.
```r
fs$read(name = "myfile.bin", con = connection)
```
The latter will be a memory efficient way to incrementally read from the GridFS and write out the data. It is similar to the `export()` method for regular mongo collection objects.
__*References*__:
- [mongoc_gridfs_find_one_by_filename](http://mongoc.org/libmongoc/current/mongoc_gridfs_find_one_by_filename.html) to lookup the file
- [mongoc_stream_gridfs_new](http://mongoc.org/libmongoc/current/mongoc_stream_gridfs_new.html) to create a stream reader
- [mongoc_stream_t](http://mongoc.org/libmongoc/current/mongoc_stream_t.html) methods for reading the stream
- [What exactly is a connection in R](https://stackoverflow.com/questions/30445875/what-exactly-is-a-connection-in-r)
## Writing Files
Analogous to reading, a write operation can either write a raw data vector from memory or stream data from a local file or connection object.
```r
fs$write(name = "myfile.bin", data = buffer)
```
When the `data` argument is an R connection object, it will incrementally read from the connection and upload to GridFS.
```r
fs$write(name = "myfile.bin", data = connection)
```
__*References*__:
- [mongoc_gridfs_create_file_from_stream](http://mongoc.org/libmongoc/current/mongoc_gridfs_create_file_from_stream.html) create a new file using a stream
- [mongoc_stream_write](http://mongoc.org/libmongoc/current/mongoc_stream_write.html) write to the stream
- [What exactly is a connection in R](https://stackoverflow.com/questions/30445875/what-exactly-is-a-connection-in-r)
## Removing Files
Removes a single file from the GridFS collection:
```r
fs$remove(name = "myfile.bin")
```
Here the `name` argument can be vectorized in standard R fashion such that multiple files can be removed with a single call.
__*References*__:
- [mongoc_gridfs_remove_by_filename](http://mongoc.org/libmongoc/current/mongoc_gridfs_remove_by_filename.html) to delete the file
## Drop GridFS
Requests that an entire GridFS be dropped, including all files associated with it.
```r
fs$drop()
```
__*References*__:
- [mongoc_gridfs_drop](http://mongoc.org/libmongoc/current/mongoc_gridfs_drop.html)