From ba6e90e13881107cb6d90f4f99a80a89b15e0d00 Mon Sep 17 00:00:00 2001 From: Mahdi Dibaiee Date: Fri, 22 Sep 2023 15:02:20 +0100 Subject: [PATCH 1/3] docs: source-mongodb docs updates --- .../Connectors/capture-connectors/mongodb.md | 22 ++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/site/docs/reference/Connectors/capture-connectors/mongodb.md b/site/docs/reference/Connectors/capture-connectors/mongodb.md index 2d3d158b1b..bb2fbff1a6 100644 --- a/site/docs/reference/Connectors/capture-connectors/mongodb.md +++ b/site/docs/reference/Connectors/capture-connectors/mongodb.md @@ -25,6 +25,18 @@ You'll need: [Role-Based Access Control](https://www.mongodb.com/docs/manual/core/authorization/) for more information. + * Read access to the `local` database and `oplog.rs` collection in that + database are also necessary. + + In order to grant these permissions with a command like so: + ``` + use admin; + db.createUser({ + user: "", + pwd: "", + roles: [ "readAnyDatabase" ] + }) + ``` * ReplicaSet enabled on your database, see [Deploy a Replica Set](https://www.mongodb.com/docs/manual/tutorial/deploy-replica-set/). @@ -89,8 +101,8 @@ connector's ability to do this depends on the size of the [replica set oplog](https://www.mongodb.com/docs/manual/core/replica-set-oplog/), and in certain circumstances, when the pause has been long enough for the oplog to have evicted old change events, the connector will need to re-do the backfill to -ensure data consistency. In such cases, the connector will error out the first -time it is run with a message indicating that it is going to re-do a backfill on -the next run, and it will be restarted by Flow runtime. This new run of the -connector will do a backfill and once fully caught up, will start capturing -change events. +ensure data consistency. In such cases, the connector will error, and to resolve +this case, first try to increase the size of your oplog to avoid this issue in +the future, and then you need to remove the binding that is unable to be +captured, publishing your task, and then adding the binding back so the backfill +is restarted. From b920e38e766b4e30439e57960820fd2c5031c91d Mon Sep 17 00:00:00 2001 From: Mahdi Dibaiee Date: Fri, 6 Oct 2023 17:05:23 +0100 Subject: [PATCH 2/3] source-mongodb: update docs --- .../Connectors/capture-connectors/mongodb.md | 38 ++++++++++++++----- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/site/docs/reference/Connectors/capture-connectors/mongodb.md b/site/docs/reference/Connectors/capture-connectors/mongodb.md index bb2fbff1a6..387ba86eb7 100644 --- a/site/docs/reference/Connectors/capture-connectors/mongodb.md +++ b/site/docs/reference/Connectors/capture-connectors/mongodb.md @@ -21,14 +21,20 @@ You'll need: * Credentials for connecting to your MongoDB instance and database - * Read access to your MongoDB database and desired collections, see + * Read access to your MongoDB database(s), see [Role-Based Access Control](https://www.mongodb.com/docs/manual/core/authorization/) for more information. * Read access to the `local` database and `oplog.rs` collection in that - database are also necessary. + database. + * We recommend giving access to read all databases, as this allows us to + watch an instance-level change stream, allowing for better guarantees of + reliability, and possibility of capturing multiple databases in the same + task. However, if access to all databases is not possible, you can + give us access to a single database and we will watch a change stream on + that specific database. - In order to grant these permissions with a command like so: + In order to create a user with access to all databases, use a command like so: ``` use admin; db.createUser({ @@ -37,6 +43,18 @@ You'll need: roles: [ "readAnyDatabase" ] }) ``` + + In order to create a user with access to a specific database and the `local` database, + use a command like so: + + ``` + use ; + db.createUser({ + user: "", + pwd: "", + roles: ["read", { role: "read", db: "local" }] + }) + ``` * ReplicaSet enabled on your database, see [Deploy a Replica Set](https://www.mongodb.com/docs/manual/tutorial/deploy-replica-set/). @@ -93,7 +111,7 @@ The connector starts by backfilling data from the specified collections until it reaches the current time. Once all the data up to the current time has been backfilled, the connector then uses [**change streams**](https://www.mongodb.com/docs/manual/changeStreams/) to capture -change events from the database and emit those updates to the flow collection. +change events and emit those updates to their respective flow collections. If the connector's process is paused for a while, it will attempt to resume capturing change events since the last received change event, however the @@ -101,8 +119,10 @@ connector's ability to do this depends on the size of the [replica set oplog](https://www.mongodb.com/docs/manual/core/replica-set-oplog/), and in certain circumstances, when the pause has been long enough for the oplog to have evicted old change events, the connector will need to re-do the backfill to -ensure data consistency. In such cases, the connector will error, and to resolve -this case, first try to increase the size of your oplog to avoid this issue in -the future, and then you need to remove the binding that is unable to be -captured, publishing your task, and then adding the binding back so the backfill -is restarted. +ensure data consistency. In these cases it is necessary to [resize your +oplog](https://www.mongodb.com/docs/manual/tutorial/change-oplog-size/#c.-change-the-oplog-size-of-the-replica-set-member) or +[set a minimum retention +period](https://www.mongodb.com/docs/manual/reference/command/replSetResizeOplog/#minimum-oplog-retention-period) +for your oplog to be able to reliably capture data. +The recommended minimum retention period is at least 24 hours, but we recommend +higher values to improve reliability. From cd45e5966f4e6242d037c3546ecaf03a53969c5f Mon Sep 17 00:00:00 2001 From: Mahdi Dibaiee Date: Tue, 10 Oct 2023 09:50:21 +0100 Subject: [PATCH 3/3] docs: mongodb mention ?authSource=admin --- site/docs/reference/Connectors/capture-connectors/mongodb.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/site/docs/reference/Connectors/capture-connectors/mongodb.md b/site/docs/reference/Connectors/capture-connectors/mongodb.md index 387ba86eb7..11566f8c94 100644 --- a/site/docs/reference/Connectors/capture-connectors/mongodb.md +++ b/site/docs/reference/Connectors/capture-connectors/mongodb.md @@ -44,6 +44,10 @@ You'll need: }) ``` + If you are using a userw ith access to all databases, then in your mongodb + address, you must specify `?authSource=admin` parameter so that + authentication is done through your admin database. + In order to create a user with access to a specific database and the `local` database, use a command like so: