-
-
Notifications
You must be signed in to change notification settings - Fork 224
docs(self-hosted): external storage configurations #1269
docs(self-hosted): external storage configurations #1269
Conversation
@aldy505 is attempting to deploy a commit to the Sentry Team on Vercel. A member of the Team first needs to authorize it. |
@@ -0,0 +1,89 @@ | |||
--- | |||
title: External Storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should generalize to "Data Storage" or something.
This way this document can explain where the data is stored by default and can list alternatives if there are any.
Could have the following sections:
- Sentry (with a general explanation about postgres, clickhouse and kafka maybe)
- Filestore (Uploads, Replays)
- Database
- Object Storage
- Nodestore (Event data)
- Database
- Filestore (Uploads, Replays)
- Vroom (Profiles)
- Docker volume
- Object Storage
We should probably either rename those section to what the specifically store or explain that in the intro because "Vroom" is not very descriptive but if it's explained that that component is responsible for (ingest and) storing profiling data it makes a lot more sense.
Maybe with until someone else also chimes in before rewriting the whole thing in case I'm off base with this outline but this sounds like a document I would love to have had when I started my self-hosted adventures 👍
For the Object Storage thing we might want to link to the relevant documentation instead of adding examples for every option under the sun because otherwise there is no bound to the size of this document.
After changing configuration files, re-run the <code>./install.sh</code> script, to rebuild and restart the containers. See the <Link to="/self-hosted/#configuration">configuration section</Link> for more information. | ||
</Alert> | ||
|
||
<!-- Should we add a description about what "external storage" is? --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can safely assume that people can find that out for themselves if they are looking at this page
|
||
<!-- Hello! If you're reading this, you're in luck because I can't decide whether to make.. wait let me copy the text from Discord. | ||
|
||
I got some time before Monday to write up some docs about setting up an S3 storage for selfhosted instance, but I can't decide whether I should put it under a big "External Services" page, in which people can include external postgres, external redis, and that kind of things; or should I put it under a page called "External Storage"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
External storage sounds fine. I don't think we should recommend people to use external postgres, redis, etc as that can introduce a lot of issues for people trying to set that up unless they really know what they're doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with "don't think we should recommend people to use external postgres, redis, etc", but if there are some people who wishes to do that... I don't know if there's any other better way to say to them that "they're on their own"
|
||
Filestore handles storing attachment, sourcemap, and replays. Filestore configuration for Sentry should be configured on the `sentry/config.yml` file. | ||
|
||
### S3 backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure, have you tried the Azure/s3 compatible backend without issues? We're using GCS so wanted to make sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! See my patch getsentry/self-hosted@master...teknologi-umum:sentry:teknum-patch
title: External Storage | ||
--- | ||
|
||
<!-- Hello! If you're reading this, you're in luck because I can't decide whether to make.. wait let me copy the text from Discord. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: much easier to review if questions/comments like this are added in the GH comment system, rather than in-line in the PR.
|
||
I got some time before Monday to write up some docs about setting up an S3 storage for selfhosted instance, but I can't decide whether I should put it under a big "External Services" page, in which people can include external postgres, external redis, and that kind of things; or should I put it under a page called "External Storage"? | ||
|
||
There. Please help me decide this. I'll delete this comment afterwards --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a separate page that, rather than "External Services", says something like "Unsupported Workflows".
For example, S3 storage support technically exists, but is a. untested, and b. unused by Sentry internally. So there's no real pressure ensuring that is stays functional over time. Ultimately, what we have at the moment is a (very possibly bit-rotted) thin wrapper around Django's FileStore capabilities. We do not want to indicate to users that it is something we'll offer support for, because realistically we can't offer very good support for it, and folks will be left disappointed.
For this specific doc, we need to be very clear that this is provided as a rough best effort template, and that we offer very limited support for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because realistically we can't offer very good support for it, and folks will be left disappointed.
I know this, but I think the current S3 support is good enough for selfhosted. Users can always pull out their own Django plugins though. Like @stayallive's S3 Nodestore plugin https://github.com/stayallive/sentry-nodestore-s3
we need to be very clear that this is provided as a rough best effort template
I agree. Need to take some time to come up with good enough copywriting for this lol.
After changing configuration files, re-run the <code>./install.sh</code> script, to rebuild and restart the containers. See the <Link to="/self-hosted/#configuration">configuration section</Link> for more information. | ||
</Alert> | ||
|
||
<!-- Should we add a description about what "external storage" is? --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly do you mean by "external storage"? Does that essentially mean "storage supplied by a cloud services provider like AWS/GCP/Azure"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's something that's not strictly on the same filesystem as the sentry self-hosted instance. But it also excludes if you want to use something like NAS or external bind-mount storage to store Sentry data. For files, it's the blob storage provided by each cloud provider. For databases, it's external database that's either managed or unmanaged, but it should be separate to the sentry self-hosted instance. Do you have any suggestion on how to better phrase this out?
|
||
<!-- Should we add a description about what "external storage" is? --> | ||
|
||
## Filestore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
## Filestore | |
## Django Filestore |
Sentry (confusingly) maintains a separate service called Filestore
which acts as an intermediate layer in front of ex GCP, though we don't really recommend this for self-hosted use.
|
||
<!-- Hello! If you're reading this, you're in luck because I can't decide whether to make.. wait let me copy the text from Discord. | ||
|
||
I got some time before Monday to write up some docs about setting up an S3 storage for selfhosted instance, but I can't decide whether I should put it under a big "External Services" page, in which people can include external postgres, external redis, and that kind of things; or should I put it under a page called "External Storage"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call the proposed page "Integrating with Major Cloud Providers" or something similar, just to make it clear that we are specifically referring to GCS/AWS/Azure.
|
||
Additional environment variables should be provided: | ||
- `AWS_ACCESS_KEY=foobar` | ||
- `AWS_SECRET_KEY=foobar` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (here and elsewhere): change the value from foobar
to something else, to make clear that the two keys will be different in practice. I would suggest something like your_secret_key
or similar.
- `disableSSL`: A value of "true" disables SSL when sending requests. | ||
- `s3ForcePathStyle`: A value of "true" forces the request to use path-style addressing. | ||
|
||
### Azure Blob Storage backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination is to remove Azure for now, and indicate somewhere in this document that there are no shims for it atm. Since we don't have a Filestore
shim for Azure, in practice it will be very hard to run vroom
on Azure, and users will likely work themselves into a corner if they try.
Co-authored-by: Alex Zaslavsky <[email protected]>
Co-authored-by: Alex Zaslavsky <[email protected]>
title: External Storage | ||
--- | ||
|
||
In some cases, storing Sentry data on-disk is not really something people can do. Sometimes, it's better if they can offload it into some bucket storage (like AWS S3 or Google Cloud Storage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit confusing, but I think adding another page later after this about "Unsupported Workflows" in which we can specify more about what kind of things that we can't offer support to (external Redis, external Postgres, installing third party plugins for extending some stuff).
See @azaslavsky's comment here #1269 (comment)
### Google Cloud Storage backend | ||
|
||
You will need to set `GOOGLE_APPLICATION_CREDENTIALS` environment variable. For more information, refer to the [Google Cloud documentation for setting up authentication](https://cloud.google.com/storage/docs/reference/libraries#setting_up_authentication). | ||
|
||
```bash | ||
gs://my-bucket | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested this. I only know how to configure this. Can you guys test this out on your dogfood instance?
Sorry for the ping but I need your feedbacks on this. @hubertdeng123 @azaslavsky @stayallive