Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control-plane: use btree index for log_lines #1197

Merged
merged 1 commit into from
Sep 25, 2023
Merged

Conversation

psFried
Copy link
Member

@psFried psFried commented Sep 25, 2023

Description:

Replaces the current BRIN index on the internal.log_lines table with a btree index. The BRIN index would require queries to use the logged_at timestamp in the where clause in order for it to be effective. Neither the UI or flowctl do that currently. The btree index works well when only filtering on token, so this changes to just using that.

Notes for reviewers:

The main issue this is seeking to address is that clients are getting HTTP 500 responses when they try to fetch log lines during a publication or discover. Another option that might also work to address this would be to add where logged_at > (now() - '1h'::interval) to the query in the view_logs RPC. I decided to use the btree index instead because it felt somewhat less hacky.

BUT, we may still want a brin index on (only) logged_at, so that we can use it in a pg_cron job to clean out old logs. I've opted to ignore this for the time being, and address that in a future PR along with the cron job to delete the logs.


This change is Reviewable

Replaces the current BRIN index on the `internal.log_lines` table with a btree
index.  The BRIN index would require queries to use the `logged_at` timestamp
in the where clause in order for it to be effective.  Neither the UI or flowctl
do that currently. The btree index works well when only filtering on token, so
this changes to just using that.
@psFried psFried requested a review from jshearer September 25, 2023 16:33
Copy link
Contributor

@jshearer jshearer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading your testing of this, it looks like it should help query perf in the short/medium term, and if we want the BRIN index back we can re-create it. LGTM 👍

@psFried psFried merged commit 1cf02e0 into master Sep 25, 2023
3 checks passed
@psFried psFried deleted the phil/logs-index branch September 25, 2023 19:59
@jgraettinger
Copy link
Member

jgraettinger commented Sep 25, 2023

I see this is applied already, and it's not a big deal, but I would consider this a bug in usage of the table.

In any operational / live context, we're polling the table with a tight interval between SELECT's, and there's always a very recent timestamp for which you want logs after: either the timestamp at which the publication was created, or the biggest logged_at timestamp seen so far. The agent is designed such that logged_at is strictly ascending for exactly this purpose (you don't have to deal with late-arriving logs).

The BRIN index lets you very quickly discard pretty much the entire table, evaluating only the really recent stuff logged after the cursor logged_at.

@jgraettinger
Copy link
Member

This also explains a bug I've noticed in flowctl: each poll of logs is re-printing all publication logs, rather than just printing incremental new logs. Looks like I forgot to push down a gt on logged_at right here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants