Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container indexes #699

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions db/migrate/20230803225825_container_indexes.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
class ContainerIndexes < ActiveRecord::Migration[6.1]
def change
# container projects page
add_index :container_routes, :container_project_id
add_index :container_services, :container_project_id
# didn't show much, since only had 1 record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't help future readers, so let's remove the comment. It's fine in the commit message and/or PR comments though)

add_index :container_replicators, :container_project_id

add_index :container_groups, [:container_project_id, :id], :name => "index_container_groups_on_cpid_id_not_del", :where => "deleted_on IS NULL"
# container providers screen

# NOTE: this gets index only scans, but has a larger index size
add_index :taggings, [:taggable_type, :taggable_id, :tag_id], :name => "index_taggings_on_type_id_id"
remove_index :taggings, :column => [:taggable_id, :taggable_type], :name => "index_taggings_on_taggable_id_and_taggable_type"
Comment on lines +12 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this at all. Typically pattern for indexes is to put the highest cardinality first, unless there are queries where the lower cardinality one can stand alone. Also I don't understand adding tag_id

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is looked up using taggable_type,taggable_id - the taggable_type is held constant and there are many taggable_id values.

Adding the tag_id to the index allows the tag_id to be pulled from the index, so it doesn't need to hit the data page to look up the tag_id.

We use tagging a lot, so I felt that skiping the data page would help us. But this does make the index larger, so maybe we don't want to go this route.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against adding the tag_id - I'm questiong why taggable_type and taggable_id order were reversed - that's the part I'm questioning.


# container Services
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistent caps

Suggested change
# container Services
# container services

add_index :container_groups_container_services, [:container_service_id, :container_group_id], :name => "index_container_groups_on_csi_cgi"

# container groups
add_index :containers, [:container_group_id, :state]

# container nodes
add_index :container_conditions, [:container_entity_type, :container_entity_id, :name], :name => "index_container_conditions_on_cet_ceid_name"
Comment on lines +22 to +23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect id first.

Copy link
Member Author

@kbrock kbrock Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virtual attribute that this is targeting is:

  (SELECT container_conditions.status
  FROM container_conditions
  WHERE container_conditions.name = 'Ready'
  AND container_conditions.container_entity_id = container_nodes.id
  AND container_conditions.container_entity_type = 'ContainerNode')
  AS ready_condition_status,

When we have 2 indexes, one with id first and the other with type first, the analyzer prefers the one with type first. When I delete the type first index, it is still able to use the id first index with similar query statistics.

Having name, type, id or WHERE name = 'READY' would be faster, but this way, the index can also be used for all container_conditions, not just the ready ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have 2 indexes, one with id first and the other with type first, the analyzer prefers the one with type first. When I delete the type first index, it is still able to use the id first index with similar query statistics.

I'm not sure that's an entirely accurate assessment. Or rather, statistically, if they are the same, it will just pick one, but that doesn't then imply that type,id is better than id,type.

Based on all my readings, higher cardinality should always be first because it eliminates more rows and allows the index to execute faster, because the row elimination happens on the first pass of the index. So id,type is nearly always preferable over type,id. The only time this shouldn't be true is if we have queries that don't provider the _id in the query at all.

Having name, type, id ...

Not my suggestion - I think it should be id, type, name

end
end