Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Job Query Performance by Adding Indexed Composite Field for Efficient Lookups #45

Open
fermentfan opened this issue Jul 25, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@fermentfan
Copy link

fermentfan commented Jul 25, 2024

I think it's crucial to have a way to query jobs in a more performant way. Here is an example scenario to illustrate the problem:

Example Situation:

A customer books an appointment (Booking entity) for a date 4 weeks in the future. I want to notify the customer 1 week before the appointment starts. To achieve this, I create a job to send the notification at the appropriate time.

If the organizer needs to cancel the appointment due to personal reasons, I would issue a refund and take necessary actions. Additionally, I need to cancel the previously created notification job.

Currently, to achieve this, I would query the metadata field in the jobs collection of MongoDB. However, this field is not indexed, and adding an index to the metadata field could be too expensive in terms of performance and storage costs. This would lead to high costs and inefficiencies when handling a large collection of data, as it would result in MongoDB performing a full collection scan.

Proposed Solution:

In my experience with NoSQL databases, a common approach is to create a single field consisting of attributes that are commonly queried. For instance, we can concatenate the customer ID and booking ID into a single field:

facebook|1234567890;a1a2b8e2-11b1-48d0-adb7-d4647a3e424d

This composite field should be indexed to allow fast querying. Combined with the job name, this would enable very specific and efficient lookups with a single index.

Benefits:

  • Improved performance for querying and managing jobs.
  • Reduced operational costs by avoiding full collection scans.
  • Enhanced scalability for handling a larger volume of jobs.

Implementing this solution would greatly enhance the efficiency and scalability of Pulse.

Note: one could of course make use composite indexes with multiple fields instead of this composite field, but the Pulse API might get too complicated then when one needs to open the whole field and indexing API to the configuration properties of this dependency I think.

@code-xhyun code-xhyun self-assigned this Jul 26, 2024
@code-xhyun code-xhyun added the enhancement New feature or request label Jul 26, 2024
@code-xhyun
Copy link
Contributor

@fermentfan
In the current structure, the best option is to change the disableAutoIndex option to false, which would add the metadata information to the index. While your suggestions seem quite good, it appears difficult to implement such changes immediately based on our current standards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants