Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in async removal after destroy #266

Closed
georgeclaghorn opened this issue Jun 19, 2023 · 1 comment · Fixed by #290
Closed

Race condition in async removal after destroy #266

georgeclaghorn opened this issue Jun 19, 2023 · 1 comment · Fixed by #290
Labels
bug Something isn't working

Comments

@georgeclaghorn
Copy link

georgeclaghorn commented Jun 19, 2023

With enqueue: true, an after_destroy callback is added which enqueues a MeiliSearch::Rails::MSJob intended to remove the destroyed record from the Meilisearch index:

elsif respond_to?(:after_destroy)
after_destroy { |searchable| searchable.ms_enqueue_remove_from_index!(ms_synchronous?) }

def ms_enqueue_remove_from_index!(synchronous)
if meilisearch_options[:enqueue]
unless self.class.send(:ms_indexing_disabled?, meilisearch_options)
meilisearch_options[:enqueue].call(self, true)
end
else
ms_remove_from_index!(synchronous || ms_synchronous?)
end
end

if options[:enqueue]
raise ArgumentError, 'Cannot use a enqueue if the `synchronous` option is set' if options[:synchronous]
proc = if options[:enqueue] == true
proc do |record, remove|
MSJob.perform_later(record, remove ? 'ms_remove_from_index!' : 'ms_index!')
end
elsif options[:enqueue].respond_to?(:call)
options[:enqueue]
elsif options[:enqueue].is_a?(Symbol)
proc { |record, remove| send(options[:enqueue], record, remove) }
else
raise ArgumentError, "Invalid `enqueue` option: #{options[:enqueue]}"
end
meilisearch_options[:enqueue] = proc do |record, remove|
proc.call(record, remove) unless ms_without_auto_index_scope
end
end

Line 372 above passes the destroyed record as an argument to the job. Active Job transparently serializes the record to a Global ID on enqueue and looks up the record from the Global ID when the job is performed.

The enqueued background job may be performed after the transaction wrapping the record’s destruction has committed. In this case, Active Job won’t be able to find the record in the DB from its Global ID, and the MSJob will fail with an ActiveJob::DeserializationError.

From a quick look, it seems that the plugin’s tests don’t catch this because they only assert that a job is enqueued with the correct arguments. They might have caught it if they exercised the end-to-end destroy→enqueue→perform flow using ActiveJob::TestHelper’s perform_enqueued_jobs or similar.

Suggested fix:

  1. Replace the after_destroy with an after_destroy_commit callback.
  2. Update the job to take a model and record ID as arguments. (Alternatively, add a new job to avoid breaking changes to the existing background job’s signature.)
  3. Attempt to look up the record. If it exists, index it. If it doesn't, delete it from the index.
    def perform(model, id)
      if record = model.unscoped.find_by(id: id)
        record.ms_index!
      else
        model.ms_remove_document_from_index_by_id!(id)
      end
    end

Further reading: elasticsearch-model’s documentation includes an example Sidekiq job implementing an alternative approach (taking an operation name argument rather than using the record’s existence to decide what to do).

History:

@brunoocasali
Copy link
Member

Hi @georgeclaghorn, thanks for your full debugging and plan to fix the bug!

I want to do it, but unfortunately, I can't now. The integrations team is changing how we work, so it could take a while until we have enough time to progress in issues like this.

Although, I would love to review a PR if you're willing to contribute! ❤️

@brunoocasali brunoocasali added the bug Something isn't working label Jun 20, 2023
@meili-bors meili-bors bot closed this as completed in 6d268c1 Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants