Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Killed Integrity Urlkey process in console #13

Open
gifrancohe opened this issue Oct 17, 2020 · 7 comments
Open

Killed Integrity Urlkey process in console #13

gifrancohe opened this issue Oct 17, 2020 · 7 comments

Comments

@gifrancohe
Copy link

After the process is finished and the percentage is 100%, the console printed "Killed" and the backend shows the message "We are already refreshing the product url key's, just have a little patience". But nothing happens and when I try to run the process again from the console, an exception is displayed with the same message. "An unexpected exception occured: 'We are already refreshing the product url key's, just have a little patience"

I hope they can help me. Thank you very much.

@hostep
Copy link
Member

hostep commented Oct 19, 2020

Hmm, that's strange. Does the 'Killed' message come from Magento, or from your operating system? Maybe you are running against the limits of your system and the process gets killed by the operating system?

You can always specify the --force flag to restart the process in case the previous run got stuck. See the help section:

$ bin/magento catalog:product:integrity:urlkey --help
Description:
  Checks data integrity of the values of the url_key product attribute.

Usage:
  catalog:product:integrity:urlkey [options]

Options:
  -f, --force           Force the command to run, even if it is already marked as already running
  -h, --help            Display this help message
  -q, --quiet           Do not output any message
  -V, --version         Display this application version
      --ansi            Force ANSI output
      --no-ansi         Disable ANSI output
  -n, --no-interaction  Do not ask any interactive question
  -v|vv|vvv, --verbose  Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

@gifrancohe
Copy link
Author

apparently it's my operating system, but I don't know why it's killing the request. Try executing the command with the memory_limit parameter and the -f flag but the same thing happens. My store has around 390,000 products, I think that's because of that volume of url to process.

The other commands work perfect. I leave an example of how I am executing the command

$ php -d memory_limit=6G bin/magento catalog:product:integrity:urlkey -f

Thanks for the help.

@hostep
Copy link
Member

hostep commented Oct 20, 2020

Thanks for the feedback!

I've only tested this module with a collection of max ~40.000 products if I remember correctly, so your 390.000 products might indeed need a lot of memory and might trigger the Out Of Memory Killer on your OS.

I'll take a stab at trying out this approach later this week, maybe it will help in reducing the amount of memory needed: https://www.matheusgontijo.com/2018/02/10/magento-2-working-with-large-collections-php-fatal-error-allowed-memory-size-of-xxxx-bytes-exhausted/

@hostep
Copy link
Member

hostep commented Oct 21, 2020

Hi @gifrancohe!

I've made a first attempt at reducing the memory usage for generating product url key problems on the memory-optimisations branch.

Could you maybe test this out?
You can run the following composer command to get that experimental branch:

composer require baldwin/magento2-module-url-data-integrity-checker:dev-memory-optimisations

From what I've seen, this:

  • reduces memory usage significantly
  • while keeping more or less the same execution time (which is slow, but that's something to be solved on another day)
  • and keeping the same amount of database queries (in theory, not actually tested)

Could you maybe test this out a bit and give me some feedback?

Thanks!

@gifrancohe
Copy link
Author

Hi @hostep.

I apologize for not answering before. I tried the test with the new branch, but the result was the same. Also try increasing the resources of my test server, going from having 2 CPU and 4 of ram memory to having 4 CPU and 16 of ram memory, but the same thing continues, after reaching 100% the progress bar, the console processes for a few minutes, until the process is killed.

@hostep
Copy link
Member

hostep commented Oct 28, 2020

Okay, that's very useful information!

So this means the gathering of information works without memory problems right now. It's writing out that data to the storage (which is currently one big json file) that is problematic now. So that's the next thing which needs to be optimised.
I also heard from a colleague of mine that loading in the json file and displaying that data in a grid in the backend of Magento can crash if the data is too big. So that's also something we need to take a look at.
I'm thinking of storing the data in the database instead of a big json file, that can probably be done more efficiently in terms of memory usage.

It might take me a while (a couple of weeks) to rewrite this part of the tool though. But it sounds like we need to take care of this for shops with a lot of products which have a lot of url problems.

Thanks for the feedback!

@DominicWatts
Copy link
Contributor

@gifrancohe

Just read your issue regarding large collection and memory usage.

I've not got to play with such a large collection just yet. Max so far is around 80K. As a proof of concept I produced a simple product CSV exporter after having issues with large collections myself with a couple of different feed generator extensions on various magento versions. Using the iterator massively slowed down the process I was working on. However it did get the job done.

My proof of concept extension is tagged. 1.0.1 uses standard collection. 1.0.2 uses iterator.

However both dump to a CSV file and have potential to use lots of memory. I also print out some stats.

I'm curious how the extension will fare with such a large collection

Extension: https://github.com/DominicWatts/ProductCsvExport/

https://github.com/DominicWatts/ProductCsvExport/blob/1.0.1/Console/Command/Product.php
vs
https://github.com/DominicWatts/ProductCsvExport/blob/1.0.2/Console/Command/Product.php

I'd love to hear how it performs on such a large collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants