Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search / My Jetpack: Function get_raw_post_type_breakdown can render back end inaccessible #38130

Closed
stevedaily opened this issue Jun 30, 2024 · 10 comments
Assignees
Labels
[Feature] Search For all things related to Search [Focus] Performance [Package] My Jetpack [Plugin] Backup A plugin that allows users to save every change and get back online quickly with one-click restores. [Pri] High [Status] Priority Review Triggered The guild in charge of triage has been notified of this issue in Slack Triaged [Type] Bug When a feature is broken and / or not performing as intended

Comments

@stevedaily
Copy link

Impacted plugin

Backup

Quick summary

I have a very large site with millions of records in the _posts table. The admin area (admin.php) is now inaccessible due to this query: "SELECT post_type, post_status, COUNT( * ) AS num_posts FROM ***_posts WHERE post_password = '' GROUP BY post_type, post_status"

I traced this back to plugins\jetpack-backup\jetpack_vendor\automattic\jetpack-my-jetpack\src\products\class-search-stats.php,
Line 140
The function get_raw_post_type_breakdown() is the culprit.

Steps to reproduce

  1. Start with a site having a very large number of posts in the posts table
  2. Visit admin.php
  3. Notice the time taken to process the query, "SELECT post_type, post_status, COUNT( * ) AS num_posts FROM ***_posts WHERE post_password = '' GROUP BY post_type, post_status"

A clear and concise description of what you expected to happen.

Back end should load without much delay.

What actually happened

The back end timed out.

Impact

All

Available workarounds?

No but the platform is still usable

Platform (Simple and/or Atomic)

Self-hosted

Logs or notes

Jetpack Backup 2.6

@stevedaily stevedaily added [Type] Bug When a feature is broken and / or not performing as intended Needs triage Ticket needs to be triaged labels Jun 30, 2024
@github-actions github-actions bot added the [Status] Priority Review Triggered The guild in charge of triage has been notified of this issue in Slack label Jun 30, 2024
@github-actions github-actions bot added [Plugin] Backup A plugin that allows users to save every change and get back online quickly with one-click restores. [Pri] High labels Jun 30, 2024
@jeherve jeherve moved this from Needs Triage to Triaged in Automattic Prioritization: The One Board ™ Jul 1, 2024
@jeherve jeherve added the [Feature] Search For all things related to Search label Jul 1, 2024
@jeherve
Copy link
Member

jeherve commented Jul 1, 2024

cc @kangzj who initially worked on it in #26463.

@jeherve jeherve changed the title Function get_raw_post_type_breakdown can render back end inaccessible Search / My Jetpack: Function get_raw_post_type_breakdown can render back end inaccessible Jul 1, 2024
@lsl
Copy link
Contributor

lsl commented Aug 8, 2024

Bump, would be nice to fix this.

@kangzj
Copy link
Contributor

kangzj commented Aug 20, 2024

Thanks for the ping. Seems we need to improve performance of the query, or even is the call necessary? I can take a look but I don't think I can get onto it before September.

Looping in @Automattic/jetpack-data-team in case they have something in mind 🙂

@obenland
Copy link
Member

@kangzj It's the end of September, is there any news on this?

@gibrown
Copy link
Member

gibrown commented Oct 2, 2024

Hmmm... we'll try to take a look at if there is a better way. That is a pretty big query on a large site and we should be able to do something smarter.

@kangzj
Copy link
Contributor

kangzj commented Oct 2, 2024

thanks @gibrown

I don't see any easy fixes. The only solution I can think of is to do the counting in batches. 🧐

@donnapep
Copy link
Contributor

donnapep commented Oct 8, 2024

@gibrown or @kangzj Just wondering if it would make sense to assign this issue to either of you or someone else on your team? I think it would be good to have a DRI.

@oikeme
Copy link
Contributor

oikeme commented Nov 15, 2024

@kangzj , I've looked in to this and @gibrown and I have also discussed the issue as well. There isn't an easy fix to this as you've noted and batching could be helpful to solve this.

There are a few questions including:

  • The question of the necessity of the call is still there however the function is called so there are likely use cases. If these aren't absolutely necessary though can we get rid of this?
  • The WHERE onpost_password isn't great. Do we actually need this?

A few thoughts on different solutions are:

  • Add limits to ensure we don't run the query in a situation where it will cause the backend issues being experienced.
  • Remove the WHERE clause on post_password if it's not necessary.
  • Batch where appropriate for sites with high posts counts within the limit.
  • Offload this to an endpoint that can be handled on wpcom if the site posts count is above the limit. We already have get_stats_from_wpcom which implements some of this and can possibly be updated to include what's missing from get_raw_post_type_breakdown. If that's not the case then a new endpoint can be implemented for this.

Are there additional thoughts on the proposed solutions or options to consider?

@oikeme
Copy link
Contributor

oikeme commented Dec 18, 2024

@kangzj, @gibrown I've worked on and tested out a resolution to another similar issue with the remote request and batching and that went well so far. I've created this PR which implements something similar for the My Jetpack get_raw_post_type_breakdown issue and will make a remote request for sites with over a million posts. From testing internally this number looked like a high enough number to go with the remote query instead but if there are concerns let me know.

@oikeme
Copy link
Contributor

oikeme commented Dec 19, 2024

Resolved and merged in #40635

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Search For all things related to Search [Focus] Performance [Package] My Jetpack [Plugin] Backup A plugin that allows users to save every change and get back online quickly with one-click restores. [Pri] High [Status] Priority Review Triggered The guild in charge of triage has been notified of this issue in Slack Triaged [Type] Bug When a feature is broken and / or not performing as intended
Development

No branches or pull requests

8 participants