Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetpack Sync: make jetpack media extraction more consistent #35369

Merged
merged 7 commits into from
Feb 2, 2024

Conversation

robfelty
Copy link
Contributor

Currently we extract image size info in some cases, but not in all. For example, the from_slideshow method will return image width and height, but not from_gallery. These changes aim to make this more consistent across all the ways we can extract image information.

Fixes #

Proposed changes:

This copies much of the logic around extracting image size info to all of the various methods in the Jetpack_PostImages class, and adds a number of additional tests for that class and the media extractor class.

Other information:

  • Have you written new tests for your changes, if applicable?
  • Have you checked the E2E test CI results, and verified that your changes do not break them?
  • Have you tested your changes on WordPress.com, if applicable (if so, you'll see a generated comment below with a script to run)?

Does this pull request change what data or activity we track or use?

No

Testing instructions:

Run the newly added unit tests

$ jetpack docker phpunit -- --filter=WP_Test_Jetpack_MediaExtractor
$ jetpack docker phpunit -- --filter=WP_Test_Jetpack_PostImages

@robfelty robfelty requested review from jeherve and trakos January 31, 2024 12:29
Copy link
Contributor

github-actions bot commented Jan 31, 2024

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WordPress.com Simple site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin, and enable the fix/make-jetpack-media-extraction-more-consistent branch.

  • To test on Simple, run the following command on your sandbox:

    bin/jetpack-downloader test jetpack fix/make-jetpack-media-extraction-more-consistent
    

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

@github-actions github-actions bot added the [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ label Jan 31, 2024
@robfelty robfelty requested a review from gibrown January 31, 2024 12:30
@robfelty robfelty requested a review from sotirispl January 31, 2024 12:30
Copy link
Contributor

github-actions bot commented Jan 31, 2024

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Team Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


The e2e test report can be found here. Please note that it can take a few minutes after the e2e tests checks are complete for the report to be available.


Once your PR is ready for review, check one last time that all required checks appearing at the bottom of this PR are passing or skipped.
Then, add the "[Status] Needs Team Review" label and ask someone from your team review the code. Once reviewed, it can then be merged.
If you need an extra review from someone familiar with the codebase, you can update the labels from "[Status] Needs Team Review" to "[Status] Needs Review", and in that case Jetpack Approvers will do a final review of your PR.


Jetpack plugin:

The Jetpack plugin has different release cadences depending on the platform:

  • WordPress.com Simple releases happen daily.
  • WoA releases happen weekly.
  • Releases to self-hosted sites happen monthly. The next release is scheduled for February 6, 2024 (scheduled code freeze on February 5, 2024).

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Backup plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Boost plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Search plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Social plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Starter Plugin plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Protect plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Videopress plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Migration plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Mu Wpcom plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.


Inspect plugin:

  • Next scheduled release: March 5, 2024.
  • Scheduled code freeze: February 26, 2024.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.

@github-actions github-actions bot added [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! [Action] Repo Gardening Github Action: manage PR and issues in your Open Source project [Block] AI Assistant [Block] Sharing Buttons [Block] Sharing Button [Boost Feature] Cache [Feature] Contact Form [Feature] Custom Content Types Custom post or content types (usually for testimonials and portfolios) and their settings. [Feature] Masterbar WordPress.com Toolbar and Dashboard customizations [Feature] Tiled Gallery A different way to display image galleries on your site, in different organizations and shapes. [Focus] Compatibility Ensuring our products play well with third-parties [JS Package] AI Client [JS Package] Boost Score Api [JS Package] Connection [JS Package] Components [JS Package] IDC [JS Package] Image Guide [JS Package] Licensing [JS Package] Partner Coupon [JS Package] Publicize Components [JS Package] Shared Extension Utils [mu wpcom Feature] Launchpad labels Jan 31, 2024
@github-actions github-actions bot added [Plugin] Protect A plugin with features to protect a site: brute force protection, security scanning, and a WAF. [Plugin] Search A plugin to add an instant search modal to your site to help visitors find content faster. [Plugin] Social Issues about the Jetpack Social plugin [Plugin] Starter Plugin [Plugin] VideoPress A standalone plugin to add high-quality VideoPress videos to your site. Actions GitHub actions used to automate some of the work around releases and repository management Admin Page React-powered dashboard under the Jetpack menu Docs E2E Tests RNA labels Jan 31, 2024
@robfelty robfelty force-pushed the fix/make-jetpack-media-extraction-more-consistent branch from 24168f0 to 3644c95 Compare January 31, 2024 14:43
@robfelty
Copy link
Contributor Author

There is a small elasticsearch change which is required before this can land D136604-code

} else {
$ret_images[] = $image['src'];
$ret_image = $image['src'];
Copy link
Contributor

@trakos trakos Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would overwrite src_width and src_height when alt_text isn't set, is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! yes, because the way that $extract_alt_text was implemented, when it is set to true, then the results are returned as an array of associative arrays. When it is set to false, then a simple array of urls is returned. I am fairly certain that only ES indexing is using the $extract_alt_text option, so I decided to just expand on that, in order to keep other uses backwards compatible.

Copy link
Contributor

@trakos trakos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass and code looks right to me, though I'm not as familiar with Jetpack codebase.

I have added a few minor notes on a couple of details.

@@ -543,14 +549,21 @@ public static function get_images_from_html( $html, $images_already_extracted, $
}

if ( ! in_array( $queryless, $image_list, true ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that this check will probably work less often now since $image_list will store strings less often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I added another test (and updated one) to make this a bit more explicit. In the case that an image is used once in a gallery and also simply in the content, but with different sizes or alt_text, I think it makes sense to include both. In the case that all the fields are the same, then we should not duplicate. It turns out that we were not uniquing links either. I found a TODO in the code, which I did.

projects/plugins/jetpack/class.jetpack-post-images.php Outdated Show resolved Hide resolved
Copy link

@sotirispl sotirispl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing and overall changes look good to me.

@robfelty robfelty merged commit 0a65186 into trunk Feb 2, 2024
54 checks passed
@robfelty robfelty deleted the fix/make-jetpack-media-extraction-more-consistent branch February 2, 2024 14:28
@github-actions github-actions bot removed [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! [Status] Needs Team Review labels Feb 2, 2024
spsiddarthan pushed a commit that referenced this pull request Feb 15, 2024
* first commit with some new tests and made media extractor more consistent

* fixed and added more tests

* added more tests

* changelog

* removed debugging code

* changes from code review. Added another test. Also added link deduping

* fixed conditional
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Action] Repo Gardening Github Action: manage PR and issues in your Open Source project Actions GitHub actions used to automate some of the work around releases and repository management Admin Page React-powered dashboard under the Jetpack menu [Block] AI Assistant [Block] Sharing Button [Block] Sharing Buttons [Boost Feature] Page Cache Docs E2E Tests [Feature] Contact Form [Feature] Custom Content Types Custom post or content types (usually for testimonials and portfolios) and their settings. [Feature] Masterbar WordPress.com Toolbar and Dashboard customizations [Feature] Tiled Gallery A different way to display image galleries on your site, in different organizations and shapes. [Focus] Compatibility Ensuring our products play well with third-parties [JS Package] AI Client [JS Package] Boost Score Api [JS Package] Components [JS Package] Connection [JS Package] IDC [JS Package] Image Guide [JS Package] Licensing [JS Package] Partner Coupon [JS Package] Publicize Components [JS Package] Shared Extension Utils [mu wpcom Feature] Block Patterns [mu wpcom Feature] Launchpad [mu wpcom Feature] Verbum Comments Verbum, a better comment experience, app developed in the mu-wpcom plugin [Package] Ad aka WordAds [Package] Admin Ui [Package] Backup [Package] Blaze [Package] Connection [Package] Forms [Package] Identity Crisis This package no longer exists in the monorepo. It was merged into [Package] Connection. [Package] Jetpack mu wpcom WordPress.com Features [Package] My Jetpack [Package] Publicize [Package] Search Contains core Search functionality for Jetpack and Search plugins [Package] Sync [Package] VideoPress [Package] WAF [Package] WP JS Data Sync [Plugin] Backup A plugin that allows users to save every change and get back online quickly with one-click restores. [Plugin] Boost A feature to speed up the site and improve performance. [Plugin] Inspect [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ [Plugin] Migration [Plugin] mu wpcom jetpack-mu-wpcom plugin [Plugin] Protect A plugin with features to protect a site: brute force protection, security scanning, and a WAF. [Plugin] Search A plugin to add an instant search modal to your site to help visitors find content faster. [Plugin] Social Issues about the Jetpack Social plugin [Plugin] Starter Plugin [Plugin] VideoPress A standalone plugin to add high-quality VideoPress videos to your site. RNA [Tests] Includes Tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants