-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jetpack Sync: make jetpack media extraction more consistent #35369
Jetpack Sync: make jetpack media extraction more consistent #35369
Conversation
Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.
Interested in more tips and information?
|
Thank you for your PR! When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:
This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖 The e2e test report can be found here. Please note that it can take a few minutes after the e2e tests checks are complete for the report to be available. Once your PR is ready for review, check one last time that all required checks appearing at the bottom of this PR are passing or skipped. Jetpack plugin: The Jetpack plugin has different release cadences depending on the platform:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Backup plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Boost plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Search plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Social plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Starter Plugin plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Protect plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Videopress plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Migration plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Mu Wpcom plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. Inspect plugin:
If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack. |
24168f0
to
3644c95
Compare
There is a small elasticsearch change which is required before this can land D136604-code |
} else { | ||
$ret_images[] = $image['src']; | ||
$ret_image = $image['src']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would overwrite src_width
and src_height
when alt_text
isn't set, is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! yes, because the way that $extract_alt_text
was implemented, when it is set to true, then the results are returned as an array of associative arrays. When it is set to false, then a simple array of urls is returned. I am fairly certain that only ES indexing is using the $extract_alt_text
option, so I decided to just expand on that, in order to keep other uses backwards compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests pass and code looks right to me, though I'm not as familiar with Jetpack codebase.
I have added a few minor notes on a couple of details.
@@ -543,14 +549,21 @@ public static function get_images_from_html( $html, $images_already_extracted, $ | |||
} | |||
|
|||
if ( ! in_array( $queryless, $image_list, true ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that this check will probably work less often now since $image_list
will store strings less often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I added another test (and updated one) to make this a bit more explicit. In the case that an image is used once in a gallery and also simply in the content, but with different sizes or alt_text, I think it makes sense to include both. In the case that all the fields are the same, then we should not duplicate. It turns out that we were not uniquing links either. I found a TODO in the code, which I did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are passing and overall changes look good to me.
* first commit with some new tests and made media extractor more consistent * fixed and added more tests * added more tests * changelog * removed debugging code * changes from code review. Added another test. Also added link deduping * fixed conditional
Currently we extract image size info in some cases, but not in all. For example, the
from_slideshow
method will return image width and height, but notfrom_gallery
. These changes aim to make this more consistent across all the ways we can extract image information.Fixes #
Proposed changes:
This copies much of the logic around extracting image size info to all of the various methods in the
Jetpack_PostImages
class, and adds a number of additional tests for that class and the media extractor class.Other information:
Does this pull request change what data or activity we track or use?
No
Testing instructions:
Run the newly added unit tests