Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoExtractProvider now support the new scrapy-poet cache interface #31

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ivanprado
Copy link
Contributor

This requires this change from scrapy-poet: scrapinghub/scrapy-poet#55

Additionally, the preferred pageType for HTML requests (AutoExtractProductData)
is now chosen always if listed as dependency instead of just choosing
the first dependency pageType to request the HTML

Todo:

  • Change the scrapy-poet dependency to Pipy once released

Additionaly, the preferred pageType for HTML requests (``AutoExtractProductData``)
is now chosen always if listed as dependency instead of just choosing
the first dependency ``pageType`` to request the HTML
@codecov
Copy link

codecov bot commented Dec 3, 2021

Codecov Report

Merging #31 (baf4a5d) into master (5baa342) will increase coverage by 0.57%.
The diff coverage is 97.56%.

❗ Current head baf4a5d differs from pull request most recent head 0f0dd55. Consider uploading reports for the commit 0f0dd55 to get more accurate results

@@            Coverage Diff             @@
##           master      #31      +/-   ##
==========================================
+ Coverage   85.24%   85.82%   +0.57%     
==========================================
  Files           9        9              
  Lines         488      515      +27     
==========================================
+ Hits          416      442      +26     
- Misses         72       73       +1     
Impacted Files Coverage Δ
scrapy_autoextract/providers.py 93.29% <97.56%> (+0.53%) ⬆️

@ivanprado ivanprado requested review from kmike and sortafreel December 3, 2021 12:16
tests/test_providers.py Outdated Show resolved Hide resolved
scrapy_autoextract/providers.py Outdated Show resolved Hide resolved
scrapy_autoextract/providers.py Show resolved Hide resolved
CHANGES.rst Show resolved Hide resolved
Copy link
Contributor

@sortafreel sortafreel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job Ivan 👍

@@ -30,6 +32,9 @@
_TASK_MANAGER = "_autoextract_task_manager"


AEDataType = TypeVar('AEDataType', bound=AutoExtractData, covariant=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate a little bit, please, why you created a covariant TypeVar here instead of just using AutoExtractData in typing? Is it because it could be product data, article data, and so on, so it won't be an invariant?

Copy link
Contributor

@BurnzZ BurnzZ Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's precisely right. Using covariant=True here would cover any subtypes derived from AutoExtractData.

setup.py Outdated
@@ -29,7 +29,7 @@ def get_version():
install_requires=[
'autoextract-poet>=0.3.0',
'zyte-autoextract>=0.7.0',
'scrapy-poet>=0.2.0',
'scrapy-poet @ git+https://[email protected]/scrapinghub/scrapy-poet@injector_record_replay_native#egg=scrapy-poet',
Copy link
Contributor

@BurnzZ BurnzZ Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to remove this after a new release of scrapy-poet to PyPI is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants