AutoExtractProvider now support the new scrapy-poet cache interface #31

ivanprado · 2021-12-03T12:08:05Z

This requires this change from scrapy-poet: scrapinghub/scrapy-poet#55

Additionally, the preferred pageType for HTML requests (AutoExtractProductData)
is now chosen always if listed as dependency instead of just choosing
the first dependency pageType to request the HTML

Todo:

Change the scrapy-poet dependency to Pipy once released

Additionaly, the preferred pageType for HTML requests (``AutoExtractProductData``) is now chosen always if listed as dependency instead of just choosing the first dependency ``pageType`` to request the HTML

codecov · 2021-12-03T12:09:06Z

Codecov Report

Merging #31 (baf4a5d) into master (5baa342) will increase coverage by 0.57%.
The diff coverage is 97.56%.

❗ Current head baf4a5d differs from pull request most recent head 0f0dd55. Consider uploading reports for the commit 0f0dd55 to get more accurate results

@@            Coverage Diff             @@
##           master      #31      +/-   ##
==========================================
+ Coverage   85.24%   85.82%   +0.57%     
==========================================
  Files           9        9              
  Lines         488      515      +27     
==========================================
+ Hits          416      442      +26     
- Misses         72       73       +1

Impacted Files	Coverage Δ
scrapy_autoextract/providers.py	`93.29% <97.56%> (+0.53%)`	⬆️

tests/test_providers.py

scrapy_autoextract/providers.py

CHANGES.rst

sortafreel

Great job Ivan 👍

sortafreel · 2021-12-10T19:45:20Z

scrapy_autoextract/providers.py

@@ -30,6 +32,9 @@
 _TASK_MANAGER = "_autoextract_task_manager"


+AEDataType = TypeVar('AEDataType', bound=AutoExtractData, covariant=True)


Could you elaborate a little bit, please, why you created a covariant TypeVar here instead of just using AutoExtractData in typing? Is it because it could be product data, article data, and so on, so it won't be an invariant?

Yep, that's precisely right. Using covariant=True here would cover any subtypes derived from AutoExtractData.

BurnzZ · 2021-12-22T03:24:27Z

setup.py

@@ -29,7 +29,7 @@ def get_version():
    install_requires=[
        'autoextract-poet>=0.3.0',
        'zyte-autoextract>=0.7.0',
-        'scrapy-poet>=0.2.0',
+        'scrapy-poet @ git+https://[email protected]/scrapinghub/scrapy-poet@injector_record_replay_native#egg=scrapy-poet',


Reminder to remove this after a new release of scrapy-poet to PyPI is done.

AutoExtractProvider now support the new scrapy-poet cache interface

d82fa27

Additionaly, the preferred pageType for HTML requests (``AutoExtractProductData``) is now chosen always if listed as dependency instead of just choosing the first dependency ``pageType`` to request the HTML

ivanprado added 2 commits December 3, 2021 13:11

Fix py36

332af78

Fix py36

15a8237

ivanprado requested review from kmike and sortafreel December 3, 2021 12:16

BurnzZ reviewed Dec 10, 2021

View reviewed changes

tests/test_providers.py Outdated Show resolved Hide resolved

scrapy_autoextract/providers.py Outdated Show resolved Hide resolved

scrapy_autoextract/providers.py Show resolved Hide resolved

CHANGES.rst Show resolved Hide resolved

sortafreel approved these changes Dec 10, 2021

View reviewed changes

BurnzZ reviewed Dec 22, 2021

View reviewed changes

BurnzZ and others added 4 commits December 22, 2021 15:57

fix annotations, typos, and dep versioning

acc66ca

use the proper Cache Mixin from scrapy-poet

1e2be8a

Merge branch 'master' into cacheable_provider

def4914

small import fix

0f0dd55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoExtractProvider now support the new scrapy-poet cache interface #31

AutoExtractProvider now support the new scrapy-poet cache interface #31

ivanprado commented Dec 3, 2021

codecov bot commented Dec 3, 2021 •

edited

Loading

sortafreel left a comment

sortafreel Dec 10, 2021

BurnzZ Dec 22, 2021 •

edited

Loading

BurnzZ Dec 22, 2021 •

edited

Loading

		@@ -30,6 +32,9 @@
		_TASK_MANAGER = "_autoextract_task_manager"


		AEDataType = TypeVar('AEDataType', bound=AutoExtractData, covariant=True)

AutoExtractProvider now support the new scrapy-poet cache interface #31

Are you sure you want to change the base?

AutoExtractProvider now support the new scrapy-poet cache interface #31

Conversation

ivanprado commented Dec 3, 2021

codecov bot commented Dec 3, 2021 • edited Loading

Codecov Report

sortafreel left a comment

Choose a reason for hiding this comment

sortafreel Dec 10, 2021

Choose a reason for hiding this comment

BurnzZ Dec 22, 2021 • edited Loading

Choose a reason for hiding this comment

BurnzZ Dec 22, 2021 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Dec 3, 2021 •

edited

Loading

BurnzZ Dec 22, 2021 •

edited

Loading

BurnzZ Dec 22, 2021 •

edited

Loading