Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] ignore nested fields #89

Open
elacuesta opened this issue Jul 16, 2021 · 0 comments
Open

[feature request] ignore nested fields #89

elacuesta opened this issue Jul 16, 2021 · 0 comments

Comments

@elacuesta
Copy link
Member

elacuesta commented Jul 16, 2021

As discussed with @fcanobrash, I'm only opening this here so we can keep track if it.

The AUTOUNIT_DONT_TEST_OUTPUT_FIELDS setting cannot be used to ignore fields that are not in the first level of items. We thought about using jmespath but that doesn't provide a way to modify data, only access it. My current workaround for this is the following monkeypatch:

import operator
from contextlib import suppress
from functools import reduce

import scrapy_autounit
from scrapy_autounit.cassette import Cassette
from scrapy_autounit.player import Player


class IgnoreNestedFieldsPlayer(Player):
    """Patched player that allows to specify nested fields to be ignored.
    """

    @classmethod
    def from_fixture(cls, path):
        """This override is only needed while https://github.com/scrapinghub/scrapy-autounit/pull/88 is not merged"""
        cassette = Cassette.from_fixture(path)
        return cls(cassette)

    def _filter_output_fields(self, item):
        dont_test = self.spider.settings.get("AUTOUNIT_DONT_TEST_OUTPUT_FIELDS", [])
        if not dont_test:
            dont_test = self.spider.settings.get("AUTOUNIT_SKIPPED_FIELDS", [])
        for entry in dont_test:
            *first_keys, last_key = entry.split(".")
            if first_keys:
                with suppress(KeyError):
                    item = reduce(operator.getitem, first_keys, item)
            item.pop(last_key, None)


scrapy_autounit.player.Player = IgnoreNestedFieldsPlayer

After this I'm able to do the following in settings.py:

AUTOUNIT_DONT_TEST_OUTPUT_FIELDS = ["metadata.found_date", "metadata.updated_date"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant