Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subject.get_best_query's with_filenames argument is not honored #1673

Closed
gotmax23 opened this issue Aug 27, 2024 · 5 comments
Closed

Subject.get_best_query's with_filenames argument is not honored #1673

gotmax23 opened this issue Aug 27, 2024 · 5 comments
Assignees
Labels
Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take

Comments

@gotmax23
Copy link

Since the latest libdnf update, hawkey.Subject.get_best_query does not honor the with_filenames argument. Filenames are still matched even if with_filenames is explicitly set to False.

import dnf
import dnf.subject
import fedrq.config
import hawkey


def get_base() -> dnf.Base:
    """
    Use fedrq to generate a Base object with filelists loaded
.
    This can also be done manually.
    """
    base = (
        fedrq.config.get_config(backend="dnf", load_filelists="always")
        .get_rq(branch="rawhide")
        .base
    )
    assert "filelists" in base.conf.optional_metadata_types
    assert isinstance(base, dnf.Base)
    return base


def main() -> None:
    base = get_base()
    print("Hawkey version:", hawkey.VERSION)
    print("dnf version:", dnf.VERSION)
    path = "/usr/share/ansible"
    # Filenames are not enabled for the query
    subject = dnf.subject.Subject(path).get_best_query(base.sack, with_filenames=False)
    # Query length should be 0
    print("Query length:", len(subject))
    # for package in subject:
    #     print(package)


if __name__ == "__main__":
    main()
$ python with_filelists.py
Hawkey version: 0.73.0
dnf version: 4.19.0
Query length: 0
$ python with_filelists.py 
Hawkey version: 0.73.3
dnf version: 4.21.1
Query length: 3

This is breaking fedrq's unit tests that check that our wrapper of the Subject API returns the correct results when with_filenames is disabled.

@m-blaha
Copy link
Member

m-blaha commented Aug 28, 2024

After a bit of experimenting - this regression is a side effect of the patch #1670.
The string in question /usr/share/ansible is found during search in provides

libdnf/libdnf/sack/query.cpp

Lines 2751 to 2757 in fbd3447

if (with_provides) {
queryUnion(origQuery);
addFilter(HY_PKG_PROVIDES, HY_GLOB, subject);
if (!empty()) {
return {true, std::unique_ptr<Nevra>()};
}
}

This causes the query is returned even before the string was searched in file lists.

It looks like POOL_FLAG_ADDFILEPROVIDESFILTERED causes that when filelists metadata are loaded, they are always added to provides. @kontura can you please take a look?

@kontura kontura self-assigned this Aug 28, 2024
@github-project-automation github-project-automation bot moved this to Backlog in DNF team Aug 30, 2024
@jan-kolarik jan-kolarik added the Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take label Aug 30, 2024
@jan-kolarik jan-kolarik moved this from Backlog to Todo in DNF team Aug 30, 2024
@mlschroe
Copy link
Contributor

The change in behavior with POOL_FLAG_ADDFILEPROVIDESFILTERED is unintended. I changed the way libsolv sets up the lazy fileprovides lookup so that it works like before. See openSUSE/libsolv@2150865

I don't know if this is relevant to the slowness issue.

mlschroe referenced this issue in openSUSE/libsolv Sep 19, 2024
Turning on POOL_FLAG_ADDFILEPROVIDESFILTERED made the lookup of
all file dependencies lazy, not only the ones required by some
dependency. This changes the way searching works and may also
slow down some use cases.

We now changed addfileprovides() to record the needed non-standard
file dependencies and only make those lazy in createwhatprovides().
@kontura
Copy link
Contributor

kontura commented Sep 25, 2024

The change in behavior with POOL_FLAG_ADDFILEPROVIDESFILTERED is unintended. I changed the way libsolv sets up the lazy fileprovides lookup so that it works like before. See openSUSE/libsolv@2150865

@mlschroe the linked commit doesn't seem to affect this particular issue.

Though while looking into this I extracted the following snippet:

#include "solv/pool.h"
#include "solv/repo.h"

static const char * input =  "/usr/lib/file";

int main() {
    Id p, pp;
    Solvable * s;

    Pool * pool = pool_create();
    pool_set_flag(pool, POOL_FLAG_ADDFILEPROVIDESFILTERED, 1);

    Repo * repo = repo_create(pool, "repo");
    Repodata * data = repo_add_repodata(repo, 0);

    // add pkg with provide
    s = pool_id2solvable(pool, repo_add_solvable(repo));
    s->name = pool_str2id(pool, "foo", 1);
    s->evr = pool_str2id(pool, "1-2", 1);
    s->arch = pool_str2id(pool, "x86_64", 1);
    Id reldep = pool_rel2id(pool, s->name, s->evr, REL_EQ, 1);
    s->provides = repo_addid_dep(repo, s->provides, reldep, 0);

    // add "/usr/lib/file" file to pkg
    Id did = repodata_str2dir(data, "/usr/lib", 1);
    repodata_add_dirstr(data, s - pool->solvables, SOLVABLE_FILELIST, did, "file");

    repodata_internalize(data);

    pool_addfileprovides(pool);
    pool_createwhatprovides(pool);

    // Get all packages providing `input`

    Id path_id = pool_str2id(pool, input, 1);
    FOR_PROVIDES(p, pp, path_id) {
        s = pool->solvables + p;
        printf("FOR_PROVIDES: %s\n", pool_solvable2str(pool, s));
    }

    pool_free(pool);
    return 0;
}

It is odd to me that while the POOL_FLAG_ADDFILEPROVIDESFILTERED is set /usr/lib/file is found as a provide but without the flag its not.
I understand the flag changes how the file entries (that would not be included in the primary file) are added but I would expect the end result to be the same.

I did find another approach that seems to work regardless of the POOL_FLAG_ADDFILEPROVIDESFILTERED:

    int flags;
    Queue q;
    queue_init(&q);
    flags = SELECTION_PROVIDES;
    selection_make(pool, &q, input, flags);

    flags = SELECTION_FILELIST|SELECTION_SUBTRACT;
    selection_make(pool, &q, input, flags);

    Queue pq;
    queue_init(&pq);
    selection_solvables(pool, &q, &pq);

    for (int j = 0; j < pq.count; j++) {
        printf("provides wihtout filelists: %s\n", pool_solvable2str(pool, pool_id2solvable(pool, pq.elements[j])));
    }
    queue_free(&q);

@mlschroe
Copy link
Contributor

mlschroe commented Sep 25, 2024

Yes, this happens because the path_id is added after the pool_createwhatprovides(() call. Fixed now in commit openSUSE/libsolv@081580d

@kontura
Copy link
Contributor

kontura commented Sep 25, 2024

Thanks, that fixes this issue.

@kontura kontura closed this as completed Sep 25, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in DNF team Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take
Projects
Archived in project
Development

No branches or pull requests

5 participants