-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Pinned vs loose requirements, application vs library use cases #204
Comments
My own opinion, not on behalf of anyone else: One of the projects I work with is in that 10% of using pySHACL like a library and application. That community has some applications that benefit from accessing I think it is to that community's benefit that pySHACL favor looser requirements, for both the "Library" mode and the "Application" mode you're suggesting. If requirements are too tightly specified, a worst-but-not-unworkable case can occur where pySHACL's CLI becomes incompatible with another application an end user in this "10%" group needs. The workaround of making a Python virtual environment per application is (probably) always technically available, but it would be a step of documentation I would not enjoy drafting. I prefer nudging a dependency's floor, or placing a (hopefully temporary) ceiling, in the downstream adopter. Another argument for the looser requirement specification is CI that runs on a schedule could catch when a dependency introduces an incompatibility. I don't pin dependencies often enough to have an opinion on whether that style of issue detection is also practical to do with pinning. |
This is not exactly the case, at least not for most python applications I use. Generally, Python applications are installed from wheels, and in most cases locked versions don't affect wheels. I have extracted the dependency specs from the following wheels: Data was extracted with this script. Most version restrictions from wheels are not that strict, they are rather broad, the strictest ones are filtered here: https://github.com/aucampia/examples/blob/4da918e062d5591fd59ec25a408df7eff9e3772f/202309-python_apps/output/restricted.csv In there, the exact version is selected in only 4 cases. Other cases follow semantic versioning restrictions for the most part, and in some cases this even spans multiple major versions, like this.
While the pySHACL project has a lockfile (i.e. poetry.lock) this does not really affect users in most cases as it does not affect the wheel, and will really only be effective if users use pySHACL by cloning the repo and using poetry install, which are not part of the documented installation instructions [ref]. There is, of course, the problem of validating that the version ranges in wheel you distribute are correct. For RDFLib I validate this by testing RDFLib with the minimum versions of the dependencies in addition to testing it with the latest versions of the dependencies: https://github.com/RDFLib/rdflib/blob/16047eb2f70d061dc7bee564a05e6ba880c7f0e2/devtools/constraints.min
I don't think the situation is that unique, and most cases of other tools I know of that are both a library and a CLI tool work fairly well by distrusting one wheel with fairly broad version restrictions. I think the best option is to just validate your version ranges as I do for RDFLib. There are some other things to note though:
|
This stopped working after the move to poetry, but I will fix it tonight or tomorrow. |
Fix here: |
Thanks @aucampia for your comments. |
In the python world, packages (installable python modules) are usually in the form of Applications or Libraries.
Libraries are building block packages that users can use to build their application. And Application packages are python programs and utilities installed by end users.
Mechanically there is not much difference between a python library and and python application, and some packages can have aspects of both. PySHACL was originally designed and developed to be a python Application, it has a commandline interface to execute the validation functionality. But PySHACL also has aspects of a python library, it's modules can be imported into other python code, and called programmatically. The README file includes examples of both use cases. (Note, you can also call the module from the commandline, that is a third mechanism, but equal in functionality to the Application mode).
From my experience interacting with PySHACL users on Github threads, on the SHACL Discord server, and in person, I estimate around 50% of users use PySHACL as an Application, installing it only to execute it from the commandline interface. Around 40% of users treat it as a library, incorporating PySHACL into their broader codebase. The remaining 10% have a hybrid use case, utilising both the commandline interface, and the module imports.
The tricky part comes when defining the runtime requirements of the package. Usually python Applications maintain a tightly controlled (or Pinned) list of required library versions to ensure best operation, and they ship with a package lockfile to inform the package manager what versions of requirements to install. Libraries on the other hand tend to have as loose as possible requirements, to ensure maximum compatibility with other codebases, and often do not ship a lockfile in the package, so they leave it up to the developer to choose exactly which library versions they use in their application.
PySHACL has always needed to balance on this line of loose requirements for library use cases, but shipping a lockfile to ensure the application use case works out of the box. For example, we have always tried to maintain backward compatibility with RDFLib for the last three RDFLib releases, because it is surprisingly common for developers to need to use PySHACL in an codebase with an older version of RDFLib, so we leave the RDFLib requirement loose.
This thread is a discussion about the direction PySHACL should take here. Would it be a good idea to split the PySHACL codebase into two packages? One is the PySHACL library, and the other is the PySHACL CLI Tool?
Related: #203, #197
The text was updated successfully, but these errors were encountered: