Should we externalize framework integrations? #571

PGijsbers · 2023-07-20T10:46:10Z

PGijsbers
Jul 20, 2023
Maintainer

I have a feeling that we should make integrations less tightly coupled with the main repository.
I think it would be great if you could host a repository on Github that just has the framework folder (with setup.sh, exec.py, …) and to reference that directly. Essentially, it would keep everything else the same, except there would be a git clone step to download the integration instead of it being shipped with the main project.
We would keep the baseline integrations in the repository itself, and have an overview of the supported integrations (akin to sklearn-contrib) that are explicitly quality checked (to be defined). The "approved" integrations could also be directly referenced in our frameworks.yaml file, so it should still work out of the box.
It would then be easy to develop integrations independently as long as we are consistent and deliberate about our versioning and releases.

Pros:

It is easier for people to share framework integrations
It should reduce maintenance work in the main repository (we are quickly heading to 20+ integrated frameworks), if framework authors have an easier way to update integration scripts on their own time we may not have to do it ourselves

Cons:

It becomes harder to verify integrations work as we expect them to (we'd need to keep track of the different repositories).
There are more different versions to keep track of (amlb version, integration version, and framework version).

Thoughts on the matter are welcome :)

Innixma · 2023-07-21T21:35:32Z

Innixma
Jul 21, 2023
Collaborator

This is a good idea, but you will want to be very confident in what a framework definition is and avoid making breaking changes to it too often. Currently since everything is in one repo, it is easy to refactor something to add logic such as the inference-time-for-n-rows feature to each framework, but if it is split into N repos with N owners, now you need to hope that each of the N owners update their code correctly.

I'd recommend making a general template framework exec.py example that is very solid and comprehensive before externalizing.

Sometime in the future AutoGluon itself may switch to this logic for model contributions, which I had thought about last year but decided not to do yet due to the potential breaking changes I'd like to make in future to some core logic like AbstractModel.

Probably doing this with AMLB is less risky than with AutoGluon, since you don't have to worry about cross-dependency issues and conflicts. That is a huge boon.

0 replies

PGijsbers · 2023-07-21T21:42:41Z

PGijsbers
Jul 21, 2023
Maintainer Author

Thanks for weighing in! I would definitely want something like #279 before actually putting this into practice. Also, while it allows people to develop integrations independently, we could always choose to submit PRs to specific important integrations and/or have some "joint ownership" through Github collaborators or similar. Given we only do this after the interface is already fairly stable, I think that that would be sufficient "risk" mitigation?

1 reply

Innixma Jul 21, 2023
Collaborator

Yeah I think that makes sense

mfeurer · 2023-11-03T17:02:33Z

mfeurer
Nov 3, 2023
Maintainer

I think this would be a great step forward. I really like the way BayesMark can be extended as it does not require any commits. Yes, it comes with the drawback that the users needs to take responsiblity, but it will also allow faster turnaround times.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we externalize framework integrations? #571

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Should we externalize framework integrations? #571

PGijsbers Jul 20, 2023 Maintainer

Replies: 3 comments · 1 reply

Innixma Jul 21, 2023 Collaborator

PGijsbers Jul 21, 2023 Maintainer Author

Innixma Jul 21, 2023 Collaborator

mfeurer Nov 3, 2023 Maintainer

PGijsbers
Jul 20, 2023
Maintainer

Replies: 3 comments 1 reply

Innixma
Jul 21, 2023
Collaborator

PGijsbers
Jul 21, 2023
Maintainer Author

Innixma Jul 21, 2023
Collaborator

mfeurer
Nov 3, 2023
Maintainer