Replies: 3 comments 1 reply
-
This is a good idea, but you will want to be very confident in what a framework definition is and avoid making breaking changes to it too often. Currently since everything is in one repo, it is easy to refactor something to add logic such as the inference-time-for-n-rows feature to each framework, but if it is split into N repos with N owners, now you need to hope that each of the N owners update their code correctly. I'd recommend making a general template framework exec.py example that is very solid and comprehensive before externalizing. Sometime in the future AutoGluon itself may switch to this logic for model contributions, which I had thought about last year but decided not to do yet due to the potential breaking changes I'd like to make in future to some core logic like AbstractModel. Probably doing this with AMLB is less risky than with AutoGluon, since you don't have to worry about cross-dependency issues and conflicts. That is a huge boon. |
Beta Was this translation helpful? Give feedback.
-
Thanks for weighing in! I would definitely want something like #279 before actually putting this into practice. Also, while it allows people to develop integrations independently, we could always choose to submit PRs to specific important integrations and/or have some "joint ownership" through Github collaborators or similar. Given we only do this after the interface is already fairly stable, I think that that would be sufficient "risk" mitigation? |
Beta Was this translation helpful? Give feedback.
-
I think this would be a great step forward. I really like the way BayesMark can be extended as it does not require any commits. Yes, it comes with the drawback that the users needs to take responsiblity, but it will also allow faster turnaround times. |
Beta Was this translation helpful? Give feedback.
-
I have a feeling that we should make integrations less tightly coupled with the main repository.
I think it would be great if you could host a repository on Github that just has the framework folder (with setup.sh, exec.py, …) and to reference that directly. Essentially, it would keep everything else the same, except there would be a
git clone
step to download the integration instead of it being shipped with the main project.We would keep the baseline integrations in the repository itself, and have an overview of the supported integrations (akin to sklearn-contrib) that are explicitly quality checked (to be defined). The "approved" integrations could also be directly referenced in our
frameworks.yaml
file, so it should still work out of the box.It would then be easy to develop integrations independently as long as we are consistent and deliberate about our versioning and releases.
Pros:
Cons:
Thoughts on the matter are welcome :)
Beta Was this translation helpful? Give feedback.
All reactions