Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-fidelity tabular benchmark compatible additions #112

Closed
wants to merge 53 commits into from

Conversation

Neeratyoy
Copy link
Collaborator

Main changes/updates (wrt ML benchmarks primarily):

  • Editing abstract benchmark function signature to include option for choosing fidelity space type
    • Adds an extra parameter called fidelity_choice=None which shouldn't break existing code
    • Allows for future extension of the benchmark to a different fidelity (e.g.: dataset fraction) or even multi-multi-fidelity
  • Adds a ML-benchmark specific template that can be inherited to easily add more parameter spaces by defining only their ConfigurationSpace and just the model definition
  • The above point naturally implies that the data preprocessing and the model pipeline for the ML models are being fixed/standardized
    • More decisions need to be taken for the final design, but future decisions under this class design would mean any change to the pipeline affects all ML parameter spaces

TODO:

  • Verify parameter spaces
  • Hist Gradient Boosting not tested, only as placeholder

KEggensperger and others added 18 commits March 26, 2021 11:46
* Scikit-learn showed every time the benchmark was created a warning that the surrogate was created with a different scikit-learn version.
However, we can't suppress another deprecation warning, since scikit-learn makes sure that they are not suppressed. (!)
* Increase benchmark version
* Add a reduced configuration space for the paramnet benchmark
* Codestyle: Add new line at end of file
SVM Surrogate Benchmark

* Intial Commit of a SVM Surrogate Benchmark. 

* Remove some wrong information from the docstrings (also Paramnet Benchmark)

* Change an old test case which tests for actual runtime. (svm)
  this test was comparing the actual time needed for running a svm configuration. Thus, it was often failing on different machines.

* SVM Surrogate: Add more references + min dataset fraction

* SVM: Add Container Recipe + ClientInterface
* Readme + small fix in client

* Test different singularity versions

* Print Container Version + HPOBench Version in Log.
Improve Container Integration

* Container-Client-Communication: The container does not read the hpobenchrc file anymore. We bind the directories (socket, data, ...) now to fixed paths in the container. 

* Increase Version Number for each Container

* Equalize client abstract benchmark function calls.

* Update Logging Procedure on Client and in Container

* Update Recipe Files: 
  Add a clean-up step to the recipe to reduce the size of the containers. 

* Add a Container Configuration test

* Update Recipe Template + Fix Typo
@KEggensperger KEggensperger self-requested a review July 7, 2021 08:35
@KEggensperger
Copy link
Contributor

Could you please rebase to development?

@PhMueller
Copy link
Contributor

Hey Neeratyoy,

thanks for your work!
Before changing the Interface, maybe a quick question:

Couldn't we also solve it via inheritance?

I think about having a BaseMLBenchmark() and create new Classes which overwrite the corresponding fidelity space.
Is this something you could use?
Something like that?

class BaseMLBenchmark()
    def __init__():
        do_stuff()
        
   def objective_function():
       do_compute()
       
   def get_fidelity_space(seed=None):
       raise NotImplementedError()
       
class SmallFidelityBenchmark(BaseMLBenchmark):
    def get_fidelity_space(seed=None):
        return SmallFidelitySpace()
        
class LargeFidelityBenchmark(BaseMLBenchmark):
    def get_fidelity_space(seed=None):
        return LargeFidelitySpace()

Since the other benchmarks don't use the fidelity_choice, it would be cool to talk a little bit about it.

@Neeratyoy
Copy link
Collaborator Author

it would be cool to talk a little bit about it.

Sure, I think we should.

I went with that design since I believe there are checks for the abstract class function signature checks. And to not affect containerization potentially, I thought of changing the abstract class definition where I presumed a None will keep things safe.
I did try the inheritance (which I am already doing) but went back to this design for some reason I can't recall.

As for the multiple class definitions for different fidelity spaces, sure, that is an alternate solution. However, I just felt that it makes the code more verbose. Also might make things more brittle for code using these classes. Hence, I went with the parameterization approach.

In any case, these are probably design choices more than functionality so we just need to come to an agreement wrt the scope of the package. Happy to discuss!

@Neeratyoy
Copy link
Collaborator Author

@PhMueller shall we close this?

@Neeratyoy
Copy link
Collaborator Author

Deprecated.
Subsumed in #121.

@Neeratyoy Neeratyoy closed this Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants