Add `save_model` and `load_model` functions (or something similar) #259

tomicapretto · 2020-10-28T13:11:14Z

PyMC3 allows to save traces via pm.save_trace() and they can be loaded via pm.load_trace().
I think that having functionality to save and load model objects will favor interactivity when working with Bambi models.
It is already happening to me that every time I reset my session I need to run the samplers again and it is annoying.

The text was updated successfully, but these errors were encountered:

tomicapretto · 2020-10-28T14:17:21Z

@aloctavodia pointed that we can already save fitted models via arviz.to_netcdf and loaded with arviz.from_netcdf.

tomicapretto · 2021-04-09T01:55:02Z

I would like to add this feature soon, and I've been thinking that dill is a good candidate. However, a model has two "independent" objects associated with it: the Model instance itself and the InferenceData object that Model.fit() returns. I think this functionality would make more sense if it could be used like save_model(object, path) and load_model(path). I'm opening a new issue with some ideas about how we could achieve this.

canyon289 · 2021-04-09T03:10:48Z

Something that actually might make more sense for Bambi is a json or yaml format that records the model priors and formulae string, and then reconstructs it. This would be diffable and could be saved to the netcdf file in the form of a string. I havent thought through this much but just posting the idea here

…

On Thu, Apr 8, 2021 at 6:55 PM Tomás Capretto ***@***.***> wrote: I would like to add this feature soon, and I've been thinking that dill <https://github.com/uqfoundation/dill> is a good candidate. However, a model has two "independent" objects associated with it: the Model instance itself and the InferenceData object that Model.fit() returns. I think this functionality would make more sense if it could be used like save_model(object, path) and load_model(path). I'm opening a new issue with some ideas about how we could achieve this. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#259 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABXBFYNBYP7X62BTRCMXXY3THZNALANCNFSM4TCJWGFA> .

tomicapretto · 2021-04-09T11:02:06Z

Makes much sense! Much better than my proposal! I'll try to write something

tomicapretto · 2021-04-09T13:00:54Z

I'm realizing that we have more than a formula and the prior description. We also have a pandas DataFrame, the Bambi Term instances a formulae.DesignMatrices object. The design matrices and the terms could be re-constructed from the formula, the prior description, and the pandas DataFrame, but I maybe problems can arise? For example, what if you have a model built and saved with one version of Bambi, and then you load the description with another version of Bambi where something has changed. I don't know, just thinking out loud here.

jankaWIS · 2024-09-05T11:29:32Z

Hi, I don't have much to contribute to this topic but I do have this question. Is there a (recommended) way how to save and load models after fitting them? Let's say you want to run the model, save and then later come to play with visualisations or inspections and you don't want to rerun it again.
Thanks!

ColCarroll · 2024-09-05T13:44:25Z

Things have sort of changed for the better here -- I think you could save the inference results using .to_netcdf and .load, respectively on the inference data object.

A nice thing about xarray/netCDF/arviz (they're all kind of the same thing when it comes to InferenceData) is that they can carry JSON metadata. It might make sense for bambi to write some of that when running inference. @tomicapretto might have a better idea about how to do that, but it feels like it might be a fairly easy contribution (not trying to pressure you! 😁 )

tomicapretto · 2024-09-05T15:05:50Z

We're not offering anything at the moment unfortunately. But I think I can help you with some ideas.

When we talk about "saving and loading the model" we need to keep in mind there are two things we usually work with:

The Bambi model
The InferenceData object. This usually contains draws from the posterior, but it can contain other things (draws from prior, prior predictive, posterior predictive, log likelihood, etc.)

These two objects differ in some respects regarding saving/loading.

The Bambi model object "contains" a lot of objects and data of varying complexity, and it doesn't offer method to store and load it from disk. However, it's relatively cheap to instantiate a Bambi model multiple times.
The InferenceData object is much better structured, and it offers methods to save and load from disk (the ones Colin mentioned above). On the other hand, it's quite hard to generate all the time when it has draws from the posterior (because one needs to sample from the posterior and that takes time!).

You could write a small program that creates a Bambi model and then checks if a specific .nc file already exists to determine if the model has been previously fitted. If the file exists, it loads the .nc file and moves on; if not, it samples the posterior and saves the results in the .nc file and moves on. This way, the posterior is sampled only once.

Is that the ideal approach? I don't think so. If you need things that require interacting with the underlying PyMC model after getting posterior draws, the PyMC model will be recompiled each time. However, that is usually much cheaper than getting draws from the posterior all the time.

tomicapretto added the enhancement label Apr 9, 2021

tomicapretto mentioned this issue Apr 9, 2021

Some ideas about the Model class #332

Closed

canyon289 mentioned this issue May 13, 2021

Improve workflow around saving and loading fitted models pymc-devs/pymc#4687

Closed

krassowski mentioned this issue Aug 24, 2021

Coarse-grained multiprocessing with bambi #400

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `save_model` and `load_model` functions (or something similar) #259

Add `save_model` and `load_model` functions (or something similar) #259

tomicapretto commented Oct 28, 2020

tomicapretto commented Oct 28, 2020

tomicapretto commented Apr 9, 2021

canyon289 commented Apr 9, 2021 via email

tomicapretto commented Apr 9, 2021

tomicapretto commented Apr 9, 2021

jankaWIS commented Sep 5, 2024 •

edited

Loading

ColCarroll commented Sep 5, 2024

tomicapretto commented Sep 5, 2024

Add save_model and load_model functions (or something similar) #259

Add save_model and load_model functions (or something similar) #259

Comments

tomicapretto commented Oct 28, 2020

tomicapretto commented Oct 28, 2020

tomicapretto commented Apr 9, 2021

canyon289 commented Apr 9, 2021 via email

tomicapretto commented Apr 9, 2021

tomicapretto commented Apr 9, 2021

jankaWIS commented Sep 5, 2024 • edited Loading

ColCarroll commented Sep 5, 2024

tomicapretto commented Sep 5, 2024

Add `save_model` and `load_model` functions (or something similar) #259

Add `save_model` and `load_model` functions (or something similar) #259

jankaWIS commented Sep 5, 2024 •

edited

Loading