Running models with big dataset tips #853
Unanswered
danieltomasz
asked this question in
Q&A
Replies: 1 comment 5 replies
-
Hi @danieltomasz, which family are you using? That seems to be a good case for plain PyMC and perhaps sparse data structures. The reason why I suggest PyMC is that Bambi almost always create dense matrices (which can be very big in your case). |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am experimenting with running big models on my laptop (M1 with 16 gb ram), the project is to explore feasibility and limitation oof different approaches; mI have a dataset containing response in fmri voxels (14752 voxel pers subject)
With the model
model = bmb.Model("value ~ (1|subject) + (1|voxel)", filtered_data_frame)
When I convert the data type to 'float32' I can add more subjects to my model to be able to fit object into memory in jupyter,
but there obviously are still limits there until jupyter crash, not counting inference time, what would be the best practice to work with such big models on cpu if any (leaving aside having good GPUs and enough RAM)
Beta Was this translation helpful? Give feedback.
All reactions