Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A strategy towards robust priors (without links, assuming intercept exists) #313

Closed
AlexanderFengler opened this issue Nov 1, 2023 · 0 comments · Fixed by #331
Closed
Assignees
Labels
enhancement New feature or request

Comments

@AlexanderFengler
Copy link
Collaborator

AlexanderFengler commented Nov 1, 2023

We will need two basic solutions, this focuses first on models that have an intercept.

Specifics:

In general use Normals for common betas, fix sigma at around '0.25'.
In general expect z-scored covariates.

  1. Common terms (global)
    • Intercept --> use parameter bounds to set mean (this will need a bit of specialization across models)
    • rule of thumb 1: if approx_differentiable --> use mean of bounds
    • rule of thumb 2: if analytical --> hook prior mean to approx_differentiable version where possible
    • Any other beta should be Normal(0, 0.25)

In general use Normals centered around 0 for group terms (since intercept is taken care of by assumption), specifically,

  1. Group-terms
    • Intercept --> is now an offset --> Normal(mu_group, sigma_group)
    • Any other beta --> Normal(mu_group, sigma_group)

(no special treatment for intercept)

Group-terms have hierarchy, so we need distributions for mu and sigma,

For mu

  • Normal(0, 0.25)

For sigma

  • Weibull(alpha = 1.5, beta 0.3)

which fulfills the basic desiderata of being zero avoiding as well as capping the top end which is usually unrealistic and creates convergence problems.


Strategy:

We want to be able to intervene for prior setting before the model is compiled, so the idea is to use the supplied data and the supplied regression functions to infer exactly what terms need priors in the model.

We should be able to exploit the fact that Bambi parses the regression formulas via the formulae package under the hood. It splits the regression function into a list of terms, which are defined in such a way that Bambi can flesh out the corresponding PyMC model.

There are essentially three types of terms (mapping onto the discussion above),

  1. Intercept (basic term, associate intercept specific prior)
  2. Term (basic term, associate basic prior)
  3. GroupSpecificTerm (group level term, associate group level distribution)

We should be able to use the formulae.design_matrices() function to get at all these terms, and set their priors according to the strategy outlined above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants