You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to compute the AIC goodness of fit for a fitted model. This requires knowing the likelihood function value for the set of estimated vonMises-Fisher parameters. But what is being returned by VonMisesFisherMixture.log_likelihood()? It is an array of size (n_clusters, n_samples) and would appear to be probability values (in [0, 1]) that a given sample belongs to a given cluster. They are not log-likelihood values (since those would all be < 0). From this array, what is the correct way to compute the likelihood value needed for computing AIC? I think it is something like below, but could be wrong since I'm not yet certain of what's being returned by log_likelihood().
likelihood = vmf_soft.log_likelihood(x) # shape (n_clusters, n_samples)
log_likelihood = np.sum(np.log(np.max(likelihood, axis=0))) # Choose the cluster of highest probability, convert that probability to log-likelihood, and sum across all samples.
This is based on equation 3.2 of the 2005 paper. The cluster/class weights may need to be involved too. I'm not sure if they're already incorporated into the values returned by log_likelihood().
I'd like to compute the AIC goodness of fit for a fitted model. This requires knowing the likelihood function value for the set of estimated vonMises-Fisher parameters. But what is being returned by
VonMisesFisherMixture.log_likelihood()
? It is an array of size (n_clusters, n_samples) and would appear to be probability values (in [0, 1]) that a given sample belongs to a given cluster. They are not log-likelihood values (since those would all be < 0). From this array, what is the correct way to compute the likelihood value needed for computing AIC? I think it is something like below, but could be wrong since I'm not yet certain of what's being returned bylog_likelihood()
.This is based on equation 3.2 of the 2005 paper. The cluster/class weights may need to be involved too. I'm not sure if they're already incorporated into the values returned by
log_likelihood()
.@jasonlaska, maybe you can help?
The text was updated successfully, but these errors were encountered: