You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If user chooses findBestK=c(TRUE,FALSE) with a range of values, e.g. ks=4:15 we are extremely inefficient, since we run each for 4-15 with findBestK=FALSE, and then for findBestK=TRUE, we RERUN all of k=4-15 and find the best K. This is because everything is run on parallel without cross-talk.
Similarly, if findBestK=TRUE, we throw away k=4-15 and only save the best, which seems like a waste if we just calculated it...
Perhaps should make findBestK so that will calculate and save k=4-15, then post-process those to get best. I.e. in clusterMany, would internally also set findBestK=FALSE, then do findBestK clustering last with just a silhouette processing of the results.
Could also make slot to save silhouette so could easily plot later.
The text was updated successfully, but these errors were encountered:
If user chooses findBestK=c(TRUE,FALSE) with a range of values, e.g.
ks=4:15
we are extremely inefficient, since we run each for 4-15 with findBestK=FALSE, and then for findBestK=TRUE, we RERUN all of k=4-15 and find the best K. This is because everything is run on parallel without cross-talk.Similarly, if findBestK=TRUE, we throw away k=4-15 and only save the best, which seems like a waste if we just calculated it...
Perhaps should make findBestK so that will calculate and save k=4-15, then post-process those to get best. I.e. in clusterMany, would internally also set findBestK=FALSE, then do findBestK clustering last with just a silhouette processing of the results.
Could also make slot to save silhouette so could easily plot later.
The text was updated successfully, but these errors were encountered: