-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about v3 #63
Comments
Thanks for writing!
Happy to receive your feedback and contributions to usage! I definitely appreciate that I am too close to the library to understand where it's confusing. Hope this helps! Let me know if I can elaborate more on anything. I am so lazy and busy with life that I haven't made time to finish Ebisu v3 but it's quite close! |
Hey, thanks for coming back to me so quickly!
Once I am confident that I am using the tool correctly, Ill propose some additions to the readme. |
Interesting question! I haven't thought about this before, let me see if we can noodle through to something meaningful—
So note that each Ebisu model has a time-to-80%-recall, a floating point number. For young cards this will be minutes, and for more mature cards it will be days. This is a continuous variable and it's not obvious to me how to discretize it into bins like "unknown", "learning", "known" unless you do something ad hoc like
These limits (1 day, 1 week) are magic numbers and it's not clear to me that you need these when the time-to-80%-recall will tell you more exactly when to review. For a user visualization/motivation perspective, one idea could be, like you suggest, use the dominant atom of the ensemble as some kind of feedback. The default v3 model will have 5 atoms so you can label flashcards as one-star through five-star?
Ah, so, if the goal is to avoid overloading the learner with too many new facts, then maybe your rule can be something like, "don't introduce new flashcards until no flashcard has time-to-80%-recall < 2 days". So those are a couple of ideas for mapping an Ebisu v3 model to a visualization for the user (find the atom with the highest weight) as well as how to pick when to introduce new flashcards (ensure there are no quickly-decaying flashcards). Both are a little ad hoc but that's fine, since these are app-level decisions that you make, Ebisu doesn't make these decisions. But from the perspective of categorizing cards you've already learned for scheduling purposes, it's still not clear to me whether this is useful? Because, as described above, Ebisu gives you a real number for exactly what time the recall probability drops below a threshold (or equivalently, what the recall probability is right now), and it seems like that should be enough to decide whether or not to ask the student about this flashcard.
Anyway, hope this helps! Forgive me if I've misunderstood you, please feel free to tell me where I'm getting confused! |
Ok, this is turning into a nice noodle salad. I did do my best to keep it sorted. Relations Between CardsYou really tickled my fancy with the concept of relationships between cards. Because that's essentially what I want to build across scales and a lot of my thinking follows from that so let's start here.
It is not about controlling the load for the learner (I agree with you that Anki is too rigid), it is about enabling the modeling of interdependencies between facts. I think about knowledge as interconnected and codependent. I need to know
One more thing about the
|
How to update a meaning flashcard if you just reviewed the pronunciation flashcardThere are two ways this can happen Passive reviewWhen you quizzed the first flashcard (pronunciation of a word), the student answered and you showed them the meaning. This is an active (normal) review for the pronunciation flashcard and a passive review for the meaning: you didn't actively test recall on the meaning, and you have no evidence that that memory was strengthened or weakened or whatever. The very simple way I handle passive reviews is to keep the same model and just overwrite the "last seen" timestamp for the flashcard with the current timestamp.
Correlation using noisy-binaryThe more interesting approach is if you don't show the meaning after quizzing for the pronunciation. This isn't a passive quiz for meaning—it's not a quiz at all, and you have to use math. I don't have a good mathematical way to do this 😢 but here's an approach I've experimented with: per https://fasiha.github.io/ebisu/#bonus-soft-binary-quizzes you can customize call
These are hard to guess at, and maybe a future version of Ebisu will help you find these numbers given a lot of quiz history. But you can guess: suppose
So you could do something like this: assume the student passed the pronunciation flashcard: import ebisu
meaningModel = (2, 2, 1) # one hour halflife
elapsedTime = 2 # it's been two hours since you last saw the meaning
newMeaningModel = ebisu.updateRecall(meaningModel, 0.8, 1, elapsedTime, q0=0.4)
print(ebisu.modelToPercentileDecay(newMeaningModel))
# prints 1.145123576480906 So our model for the meaning went from halflife of 1 hour to 1.15 hours because we guess that there's this link between passing the pronunciation flashcard and knowing the meaning (80% chances of knowing pronunciation assuming you truly knew the meaning, 40% of knowing the pronunciation assuming you truly forgot the meaning). If you failed the pronunciation quiz but wanted to keep the same numbers, it'd be
i.e., the halflife for the meaning flashcard dropped from 1 hour to 0.9 hours because you failed the pronunciation card. You'd then overwrite the meaning flashcard's model with this new model (keeping its "last seen" timestamp the same). I don't love this technique. It's totally ad hoc. I want to spend some time thinking about other ways to model (and measure) correlations between flashcards (there's an issue about this #27) but haven't gotten to it. So there's no API planned for this, just some experimentation. Skill treeAhh nice, thanks for this explanation! So it's like https://www.executeprogram.com (Gary Bernhardt is also a huge fan of spaced repetition 🙌). Honestly I don't know if you want to or need to use Ebisu for modeling the skill tree? Like, I don't know if for example Execute Program requires you to achieve some level of mastery in step 1 before allowing you to study step 2, I think it's perfectly reasonable to let the student see flashcard 1, ask them to commit it to memory, and then click "next" to go on to flashcard 2 that depends on 1. If you wanted to use Ebisu to prevent students from rushing too fast, then we have the various ad hoc techniques we discussed above (like, ensure flashcard 1 has an time-to-80%-recall > 1 week, etc.) but the skill tree itself is something that I think makes sense to keep at your app's level. Can different facets of the same concept share models/atoms?I don't think so. I haven't thought about this a lot, but it's never made sense to me to try and use a single memory model to capture pronunciation vs meaning vs writing—i.e., the three facets of the word are 階段 (written form), "kaidan" (pronunciation in Japanese), and "stairs" (meaning in English). It's always made sense to keep these as separate models. One reason is just the learning experience. Some learners who know Chinese will already know the written form and they'll be really good at guessing the pronunciation in Japanese (階段 is pronounced jie1duan4, which maybe to English speakers sounds quite different from "kaidan" but these are actually quite related), but the meaning is quite different in Chinese—so maybe the meaning card might actually be harder for Chinese speakers learning this than for English speakers. But also from a statistics perspective, it seems harder to make a hierarchical model where there's a "base" model that represents the overall fact and then some transformed model for written form vs pronunciation vs meaning. It seems easier to keep these separate models and enforce some kind of correlation when updating one without the other? So I don't have any clear idea of how I'd combine atoms or ensembles tracking one of these vs the other in Ebisu. Now, your app should definitely keep track of the fact that these three flashcards are interrelated! Back when I used it (years ago) Anki didn't do this and it really annoyed me to get asked about the written form today when it asked me about the pronunciation yesterday. So even if you didn't track any kind of correlation between these different flashcards, you should at least try and space them out a bit so your users don't get annoyed. Or you could quiz users on all 3 sub-facts at the same time, maybe the rule is "when one is due, review all 3". Or you could review one sub-fact (written form to pronunciation) when it's due, and then after reviewing the answer, you could have a followup screen where the user can click to see the meaning. That's the perfect use case for a noisy-binary quiz: if they click the meaning, that means they probably forgot it, and if they didn't click it that meaning they probably remember it but it's not 100% and you can tweak the numbers to decide how much meaning. So in summary, Ebisu for now doesn't know how to handle the different facets of the same card but your app definitely should. Updating the atom weights in the ensemble
It's a very standard statistical update: each atom's weight is multiplied by
You can fully control this: you can set each atom's model (an Ebisu v2 model, a 3-tuple of alpha, beta, and time) and its weight, or you can let the initialization do that by giving it
and it creates atoms with that halflife logarithmically spaced in halflife and in weight.
I haven't thought of any obvious reasons you'd tweak the weights/models of individual atoms but am happy to be surprised 😅 Performance of recall predictions at scaleYeah I've always worried about running
There's no great solution. It's not been a problem since most apps seem to run this locally on users' devices so they're only running it over whatever hundreds~thousands of flashcards a few times a day but yes, I always worry this will become a deal-breaker for someone at some point. Hope this helps and is clear! |
sorry for not responding yet. i am building my app using ebisu and want to get more handson experience. so far i quite like the experience |
I am building and building. Ebisu is pretty deeply embedded in my system by now. Do you have a prediction when multiple models, ie v3, will drop? ive reached the first use case where that specific feature would be killer. |
@LazerJesus thanks for checking in! Sorry to make you wait for so long! I just published a release candidate 3.0.0rc1 to Pypi with the Beta ensemble extension 🥳! The README is https://github.com/fasiha/ebisu/tree/v3-release-candidate#readme and you can install it like this: python -m pip install "git+https://github.com/fasiha/ebisu@v3-release-candidate" (Note, the above installs the latest version for GitHub, things have changed a tiny bit since I published rc1 to PyPI but if you want that, you'd do The README has an example script and some verbal explanation for changes to the API. If you're able to beta-test for me, I would be SO grateful 🙇🙏🥹! Please let me know any and all questions, comments, feedback 😁! |
hey
i would love to beta test immediately but i am on the js version 🙃
…On Sun 12. Nov 2023 at 09:22 Ahmed Fasih ***@***.***> wrote:
@LazerJesus <https://github.com/LazerJesus> thanks for checking in! Sorry
to make you wait for so long! I just published a release candidate 3.0.0rc1
to Pypi with the Beta ensemble extension 🥳! The README is
https://github.com/fasiha/ebisu/tree/v3-release-candidate#readme and you
can install it like this:
python -m pip install "ebisu>=3rc"
The README has an example script and some verbal explanation for changes
to the API. If you're able to beta-test for me, I would be SO grateful
🙇🙏🥹!
Please let me know any and all questions, comments, feedback 😁!
—
Reply to this email directly, view it on GitHub
<#63 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACVBZJTDJDOKDOVFXH2SV3DYECBKZAVCNFSM6AAAAAA4U7UBB6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXGA2TGNBTGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@LazerJesus Ebisu.js now has a 3.0.0-rc.1 release candidate up at https://github.com/fasiha/ebisu.js/tree/v3#readme with instructions on how to install it ( |
cool. i’ll implement it next week. feedback inbound ✌️
…On Wed 15. Nov 2023 at 08:33 Ahmed Fasih ***@***.***> wrote:
@LazerJesus <https://github.com/LazerJesus> Ebisu.js now has a 3.0.0-rc.1
release candidate up at https://github.com/fasiha/ebisu.js/tree/v3#readme
with instructions on how to install it (npm i "
https://github.com/fasiha/ebisu.js#v3") and how to use it. It's been
tested to ensure it produces the same numbers as the Python v3-release
candidate as well. Please check it out!
—
Reply to this email directly, view it on GitHub
<#63 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACVBZJXSKVTKCLYHH6OL5O3YERV6HAVCNFSM6AAAAAA4U7UBB6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJRHE2DENZYHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
var model = ebisu.initModel({ firstHalflife: 24, numAtoms: 1 });
|
Sorry @LazerJesus, the single-atom model is just an Ebisu v2 model 😕 ( Sorry also for the delay! I work on quiz apps for some days and then return to Ebisu and then go back to working on apps 😅 |
hey @fasiha so, there are a few thoughts about the atoms. |
@fasiha, any updates? specifically on the migration |
@LazerJesus sorry for the delay! After some more testing (see #66) I'm no longer confident in the v3 release candidate. My apologies 😢🙇! If you wanted to continue using the v3-rc we discussed above in this thread, one approach might be, for each model to port, initialize a v3-rc model and replay all the quizzes, assuming you have that data somewhere in your database. I have a JavaScript implementation of the newer algorithm from #66 (tentatively called import * as ebisu3split from './split3.ts';
const oldModel = [5.5, 5.5, 4]; // first and second elements should be very close, since ebisu v2 rebalances models
// Also I'm assuming the third element is in HOURS
const newModel = ebisu3split.initModel({alphaBeta: oldModel[0], halflifeHours: oldModel[2]}) This split3 model has performance comparable to v3-rc when run on old data, and is a very simple twist on v2. I'm hoping to start testing this in a real app in a few days, so I'll report back either way. If you want me to release this as a branch that you can I haven't entirely abandoned v3-rc, so I'm really sorry for sending you in so many different directions. These kinds of open-ended research problems are hard, and I'm really bad at them 😓 so thank you so much for bearing with me and your continued encouragement 🙇 |
for now, my needs are met by v2. i am under no pressure to change the SR algorithm. if you have something you deem worth testing, i'll play around with it. |
Hi,
I recently found your project and have started to build a learning app using Ebisu to time the repetitions. I have a few questions specifically concerned with version 3 of Ebisu.
I'll just list them
I ask because I would love to stay on the edge but can't stand building complex systems in Python. I'd love to switch to JS.
Anki groups its cards into New, Learning, Review, Young, Mature, and Relearning.
From the way I understood the Ensemble, it should be possible to model a (Ebisu-)fact progressing from
New -> Learning -> Learning -> Young -> Mature <*-Relearning
by shifting the ensemble. Is that correct and is that what you had in mind?Thanks for building this. I find it conceptually quite appealing for its simplicity and elegance. I'll probably contribute some code examples soon if I am allowed. I had some trouble wrapping my head around the intended usage that a few dedicated examples would ease.
Best,
Finn
The text was updated successfully, but these errors were encountered: