Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podcast script generation component #15
podcast script generation component #15
Changes from 23 commits
536c98d
0ae661e
c4a1ee1
d2b276c
8629481
cef92b3
acd50a9
2a8f005
95c342a
e8ac586
ee7d299
fb38207
29df436
d4b6066
abeb3c0
4bcc57b
06627fa
4457813
c73d4d3
a868093
8b2c57b
d2c75c9
7a7e39c
e1c6ccb
72413a1
8817ea0
06a2c3d
39fb3b3
1968278
f88b713
73ac7bf
1c11c98
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanfrench :
I am currently looking into this. I am trying to find a way to use the llama_cpp API to don't waste the tokenization call just for the sake of filtering.
If I don't find an easy solution today, maybe we can consider it a follow-up to not block the full PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daavoo - sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a try (to encode first) but the code became way more complicated.
However, it seems that people consider 1 token ~= 4 characters a common default and it is the value used when trying to estimate token consumption without expending calls.
So, I updated the code to use this 4 approximation and made a small improvement to use the
context lenght
associated to each model (before I was too lazy and just hardcoded the number to 4096 as it was the value for OLMoE)