Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example to use grammar sampler ? #604

Open
edwin0cheng opened this issue Dec 17, 2024 · 4 comments
Open

example to use grammar sampler ? #604

edwin0cheng opened this issue Dec 17, 2024 · 4 comments

Comments

@edwin0cheng
Copy link

Any point to use grammar sampler in llama-cpp-rs ?

In the origin llama.cpp server example, it uses grammar sampler and chain sampler separately.
Do I have to use these that way ?

@edwin0cheng
Copy link
Author

Here is how I do if anyone interested :

pub struct LlmSampler {
    grammar_sampler: Option<LlamaSampler>,
    chain: LlamaSampler,

    cur_p: Option<LlamaTokenDataArray>,
}

impl LlmSampler {
    pub fn new(model: &LlamaModel, grammar: &str) -> Self {
        // TODO: Implement a real sampler based on opts
        let samplers = vec![LlamaSampler::dist(0)];

        let sampler = LlamaSampler::chain(samplers, false);

        let grammar_sampler = if !grammar.is_empty() {
            Some(LlamaSampler::grammar(model, &grammar, "root"))
        } else {
            None
        };

        Self { grammar_sampler, chain: sampler, cur_p: None }
    }

    pub fn accept(&mut self, token: LlamaToken, with_grammar: bool) {
        if with_grammar {
            if let Some(s) = self.grammar_sampler.as_mut() {
                s.accept(token);
            }
        }

        self.chain.accept(token);
    }

    fn sample(&mut self, ctx: &LlamaContext, idx: i32) -> LlamaToken {
        self.set_logits(ctx, idx);

        {
            let cur_p = self.cur_p.as_mut().unwrap();
            self.chain.apply(cur_p);

            let id = cur_p.data[cur_p.selected.unwrap()].id();

            // check if the sampeld token fits the grammar
            if let Some(grammar_sampler) = self.grammar_sampler.as_mut() {
                let single_token_data = LlamaTokenData::new(id, 1.0, 0.0);
                let mut signel_token_data_array =
                    LlamaTokenDataArray::new(vec![single_token_data], false);

                grammar_sampler.apply(&mut signel_token_data_array);

                let logit = signel_token_data_array.data[0].logit();
                let is_valid = !(logit.is_infinite() && logit.is_sign_negative());

                if is_valid {
                    return id;
                }
            }
        }

        // resampling:
        // if the token is not valid, sample again, but first apply the grammar sampler and then the sampling chain
        self.set_logits(ctx, idx);
        let cur_p = self.cur_p.as_mut().unwrap();

        if let Some(grammar_sampler) = self.grammar_sampler.as_mut() {
            grammar_sampler.apply(cur_p);
        }
        self.chain.apply(cur_p);

        cur_p.data[cur_p.selected.unwrap()].id()
    }

    fn set_logits(&mut self, ctx: &LlamaContext, i: i32) {
        let logits = ctx.get_logits_ith(i);
        let n_vocab = ctx.model.n_vocab();

        let mut cur = vec![];
        for i in 0..n_vocab {
            let token = LlamaToken(i);
            cur.push(LlamaTokenData::new(token, logits[i as usize], 0.0));
        }

        self.cur_p = Some(LlamaTokenDataArray::new(cur, false));
    }
}

@nkoppel
Copy link
Contributor

nkoppel commented Dec 24, 2024

This should be possible with only:

let mut sampler = LlamaSampler::chain_simple([
    LlamaSampler::grammar(grammar_str, "root"),
    LlamaSampler::dist(seed),
]);

As long as you make sure to use one of the sample or accept methods to make sure that the grammar sampler is fed the generated tokens. I am not quite sure how this will respond to no token being valid in the grammar, but hopefully it defaults to outputting eos.

@ethragur
Copy link

For me that always results in a coredump:

llama-cpp-rs/target/release/build/llama-cpp-sys-2-ae79962bb25bc99c/out/llama.cpp/src/llama-grammar.cpp:1137: GGML_ASSERT(!grammar.stacks.empty()) failed
target/release/simple(+0x1be8cb) [0x5555557128cb]
target/release/simple(+0x1bedc7) [0x555555712dc7]
target/release/simple(+0x2c7873) [0x55555581b873]
target/release/simple(+0x2bd4c2) [0x5555558114c2]
target/release/simple(+0x19720a) [0x5555556eb20a]
target/release/simple(+0x1b0863) [0x555555704863]
target/release/simple(+0x1ae68d) [0x55555570268d]
target/release/simple(+0x48f8e7) [0x5555559e38e7]
target/release/simple(+0x19c855) [0x5555556f0855]
/nix/store/wn7v2vhyyyi6clcyn0s9ixvl7d4d87ic-glibc-2.40-36/lib/libc.so.6(+0x2a27e) [0x7ffff7a3127e]
/nix/store/wn7v2vhyyyi6clcyn0s9ixvl7d4d87ic-glibc-2.40-36/lib/libc.so.6(__libc_start_main+0x89) [0x7ffff7a31339]
target/release/simple(+0x192e65) [0x5555556e6e65]
Aborted (core dumped)

I tested it with 0.1.84 and the main (9af1286) branch.
Here is the diff to the simple example:


+    let grammar_str = r#"root ::= answer
+answer ::= "{"   ws   "\"response\":"   ws   string   "}"
+answerlist ::= "[]" | "["   ws   answer   (","   ws   answer)*   "]"
+string ::= "\""   ([^"]*)   "\""
+boolean ::= "true" | "false"
+ws ::= [ \t\n]*
+number ::= [0-9]+   "."?   [0-9]*
+stringlist ::= "["   ws   "]" | "["   ws   string   (","   ws   string)*   ws   "]"
+numberlist ::= "["   ws   "]" | "["   ws   string   (","   ws   number)*   ws   "]""#;
+
     let mut sampler = LlamaSampler::chain_simple([
+        LlamaSampler::grammar(&model, grammar_str, "root"),
         LlamaSampler::dist(seed.unwrap_or(1234)),
         LlamaSampler::greedy(),
     ]);

@nkoppel
Copy link
Contributor

nkoppel commented Dec 26, 2024

That is a bug in the simple example itself. Both LlamaSampler::sample and LlamaSampler::accept accept the token into the sampler, and the string "{{" is not valid within the grammar. Removing the line sampler.accept(token); fixed this error for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants