example to use grammar sampler ? #604

edwin0cheng · 2024-12-17T21:08:56Z

Any point to use grammar sampler in llama-cpp-rs ?

In the origin llama.cpp server example, it uses grammar sampler and chain sampler separately.
Do I have to use these that way ?

edwin0cheng · 2024-12-19T15:56:21Z

Here is how I do if anyone interested :

pub struct LlmSampler {
    grammar_sampler: Option<LlamaSampler>,
    chain: LlamaSampler,

    cur_p: Option<LlamaTokenDataArray>,
}

impl LlmSampler {
    pub fn new(model: &LlamaModel, grammar: &str) -> Self {
        // TODO: Implement a real sampler based on opts
        let samplers = vec![LlamaSampler::dist(0)];

        let sampler = LlamaSampler::chain(samplers, false);

        let grammar_sampler = if !grammar.is_empty() {
            Some(LlamaSampler::grammar(model, &grammar, "root"))
        } else {
            None
        };

        Self { grammar_sampler, chain: sampler, cur_p: None }
    }

    pub fn accept(&mut self, token: LlamaToken, with_grammar: bool) {
        if with_grammar {
            if let Some(s) = self.grammar_sampler.as_mut() {
                s.accept(token);
            }
        }

        self.chain.accept(token);
    }

    fn sample(&mut self, ctx: &LlamaContext, idx: i32) -> LlamaToken {
        self.set_logits(ctx, idx);

        {
            let cur_p = self.cur_p.as_mut().unwrap();
            self.chain.apply(cur_p);

            let id = cur_p.data[cur_p.selected.unwrap()].id();

            // check if the sampeld token fits the grammar
            if let Some(grammar_sampler) = self.grammar_sampler.as_mut() {
                let single_token_data = LlamaTokenData::new(id, 1.0, 0.0);
                let mut signel_token_data_array =
                    LlamaTokenDataArray::new(vec![single_token_data], false);

                grammar_sampler.apply(&mut signel_token_data_array);

                let logit = signel_token_data_array.data[0].logit();
                let is_valid = !(logit.is_infinite() && logit.is_sign_negative());

                if is_valid {
                    return id;
                }
            }
        }

        // resampling:
        // if the token is not valid, sample again, but first apply the grammar sampler and then the sampling chain
        self.set_logits(ctx, idx);
        let cur_p = self.cur_p.as_mut().unwrap();

        if let Some(grammar_sampler) = self.grammar_sampler.as_mut() {
            grammar_sampler.apply(cur_p);
        }
        self.chain.apply(cur_p);

        cur_p.data[cur_p.selected.unwrap()].id()
    }

    fn set_logits(&mut self, ctx: &LlamaContext, i: i32) {
        let logits = ctx.get_logits_ith(i);
        let n_vocab = ctx.model.n_vocab();

        let mut cur = vec![];
        for i in 0..n_vocab {
            let token = LlamaToken(i);
            cur.push(LlamaTokenData::new(token, logits[i as usize], 0.0));
        }

        self.cur_p = Some(LlamaTokenDataArray::new(cur, false));
    }
}

nkoppel · 2024-12-24T18:23:15Z

This should be possible with only:

let mut sampler = LlamaSampler::chain_simple([
    LlamaSampler::grammar(grammar_str, "root"),
    LlamaSampler::dist(seed),
]);

As long as you make sure to use one of the sample or accept methods to make sure that the grammar sampler is fed the generated tokens. I am not quite sure how this will respond to no token being valid in the grammar, but hopefully it defaults to outputting eos.

ethragur · 2024-12-26T08:16:49Z

For me that always results in a coredump:

llama-cpp-rs/target/release/build/llama-cpp-sys-2-ae79962bb25bc99c/out/llama.cpp/src/llama-grammar.cpp:1137: GGML_ASSERT(!grammar.stacks.empty()) failed
target/release/simple(+0x1be8cb) [0x5555557128cb]
target/release/simple(+0x1bedc7) [0x555555712dc7]
target/release/simple(+0x2c7873) [0x55555581b873]
target/release/simple(+0x2bd4c2) [0x5555558114c2]
target/release/simple(+0x19720a) [0x5555556eb20a]
target/release/simple(+0x1b0863) [0x555555704863]
target/release/simple(+0x1ae68d) [0x55555570268d]
target/release/simple(+0x48f8e7) [0x5555559e38e7]
target/release/simple(+0x19c855) [0x5555556f0855]
/nix/store/wn7v2vhyyyi6clcyn0s9ixvl7d4d87ic-glibc-2.40-36/lib/libc.so.6(+0x2a27e) [0x7ffff7a3127e]
/nix/store/wn7v2vhyyyi6clcyn0s9ixvl7d4d87ic-glibc-2.40-36/lib/libc.so.6(__libc_start_main+0x89) [0x7ffff7a31339]
target/release/simple(+0x192e65) [0x5555556e6e65]
Aborted (core dumped)

I tested it with 0.1.84 and the main (9af1286) branch.
Here is the diff to the simple example:


+    let grammar_str = r#"root ::= answer
+answer ::= "{"   ws   "\"response\":"   ws   string   "}"
+answerlist ::= "[]" | "["   ws   answer   (","   ws   answer)*   "]"
+string ::= "\""   ([^"]*)   "\""
+boolean ::= "true" | "false"
+ws ::= [ \t\n]*
+number ::= [0-9]+   "."?   [0-9]*
+stringlist ::= "["   ws   "]" | "["   ws   string   (","   ws   string)*   ws   "]"
+numberlist ::= "["   ws   "]" | "["   ws   string   (","   ws   number)*   ws   "]""#;
+
     let mut sampler = LlamaSampler::chain_simple([
+        LlamaSampler::grammar(&model, grammar_str, "root"),
         LlamaSampler::dist(seed.unwrap_or(1234)),
         LlamaSampler::greedy(),
     ]);

nkoppel · 2024-12-26T19:08:44Z

That is a bug in the simple example itself. Both LlamaSampler::sample and LlamaSampler::accept accept the token into the sampler, and the string "{{" is not valid within the grammar. Removing the line sampler.accept(token); fixed this error for me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example to use grammar sampler ? #604

example to use grammar sampler ? #604

edwin0cheng commented Dec 17, 2024

edwin0cheng commented Dec 19, 2024

nkoppel commented Dec 24, 2024

ethragur commented Dec 26, 2024

nkoppel commented Dec 26, 2024

example to use grammar sampler ? #604

example to use grammar sampler ? #604

Comments

edwin0cheng commented Dec 17, 2024

edwin0cheng commented Dec 19, 2024

nkoppel commented Dec 24, 2024

ethragur commented Dec 26, 2024

nkoppel commented Dec 26, 2024