Perplexity is off for Llama 2-7b #47

taratt · 2024-05-06T03:31:57Z

Hello,
I hope this finds you well.

I was trying to prune Llama 2-7b with wanda (cloned directly from your codebase), so I ran the following command:
python main.py --model meta-llama/Llama-2-7b-hf --prune_method wanda --sparsity_ratio 0.5 --sparsity_type unstructured --save out/llama2_7b/unstructured/wanda/

but I get a perplexity of 10.27 which is way higher than what you guys are reporting. It is being pruned with c4 and tested on wikitext2 (I changed nothing in the codebase). Do you guys maybe have a guess on what I might be doing wrong?

TIA

Eric-mingjie · 2024-05-06T17:35:12Z

Hi, Could you check if the performance of dense LLaMA2-7b match our number in the paper as well?

taratt · 2024-05-07T01:48:06Z

Hi,
Thanks for your prompt response.
Yes, the dense is off too. I'm getting 7.72 for LLaMA2-7b and you guys are reporting 5.12.
Can you maybe clone your repository again yourself and see if you can reproduce the results?

Eric-mingjie · 2024-05-07T18:14:38Z

hmm, the number I get from reruning is still 5.12 for dense LLaMA2 (context size 4096), even with context size 2024, the number would be around 5.5, as verified by other works (e.g., table 4 in https://arxiv.org/abs/2306.00978).

taratt · 2024-05-07T18:32:04Z

I'm running with context size 4096 as well (nsampels = 333). This is so weird. What version of datasets and transformers are you using?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perplexity is off for Llama 2-7b #47

Perplexity is off for Llama 2-7b #47

taratt commented May 6, 2024

Eric-mingjie commented May 6, 2024

taratt commented May 7, 2024

Eric-mingjie commented May 7, 2024

taratt commented May 7, 2024

Perplexity is off for Llama 2-7b #47

Perplexity is off for Llama 2-7b #47

Comments

taratt commented May 6, 2024

Eric-mingjie commented May 6, 2024

taratt commented May 7, 2024

Eric-mingjie commented May 7, 2024

taratt commented May 7, 2024