RoPE scaling config confusing #40

tattrongvu · 2023-11-15T10:04:17Z

Hi Yarn team,
thank you guys for the awesome work. Currently I'm trying to evaluate several rope scaling methods and fortunately there are all available in this git. I have some question related to the Config of rope scaling.
I see that in the requirements.txt you already include transformers >= 4.34.0, so it mean I could use the "linear" and "dynamic-ntk" out of the box with transformers, just by add the rope scaling in AutoConfig.from_pretrained() like that:
config.rope_scaling = { "type": "linear", "factor": args.linear }
or
config.rope_scaling = { "type": "dynamic", "factor": args.dynamic_ntk }
I tried that and remove the patch for linear & dynamic-ntk and the result look identical when using your implemented patch.
Moreover, it also support Falcon architecture. (https://github.com/huggingface/transformers/blob/main/src/transformers/models/falcon/modeling_falcon.py#L162)
So my question is that, are there any different between this two implementation? or your implementation for linear & dynamic-ntk patch is for keeping the reproduction eval consistent?

The text was updated successfully, but these errors were encountered:

tpoisonooo mentioned this issue Dec 1, 2023

cannot load safetensor: Trying to set a tensor of shape torch.Size([0]) in "weight" (which has shape torch.Size([32000, 4096])) #47

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RoPE scaling config confusing #40

RoPE scaling config confusing #40

tattrongvu commented Nov 15, 2023

RoPE scaling config confusing #40

RoPE scaling config confusing #40

Comments

tattrongvu commented Nov 15, 2023