Does Scout have Nope layers or not?

by jonaskuebler - opened Apr 8

Apr 8

Hi folks, thanks for the great models.

There seems to be some discrepancy of configs. In this model the "nope_layer_interval": 4, and the conversion script would write that into the hf checkpoint https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama4/convert_llama4_weights_to_hf.py#L237 .

So the question is: does Scout have nope layers or not? Or is it save to adjust the conversion script to put an empty list for no_rope_layers as well? https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama4/convert_llama4_weights_to_hf.py#L289

Apr 8

Okay I think I am getting it now.

So every fourth layer should use NOPE. I think than the problem is that the conversion script puts an integer "no_rope_layers": 4 into the config https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama4/convert_llama4_weights_to_hf.py#L289. this gives an error.
But from what I understand, putting an empty list there, the config will use the default (which is the every 4-th layer strategy).

Probably then it is rather a fix for the conversion script

Apr 8

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment