This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
- /workspace/Vistral-24B-Instruct
- /workspace/Cydonia-24B-v4.2.0
Configuration
The following YAML configuration was used to produce this model:
# --- Gradient SLERP: Vistral (RU) x Cydonia (RP) ---
merge_method: slerp
base_model: /workspace/Vistral-24B-Instruct # SLERP anchor (0.0 = pure Vistral)
dtype: bfloat16
models:
- model: /workspace/Vistral-24B-Instruct
- model: /workspace/Cydonia-24B-v4.2.0
parameters:
# Smooth profile "more Vistral (RU) at the bottom, more Cydonia (RP) at the top"
# 5 nodes will be uniformly interpolated across all transformer layers.
t:
- filter: self_attn
value: [0.35, 0.45, 0.55, 0.60, 0.65] # lower โ closer to Vistral, higher โ to Cydonia
- filter: mlp
value: [0.30, 0.45, 0.55, 0.60, 0.60]
- value: 0.50 # default for other tensors (if any were not covered by the filters)
# SLERP also for embed/lm_head with different vocabularies (only available for 2 models)
embed_slerp: true
# Building a combined vocabulary to correctly resolve the "+2
tokenizer_source: union
Chat templates: Llama 3 OR Mistral Tekken
- Downloads last month
- 8
Model tree for Ilya626/Cydonia_Vistral
Merge model
this model