This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

/workspace/Vistral-24B-Instruct
/workspace/Cydonia-24B-v4.2.0

Configuration

The following YAML configuration was used to produce this model:

# --- Gradient SLERP: Vistral (RU) x Cydonia (RP) ---
merge_method: slerp
base_model: /workspace/Vistral-24B-Instruct   # SLERP anchor (0.0 = pure Vistral)
dtype: bfloat16

models:
  - model: /workspace/Vistral-24B-Instruct
  - model: /workspace/Cydonia-24B-v4.2.0

parameters:
  # Smooth profile "more Vistral (RU) at the bottom, more Cydonia (RP) at the top"
  # 5 nodes will be uniformly interpolated across all transformer layers.
  t:
    - filter: self_attn
      value: [0.35, 0.45, 0.55, 0.60, 0.65]  # lower — closer to Vistral, higher — to Cydonia
    - filter: mlp
      value: [0.30, 0.45, 0.55, 0.60, 0.60]
    - value: 0.50  # default for other tensors (if any were not covered by the filters)

  # SLERP also for embed/lm_head with different vocabularies (only available for 2 models)
  embed_slerp: true

# Building a combined vocabulary to correctly resolve the "+2 
tokenizer_source: union

Chat templates: Llama 3 OR Mistral Tekken

Downloads last month: 8

Safetensors

Model size

24B params

Tensor type

BF16

Model tree for Ilya626/Cydonia_Vistral

TheDrummer/Cydonia-24B-v4.2.0

Vikhrmodels/Vistral-24B-Instruct

Merge model

this model

Quantizations

2 models