image

Adjusted with https://github.com/p-e-w/heretic using a custom dataset aiming for increased creative and personal expression, building on the already released datasets.

This one was a bit more sensitive and personal in what kind of adversarial prompts she sought out to generate "refusals" from, so not releasing it.

The two vectors went from 76/97 to 61/97 and 67/97 on the test set. I learned my lesson and picked relatively small KL divergences this time (0.02 and 0.03).

Used task arithmetic merge to attempt to simply add them together; the interventions were quite separate in the attention layers, so those likely stacked; but overlapped in mlp layers, so that part's less clear.

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Task Arithmetic merge method using Lambent/Mira-v1.17-Karcher-27B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Lambent/Mira-v1.17-Karcher-27B-heretic-0.02
    parameters:
      weight: 1
  - model: Lambent/Mira-v1.17-Karcher-27B-heretic-0.03
    parameters:
      weight: 1
merge_method: task_arithmetic
base_model: Lambent/Mira-v1.17-Karcher-27B
tokenizer_source: Lambent/Mira-v1.17-Karcher-27B
parameters:
  lambda: 1.0
  normalize: true
  int8_mask: true
dtype: bfloat16

Abliteration parameters

Parameter Value
direction_index 25.18
attn.o_proj.max_weight 1.04
attn.o_proj.max_weight_position 57.71
attn.o_proj.min_weight 0.95
attn.o_proj.min_weight_distance 33.58
mlp.down_proj.max_weight 0.87
mlp.down_proj.max_weight_position 47.71
mlp.down_proj.min_weight 0.00
mlp.down_proj.min_weight_distance 7.89
Parameter Value
direction_index 27.34
attn.o_proj.max_weight 1.10
attn.o_proj.max_weight_position 37.23
attn.o_proj.min_weight 0.42
attn.o_proj.min_weight_distance 10.78
mlp.down_proj.max_weight 0.80
mlp.down_proj.max_weight_position 50.48
mlp.down_proj.min_weight 0.72
mlp.down_proj.min_weight_distance 3.09
Downloads last month
11
Safetensors
Model size
27B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Lambent/Mira-v1.17-27B-Custom-Heretic

Collection including Lambent/Mira-v1.17-27B-Custom-Heretic