liuganghuggingface commited on
Commit
32d216f
·
verified ·
1 Parent(s): 0fcc30e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -10
README.md CHANGED
@@ -7,13 +7,17 @@ tags:
7
  - biology
8
  ---
9
 
10
- context_length: 150
11
- depth: 24
12
- diffusion_steps: 500
13
- hidden_size: 1280
14
- mlp_ratio: 4
15
- num_heads: 16
16
- task_name: pretrainv6reset
17
- tokenizer_name: pretrainv6reset
18
- vocab_ring_len: 300
19
- vocab_size: 3000
 
 
 
 
 
7
  - biology
8
  ---
9
 
10
+ ### Model Configuration
11
+
12
+ | Parameter | Value | Description |
13
+ |------------|--------|-------------|
14
+ | **context_length** | 150 | Maximum sequence length for the input context. |
15
+ | **depth** | 24 | Number of transformer layers. |
16
+ | **diffusion_steps** | 500 | Number of diffusion steps during training. |
17
+ | **hidden_size** | 1280 | Hidden dimension size in the transformer. |
18
+ | **mlp_ratio** | 4 | Expansion ratio in the MLP block. |
19
+ | **num_heads** | 16 | Number of attention heads. |
20
+ | **task_name** | `pretrain` | Task type for model training. |
21
+ | **tokenizer_name** | `pretrain` | Tokenizer used for model input. |
22
+ | **vocab_ring_len** | 300 | Length of the circular vocabulary window. |
23
+ | **vocab_size** | 3000 | Total vocabulary size. |