Update README.md
Browse files
README.md
CHANGED
|
@@ -3,6 +3,7 @@ license: mit
|
|
| 3 |
---
|
| 4 |
|
| 5 |
|
|
|
|
| 6 |
<div align="center">
|
| 7 |
<h1>π dParallel: Learnable Parallel Decoding for dLLMs</h1>
|
| 8 |
<div align="center">
|
|
@@ -12,7 +13,7 @@ license: mit
|
|
| 12 |
<a href="https://github.com/czg1225/dParallel">
|
| 13 |
<img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
|
| 14 |
</a>
|
| 15 |
-
<a href="https://huggingface.co/Zigeng/dParallel-LLaDA-
|
| 16 |
<img src="https://img.shields.io/badge/HuggingFace-Model-FFB000.svg" alt="Project">
|
| 17 |
</a>
|
| 18 |
<a href="https://huggingface.co/datasets/Zigeng/dParallel_LLaDA_Distill_Data">
|
|
@@ -53,7 +54,7 @@ We introduce dParallel, a simple and effective method that unlocks the inherent
|
|
| 53 |
</tr>
|
| 54 |
<tr>
|
| 55 |
<td>π€ <strong>Model</strong></td>
|
| 56 |
-
<td><a href="https://huggingface.co/Zigeng/dParallel-LLaDA-
|
| 57 |
</tr>
|
| 58 |
<tr>
|
| 59 |
<td>π <strong>Data</strong></td>
|
|
@@ -83,8 +84,8 @@ from generate import generate
|
|
| 83 |
import torch
|
| 84 |
|
| 85 |
device = 'cuda'
|
| 86 |
-
model = LLaDAModelLM.from_pretrained('Zigeng/dParallel-LLaDA-
|
| 87 |
-
tokenizer = AutoTokenizer.from_pretrained('Zigeng/dParallel-LLaDA-
|
| 88 |
|
| 89 |
prompt = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Please reason step by step, and put your final answer within \\boxed{}."
|
| 90 |
|
|
@@ -136,4 +137,11 @@ Our code builds on [LLaDA](https://github.com/ML-GSAI/LLaDA), [Dream](https://gi
|
|
| 136 |
## Citation
|
| 137 |
If our research assists your work, please give us a star β or cite us using:
|
| 138 |
```
|
| 139 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
|
| 6 |
+
|
| 7 |
<div align="center">
|
| 8 |
<h1>π dParallel: Learnable Parallel Decoding for dLLMs</h1>
|
| 9 |
<div align="center">
|
|
|
|
| 13 |
<a href="https://github.com/czg1225/dParallel">
|
| 14 |
<img src="https://img.shields.io/badge/Paper-Arxiv-darkred.svg" alt="Paper">
|
| 15 |
</a>
|
| 16 |
+
<a href="https://huggingface.co/Zigeng/dParallel-LLaDA-8B-instruct">
|
| 17 |
<img src="https://img.shields.io/badge/HuggingFace-Model-FFB000.svg" alt="Project">
|
| 18 |
</a>
|
| 19 |
<a href="https://huggingface.co/datasets/Zigeng/dParallel_LLaDA_Distill_Data">
|
|
|
|
| 54 |
</tr>
|
| 55 |
<tr>
|
| 56 |
<td>π€ <strong>Model</strong></td>
|
| 57 |
+
<td><a href="https://huggingface.co/Zigeng/dParallel-LLaDA-8B-instruct">dParallel-LLaDA-8b-instruct</a></td>
|
| 58 |
</tr>
|
| 59 |
<tr>
|
| 60 |
<td>π <strong>Data</strong></td>
|
|
|
|
| 84 |
import torch
|
| 85 |
|
| 86 |
device = 'cuda'
|
| 87 |
+
model = LLaDAModelLM.from_pretrained('Zigeng/dParallel-LLaDA-8B-instruct', trust_remote_code=True, torch_dtype=torch.bfloat16).to(device).eval()
|
| 88 |
+
tokenizer = AutoTokenizer.from_pretrained('Zigeng/dParallel-LLaDA-8B-instruct', trust_remote_code=True)
|
| 89 |
|
| 90 |
prompt = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Please reason step by step, and put your final answer within \\boxed{}."
|
| 91 |
|
|
|
|
| 137 |
## Citation
|
| 138 |
If our research assists your work, please give us a star β or cite us using:
|
| 139 |
```
|
| 140 |
+
```
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
|
| 146 |
+
|
| 147 |
+
|