🪄 InferenceService name updated
#8 opened 8 months ago
by
ckavili
change-name
#7 opened 8 months ago
by
robertgshaw
Overview states 109b, should be 17b
#6 opened 8 months ago
by
jcordes
Failing to quantize using your method
#4 opened 10 months ago
by
redd2dead
VLLM launch parametrs
👍 3
#3 opened 11 months ago
by
Clutchkin
Why not FP8 with static and per-tensor quantization?
👍 1
2
#2 opened 11 months ago
by
wanzhenchn
Thank you uploading this.
❤️ 6
#1 opened 11 months ago
by
chriswritescode