RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic · Discussions

Resources

View closed (1)

🪄 InferenceService name updated

#8 opened 8 months ago by

change-name

#7 opened 8 months ago by

Overview states 109b, should be 17b

#6 opened 8 months ago by

Failing to quantize using your method

#4 opened 10 months ago by

VLLM launch parametrs

#3 opened 11 months ago by

Why not FP8 with static and per-tensor quantization?

#2 opened 11 months ago by

Thank you uploading this.

#1 opened 11 months ago by

chriswritescode