Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
nemotron-3
latent-moe
mtp
conversational
custom_code
8-bit precision
modelopt

CUDA Version -- Min requirement?

#6
by raymondlo84-nvidia - opened

I think we should state the minimal cuda version / driver or such in here.

@raymondlo84-nvidia from my testing if running on host (non-containerized vllm) you need 12.9+ ; Via Docker it doesn't matter (570+ drivers support forward cuda compatibility) and the container will use the embedded required cuda runtime for vllm automatically.

Sign up or log in to comment