NPU - QNN
Collection
leading models optimized for NPU deployment on Qualcomm Snapdragon
•
7 items
•
Updated
phi-3.5-onnx-qnn is an ONNX QNN int4 quantized version of Microsoft Phi-3.5-mini-instruct, providing a small fast NPU inference implementation, optimized for NPU deployment on Windows ARM64 AI PCs with Snapdragon Elite X NPU processors.
Base model
microsoft/Phi-3.5-mini-instruct