Updates needed?

#1
by xdavxd - opened

I noticed unsloth redid their quants using an updated llama.cpp for reasons, does that apply here as well?

Most fixes to llama.cpp were around the runtime, not conversion, but I do intend to revisit over the next couple days and see if it's worth remaking them all, for now they seem to work pretty well

It doesn't look like regeneration is needed as llama.cpp added workaround for old GGUF's: https://github.com/ggml-org/llama.cpp/pull/21500/commits/4e19abc52b275f547d2b9968095cc599c6e2e2e2

Or, they are if imatrix is used....

The bos is set in imatrix anyways since it doesn't use the chat template :)

The 2 and 4B models support audio, is there any audio in the imatrix generation set?

no, imatrix doesn't work on the mmproj files which are the ones that support multi-modals (as far as I understand)

imatrix currently only supports text input

Sign up or log in to comment