Shaowei Liu
lsw825
AI & ML interests
None yet
Organizations
Was this really trained during QAT using a symmetric 4bit quant with only 15/16 values used?
๐
9
3
#26 opened 3 months ago
by
jukofyork
Fix args in SGLang launch command
#9 opened 3 months ago
by
ispobock
Update tokenizer_config.json
๐คฏ
๐
12
5
#13 opened 7 months ago
by
bchenfireworks
AttributeError: 'MLACommonMetadataBuilder' object has no attribute 'page_size'
2
#6 opened 7 months ago
by
Jerry-PigeonG
Run 1T-param on A100/H100(80G)x8 using FP4
๐
๐ฅ
5
7
#9 opened 7 months ago
by
ghostplant
Question about the function call in chat template
2
#8 opened 7 months ago
by
PengYM
Any plan to release a Vision enabled version with the same or near the same base and instruct model?
๐ค
โค๏ธ
2
8
#7 opened 7 months ago
by
drmcbride
Can you provide Machine Specs
11
#2 opened 7 months ago
by
kingabzpro
Kimi-K2-Mini
๐
22
15
#1 opened 7 months ago
by
PSM24
you should mention this model use deepseek architecture
1
#3 opened 7 months ago
by
CHNtentes