HN
New
Show
Ask
Jobs
Built with Solid
SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving
(arxiv.org)
2 points | by
matt_d
11 hours ago ago
No comments yet.
No comments yet.