N
Hacker Next
new
past
show
ask
show
jobs
submit
login
▲
High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction
(
jchandra.com
)
14 points by
jchandra
2 days ago
|
1 comment
add comment
Rendered at 12:41:59 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
vivahir215 2 days ago
[-]
Interesting Approach. Curious about the latency tradeoff: OLS + SVD are much heavier than Top-K.Have you benchmarked end-to-end inference latency?
jchandra 2 days ago
[-]
[dead]
jchandra 2 days ago
[-]
[dead]