NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction (jchandra.com)
vivahir215 2 days ago [-]
Interesting Approach. Curious about the latency tradeoff: OLS + SVD are much heavier than Top-K.Have you benchmarked end-to-end inference latency?
jchandra 2 days ago [-]
[dead]
jchandra 2 days ago [-]
[dead]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 12:33:00 GMT+0000 (Coordinated Universal Time) with Vercel.