Discussion about this post

User's avatar
Subendhu Rongali's avatar

This is really impressive! Do you have any metrics on long context benchmarks such as RULER or NIAH? That seems to be the last advantage an attention mechanism would hold, compared to a state-space approach like this.

Expand full comment
Tim Post's avatar

I work mostly with quantized models and this is very exciting for me. While I can get great performance with most 8/12B Q4 models, *memory* is still a huge bottleneck (I work mostly on open source for assistive tech for brain trauma folks like myself).

I'm very excited to see where this goes. Glad I found Featherless!

Expand full comment
3 more comments...

No posts