QRWKV6 and a charm of finches

And how QRWKV6 stands out among our various RWKV6 experiments

Eugene Cheah

Dec 11, 2024

Happy December Neurips,

We are proud to announce the triple model weights release of a charm of finches

Q-RWKV-6 32B Instruct Preview

Our latest frontier model.

A variant of RWKV-6, converted from an existing Qwen 32B model.

This is our strongest linear model to date, beating out all previous RWKV, State Space and Liquid AI models, smashing all previous key english benchmarks and evals.

Excitingly, this unlocks the option of converting existing transformer models to more efficient RWKV linear architecture.

Its limitation however, is how it inherits its knowledge training, and tokenizer, from the parent model. Which in this case is limited to approximately 30 languages (compared to RWKV 100+ languages)

See more info: Announcement article
Try the model on our: Featherless.ai inference

RWKV-6 Finch MoE 37B

Our first RWKV MoE model, for RWKV-6, with 11B out of 37B active parameters. Currently provides one of strongest multi-lingual model

See more info: Announcement article

RWKV-6 Finch 7B World 3

An overall multi-lingual upgrade of our v6 7B base models, that is a major bump up from our previous 7B models for multi-lingual and mixed use cases.

This was developed and released under the RWKV foundation. With various contributors from Eleuther AI and RWKV open source group.

See more info: Announcement article

A guest post by

Eugene Cheah

Builds Attention-Free Transformer AI models (http://wiki.rwkv.com) from scratch, CEO @ featherless.ai (prv recursal.ai) - Also known for k8s infra & UI testing tools, webapps, and GPU.js, Hot-takes/Views are my own

Featherless AI - recursive dev blog

Discussion about this post