We present the Eagle and Finch architecture paper at arxiv: https://arxiv.org/abs/2404.05892
Which covers and documents the architecture changes from RWKV-v4 onwards. This paper is a collaborative effort with the folks at Eleuther AI, who helped us in the paper-writing process
Special shout-out to
BlinkDL: The creator of RWKV project
Eleuther AI: Who helped us throughout the paper writing process
Linux Foundation AI & Data: For hosting our project
Stability AI: Who sponsored the bulk of the compute, for the models covered.
Does this cover our latest model?
No - this covers our previously released Eagle and Finch line of models, trained up to 1.1T tokens
A reminder, that as a fully Open Source project, we release in the following sequence: Code, Weights, then the paper Not the other way around
Stay tuned for more details on our upcoming models this week
Eagle: 2.25T 7B
Finch: 2.5T 1.6B
(Some of you probably already know where to find it, if you search through our repos / discord)