Earlier this year, we took the world by storm when we announced that our Eagle model had beaten Metaโs Llama-2 while taking less training time, being the worldโs most efficient model.
While Eagle still packs a powerful punch, and has been helping diverse use-cases from multi-lingual, to content moderation, gaming, and role-play, weโve been working on something new, to bring our insights on efficiency to a much broader realm.
Just this Friday, we launched Featherless AI, which enables serverless inference of every Llama-3 8B and 70B model on Hugging Face we grabbed our hands.
Thatโs over 475 models. With many more being added daily.
Allowing anyone to quickly experiment, try, and choose the latest and best models, from huggingface. Starting from $10 / month.
Previously, to use the even the smallest fine-tunes requires dedicated hardware, which translates to real hosting costs, whether youโre experimenting with a model or ramping up production use. This is a barrier to a host of use cases particularly agents where each step in the agent computation might benefit from a particular model.
The goal of featherless is to make every model on HuggingFace available serverless and with these Llama & RWKV based models, weโre a big step of the way there.
With featherless, you can experiment with an entirely new range of models at completely different economics.