August 12, 2024 1:02 PM
Image Credit: Venturebeat, via Ideogram
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Today, Abu Dhabi-backed Technology Innovation Institute (TII), a research organization working on new-age technologies across domains like artificial intelligence, quantum computing and autonomous robotics, released a new open-source model called Falcon Mamba 7B.
Available on Hugging Face, the casual decoder-only offering uses the novel Mamba State Space Language Model (SSLM) architecture to handle various text-generation tasks and outperform leading models in its size class, including Meta’s Llama 3 8B, Llama 3.1 8B and Mistral 7B, on select benchmarks.
It comes as the fourth open model from TII after Falcon 180B, Falcon 40B and Falcon 2 but is the first in the SSLM category, which is rapidly emerging as a new alternative to transformer-based large language models (LLMs) in the AI domain.
The institute is offering the model under ‘Falcon License 2.0,’ which is a permissive license based on Apache 2.0.
What does the Falcon Mamba 7B bring to the table?
While transformer models continue to dominate the generative AI space, researchers have noted that the architecture can struggle when dealing with longer pieces of text.
Essentially, transformers’ attention mechanism, which works by comparing every word (or token) with other every word in the text to u...