What is TII Falcon 180B open source language model?

What is TII Falcon 180B open source language model?

Posted on


The Technology Innovation Institute (TII) has made a significant contribution to the open-source community with the introduction of a new large language model (LLM) called Falcon. This model, which boasts an impressive 180 billion parameters, is a generative LLM available in various versions, including Falcon 180B, 40B, 7.5B, and 1.3B parameter AI models.

When Falcon 40B was launched, it quickly gained recognition as the world’s top-ranked open-source AI model. This version of Falcon, with its 40 billion parameters, was trained on an astounding one trillion tokens. For two months following its launch, Falcon 40B held the number one spot on Hugging Face’s leaderboard for open-source large language models (LLMs). What sets Falcon 40B apart is that it is offered completely royalty-free with weights, a revolutionary move that helps democratize AI and make it a more inclusive technology.

The Falcon 40B LLM is multilingual, working well with a variety of languages including English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, and Swedish. This foundational LLM serves as a versatile base model that can be fine-tuned to meet specific requirements or objectives.

Falcon 180B open source LLM

Falcon 180B, the super-powerful language model with 180 billion parameters, was trained on 3.5 trillion tokens. It currently holds the top spot on the Hugging Face Leaderboard for pre-trained Open Large Language Models and is available for both research and commercial use. This model excels in various tasks like reasoning, coding, proficiency, and knowledge tests, even outperforming competitors like Meta’s LLaMA 2.

Among closed source models, Falcon 180B ranks just behind OpenAI’s GPT 4, and performs on par with Google’s PaLM 2, which powers Bard, despite being half the size of the model. This is a testament to the quality of the model, given that LLMs are particularly sensitive to the data they are trained on. The TII team built a custom data pipeline to extract high-quality pre-training data using extensive filtering and deduplication, implemented both at the sample level and at the string level.

Other articles you may find of interest on the subject of open source large language models :

In an effort to encourage innovative use of the model, Falcon 40B launched a “Call for Proposals” from scientists, researchers, and innovators. The most exceptional use cases are set to receive an investment of training computing power to work on the powerful model to shape transformative solutions. Remarkably, the model uses only 75 percent of GPT-3’s training compute, 40 percent of Chinchilla AI’s, and 80 percent of PaLM-62B’s.

One of the distinguishing factors in the development of Falcon was the quality of the training data. The pre-training data collected for Falcon 40B was nearly five trillion tokens, gathered from a variety of sources including public web crawls (~80%), research papers, legal text, news, literature, and social media conversations.

Trained on 3.5 trillion tokens

The Falcon model’s training process involved the simultaneous utilization of 4096 GPUs, totaling approximately 7 million GPUs per hour. Falcon’s training data set comprises of web data, complemented by a mix of curated content including conversations, technical papers, Wikipedia, and small fractions of code. The model has been fine-tuned for various conversational and instructional data sets but is excluded for hosting use.

Despite its impressive performance, the Falcon model does not have updated information on very recent events. However, the release of the Falcon model is seen as a significant advancement in the open-source world, outperforming other models such as Llama 2, Stable LM, Red Pajama, NPT, and others on various benchmarks. The model is 2.5 times larger than Llama 2 and outperforms Llama 2, OpenAI’s GPT 3.5 model, and Google’s Palm on various benchmarks. This makes it a powerful tool for both research and commercial use, and a significant contribution to the open-source community.

Filed Under: Guides, Top News





Latest togetherbe Deals

Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, togetherbe may earn an affiliate commission. Learn about our Disclosure Policy.

Gravatar Image
My John Smith is a seasoned technology writer with a passion for unraveling the complexities of the digital world. With a background in computer science and a keen interest in emerging trends, John has become a sought-after voice in translating intricate technological concepts into accessible and engaging articles.

Leave a Reply

Your email address will not be published. Required fields are marked *