Open source large-scale language model development project 'RedPajama' releases the first model 'RedPajama-INCITE', free for commercial use



The open source project ``RedPajama'', which builds a large-scale language model based on the paper on the large-scale language model ``LLaMA'' published by Meta, has created a base model `` RedPajama '' that aims to reproduce LLaMA as faithfully as possible. -INCITE ” has been released.

Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models — TOGETHER

https://www.together.xyz/blog/redpajama-models-v1



RedPajama is a project jointly researched by AI startup Together, ETH Zurich, Stanford University, and others. Development is being done with the goal of creating a language model.

RedPajama is ``pre-training data that needs to be high quality and covers a wide range'', ``base model trained on a large scale with that data'', ``tuning data and model that improves the base model and makes it easy to use and safe. ”, and in April 2023, a 5TB training dataset containing more than 1.2 trillion tokens was released as the first stage. This dataset has been downloaded hundreds of times and used to train models such as

MPT , OpenLLaMA and OpenAlpaca .

``RedPajama'', a project to develop a completely open source large-scale language model that is free for commercial use, has released a training dataset - GIGAZINE



And this time, the second stage 'base model' has been released. RedPajama released a command adjustment model and a chat model in addition to the base model, and indicated a policy of further improvement.

RedPajama-INCITE has two models, one with 3 billion parameters and one with 7 billion parameters. The 3 billion model is said to be the most powerful of its peers, and its small size makes it extremely fast and accessible. RedPajama says that it also works with the GPU 'RTX 2070' released in 2018. The 7 billion model is said to outperform the ' Pythia 7B ' model.

The published variations are:

◆Red Pajama-INCITE-Base-3B-v1
A base model that outperforms other open models of similar scale on benchmarks.

◆Red Pajama-INCITE-Chat-3B-v1
A chat model that fine-tunes the base model RedPajama-INCITE-Base-3B-v1 using data from the open-source large-scale language model ' Dolly 2.0 ' and the open-dose large-scale language model ' Open Assistant '. .

◆Red Pajama-INCITE-Instruct-3B-v1
A model fine-tuned to the base model of RedPajama-INCITE-Base-3B-v1 using almost the same calculation formula as ' GPT-JT ' developed by linking computers around the world.

◆Red Pajama-INCITE-Base-7B-v0.1
Early preview version of RedPajama 7B. Training of 800 billion tokens out of 1 trillion tokens has been completed.

◆Red Pajama-INCITE-Chat-7B-v0.1
Early preview version of the chat model trained by RedPajama-INCITE-Base-7B-preview.

◆Red Pajama-INCITE-Instruct-7B-v0.1
An early preview of the model designed for Few-Shot prompts trained in RedPajama-INCITE-Base-7B-preview.



RedPajama said, “The biggest takeaway is that we have demonstrated that the open source community can quickly build a large-scale language model with high performance. The feedback has been incredible, and we believe open collaboration like this, on a larger scale, will underpin the best AI systems of the future.'

in Software, Posted by log1p_kr