Japanese AI company 'Sakana AI' has developed a method to create ultra-high performance models by combining existing AI models, and uses evolutionary algorithms to try a huge number of combinations and create high-performance LLM and image generation models that are difficult for humans to come up with. Can be created



Tokyo-based AI company

Sakana AI has developed a method to create new models by combining multiple generative AI models using evolutionary algorithms. Sakana AI has already successfully created large-scale language models and image generation models, and each model has been confirmed to have higher performance than existing models.

Building a basic model using evolutionary algorithms
https://sakana.ai/evolutionary-model-merge-jp/

[2403.13187] Evolutionary Optimization of Model Merging Recipes
https://arxiv.org/abs/2403.13187

◆Technology overview
Creating a generative AI model from scratch requires a large number of high-performance GPUs to perform enormous computational processing. On the other hand, a method called ``model merging'' that creates a new model by combining existing models can be executed at relatively low cost. For this reason, model merging is attracting attention as a tool that removes barriers to the development of generative AI models.

However, the creation of generative AI models through model merging is largely dependent on human intuition and experience, and there is no systematic theory such as ``If generative AI models are combined in this way, a high-performance model can be created.'' does not exist. Therefore, Sakana AI decided to incorporate an ``evolutionary algorithm'' as a systematic and logical approach to efficiently perform model merging.

An evolutionary algorithm is an algorithm that searches for the ``optimal method to achieve a certain purpose'' by imitating the evolutionary process of living organisms. The following movie is an easy-to-understand example of an evolutionary algorithm.

A genetic algorithm was used to learn how to swing. Long version/Physics engine [Munimuni] - YouTube


Using evolutionary algorithms, we can mechanically try out a huge number of combinations to find the most efficient one. Therefore, by introducing an evolutionary algorithm to model merging, it is possible to discover results that are often overlooked by human intuition. Sakana AI uses evolutionary algorithms to optimize the operation of ``combining multiple existing generative AI models to create a generative AI model that shows a high score on a specific benchmark'', thereby creating a high-performance generative AI model. I succeeded in creating it. These generative AI models are merged using a method that is difficult for even experts to discover through trial and error on their own, and it has been confirmed that they exhibit higher performance than existing models.

◆Example of generative model created using evolutionary algorithm
The outline of the generative AI models 'EvoLLM-JP', 'EvoVLM-JP', and 'EvoSDXL-JP' created by Sakana AI is as follows.

・EvoLLM-JP
EvoLLM-JP is a generative AI model created with the aim of being a 'language model that can solve math problems in Japanese.' EvoLLM-JP is a model that combines the Japanese specialized language model 'Shisa Gamma 7B v1' and the English language models specialized in mathematics 'WizardMath 7B v1.1' and 'Abel 7B 002'. It was created by repeating the process of combining models in various patterns, measuring the performance of the resulting model using LLM's mathematical performance measurement benchmark `` MGSM '', and passing on the model with the highest score to the next generation over several hundred generations. Ta.

Below are the results of measuring the performance of the finally completed EvoLLM-JP using MGSM. EvoLLM-JP (Ours) not only showed higher performance than the original 'Shisa Gamma 7B v1', 'WizardMath 7B v1.1', and 'Abel 7B 002', but also showed higher performance than 'Llama 2 70B' and 'GPT-3.5'. ” and higher scores than existing models.



In addition, it was confirmed that EvoLLM-JP shows high performance not only in mathematical responses but also in Japanese in general. The table below shows the results of measuring response performance in Japanese using the '

Language Model Evaluation Harness .' EvoLLM-JP (Ours) not only recorded higher scores than language models with 7B parameters, but also higher scores than some language models with 70B parameters.



Examples of answers to EvoLLM-JP questions are below. It can be seen that they were able to correctly answer questions that required a combination of ``knowledge of Japanese culture'' and ``numerical ability.''

[Question] This year's first day of spring is February 3rd. What is the sum of the month and day of Setsubun?
[Answer] Setsubun is the day before the first day of spring. This year's first day of spring is February 3rd, so Setsubun is February 2nd. Adding the dates gives 2 + 2 = 4.



Below is an example that shows Japanese ability that is not related to calculation.

[Question] What is one typical Japanese dish?
[Answer] Japan's representative dish is 'sushi'. Sushi is a traditional Japanese dish made with rice and topped with various toppings. Sushi is popular all over the world and represents Japanese cuisine.



It is also possible to crack jokes using the Kansai dialect.

[Question] Please try telling a funny joke in Kansai dialect.
[Answer] Oh, listen! Yesterday, after drinking, I got so drunk that I hit a telephone pole. At that moment, my consciousness flew away, and the next thing I knew, I was lying in a hospital bed. That's what I thought at that time. ``This is an act of God. I was so drunk that I hit a telephone pole after drinking.'' Then the doctor came and told me. ``You ran into a telephone pole after drinking, getting drunk.'' I said, ``Oh, listen! I got drunk, got drunk, and hit a telephone pole.'' The doctor said. ``Is that God's work?'' I said, ``No, I didn't think so.'' Then the doctor said, ``That's good. If it's God's work, insurance won't cover it.'' At that moment, I thought. 'This doctor is an interesting guy.'



・EvoVLM-JP
Model merging using evolutionary algorithms can also combine generative AI models of different architectures created for different purposes. EvoVLM-JP is a model created by merging the English image language model 'LLaVa-1.6-Mistral-7B' and the Japanese language model 'Shisa Gamma 7B v1', and can respond to questions about images in Japanese. I can.

An example of EvoVLM-JP's response is below. If you enter a photo of a carp streamer and ask the question, 'What is the fish-shaped decoration shown in this image called in Japan?', 'What is the fish-shaped decoration shown in this image called in Japan?' It is called ``Koinobori''.'' You answered correctly.



Furthermore, in the example below, the answer regarding the color of a traffic light is 'blue' in line with Japanese culture.



You can try out EvoVLM-JP's response using the demo app linked below.

EvoVLM JP - a Hugging Face Space by SakanaAI
https://huggingface.co/spaces/SakanaAI/EvoVLM-JP



When I actually entered a photo of ramen and asked the question, ``What is in this photo?'', it correctly answered, ``Ramen is in this photo.''



Details of EvoLLM-JP and EvoVLM-JP are available at the links below.

GitHub - SakanaAI/evolutionary-model-merge: Official repository of Evolutionary Optimization of Model Merging Recipes

https://github.com/SakanaAI/evolutionary-model-merge/



・EvoSDXL-JP
EvoSDXL-JP is an image generation model created by model merging using an evolutionary algorithm. EvoSDXL-JP can generate images like the one below with just 4 steps of reasoning. Details of EvoSDXL-JP are scheduled to be released soon.



In addition, Sakana AI is supported by the Generative AI Accelerator Challenge (GENIAC) , a project to strengthen Japan's generative AI development capabilities, and will conduct research and development using the large-scale GPU supercomputer provided by GENIAC. We plan to accelerate this.

in Software,   Science, Posted by log1o_hf