Introducing the personal AI 'Pi' and the base model 'Inflection-2.5', which have almost the same IQ with less than half the calculations of GPT-4



The generative AI 'Pi' developed by the AI company Inflection is a 'personal AI' optimized for each user. Inflection has announced 'Inflection-2.5', a basic model that dramatically improves the capabilities of Pi.

Inflection-2.5: meet the world's best personal AI

https://inflection.ai/inflection-2-5

Pi is an AI that can communicate with you through Android or iOS smartphones, browsers , etc., and also supports Japanese.



According to Inflection, Pi has 1 million daily active users and 4 million monthly active users. In addition, the average conversation time is 33 minutes, one in 10 people talk for more than an hour, and 60% of users use Pi again the following week. I am getting a high score.



What supports the popularity of Pi is Inflection-2.5, the base model installed in Pi. Inflection-2.5 measures multitasking performance even though the amount of computing spent on training is 40% of GPT-4, and its predecessor, Inflection-1, is only 4% of GPT-4. Achieved a score comparable to GPT-4 on Diamond, the highest ranking dataset of Google's benchmark GPQA (Baselines and analysis for the Google-proof Q&A), which requires MMLU (Massive Multi-task Language Understanding) and graduate school level knowledge. I showed it.



Inflection-2.5 also has strengths in STEM fields such as

Hungarian Math, which is used to estimate AI's numerical processing ability, and Physics GRE, a common test used in North America. I'm showing you.



Inflection-2.5 has achieved excellent results not only in the field of science and mathematics, but also in coding, and its scores have significantly improved from Inflection-1 in the coding benchmarks MBPP + and HumanEval +.



Comparing these benchmark scores using a bar graph of Inflection-1 (yellow green), Inflection-2.5 (green), and GPT-4 (gray), it looks like the following.



Inflection said, 'In short, Inflection-2.5 maintains Pi's unique, friendly personality and exceptional safety standards, while making it an even more useful model overall.'

in Software, Posted by log1l_ks