Google Research, Google's technology research division, and Google DeepMind, its AI division, are looking back at various AI models and research results announced on AI and computing in 2023.

2023: A year of groundbreaking advances in AI and computing – Google Research Blog

Google released Bard, a chatbot AI, on March 21, 2023. This Bard can create various contents such as text generation and language translation.

Then, at the new product announcement event 'Google I/O' in May 2023, Google announced PaLM 2, a large-scale language model. PaLM 2 combines compute-optimized scaling, improved datasets, and a superior model architecture for advanced inference tasks, resulting in significant improvements in accuracy from large-scale language models developed to date. says Google.

Bard and PaLM 2 are integrated into various Google products and features. The most significant achievement is the Search Generative Experience, which enables conversational search by incorporating a large-scale language model into Google's search engine. Search Generative Experience not only allows you to search, but also allows you to create content, such as generating images and text.

In June 2023, 'AudioPaLM', which allows you to create music from text, was announced. This AudioPaLM is a multimodal architecture that integrates PaLM 2 and AudioLM, a speech-based language model.

In addition, 'DuetAI' is an AI that performs user writing, image creation, spreadsheet analysis, coding support, etc. on Google Workspace and Google Cloud, and a preview version was introduced to Google Cloud in August 2023.

In June 2023, the image generation model Imagen Editor was released, which allows users to interactively generate and edit images using natural language. You can find out what Google's image generation AI 'Imagen' is by reading the article below.

In November 2023, the AI music generation model 'Lyria' was announced in partnership with YouTube. This Lyria is expected to be incorporated into the system that automatically generates music on YouTube in the future.

And the multimodal AI model announced by Google in December 2023 is 'Gemini.' Gemini is available in three models depending on the number of parameters: Ultra, Pro, and Nano, and the smallest model, Nano, is scheduled to be installed as an AI that runs locally on Pixel 8 Pro .

Furthermore, Bard has been significantly strengthened by Gemini Pro, and it was announced that 'Bard Advanced' equipped with Gemini Ultra will also be released in 2024.

In addition, the programming AI `` AlphaCode 2'', which combines Gemini with ``AlphaCode '' announced in 2022, has also been announced. This AlphaCode 2 has 1.7 times the problem-solving ability compared to the previous generation model AlphaCode, and showed better performance than 85% of participants in competitive programming.

Transformer, the core technology of these large-scale language models, is an architecture developed by Google researchers in 2017 that generates text by breaking down sentences into units called tokens and connecting the tokens probabilistically. This is how it works.

However, although large-scale language models based on Transformer are good at creating natural sentences, they do not have high reasoning ability for solving arithmetic problems or thinking about logical problems, and even for elementary school-level arithmetic problems. I often get it wrong. Therefore, Google is conducting research to obtain inference ability that can solve arithmetic problems even with large-scale language models using a method called ` `algorithmic inference using in-context learning ''. This approach seems to have improved the benchmark accuracy of a junior high school level math test from 25.9% to 61.1%.

Furthermore, as part of a multimodal model that can recognize text, images, and videos like Gemini, it has been revealed that it has developed a language model ``PaLM-E'' that can understand images captured by a camera and instructions in text.

In July 2023, Google DeepMind announced 'Robotic Transformer 2 (RT-2),' a learning model that can convert images and text into actions.

Furthermore, in June 2023, Google DeepMind announced 'AlphaDev,' an AI that uses deep reinforcement learning to improve various computing algorithms.

In the field of natural science, AlphaFold, developed by Google DeepMind, is being researched as an AI that accurately predicts the three-dimensional structure of proteins. At the time of writing, AlphaFold had successfully created a new genome map of the human genome in collaboration with 119 scientists from 60 institutions.

Furthermore, in September 2023, the AI ``AlphaMissense'' was announced, which helps predict the harmfulness of genetic mutations and identify the causes of genetic diseases.

In November 2023, Google released details about its neural weather prediction model ``MetNet-3,'' which can predict weather every 2 minutes for 24 hours. Furthermore, the weather forecasting AI 'GraphCast' announced by Google DeepMind is a model that can predict the weather for 10 days in one minute.

According to Google Research and Google DeepMind, 'As the power of multimodal models increases, people will be able to make incredible advances, from science to education to entirely new fields of knowledge. As Google's products and research also advance, AI will find many more interesting and creative uses. If pursued boldly and responsibly, AI will change the lives of people everywhere. 'We believe it can be a foundational technology, and that's what excites us.'

