Google and Google DeepMind look back at research results on AI and computing in 2023

Google Research, Google's technology research division, and Google DeepMind, its AI division, are looking back at various AI models and research results announced on AI and computing in 2023.

2023: A year of groundbreaking advances in AI and computing – Google Research Blog

Google released Bard, a chatbot AI, on March 21, 2023. This Bard can create various contents such as text generation and language translation.

Google starts releasing ChatGPT's rival 'Bard' to the public, what kind of answer will you get when you actually ask a question? -GIGAZINE

Then, at the new product announcement event 'Google I/O' in May 2023, Google announced PaLM 2, a large-scale language model. PaLM 2 combines compute-optimized scaling, improved datasets, and a superior model architecture for advanced inference tasks, resulting in significant improvements in accuracy from large-scale language models developed to date. says Google.

Google announces large-scale language model 'PaLM 2', already introduced in 25 Google services - GIGAZINE

Bard and PaLM 2 are integrated into various Google products and features. The most significant achievement is the Search Generative Experience, which enables conversational search by incorporating a large-scale language model into Google's search engine. Search Generative Experience not only allows you to search, but also allows you to create content, such as generating images and text.

'Search Generative Experience' that can generate AI images from Google search bar is now available - GIGAZINE

In June 2023, 'AudioPaLM', which allows you to create music from text, was announced. This AudioPaLM is a multimodal architecture that integrates PaLM 2 and AudioLM, a speech-based language model.

Google announces ``AudioPaLM'', a large-scale language model that can translate voice into ``both text and audio'' - GIGAZINE

In addition, 'DuetAI' is an AI that performs user writing, image creation, spreadsheet analysis, coding support, etc. on Google Workspace and Google Cloud, and a preview version was introduced to Google Cloud in August 2023.

Google introduces preview version of ``Duet AI'', a group of AI services that can provide coding assistance and data analysis, to Google Cloud - GIGAZINE

In June 2023, the image generation model Imagen Editor was released, which allows users to interactively generate and edit images using natural language. You can find out what Google's image generation AI 'Imagen' is by reading the article below.

'Imagen', an AI system that can automatically generate high-precision images even from unusual text - GIGAZINE

In November 2023, the AI music generation model 'Lyria' was announced in partnership with YouTube. This Lyria is expected to be incorporated into the system that automatically generates music on YouTube in the future.

Google DeepMind announces AI model 'Lyria' specialized in music generation, melody is generated just by humming - GIGAZINE

And the multimodal AI model announced by Google in December 2023 is 'Gemini.' Gemini is available in three models depending on the number of parameters: Ultra, Pro, and Nano, and the smallest model, Nano, is scheduled to be installed as an AI that runs locally on Pixel 8 Pro .

Multimodal AI ``Gemini'' with performance exceeding GPT-4, which can process text, voice, and images simultaneously and have more natural interactions than humans, will be released - GIGAZINE

Furthermore, Bard has been significantly strengthened by Gemini Pro, and it was announced that 'Bard Advanced' equipped with Gemini Ultra will also be released in 2024.

Bard, which has been significantly enhanced with the AI model 'Gemini Pro', will be available, and 'Bard Advanced' equipped with Gemini Ultra will also be released in 2024 - GIGAZINE

In addition, the programming AI `` AlphaCode 2'', which combines Gemini with ``AlphaCode '' announced in 2022, has also been announced. This AlphaCode 2 has 1.7 times the problem-solving ability compared to the previous generation model AlphaCode, and showed better performance than 85% of participants in competitive programming.

AI 'AlphaCode 2' specialized in programming based on Google's multimodal AI 'Gemini' has arrived, with performance that ranks in the top 15% of competitive programming - GIGAZINE

Transformer, the core technology of these large-scale language models, is an architecture developed by Google researchers in 2017 that generates text by breaking down sentences into units called tokens and connecting the tokens probabilistically. This is how it works.

How does the machine learning model 'Transformer', which is also used for ChatGPT, generate natural sentences? -GIGAZINE

However, although large-scale language models based on Transformer are good at creating natural sentences, they do not have high reasoning ability for solving arithmetic problems or thinking about logical problems, and even for elementary school-level arithmetic problems. I often get it wrong. Therefore, Google is conducting research to obtain inference ability that can solve arithmetic problems even with large-scale language models using a method called ` `algorithmic inference using in-context learning ''. This approach seems to have improved the benchmark accuracy of a junior high school level math test from 25.9% to 61.1%.

Furthermore, as part of a multimodal model that can recognize text, images, and videos like Gemini, it has been revealed that it has developed a language model ``PaLM-E'' that can understand images captured by a camera and instructions in text.

Google announces ``PaLM-E'', a language model for robots that understands visually and text like a human, and can execute complex commands such as ``bring me sweets'' - GIGAZINE

In July 2023, Google DeepMind announced 'Robotic Transformer 2 (RT-2),' a learning model that can convert images and text into actions.

Google announces robot AI 'RT-2' that can execute complex instructions such as 'move ○○' even in an untrained environment - GIGAZINE

Furthermore, in June 2023, Google DeepMind announced 'AlphaDev,' an AI that uses deep reinforcement learning to improve various computing algorithms.

DeepMind announces AI ``AlphaDev'' that uses deep reinforcement learning to improve algorithms, and has already succeeded in speeding up sorting algorithms and hash functions - GIGAZINE

In the field of natural science, AlphaFold, developed by Google DeepMind, is being researched as an AI that accurately predicts the three-dimensional structure of proteins. At the time of writing, AlphaFold had successfully created a new genome map of the human genome in collaboration with 119 scientists from 60 institutions.

The latest version of the AI 'AlphaFold' that accurately predicts the three-dimensional structure of proteins has been released, enabling more accurate prediction of molecules than before - GIGAZINE

Furthermore, in September 2023, the AI ``AlphaMissense'' was announced, which helps predict the harmfulness of genetic mutations and identify the causes of genetic diseases.

Google DeepMind announces AI ``AlphaMissense'', which may help predict which genetic mutations are harmful and identify the cause of genetic diseases - GIGAZINE

In November 2023, Google released details about its neural weather prediction model ``MetNet-3,'' which can predict weather every 2 minutes for 24 hours. Furthermore, the weather forecasting AI 'GraphCast' announced by Google DeepMind is a model that can predict the weather for 10 days in one minute.

Google DeepMind announces AI 'GraphCast' that can predict the weather for 10 days in just 1 minute, more accurate than predictions that took several hours with a supercomputer - GIGAZINE

According to Google Research and Google DeepMind, 'As the power of multimodal models increases, people will be able to make incredible advances, from science to education to entirely new fields of knowledge. As Google's products and research also advance, AI will find many more interesting and creative uses. If pursued boldly and responsibly, AI will change the lives of people everywhere. 'We believe it can be a foundational technology, and that's what excites us.'

in Software, Posted by log1i_yk