Google CEO Sundar Pichai and Deepmind founder Demis Hassabis talk about ``digital agents'', ``search engines'', ``integration with Chrome'', etc., with a focus on multimodal AI ``Gemini''



On December 6, 2023, Google DeepMind released the multimodal AI '

Gemini '. Mr. Casey Newton of the IT-related newsletter ``Platformer'', who had obtained this Gemini news in advance, actually interviewed Google's CEO Sundar Pichai and Google DeepMind's Mr. Demis Hassabis.

Google unveils Gemini - by Casey Newton - Platformer
https://www.platformer.news/p/google-unveils-gemini

Mr. Casey Newton (hereinafter referred to as CN):
Today you shared various industry benchmarks that show Gemini's progress. But I'm curious about your own, personal model testing. Have you noticed anything that makes you feel like you've made a step forward?

Mr. Demis Hassabis (hereinafter referred to as Mr. Hassabis):
With the new Bard, you'll find that the overall quality is much improved over the previous model. I'm particularly interested in using it as a scientific assistant. It has been very useful for actually analyzing and interpreting scientific papers and their graphs, adding tables to graphs, extending graphs, etc., so I would like to further strengthen it.



CEO Sundar Pichai (hereinafter referred to as Pichai CEO):
Multimodality is very exciting. As we work to incorporate multimodality into our products and publish them thoughtfully, I think there will be a lot of new synapses coming in.

What's exciting to me is that this is just 1.0 for us. We have a strong roadmap for innovation into 2024. And what Hassabis and his team are really good at is constant iteration and developing new versions.



CN:
When I asked Eli Collins, DeepMind's vice president of products, if Gemini had shown any new capabilities, he said, 'Stay tuned.' Do you think Gemini will have capabilities beyond traditional large-scale language models (LLMs)? Or do you think it's more evolutionary?

Mr. Hassabis:
I think there will be some new features coming. This is part of the purpose of testing Gemini Ultra. We're in something of a beta version, just to check safety and responsibility, but also to see what else we can tweak.

CN:
You blog about Gemini being good at reasoning. If so, how good is Gemini for planning? Can I imagine creating an agent to make reservations using Gemini?

Mr. Hassabis:
That's spot on. This is something that we have traditionally emphasized since the days of DeepMind. We are experts in this kind of agent-based and planning systems. We are putting our efforts into this field.

Multimodality is important and fundamental to building agents. You cannot take any useful action in the world unless you analyze your environment multimodally.

Pichai CEO:
But innovation is about to happen.

CN:
You say Gemini is coming in 2024. How do you think that will change the search experience?

Pichai CEO:
We're already experimenting with it in our search generation experience and making improvements across the board at the same time. We consider Gemini to be fundamental and apply to all our products. The same goes for searching.

One of the things that search is strongly pushing for is multimodality in general. But today, making search multimodal required considerable effort. I think search is an area where Geimini will innovate because it natively provides multimodality as a foundation model.

CN:
Do you think Gemini in search will increase the number of times people get the information they need from a results page without visiting a website in the medium term?

Pichai CEO:
Fundamentally, our vision is that people use search to experience the richness and diversity of the web and content ecosystem. So even though we can extend what we can do with search generation experiences, we're actually designing our product so that people can go out and explore. I think that's what users want. I think that's the fundamental value proposition of search, so it's part of our goal as we evolve the product.

CN:
There are also reports that Gemini is coming to Chrome. What can you do with Gemini in your web browser?

Pichai CEO:
We can look at what's on the web page and answer your questions or help you with tasks related to it. You might look at something you want to understand, such as a diagram on a web page, and say, ``Give me a quick summary of this.'' It all became possible. Again, the idea is that Gemini becomes your assistant while browsing the web, helping you with your actions. These are all possibilities.

CN:
I would like to understand the current state of technology. I can imagine that most of 2024 will be spent improving Gemini 1.0. But when moving forward with Gemini 2.0 training, do you feel that it is simply a matter of throwing more data and computing power into the technology you have already developed? Or do we need fundamental research breakthroughs?

Mr. Hassabis:
That's a good question. We intend to push both frontiers. We are considering a lot of research into important capabilities that would be needed if we were to aim for an AGI-level system, and that current systems do not have, and we are working hard on all of them. We are working on this.

Building on these innovative new features with scaling, architectural improvements, and even more advanced improvements, even more possibilities remain. In fact, there are many promising areas of research.



Pichai CEO:
I feel like it's very early stages for me. There is a clear prospect that Gemini 2.0 will be better. If you look at all the work that Google DeepMind is doing, there seem to be 10 to 15 areas, but right now we're seeing rapid progress in one area, right? But innovations will emerge from other fields as well.

CN:
Apparently Gemini even wins a coding competition. Do you think that in a year's time you'll be so good that you don't need to hire engineers?

Pichai CEO:
I think Gemini will make programmers much more productive and take some of the heavy lifting out of their jobs over time. I think programmers will have sophisticated tools and more people will become programmers. It should not be underestimated. The hurdles will change and access to the field will expand.



CN:
When CEO Pichai and I spoke in early 2023, you said you wouldn't mind if the pace of development in the AI field slowed down a bit. How do you feel about the current pace of development?

Pichai CEO:
I have two points of view. I'm very optimistic about the possibilities. For example, if I take a step back and think that a breakthrough here could help make progress against cancer easier, I would want the research to move forward as quickly as possible. Right? But as we move toward higher-performance models, I think we need time to make sure we have safety measures in place.

I think the current pace is exciting. But there will be moments when you feel like you need to take a breather and catch up. I think it goes hand in hand.

Hassabis CEO:
I think so too. This whole field is a bit of a rocket voyage. I've been working in this field for 20, 30 years, and it's great to see everything fall into place. AI-powered technology will truly cure diseases. It will also be useful in the discovery of new materials and climate change. I think there are almost limitless ways in which AI can be applied to help society. We're now really moving beyond games and what we used to do and into things that are actually practical and useful in the real world.

But at the same time, I've always believed that this would be one of the most transformative technologies humans have ever invented. I think more people are coming around to that idea now. So we need to be thoughtful, responsible and as far-sighted as possible.

in Software, Posted by log1i_yk