Oct 04, 2024 17:00:00

Google releases Gemini 1.5 Flash-8B, which is smaller and faster than previous models, with a 50% lower price and reduced latency

This article, originally posted in Japanese on 17:00 Oct 04, 2024, may contains some machine-translated parts.
If you would like to suggest a corrected translation, please click here.

Google announced the high-speed, high-performance, lightweight AI model 'Gemini 1.5 Flash' in May 2024. The product version of this even lighter and more affordable version 'Gemini 1.5 Flash-8B' was released on October 3, 2024.

Gemini 1.5 Flash-8B is now production ready - Google Developers Blog

https://developers.googleblog.com/en/gemini-15-flash-8b-is-now-generally-available-for-use/

Gemini 1.5 Flash-8B is now production ready
https://simonwillison.net/2024/Oct/3/gemini-15-flash-8b/

The Gemini 1.5 Flash, an AI model announced by Google, boasts performance comparable to the Gemini 1.5 Pro in benchmark tests, but is significantly cheaper to use than the Gemini 1.5 Pro.

Google announces 'Gemini Flash', a high-speed, high-performance lightweight AI model, with performance equivalent to that of Gemini Pro at one-tenth the price - GIGAZINE

Over the next few months, Google improved Gemini 1.5 Flash and developed a smaller, faster version called Gemini 1.5 Flash-8B, which was released as an experimental version in September and a production version in October.

Although the Gemini 1.5 Flash-8B is a lightweight model, it is said to perform on par with the Gemini 1.5 Flash in many benchmarks.

Gemini 1.5 Flash-8B is the lowest priced Gemini series, and Google claims it is '50% cheaper than Gemini 1.5 Flash.' The specific costs are '$0.0375 (about 5.5 yen) per 1 million tokens of prompt input,' '$0.15 (about 22 yen) per 1 million tokens of prompt output,' and '$0.01 (about 1.5 yen) per 1 million tokens of prompt cache,' and the amount doubles for any prompts exceeding 12,800 tokens.

For developers on paid plans , billing will begin on October 14. Google said, 'This new pricing, along with the work we've done to reduce costs for developers with Gemini 1.5 Flash and Gemini 1.5 Pro, underscores our commitment to empower developers to freely build products and services that move the world forward.'

Gemini 1.5 Flash-8B also boasts lower latency for small prompts than Gemini 1.5 Flash, and the rate limit has been doubled.

Developers can access Gemini 1.5 Flash-8B via Google AI Studio and the Gemini API.

Untitled prompt | Google AI Studio
https://aistudio.google.com/app/prompts/new_chat?model=gemini-1.5-flash-8b-002

Gemini models | Gemini API | Google AI for Developers
https://ai.google.dev/gemini-api/docs/models/gemini

Related Posts:

Oct 04, 2024 17:00:00 in Software, Web Service, Posted by log1h_ik

Archives

Categories: Note; Headline; Review; Coverage; Interview; Tasting; Mobile; Software; Web Service; Web Application; Hardware; Vehicle; Science; Creature; Video; Movie; Manga; Anime; Game; Design; Art; Food; Security; Notice; Pick Up; Column

Search

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5