Dec 04, 2023 12:00:00

``LLM Visualization'' is a site that visualizes the structure of large-scale language models in 3D and displays what calculations are being performed in an easy-to-see manner.

This article, originally posted in Japanese on 12:00 Dec 04, 2023, may contains some machine-translated parts.
If you would like to suggest a corrected translation, please click here.

Chat AI such as ChatGPT performs numerous calculations internally to generate sentences. ' LLM Visualization ' is a site that allows you to easily visualize in 3D format what parameters are stored internally and what calculations are being performed.

LLM Visualization

https://bbycroft.net/llm

This is what it looks like when you access the site. There is an explanation on the left side of the screen and a 3D model on the right side. Click 'Continue' in the explanation.

In the explanation, we will perform a task to rearrange three types of characters using the model ``nano-gpt'', which has approximately 85,000 parameters, and confirm the calculations inside the large-scale language model. Press the spacebar to advance to the next section.

The places of interest in the 3D model are highlighted according to the stage of the explanation, making it easier to understand which place is being talked about.

When you place the cursor on a part of the 3D model, you can check the parameters such as 'which structure the part belongs to', 'number of rows', and 'number of columns'. In the case of 'Token Embed' in the figure below, it constitutes 'Embedding', the number of rows is 'C' which indicates the number of channels whose size is the embedding, and the number of columns is 'n_vocab' which indicates the number of vocabulary (large). is displayed. nano-gpt is an extremely small model and can only handle three characters, 'A', 'B', and 'C', so the number of Token Embed columns only needs to be three.

You can also see the calculation formula by placing the cursor on the part where the result is calculated. 'Input Embed' is calculated by adding 'Token Embed' and 'Position Embed'. The green block is the part that calculates the value based on the input, and the blue block is the part that becomes the parameter whose value is adjusted during training.

In places where the structure is complicated, 'which structure it belongs to' is displayed hierarchically.

If you want to check the internal structure in more detail, you can just zoom in.

Based on various calculations, the next character prediction was output as 'A'. By feeding back this prediction to the input, it is possible to make predictions one after another.

You can also read detailed explanations of each internal structure on the site, so if you are interested, please check it out.

In addition, in the explanation, we used 'nano-gpt' which has approximately 85,000 parameters, but if you compare nano-gpt with GPT-2 (small) which has 124 million parameters, it will be as shown in the figure below.

The size of GPT-3, which has approximately 175 billion parameters, is shown in the figure below. Thanks to the 3D format, you can see at a glance how much the models differ in size.

Related Posts:

Dec 04, 2023 12:00:00 in Review, Software, Web Application, Posted by log1d_ts

Archives

Categories: Note; Headline; Review; Coverage; Interview; Tasting; Mobile; Software; Web Service; Web Application; Hardware; Vehicle; Science; Creature; Video; Movie; Manga; Anime; Game; Design; Art; Food; Security; Notice; Pick Up; Column

Search

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5

<	3, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
23	24	25	26	27	28	1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31	1	2	3	4	5