'Animated AI' explains convolutional neural network processing with animation



A site called ``Animated AI'' has been published that uses animation to explain ``convolutional neural networks (CNN)'', which are widely used in the field of machine learning. Actions that are difficult to understand with text alone were explained visually in an easy-to-understand manner.

Animated A.I.

https://animatedai.github.io/



Convolutional neural networks are a method that weights input data such as images through filters to make it easier to recognize the shape of the data. Details are explained in the article below.

An easy-to-understand explanation of convolutional neural networks (CNN) from basics to implementation – Experiential learning blog by zero to one

https://zero2one.jp/learningblog/cnn-for-beginners/

Animated AI displays animated gifs on topics such as the basic operation of convolution processing, a process called 'padding' that adjusts the problem of reducing the size of output data due to convolution processing, and 'stride', the interval between filter movements. There are also links to YouTube videos that explain each topic.

For example, below is an animation showing the basic operation. Click on the image to play.



Below is a YouTube video explaining the basic operations.

Fundamental Algorithm of Convolution in Neural Networks - YouTube


The cube placed in the back is the input data. The data is divided into a grid, and the characteristics of each point (pixel) are digitized.



The cube placed in front is the filter. Cut out a part of the input data (for example, a 3 x 3 range) and weight it by multiplying the input data by the number of each filter. Weighting makes it easier to recognize the vast characteristics of each pixel, such as color and shape.




Filter a specific range, and when one range is finished, move to the next square and filter the next range... Repeat this process to process all ranges. The interval at which the range moves at this time is the stride.




Once processing is completed with one filter, processing is performed again with the next filter. Repeat this.



Processing the data results in the output data being smaller than the original data. To prevent this, adjust the size by adding pixels with a value such as '0' around the original data. This is a technique called 'padding.'



Stride 1 is to shift the range to be filtered by 1 square, and stride 2 is to shift the range to be filtered by 2 squares. The video below shows this in animation.



In addition, a method called ``depthwise separable convolution'' and a method called ``pixel shuffle'' are also explained.

in Review,   Software,   Video, Posted by log1p_kr