Reason why it is not easy for computers to understand music by machine learning

Music has a subjective element of expressing something using sounds invisible to the eyes and at the same time an art that combines geometric and mathematical elements that can be constructed using the symbol of score is. Patrick Hutchinson, two composers and AI researchers, analyzes the difficulty of computers to understand music by using machine learning with remarkable evolution.

Music and Machine Learning - ai.SensiLab

With the development of recent machine learning, computers have become able to recognize words and images. Both words and images have a common role in terms of communication tools for communicating information, and that is true for music as well. Mr. Hutchinson says music plays a communication role in cultural groups or specific groups. On the other hand, music does not have "meaning" in itself. Each one of the music's sounds does not have a peculiar meaning like each word of a word.

Nonetheless, it is interesting to note that even "sound" which has no meaning on its own is that it has "meaning" which should have never existed, by collecting them into "music". About this point Mr. Hutchinson explains with the following example.

· Both musical content and non-musical content can consist of one sound signal.
· If you are listening to the sound to distinguish whether your cell phone is incoming or not at the office, it is not "music".
· When you listen to the ring tone, if you are listening to the shape of the sound itself, the condition of the texture, rhythm, etc. rather than "who is your call", it is "music".

By Terry Chay

According to Mr. Hutchinson, the meaning of this "music or non-music" classification is interesting in terms of culture and evolutionary biology, but also very important for data scientists. The first reason for that is that "a natural language model that understands languages ​​is not an effective tool for analysis and creation (= composition) of music", meaning that natural language and Hutchinson says that the function and structure of data are completely different from music.

In the field of natural language processing, the emergence of word 2vec , a neural network capable of learning embedding information and distributed expressions of words, has evolved as a breakthrough, but in music without "word" He said that this method does not apply. In music, "sound" has elements of "tone color", "pitch" and "length", but it is almost impossible to put meaning into the sound itself. Therefore, Mr. Hutchinson says that at least the existing machine learning system can be said to be more suitable for natural language analysis than music.

By Brandon Giesbrecht

It is said that the background and the language used and the music have been born and used greatly. Language has developed for the purpose of communicating information and communicating in the community, but Mr. Hutchinson said that "enjoyment" was a big purpose for music. As a result, music has been evolving with little chance of being bound by concepts such as "syntax" like language.

At the same time, however, there is something that is said to be "good music" and "bad music" in music, there is something like "syntax" in "good music", a paradoxical situation It is also true that there is. Some of them are indicated by the following concepts.

· 1: Repetition <br> Many people like to repeat the same phrase etc many times in the song. Conversely, if there are songs that are not repeatedly impressive phrases such as so-called "rust", it will be difficult for the listener to follow. Also, repeatedly listening to the same song many times can also be said to manifest that people like to repeatedly listen to the same music.

· 2: Change <br> Even if you like repeat, it will be "boredom" if the same phrase is repeated endlessly. For that reason, moderate change is also necessary for good music.

By Newman University

Musical data (MIDI / MusicXML) that is released for free is often used when learning music for machine learning in many cases, but Mr. Hutchinson thinks this method is a mistake It is said that. In addition to the fact that information is lost when converting music as sound into sound, many of the data on the net was created by music fans, written for music beginners It is in the present situation that it was converted into data based on the score, Mr. Hutchinson says.

Also, Mr. Hutchinson points out that there are issues to learn by audio data rather than score data. In the case of CD, when converting sound into digital data, 41,400 data items are created per second. This is called "sampling frequency", and in the case of CD it is expressed as "44.1 kHz". If this value goes down, the sound quality will degrade, but there is a problem that the information that music itself had is lost, but the difficulty of handling the data by the computer decreases. Conversely, increasing the sampling frequency will increase the amount of music information, but the workload required for computers will increase.

By Patrick Lauke

WaveNet developed by Google's DeepMind and DeepSpeech by Baidu can create natural artificial sounds almost indistinguishable from humans, but in the case of music it does not work as well. Even though we can actually make sounds using pianos, Mr. Hutchinson points out that there are still many points to be improved if asked if there is "musicality" reproduced there.

The difficulty of "evaluating" is also one of the reasons that has not led to a good link between music and machine learning. For many people it is not easy to accurately express "Why do you like that music" in words? Mr. Hutchinson tried to solve this problem by using statistics, but it seems that the data set used to make music analysis and composition is "I do not think people would like to hear it at all" It was said that it was occupied only with such music. " Mr. Hutchinson is doubtful whether the learning done with such a data set can produce good music.

Mr. Hutchinson cites the existence of a bigger problem "cultural bias". Cultures around the world often have distinctive forms of music, in many cases. It is probably thought that many people feel with the skin, but on the other hand it is interesting that literature and paintings have no difference between cultures as much as music.

And the point of evaluation of music is that there are situations in which it is biased towards Western music that has been produced in the past hundreds of years when handling music with machine learning, Mr. Hutchinson I point out.

By Elek László

in Software,   Art, Posted by darkhorse_log