NVIDIA publishes technology that imitates the same behavior only by observing human behavior by AI



NVIDIA is a technology that allows AI to formulate an execution plan on its own from the positional relationship of objects before and after the action by simply observing human behavior such as "moving an object" published.

[1805.07054] Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations
https://arxiv.org/abs/1805.07054

New AI Technique Helps Robots Work Alongside Humans - NVIDIA Developer News Center NVIDIA Developer News Center
https://news.developer.nvidia.com/new-ai-technique-helps-robots-work-alongside-humans/

The new AI technology released by NVIDIA is described in the following movie in an easy-to-understand manner.

Research at NVIDIA: Researchers Help Robots Work Alongside Humans - YouTube


On the desk where "Blue" "Green" "Red" "Yellow" block and "Red toy car" are arranged ......


People stack blocks from the bottom in the order "red" "yellow" "green" "blue".


After this, arrange the blocks differently at positions different from the first.


AI who was observing this situation accumulated blocks in the order of "red" "yellow" "green" "blue" as the person had done earlier.


This technology is realized by four neural networks shown in red.


The first is "object detection network". We perceive the pose of the object and predict the position of the apex of the object in the space.


The second "relationship inference network" guesses the positional relationship of objects.


The third is "program generation network", making an execution plan for reproducing the positional relationship of objects ... ...


In the fourth "execution network", I order the robot to execute the execution plan generated by AI.


Among these four networks, the information generated by the object detection network and the relation inference network is output with human readable contents.


In the case of this demonstration ...


We recognize the five objects recognized by AI as "red block", "yellow block", "green block", "blue block", "red toy car". The AI ​​used for this demonstration seems to learn the two positional relationships "up" and "left of".


Next let's learn to place a "yellow block above a green block" and "a blue block above a red block ..."


If you place the blocks separately, AI plans to "place blue on the red" and "put yellow on the green" and move the block.


At this time, errors occur in the positional relationship of the objects estimated by AI, so errors may occur during execution due to processing such as "place a yellow block on a green block".


If it is a normal program, it stops here, but in the case of this AI check the state again ...


It is possible to redo the failed process and repair the error yourself.


Normally for learning of AI, "a large amount of labeled learning data is required"Supervised learningAlthough NVIDIA is often used, NVIDIA uses image data to learn AI "Unsupervised learning"Is adopted. Therefore, it says that there is an advantage that labor to prepare labeled data can be reduced.

According to NVIDIA, we do not use real images like "blocks" or "toy cars" used in demonstrations this time to learn AI at all. To actually learn "red toy car", the one in the upper left of the following image is used. For this reason, NVIDIA's demonstration released this time demonstrates that it is possible to grasp the actual object even if it is a learning method that hardly understands the information in the real world.

in Software,   Video, Posted by darkhorse_log