"DQN" and three kinds of variations of reinforcement learning algorithm published by Artificial intelligence research organization "OpenAI"



Known as the founder of Tesla and SpaceXEarlon maskMr. is also involved in the establishment of a nonprofit organization "Artificial Intelligence (AI)" OpenAI ". This OpenAI is a high-quality implementation of the reinforcement learning algorithm (RL algorithm) "OpenAI Baselines"Deep Q-Learning (DQN) and three variations were released.

OpenAI Baselines: DQN
https://blog.openai.com/openai-baselines-dqn/



The abbreviation "DQN" is a subsidiary of GoogleDeep Q-Network, an artificial intelligence program developed by DeepMindOpenAI is also used for "Deep Q-Learning"Is used as an abbreviation.

Deep Q-Learning is known as one of machine learning methods "Q-Learning (Q study)It combines the use of deep neural network and it is used for reinforcement learning of complicated and high dimensional environment such as video game and robot engineering for example.

The result of reinforcement learning seems to be tricky to reproduce that the content contains "noise" abundantly or a lot of difference due to a slight bug in the algorithm is born. In OpenAI, in order for the AI ​​research community to create an excellent baseline and to raise the research level to a higher place, as an effective implementation and to publish these algorithms as the best embodiment for making it He said that he did. In producingPython 3And Google open source libraryTensorFlowusing.

In addition to Deep Q-Learning, the total of "Double Q Learning", "Prioritized Replay", and "Dueling DQN" that fixed occasionally special actions are excessively evaluated in existing Deep Q-Learning Four are published in GitHub.

GitHub - openai / baselines: OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
https://github.com/openai/baselines


In addition, OpenAI expects to release algorithms in the future as well.

in Note, Posted by logc_nt