Reinforcement learning (RL) is an area of machine learning inspired by behavioral psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Reinforcement learning differs from standard supervised learning in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected. Instead the focus is on on-line performance, which involves finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). As well as deep learning, by using a neural network, it enables to learn massively parallel processing those humans can hardly design by hand, and to surpass what humans design. Unlike supervised learning, reinforcement learning makes autonomous learning possible. Therefore, it can make the interference by human design minimum, and very flexible and purposive learning on a huge degree of freedom can be realized. That is the reason why it opens up the path way to artificial general intelligence (AGI) or strong AI. Deep or end-to-end reinforcement learning extends reinforcement learning from learning only for actions to learning for entire process by extending the learned process to the entire process from sensors to motors. Therefore, not only actions, but also various functions including recognition and memory are expected to emerge. Especially, in higher functions, they do not connect directly with either sensors or motors, and so even deciding either their inputs or outputs is very difficult.