The robot learning lab aims at finding a general approach to acquiring motor skills. Developed algorithms and methods are tested in skill learning tasks such as playing table tennis!
Creating autonomous robots that can learn to assist humans in situations of daily life is a fascinating challenge for machine learning. We focus on the first step of creating robots that can learn to accomplish many different tasks triggered by environmental context or higher-level instruction an plan to obtain general approach to motor skill learning. We focus on (1) domain-appropriate machine learning approaches that allow for better control, imitation of behavior and self-improvement, as well as (2) new robotics approaches to creating more appropriate systems for high-speed skill learning.
Starting from theoretically sound robotic control structures for task representation and execution, we replace analytic modules by more flexible learned ones [ ]. To this end, we tackle problems such as accurate but compliant execution in joint-space [ ] or task-space [ ], learning of elementary behaviors using combination of imitation and reinforcement learning [ ], hierarchical composition of behaviors, and parsing complex demonstrations into elementary behaviors.
Mimicking how children learn new motor tasks, we have been using imitation to initialize to learn libraries of elementary primitives, and subsequently used reinforcement learning to improve performance. We have learned elementary tasks such as Ball-in-a-Cup or bouncing a ball [ ] and gradually moved to more complex ones. As the benchmark complex behavior, we chose the task of returning table tennis balls over the net. We created a parser that segments movements of a human teacher into elementary movements [ ]. These then train the single elementary movements [ ]. Novel behaviors, modulated by the opponent's incoming ball, are composed by mixing motor primitives [ ].This robot table tennis player learnt with this method successfully returns of 97\% of balls against a ball gun. Current approaches have the perspective of becoming even better [ ] and use accurate prediction of the human opponent's behavior before the opponent even touched the ball [ ].
Starting from theoretically sound robotic control structures for task representation and execution, we replace analytic modules by more flexible learned ones pubLink{PetersKMNK2012}. To this end, we tackle problems such as accurate but compliant execution, learning of elementary behaviors, hierarchical composition of behaviors, and parsing complex demonstrations into elementary behaviors.
Accurate execution of movements ideally requires only low-gain controls, such that the robot can accomplish tasks without harming humans. Following a trajectory with little feedback requires accurate prediction of needed torques, which often cannot be achieved using classical methods. However, learning such models is hard: the joint-space can never be fully explored and the learning algorithm has to cope with a data stream in real time. We have developed learning methods for tasks represented in joint-space pubLink{6505} or task-space pubLink{NguyenTuongP2012}.
\emph{Executing} a task is important, but often the task itself needs to be learned. We focused on learning elementary tasks or movement primitives, which are parameterized nonlinear differential equations with desired attractor properties. We mimic how children learn new motor tasks, using imitation to initialize these movement primitives, and reinforcement learning to subsequently improve performance. We have learned ta
For more complex tasks, hierarchical solutions compose behaviors based on a large number of elementary ones. As the `drosophila' of complex behavior, we chose the task of returning table tennis balls over the net. This requires all the methods described in the previous paragraphs, as well as forms of reinforcement learning discussed below. We created a compiler that segments movements of a human teacher into elementary movements pubLink{6742,6743}. These then train the single primitives discussed above pubLink{6803,6802}. Novel behaviors, modulated by the opponent's incoming ball, are composed by mixing motor primitives pubLink{6745}. The learning system then generalizes between the primitives. This created successful returns of 88\% of balls. Further improvement is limited by the robot's hardware and reaction time.
Human players infer the direction of incoming balls from the opponent's movement, and so can prepare their stroke before the opponent even hits the ball. Inspired by this, we developed learning methods anticipating the aim of human opponent. They achieved a prediction accuracy of $35cm$, already $320ms$ before the ball is hit pubLink{WangDBVSP2012}, significantly increasing available reaction time.
Figure 0.1: The robot table tennis setup pubLink{6745}.
Recently, we developed a method maintaining multiple conflicting strategies in its motor policy pubLink{DanielNP2012_2}. This generalizes previous work on mixtures of motor primitives pubLink{6745} and puts it on stronger theoretical footing. It was tested in a game of tetherball (Figure 0.2), and may become useful in table tennis as it allows more ambiguous gameplay.
Figure 0.2: The Tetherball task pubLink{DanielNP2012_2}, which won the IROS Best Cognitive Robotics Paper Award while being finalist for both the best paper and the best student paper award.
Motor skill learning can also be applied to grasping and manipulation. We have developed methods pubLink{6436,6636} learning how to grasp novel objects (a preliminary version pubLink{6436} won the ICINCO Best Paper Award) and generalize such basic manipulations to different objects pubLink{KroemerUOP2012}. We also studied how to recognize surfaces from tactile frequency patterns obtained by sliding a finger over surfaces pubLink{7054}. The robot autonomously discovered the most relevant dimensions of the tactile data by training jointly on vision and tactile data from various textured surfaces. Subsequently, when presented with only tactile stimuli, it was able to quickly and reliably recover surface type.