Intent Classification during Human-Robot Contact
Robots are used in many areas of industry and automation. Currently, human safety is ensured through physical separation and safeguards. However, there is increasing interest in allowing robots and humans to work in close proximity or on collaborative tasks. In these cases, there is a need for the robot itself to recognize if a collision has occurred and respond in a way which prevents further damage or harm. At the same time, there is a need for robots to respond appropriately to intentional contact during interactive and collaborative tasks. This thesis proposes a classification-based approach for differentiating between several intentional contact types, accidental contact, and no-contact situations. A dataset is de- veloped using the Franka Emika Panda robot arm. Several machine learning algorithms, including Support Vector Machines, Convolutional Neural Networks, and Long Short-Term Memory Networks, are applied and used to perform classification on this dataset. First, Support Vector Machines were used to perform feature identification. Compar- isons were made between classification on raw sensor data compared to data calculated from a robot dynamic model, as well as between linear and nonlinear features. The results show that very few features can be used to achieve the best results, and accuracy is highest when combining raw data from sensors with model-based data. Accuracies of up to 87% were achieved. Methods of performing classification on the basis of each individual joint, compared to the whole arm, are tested, and shown not to provide additional benefits. Second, Convolutional Neural Networks and Long Short-Term Memory Networks were evaluated for the classification task. A simulated dataset was generated and augmented with noise for training the classifiers. Experiments show that additional simulated and augmented data can improve accuracy in some cases, as well as lower the amount of real- world data required to train the networks. Accuracies up to 93% and 84% we achieved by the CNN and LSTM networks, respectively. The CNN achieved an accuracy of 87% using all real data, and up to 93% using only 50% of the real data with simulated data added to the training set, as well as with augmented data. The LSTM achieved an accuracy of 75% using all real data, and nearly 80% accuracy using 75% of real data with augmented simulation data.
Cite this version of the work
Wesley Fisher (2020). Intent Classification during Human-Robot Contact. UWSpace. http://hdl.handle.net/10012/16438