Robbie – Voice Control for Robots
Robbie is a robot that can be controlled by speech.
The goal of the project is the control of a robot by speech. It should be able to control basic driving commands such as forward, backward and rotations. The robot must be able to give feedback to the operator and only execute the commands if it has recognized its name (“Robbie”). The robot used is the Lego EV3 platform – the brain is a Raspberry Pi 3, on which several neural networks analyse the data coming from the microphone.
A particular challenge was the implementation of speech recognition on the comparatively inefficient Raspberry Pi 3. Therefore, only the recognition of previously defined terms is practical. For this purpose, we created a data set of around 10000 recordings with the 10 commands. In addition, a further 14000 data sets were generated (the existing recordings were randomly provided with noise, reverberation and other changes). A training file is 4 seconds long at a 16 kHz sampling rate.
In the first layer, the neural network converts the input into a Mel spectrogram. This is followed by convolutional and LSTM layers to extract features from the input signal. In addition to the commands, the network can also distinguish the categories “unknown word” and “background noise”.
In principle, the recognition of voice commands works reliably and within a short time despite the weak hardware. Feedback is returned to the operator via LED ring and voice output. Problems are also caused by the microphone, which has a high background noise level and therefore limits the range. Although it has been possible to record and generate a very large amount of training data, many more recordings of different people are required to optimize recognition. For comparison: Similar Open Source projects work with about 100000 data sets.
Maximilian Thiel, Zeynep Aydeniz, Michael Kleiner, Maximilian Spiegel, Daniel Zettler
Prof. Dr. Rieck, Kempten University (Project management)
SS 2019, Faculty of Mechanical Engineering