Voice control as input

Is there any demo code to move any component of hello robot using voice command/input?

Hey @sitaneja, welcome to the forum! Check out this command line tool: stretch_robot_respeaker_teleop.py. Just as stretch_robot_keyboard_teleop.py is a quick way to teleoperate the robot around with a keyboard, stretch_robot_respeaker_teleop.py is how you would teleoperate Stretch around using voice commands.

The python script makes use of Mozilla’s offline speech-to-text engine, DeepSpeech v0.6.1, but you can use other online/offline solutions.

1 Like

Would it require any additional library or tool to execute this code?

All of our command line tools/ROS repo’s requirements are installed at our factory. The install scripts can be found in stretch_install. Since stretch_robot_respeaker_teleop.py is currently in a feature branch (a work in progress), it’s requirements aren’t in stretch_install yet.

I’ll send the setup steps when you’re ready with the robot.

Hi @sitaneja we also have some example code of how we have used Microsoft Azure’s speech to text and LUIS to control our stretch robot. We also include other functionality like text to speech and QnA maker so the stretch can answer back. We’re currently working on the repo as well as a blog post which will explain this in more detail.

2 Likes

Hi @bshah,

I was wondering where I could find stretch_robot_respeaker_teleop.py? I could not find it in the stretch_body repo anymore.

Thanks,
Lionel

Hi @Lionel_Zhang, you can find the script at this commit, but I removed it from shipping by default because it didn’t perform reliably enough to teleop the robot with voice commands. If you’d like to test it out anyways, you can use the following instructions to set up DeepSpeech and pull the script from the linked commit. Additionally, DeepSpeech has released v0.9.3 since, so the installation procedure to use this new version may be different.

There are many ways the script could have been improved to perform voice teleop more reliably. If you are interested in using DeepSpeech, you might look into writing a custom scorer on just the keywords needed to teleop the robot. This would likely improve the model’s inferences.

Thanks, Binit. Do you remember if the sub-optimal performance was due to the mic quality/volume or the voice algorithm?

We’re actually not too interested in the voice teleop itself, I just wanted to test it quickly to see how well the mic works with a different program. The mic input volume seemed low when the user is standing around 1-2 meters away from the robot, but we haven’t done a lot of testing to see if it’s our app or the mic itself that is the problem. I was planning to play with it some more today.

For the voice teleop script, it’s likely a combination of both factors. ReSpeaker has a tutorial for isolating voice, and there’s other audio filtering techniques that would likely improve the quality of the audio.

For the low input volume problem, I’d start with easy changes, like tweaking volume settings in the OS Settings Center or within the recording app itself. Also, try other recording programs like Audacity. If the volume remains consistently low, I can suggest some other ideas for debugging the issue.

1 Like

Thanks for the suggestions. pavucontrol seems to have alleviated the volume issue a bit after maxing out the input slider. The mic would not output any audio when standing ~1m away (at 100%), unless it was maxed out (at 153%). I’ll report back here if we need additional help.

1 Like