Audio Test Robot

Updated 6/4/16


The Audio available around a robot can be used to interact with and deploy certain functions. Here, we are going to try to have the robots main processor decode these audio waveforms and identify them. If this can be done, voice controlled robots will be used around the home to perform various tasks. Here are some new images of the Audio processors output and configuration. All waveform images here were taken off of a HP digital storage scope.

 Left: Schematic diagram of high gain electret microphone amp. The two 10k resistors on the plus input set the output DC level to mid range.

Click to enlarge

 Left: Schematic diagram of audio envelope generation. Here we first strip out the bottom half of the waveform with the .1uf cap, 1N4148 diode and a 10k ground reference. This rectified waveform enters the second set of amps, which is a 16Hz Butterworth type low pass filter. This removes the freqency burst inside each audio sound and presents a clean envelope waveform ready for insertion into the analog input of a processor. The filters measured roll off is on the lower right.

Click to enlarge!

 Left: Here is the circuit board, with electret microphone, and oscilloscope probe. Electrets are very sensitive to sound, I can even detect my hand waving sub sonic sound a foot away. The idea is to be able to talk to the robot from across the room to give it instructions. Is this possible? We shall see!

 Left: Output of first microphone amplifier for the word "TESTING" - 0v is two lines up from the bottom, and the waveform here is at DC=2.5v so you can see both top and bottom of the burst. I dont quite have the technique down yet to photograph the screen of the O'scope with the web cam.

 Left: After passing through the rectifier circuit, we set the DC to zero volts and chop the bottom half off. This when filtered by the 16 Hz filter gives you the envelope output.

 Left: Final Envelope detection waveform - "TESTING"

You can see the tiny glitch at the start of the first hump? Thats the "T" sound that starts the word. And the gap between humps is the syllable break.

 Left: Envelope detection waveform - "ROBOT" is being said here, I've labeled the components of this word so you can see how this works. The scope was set for single trigger, so I started the two second long trace with a click, then said the word. The shape of the envelope detected waveform is pretty repeatable, in other words If I say the same word twice, the shape is very similar, and VERY different from other words. Obviously, words that rhyme will be a problem, so Ill use very distinct words for now. The frequency components are filtered out on this set, I am just working with the overall shape of the word.

 Left: Envelope detection waveform - "HELLO" Again, two syllables, with the second one emphasized.

 Left: Envelope detection waveform - "BYE" A quick on hump sound with lots of energy.

 Left: Envelope detection waveform - "STOP" Ok, three parts seen here of this word: the initial "S", then the main hump is for short "O", followed by the ending pulse for the "P". You can see that different words have much different envelopes. Deciphering this is the key! The next posting will detail this.


Previous Uploads on this robot: