A.T.R. Audio Test Robot Above: The robot has a microphone on a short boom on the rear end, and the mic amp/signal conditioner is the small board on the other side rear. The eventual goal is for the robot to understand a small vocabulary of word commands. Updated 6/29/16 Key Search Words: ROBOT, ROBOTICS, ROBOTIC VISION, ARTIFICIAL INTELLIGENCE, AI
The results from word recognition using a very crude digitizing of the wave shape resulted in in insufficient accuracy on similar sounding words. More research on the subject revealed that using amplitude alone in the recognition process was never very effective, and today high powered processors and DSP's are used that do Fourier transforms to separate the different frequency spectrums of various words to decode them. This is beyond what my little micro controller is capable of and Im now pursuing a different concept that has resulted in decoding six different commands successfully.
Lets discuss how this was done, then show some movies (You tube) of the process working in the actual robot.
When the robot is first turned on, there is a 2 second delay in which the amplifiers are allowed to settle. Then we take a background audio level reading of the signal with the A/D converter set for 8 bit resolution. This is around 2 ADU. The threshold for a sound trigger is then set in the program for 10 ADU higher. Once triggered on a sound, an LED is lit and the data is examined for 2 seconds. During that time, an accumulator records the time in 1 ms increments of the time that the sound level is above the preset threshold. At left is a graphical example of the word "ROBOT". The threshold accumulator is then read at the end of 2s, and the number is then used to identify the words spoken based on the length of time the sound occurred. While this sounds a bit sketchy, it works great - since we are choosing the half dozen words anyway the robot (or pet for that matter) has to differentiate for its commands.
The following words - or sounds were used for testing, and most of the time it got the words correct and sent it to the LCD display as an echo:
"Click", "Robot", "Stop", "Light On". Here are the avenge durations measured a dozen times for each word:
"CLICK" = 60ms
"STOP" = 308ms
"ROBOT" = 380ms
"LIGHT ON" = 580ms
a range of 50ms was given to each word, and the measured duration of the spoken word was compared, and a decision was made as to which range the word fell in. This was then sent to the LCD display.
Movie1 A small program was written to display the duration of each word as spoken. A beep indicates it is ready to record the next 2 second interval.
Movie2 Identification of each word is shown here.
The next step obviously is to have the robot perform tasks based on what is spoken to it.
Movie3 This is it! Four commands are given to robot verbally. CLICK - Rotate 90 degrees, ROBOT - Beep twice to say I recognise you are calling me!, GO - go forward for 3 sec, LIGHT ON - turn on white light.
Previous Uploads on this robot: 1. The sound envelope waveforms with the mic amp board 2. Sound Activated movement sequence 3. More waveforms, a scope camera and digitizing the shapes