Voice Recognition Software for Raspberry Pi

Install Instructions

(this requires git)
Here are the dependencies required to run and build:
sudo apt-get install libboost1.50-dev libboost-regex1.50-dev youtube-dl axel curl xterm libcurl4-gnutls-dev mpg123 flac sox

 sudo apt-get install git-core

git clone git://github.com/StevenHickson/PiAUISuite.git

cd PiAUISuite/Install/
./InstallAUISuite.sh

 

Update Instructions

cd PiAUISuite

git pull

cd Install

sudo ./UpdateAUISuite.sh

**Note: If the file .commands.conf doesn't exist in your home directory and you have an old version of code, the program will exit. You should either grab the newest code from github or create the file yourself.

**To make it listen for longer, edit the file /usr/bin/speech-recog.sh and change -f cd -t wav -d 3 to -f cd -t wav -d # where number is how many seconds it should listen.


The default special options are as follows:
!keyword==pi
!verify==1
!continuous==1
!quiet==0
!ignore==0
!filler==1
!thresh=0.7
!response=Yes sir?

response and keyword can be any string.
verify, continuous, quiet, filler, and ignore can be 1 or 0 (true or false respectively).
thresh can be any floating point number to set the appropriate volume.

I've also created a man page fore voicecommand. You can access it with man voicecommand. It is shown below:

OPTIONS

?
Same as -h
-b
Turns off the FILL audio. The purpose of this was because the Raspbery Pi (or mine at least) cuts off the first few seconds of audio. This flag turns that feature off. You should only be concerned with this if you hear FILL before everything it says.
-c
Makes voicecommand run in continuous mode, where it will keep listening over and over again.
-d
Sets the duration for listening to the audio for voice commands
-D
Sets the audio hardware. The default is plughw:1,0 -
-e
Edits the voicecommand config file.
The format is voice==command
You can use any character except for newlines or ==
If the voice starts with ~, the program looks for the keyword anywhere. Ex: ~weather would pick up on weather or what's the weather
You can use ... at the end of the command to specify that everything after the given keyword should be options to the command.
Ex: play==playvideo ...
This means that if you say "play Futurama", it will run the command playvideo Futurama
You can use $# (where # is any number 1 to 9) to represent a variable. These should go in order from 1 to 9
Ex: $1 season $2 episode $3==playvideo -s $2 -e $3 $1
This means if you say game of thrones season 1 episode 2, it will run playvideo with the -s flag as 1, the -e flag as 2, and the main argument as game of thrones, i.e. playvideo -s 1 -e 2 game of thrones
Because of these options, it is important that the arguments range from most strict to least strict.
This means that ~ arguments should probably be at the end.
You can also put comments if the line starts with # and special options if the line starts with a !
Default options are shown as follows:
!keyword==pi,!verify==1,!continuous==1,!quiet==0,!ignore==0,!thresh==0.7,!maxResponse==-1
api==BLANK,!filler==FILLER FILL,!response==Yes Sir?,!duration==3,!com_dur==2,!hardware==plughw:1,0,!language==en_us
Keyword, filler, and response accept strings. verify, continuous, quiet, and ignore except 1 or 0 (true or false respectively). thresh excepts a floating point number. These allow you to set some of the flags as permanent options (If these are set, you can overwrite them with the flag options).
You can set a WolframAlpha API and maxResponse (the number of branches) like !api==XXXXXX-XXXXXXXXXX amd !maxResponse==3
You can now customize the language support for speech recognition and some text to speech with the language flag. Look up your country code and use that. Ex. For US: !language==en_us, for Spain !language==es, for Germany !language==de.
-f /my-location/config-file
This allows you to load a different config file located in a different spot. The default one is in your home directory and is ~/.commands.conf
The config file must be formatted the same way.
-h
Shows this man page.
-i
Sets the ignore mode. When this flag is activated, if a command is not in the config file, nothing happens. The default behavior is to try to find an answer or response to that question and then speak it. This turns off that behavior.
-I string
Sets the forced input mode. This allows you to test it without the microphone or get it to parse typed information. It will not run in continuous mode with this.
-k word
Sets the keyword. The default is pi. If this flag is set, the verify and continuous flags are also set since this is only checked during those two modes.
      Ex. voicecommand -c -v -k Jarvis
-l
Sets the duration for listening to the audio for the command keyword. This is different than the -d flag that listens for the voice commands.
-s
Runs a setup operation that attempts to set all of the config options in the config file so that voicecommand works properly
-r word
Sets the response. The default is "Yes Sir?" (For version 1.0, it was Ready?. If this response is more than one word, it should be put in quotes, otherwise it doesn't need to be
      Ex. voicecommand -r Ready?
-t #
Sets the threshold for volume to determine if the keyword was spoken. This should be a floating point number. The default value is 0.7 which works well with the Logitech C310 camera/mic from about 6 feet away.
Ex. voicecommand -t 1.2
-p
Sets passthrough mode on so that instead of running the commands, it just prints them. This is going to be used for the XBMC plugin and Android app.
-q
Sets quiet mode on so that voicecommand never speaks through the audio output. It still prints everything but doesn't ever respond. This includes the keyword response.
-v
Makes voicecommand verify the keyword. This only happens in continuous mode so if this flag is set, the continuous flag will be set as well. The default mode is to not verify. When voicecommand hears any sound above the threshold, it says the response then listens for a command. The default keyword is pi. When the verify flag is set, after the threshold is met, voicecommand verifies that the keyword was spoken.

 

 

playvideo:
    Uses a special locate database and omxplayer to quickly locate and play videos.
    http://stevenhickson.blogspot.com/2013/03/playing-videos-intelligently-with.html

downloader:
    Uses curl and transmission to find the best torrent based on your input and then starts downloading it.
    http://stevenhickson.blogspot.com/2013/03/automatically-downloading-torrents-with.html

gvapi:    
    Uses curl and my google voice api. It can check, send, and delete SMS messages. See the man page for more by typing man gavpi
    http://stevenhickson.blogspot.com/2013/05/using-google-voice-c-api.html

gtextcommand:
    Uses curl and my google voice api in order to make the computer check for text messages every minute and run a command that you send.
    http://stevenhickson.blogspot.com/2013/03/controlling-raspberry-pi-via-text.html

youtube:
    Uses youtube-dl and other scripts to stream youtube files.
    http://stevenhickson.blogspot.com/2013/06/playing-youtube-videos-in-browser-on.html
    http://stevenhickson.blogspot.com/2013/04/using-youtube-on-raspberry-pi-without.html

youtube-safe:
    Uses youtube-dl, get_flash_videos, and other scripts to stream other video files (The Daily Show, Hulu, etc.).
    http://stevenhickson.blogspot.com/2013/06/getting-huluvimeo-to-work-on-raspberry.html
    http://stevenhickson.blogspot.com/2013/06/streaming-other-hd-video-sites-on.html

voicecommand:
    Uses googles api and a special config to run commands based on what you say.
    http://stevenhickson.blogspot.com/2013/05/voice-command-v20-for-raspberry-pi.html
    http://stevenhickson.blogspot.com/2013/04/voice-control-on-raspberry-pi.html

Copyright
GPLv3
Steven Hickson
Date: 08th July, 2015 Written by Satya Comment count: 0

Google map:

About The Author

Satya

<p>satya maharjan</p>
Want to reach this author? |
Share It: