Speech Recognition

VoiceGuide can use Google Cloud to perform Speech-To-Text (STT) in either real-time, or transcribing recorded sound files.

VoiceGuide also supports any MRCPv2 compliant speech recognition engine.

Please contact sales@voiceguide.com to discuss your Speech Recognition requirements.


Configuring Dialogic Cards

If using Analog Dialogic cards then a special "CSP Enabled" Firmware file will need to be selected. Go the Dialogic Configuration Manager, bring up the properties page for the analog card and on the Misc tab select the following Firmware file:

D/41JCT : d41jcsp.fwl

D/120JCT : d120csp.fwl

Similarly, when using the Dialogic JCT series T1 and E1 cards the "CSP Enabled" Firmware needs to be selected in the Dialogic Configuration Manager. Further changes in dxxx resource specification in Config.xml file also need to be made when using T1 and E1 JCT cards. Please consult support@voiceguide.com when deploying Speech Recognition on T1 or E1 JCT family cards.


Installing the Speech Recognition engine

It is recommended that the Speech Recognition engine be installed on a separate server. Speech Recognition engines have high resource and CPU requirements and the high CPU load can degrade overall IVR performance if the Speech Recognition engine is installed on the same system.


MRCPv2 setup

An mrcp.xml configuration file needs to be placed in VoiceGuide's \conf\ subdirectory.

A sample mrcp config file is placed in the \conf\ subdirectory on installation. It is called "sample_mrcp.xml" and needs to be renamed to mrcp.xml and its entries updated with the correct IP addresses.
Please see comments in that file.


ASR Grammars

To have VoiceGuide recognize speech during a play, a grammar associated with this play module must be defined.
A grammar is defined by creating a text file which contains the grammar for the particular play module, and placing this file in the same directory where the script is located. The filename format is:


The ModuleTitle identifies for which Play module the grammar file is for.

The grammar file contents are read in when the Play module starts.

Other data files can be used to set MRCPv2 parameters and behaviour.
Please contact VoiceGuide Support for more information.


Speech Recognition Engine Responses

When VoiceGuide receives the response from the Speech Recognition Engine it will create the following Result Variables

$RV[ModuleTitle_ASR_Instance] Contains the speech recognition engine's response. This is also called "Semantic Interpretation"
$RV[ModuleTitle_ASR_Input] Contains the text of what the caller said.
$RV[ModuleTitle_ASR_Confidence] Contains the confidence level. Indicates how confident the speech recognition engine is that the recognition is correct range 0-100

VoiceGuide will first see if a path matching the value stored in $RV[ModuleTitle_ASR_Instance] is found.
If the matching path is found then that path is taken.
Next VoiceGuide will first see if a path matching the value stored in $RV[ModuleTitle_ASR_Input] is found.
If the matching path is found then that path is taken.
Otherwise the Success path is taken if some response was returned, or a Fail path is taken if the recognition attempt was not successful.



© Katalina Technologies Pty. Ltd.