VoiceGuide IVR Software Main Page
Jump to content

Play Sound File While Generating Tts?

Recommended Posts

Hello,

 

The ability to playback sound while a script is running is a great feature!

 

I was wondering if there is a way to do something similar and playback sound while TTS is being generated.

 

Right now the caller hears silence while the TTS engine is working.

 

Thank you.

Share this post


Link to post

How long is the text that you are converting ?

TTS generation does not usually take that long.

 

Can you post vgEngine/ktTel/ktTts traces capturing the call?

(please .ZIP up traces before posting)

Share this post


Link to post

I am using the neospeech paul TTS engine for the main menu and there is like a 3 second delay while the TTS engine is preparing the WAV for this menu.

 

Click here for my log files.

 

There you'll see exactly how long it is.

 

Thank you

Share this post


Link to post

It looks like this TTS engine takes longer to fully generate TTS file when <pitch> or <rate> modifiers are used, but is fairly fast when plain text is used.

 

Right now there is no inbuilt way to play 'on-hold' sound when TTS is generated (there was just no need for it beforehand).

 

Are you able to run some command line tool for this TTS engine that generates the WAV sound file? this way you may be able to run this generation from a Run Program module, which can play 'on-hold' sound in meantime.

 

 

---------------------------------------------------------------------------------------

 

151312.062 4612 fn TtsToWavFile

 

<?xml version="1.0"?>

<voice required="Name=Paul;">

For zmonim in <pitch middle="-3">Lakewood New Jersey</pitch>, press 1.

For zmonim in a different <pitch middle="-2">location</pitch>, press 2.

<silence msec="300"/>to change settings, press 3.

<silence msec="300"/>to make a donation, to <pron sym="k aa n t ae k t ax s">contact us</pron>, <rate speed="-6">or</rate> to get international numbers, press 4.

<silence msec="300"/>to request a zmonim <pron sym="t ae k s m eh s ax jh">text message</pron>, press 5.

<silence msec="300"/><pron sym="t eh">to</pron>repeat the main menu, press star.

<silence msec="400"/>You can eturn <pron sym="t ih dh ax">to the</pron> main menu at any time, by <pitch middle="-4">pressing star</pitch>

 

151313.515 5824 1 task 2 speak returned. now waiting

151313.765 5824 1 task 2 generation completed

 

1.7 seconds

 

---------------------------------------------------------------------------------------

 

151342.828 4612 fn TtsToWavFile

 

Please enter your PIN.

 

151342.843 5824 1 task 3 speak returned. now waiting

151342.890 5824 1 task 3 generation completed

 

0.06 seconds

 

---------------------------------------------------------------------------------------

 

151349.328 4612 fn TtsToWavFile

 

<voice required="Name=Paul;">

<pitch middle="-2">

Welcome

<pitch middle="-2">to</pitch>

myzmonim

<pitch middle="2">info</pitch> <pitch middle="-3">line</pitch>.

</pitch>

 

151351.046 5824 1 task 4 speak returned. now waiting

151351.125 5824 1 task 4 generation completed

 

1.8 seconds

 

---------------------------------------------------------------------------------------

 

151355.531 4612 fn TtsToWavFile

 

<?xml version="1.0"?>

<voice required="Name=Paul;">

For zmonim in <pitch middle="-3">Gateshead</pitch>, press 1.

For zmonim in a different <pitch middle="-2">location</pitch>, press 2.

<silence msec="300"/>to change settings, press 3.

<silence msec="300"/>to make a donation, to <pron sym="k aa n t ae k t ax s">contact us</pron>, <rate speed="-6">or</rate> to get international numbers, press 4.

<silence msec="300"/><pron sym="t eh">to</pron>repeat the main menu, press star.

<silence msec="400"/>You can return <pron sym="t ih dh ax">to the</pron> main menu at any time, by <pitch middle="-4">pressing star</pitch>

 

151357.031 5824 1 task 5 speak returned. now waiting

151357.375 5824 1 task 5 generation completed

 

2.8 seconds

 

---------------------------------------------------------------------------------------

 

151403.984 4612 fn TtsToWavFile

 

<?xml version="1.0"?>

<voice required="Name=paul;">

Please enter A 5-digit postal code, a 2-digit area code, <rate speed="-5">or</rate> A location I.D. number, followed by the pound sign.

 

151405.546 5824 1 task 6 speak returned. now waiting

151407.328 5824 1 task 6 generation completed

 

3.3 seconds

 

---------------------------------------------------------------------------------------

 

151413.546 4612 fn TtsToWavFile

 

<?xml version="1.0"?> <voice required="Name=paul;"> The location you typed <voice required="Name=Mike16;"><spell>11559</spell></voice>, could not be found. Please try <pitch middle="-4">again</pitch>.

 

151415.093 5824 1 task 7 speak returned. now waiting

151415.687 5824 1 task 7 generation completed

 

2.1 seconds

 

---------------------------------------------------------------------------------------

 

151424.906 4612 fn TtsToWavFile

 

<?xml version="1.0"?>

<voice required="Name=paul;">

Please enter A 5-digit postal code, a 2-digit area code, <rate speed="-5">or</rate> A location I.D. number, followed by the pound sign.

 

151426.453 5824 1 task 8 speak returned. now waiting

151426.531 5824 1 task 8 generation completed

 

1.6 seconds

 

---------------------------------------------------------------------------------------

 

151435.796 4612 fn TtsToWavFile

 

<?xml version="1.0"?> <voice required="Name=Paul;"> Here are the zmonim for <pitch middle="-2">Beitar Illit</pitch>.<silence msec="100"/>It's now 3:14. <rate speed="-2"> mihn<pron sym="h ah">cha</pron> <pron sym="g ah d ow l ax">Gedola</pron></rate>, was at 1:19. <rate speed="-3"> shki<pron sym="y aa"/></rate>, will be at 7:53. <rate speed="-5">| <pron sym="t s ey s s">Tzeis</pron></rate>three stars, 8:31. <rate speed="-5">| <pron sym="t s ey s s s">Tzeis</pron></rate>72 minutes, 9:06. Zmonim for <pitch middle="-4">tomorrow</pitch>. <rate speed="-3">alow<pron sym="s s"/></rate>degrees, 4:09. <rate speed="-3">alow<pron sym="s s"/></rate><silence msec="50"/><pitch middle="10">fixed</pitch>, 4:25. zmon<pron sym="t s ih t s ih s s">Tzitzis</pron><silence msec="100"/><pron sym="uw t t f ih l ih n">U'Tfillin</pron>, 4:38. <rate speed="-5">naitz</rate>, 5:37 and 06 seconds. <pron sym="sh ax m aa">Shema</pron><pron sym="m ah g ax n ax v r ah m">Magen Avraham</pron> Degrees, 8:26. <pron sym="sh ax m aa">Shema</pron><pron sym="m ah g ax n ax v r ah m">Magen Avraham</pron> Fixed, 8:33. <pron sym="sh ax m aa">Shema</pron>Graw and <pron sym="b ao l ih t ao n y ax">Baal Hatanya</pron>, 9:09. zmon <rate speed="-3">tfeelah</rate>, 10:20. , 12:42. <rate speed="-2"> mihn<pron sym="h ah">cha</pron> <pron sym="g ah d ow l ax">Gedola</pron></rate>, 1:19. . <silence msec="300"/>To <rate speed="-3">repeat</rate>, press 1. <pron sym="t uh">to</pron>return <pitch middle="-3">to</pitch> the main menu, press star. <pron sym="t ax">to</pron>end your call, <pitch middle="-4">just hang up</pitch>.

 

151437.343 5824 1 task 9 speak returned. now waiting

151438.656 5824 1 task 9 generation completed

 

2.9 seconds

 

---------------------------------------------------------------------------------------

Share this post


Link to post

Another workaround suggestion:

 

Are you perhaps able to pre-generate some of the various sound files used in this system and then just concatenate the existing .wav files when playing the prompts? This may be a workaround for the prompts that do not differ much from call to call.

 

Can you use Say Number modules for the speaking back of numbers etc?

Share this post


Link to post

How fast is the system used here? What is the CPU/Memory/OS? Is anything else running on system apart from VoiceGuide?

Share this post


Link to post

System Specs Pentium 4 CPU, 2.20 GHz, 1.5 GB RAM, Win XP pro

Nothing else running besides VoiceGuide

 

I cannot use Say Number Modules because the dynamic content is more than just numbers, and sometimes involves names too.

 

Concatenating dynamic WAV with static WAV on the fly seems like a great idea, but I've never done something like this before. Does voiceguide come with any sample scripts that show you how to concatenate?

 

Using a command line tool for generating TTS is also an interesting idea; Will look into it.

Share this post


Link to post
Concatenating dynamic WAV with static WAV on the fly seems like a great idea, but I've never done something like this before. Does voiceguide come with any sample scripts that show you how to concatenate?

Just use a chain of Play modules. Some playing static WAV files and some doing short simple TTS.

 

Pre-generating would be the better approach though.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×