Sapi On Multiple Lines

07/21/2003 03:48 PM

Our Voice Guide script has several instances of TTS. While this works fine on a single-line application, we are attempting to accomodate a multiple line situation and are searching for the easiest way to configure multiple simultaneous TTS.

Any assistance would be greatly appreciated.

Thank you very much.

John Cope

Medusa Research, Inc.

07/22/2003 12:03 AM

Can you forward us the script that you are using? Then we can see what demands your script places on the TTS system.

Do you use TTS for applications like speaking back addresses etc or do you retrieve large blocks of description text?.

07/23/2003 05:47 PM

We're not necessarily over taxing the TTS system, as we're not TTSing large blocks of text. Most of it is reading names pulled from a database, but the script will ultimately be running on a maching accomodating 8 telephone lines, so we're concerned with the possibility of multiple callers reaching a TTS module at the same time. How does Voice Guide handle this?

07/24/2003 01:51 AM

Sounds like each of the Text To Speech requests in your application would not take longer then about 3 seconds.

On a fast computer (say current release architecture 2GHz) TTS takes less then 50ms (1/20th of a second) to generate 3 seconds worth of speech. So even if all 8 lines issued a request at the same time, the line which had it's TTS request processed last should experience a delay of less then half a second before it could play it's TTS file. (see here for some more discussion on this)

Keep in mind though that in real life requests will never arrive *exactly* at the same time, and if all 8 requests arrived in the same one second but 100ms apart then each line would only have to wait it's own 50ms before it could start playing - so there will be no delays at all due to multiple requests even though all 8 requests arrived within the same second...

So how can we estimate the likelihood of delays due to multiple TTS requests?

Well this would depend on how many TTS requests are made while the script is running.

As an example: when the script is running a TTS request is made every 20 seconds ie: 3 seconds out of every 20 seconds are spent playing TTS.

So the probability that at any one time a line is waiting for its TTS request to be completed is 1 in 400 (50ms out of every 20 seconds).

So with all 8 lines running the probability that one of the other 7 lines issues a TTS request while the first line is generating TTS is 7/400. (this includes TTS requests which would arrive near the end of the first request being processed - which would not affect the delay by much)

Then the possibility that a 3rd line issues a TTS request while the second line is waiting to have it's TTS processed is 7/400 * 6/400 = 42/160000

The possibility that a 4th line issues a TTS request while two other lines are waiting to have it's TTS processed is 42/160000 * 5/400 = 210/64000000

So there is a 0.00000328125 chance that a while there is a TTS request being processed 3 other lines will ask for their TTS to be processed.

And what happens then?

the second line would wait between 0 and 50 ms to get its TTS request started

the third line would wait between 50ms and 100 ms to get its TTS request started

the fourth line would wait between 100ms and 150 ms to get its TTS request started

so there is a 0.00000328125 chance that a line will have its TTS request delayed by between 100ms and 150 ms..

The chance that a 5th line gets delayed by between 150ms and 200ms is even less and so it goes on until we get to the possibility of the 8th line TTS request getting delayed by between 300ms and 350ms (ie about 1/3rd of a second)...

5th line : 150-200ms : 210/64000000 * 4/400 = 840/25,600,000,000

6th line : 200-250ms : 840/25,600,000,000 * 3/400 = 2520/10,240,000,000,000

7th line : 250-300ms : 2520/10,240,000,000,000 * 2/400 = 5040/4,096,000,000,000,000

8th line : 300-350ms : 5040/4,096,000,000,000,000 * 1/400 = 5040/1,638,400,000,000,000,000

5040/1,638,400,000,000,000,000 ... so if all 8 lines on this system were continously used since the universe began, a caller would have experienced the 300-350ms TTS generation delay maybe a couple of times... and the final question is: would the caller have even noticed such a short delay ?

So how does the single queue TTS scale?

Say you have the same app running on a 30 line system. Will the simple $35 single thread TTS SAPI engine be OK, or should you buy a TTS engine which can accomodate several requests at once for a few thousand dollars?

You can calculate all the probabilities in a similar way to above:

Probabilty a TTS request is delayed between 0- 50ms : 29/400

Probabilty a TTS request is delayed between 50-100ms : 29/400 * 28/400 = 252/160000 = 0.001575

Probabilty a TTS request is delayed between 100-150ms : 252/160000 * 27/400 = 6804/64000000 = 0.000106

Probabilty a TTS request is delayed between 150-200ms : 6804/64000000 * 26/400 = 176904/25600000000 = 0.0000069

This means that a fully uilized 30 line application which plays on each line an average 3 seconds of TTS every 20 seconds 99.999% of TTS requests would not get delayed by more then 200ms...

Conclusion:

The standard TTS engine is more then good enough for the average application.

07/24/2003 03:08 PM

Thank you very much for your comprehensive reply. You've made it much easier for us to evaluate the possibility of delay due to TTS.

We appreciate your assistance!

Medusa Research, Inc.

07/02/2008 05:49 PM

We handle TTS a little differently since the message can go out on one or more lines, up to several thousnad calls, and we do not want the TTS engine to build the message for every call.

We process any TTS request first and create a single wav file. We use Cepstral swift.exe which creates the wav file in the voice we want, and saves the wav file to disk. We build the call list and pass the location of the wav file <FilenameToPlay>c:\stt\wav\633491893390150000.wav</FilenameToPlay>

Using this method, there are no delays regardless of the number of calls being placed.

Jeff

07/02/2008 09:07 PM

Pre-generating of WAV files before the call is the best approach. If this is possible in your application then this approach is recommended.

If you are able to use TTS to pre-generate the .WAV files like in the above example then the TTS engine will not be used at all during the call itself, and this will reduce CPU usage during the call and eliminate all delay.

Keep in mind that when you will be using the TTS generated .WAV file concatenated with other .WAV files then you will need to ensure that all the concatenated .WAV files are of the same format.

Sign In

Sapi On Multiple Lines

Recommended Posts

Guest Cope

Share this post

Link to post

SupportTeam

Share this post

Link to post

Guest Cope

Share this post

Link to post

SupportTeam

Share this post

Link to post

Guest Medusa

Share this post

Link to post

jewillis

Share this post

Link to post

SupportTeam

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in

Browse