AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Text to speech

Is anyone using TTS functionality with CS? We have implemented the speech to text and text to speech options in the php versions of our bots. Speech to text works really well. TTS on the other hand is buggy. The first time our bots speak, they speak in one gender, then for the next answer they switch grin.

Basically they seem to be using the Female UK voice for the first answer then switch to the Male UK voice for the second, and vice versa.

Anyone using TTS routinely? We would like to use some voices other than the UK versions as well.


Doug Danforth


  [ # 1 ]

I’m guessing from the context you are using built in windows voices?  I think you might get better results with something cloud like Amazon’s Polly or one of the open source frameworks like festival, open mary or a hybrid like iSpeech.


  [ # 2 ]

A company that I’ve used is

Worth looking at.

Reasonably priced, good range of voices and the SSML is a nice touch.


  [ # 3 ]

Thanks for the responses. I am trying to use the php files that are included with ChatScript. They access some Google voices but the choices seem pretty limited. I am open to other TTS options and will investigate the suggestions above.

Anyone using IBM Watson TTS? I have a Bluemix account and we are trying to get it working with ChatScript and our Virtual Patients in Unity3D.


  [ # 4 ]

I’m curious if you have made any progress on this? I’m really interested in and embedded, non-web based text-to-speech to use with chatscript.



  [ # 5 ]

Yes, we have implemented two TTS versions of ChatScript.

One uses a Unity3D exe approach and the second is Unity based iPad version. Both use Watson TTS and STT. CS is not embedded in either platform. The Unity application uses the hardware (laptop or iPad) microphone to get the utterance, sends it to Watson for TTS, Watson returns the text which gets sent to CS running on a local or remote server. CS matches the input, returns the answer text which gets routed to Watson for TTS and the bot then replies/speaks the answer.

There is some other voodoo in the loop and we record all audio to do some NLP research. That slows things down a bit but not that much. We have to build in a delay anyway to “guess” when the student is done speaking. Long pauses make for some pretty comical conversations since CS interprets the partial inputs as separate questions.

We have not really pursued the php approach since we got the other to work pretty well. 



  login or register to react