Who are the real vendors (not resellers)  of voice-to-text?

It seems like well guarded information, so what are the rumors?
- I hear what sounds like a firm rumor that Apple’s Siri uses Nuance.
- IBM is a vendor and they are not the kind of company to license it from someone else
- Any rumors about Assistant, Cortana, and Alexa?
Since these are younger companies and voice-to-text is a pretty hard problem, that requires an extended history of research, I cannot see how Google, Microsoft, or Amazon had the time to develop the technology. I see the comment: “it is easy now with neural nets” and believe that to be nonsense. I might be wrong.

Also, Amazon is basically a catalogue and so the kinds of speech it needs to understand is very much more limited than for Siri, etc. People coming back from CES are saying Alexa does a better job recognizing language and I want to know: is this cuz they have a better underlying voice-to-text capability or is it because it is better integrated into household devices - and more successful because of how it is deployed, focused, and marketed, with easier applications?


I guess it depends on the OS, Peter. For Windows 7+ systems, the built-in voice recognition has been pretty good, though it takes a while to properly “train” it to get a decently low failure rate. I have no liking for Cortana, as I feel it’s far too invasive with my current Windows 10 setup, so I use Nuance’s Dragon software when I need to dictate something. As for *nix and Mac, I have no clue, other than the Voice Rec built into Google Chrome, which I use very rarely (I DO have a custom web page on my local/dev servers that allows me to “trick” chrome into dictating for me, but it’s so prone to errors that it’s often better to just type). Beyond that, I haven’t the foggiest.


Thanks Dave, good to know. I work in Windows. I find packages from Nuance priced the same as, for example, Visual Studio from Microsoft. I did not know v2t was built into Windows. I should poke around on my own PC! I also see things that look good at first, like and a closer look shows a system that recognizes questions and a few relational adjectives.

I also see Google charging a lot and IBM charging a little. My large behemoth employer uses antiquated customer service mechanisms and that is an opportunity, even for the simplest keyword based call forwarding. If I was younger, acquiring new skills would be more exciting smile But in any case, there are a lot of potential customers for call automation - even though it is not a complete language solution such as you are interested in.


I’ve no idea how old you are, Peter, but I’ll be 56 in a couple of months, and I’m still eagerly learning things. In fact, I’m currently a student at Western Nevada College here in Carson City, going for a degree in Graphic Communications, and having the time of my life in the process. cheese You’re never too old to learn! smile


I agree with Dave. I’m only a few months younger than Dave but I can never stop learning either.


You are right. Learning how language works and trying to capture its essence in software is of great interest to me but learning how call centers work, how to integrate software into a telephony system, and things like time latency when using the cloud ...I probably would not have enjoyed these at any age.

I wish I had a team with the right combination of interests. Trying to develop an automated phone assistant for my employer as a Python “skunk” project won’t be easy inside an already busy C++ schedule. So a certain amount of whining and sighing should be expected.


lol I get that, and won’t take offense to a bit of winging from time to time. smile


Note: apparently Google IS rolling its own speech recognition - claiming to have slashed the error rate by 30% AND using neural nets.


@Peter that’s so true, there isn’t time to learn all the things that I want to learn, never mind the things that might be useful but not very interesting.


