Voice Technology Progresses Beyond Siri

    Slide Show

    Top 10 Mobile Technologies and Capabilities for 2015 and 2016

    Even the writers of the original Star Trek knew that voice was the ultimate user interface. Half a century later, folks are doing much the same thing as Captain Kirk did when he asked whether the planet below was “Type M” or not.

    Apple’s Siri was the first of the voice-based smartphone interfaces. It has been joined by Google Now and, soon, Microsoft’s Cortana. Pocket-lint compares the three personal assistants in a number of areas. It starts with a general description of each and then covers voice functionality, whether it is possible to type instead of voicing queries, the nature of the search functions, and a number of more specific attributes.

    The conclusion is interesting. Writer Elyse Betters suggests that Google and Microsoft, due to their other assets, have a leg up on Siri. The reality, she writes, is that “no matter how many bells and whistles it may offer, Siri can’t quite take on Cortana and Google Now. At least not right now.” In terms of a comparison of the Google and Microsoft products, it’s close:

    In a nutshell: Cortana and Google Now are equally powerful assistants. Cortana technically has more to offer in the personality department (and let’s not forget that it also offers People reminders), but Google Now is still perfectly capable and just as addictive.

    Microsoft’s Cortana is the new voice in voice interfaces. Indeed, it is not yet generally available. Microsoft Research Distinguished Engineer Larry Heck blogged that the technology today is akin to where search was in the early days of that discipline. Cortana, which is currently in the hands of developers, now deals only with voice and text. However, the goal is to incorporate gesture-based actions as well.  

    The Economist  says that voice recognition has improved during the past few years because the statistical approach to guessing what word follows another word has improved. Systems are better at creating a context, which improves their ability to correctly predict the meaning of a sentence.

    For instance, a person could say, “Name a movie with Gene Hackman and Clint Eastwood.” The system would respond, “Unforgiven.” In the next iteration of voice recognition and machine intelligence, the person would be able to subsequently ask, “Did it get good reviews?” and have the system understand that “it” refers to the movie.

    At the Cable Show last week in Los Angeles, I had the opportunity to chat with an executive from Nuance, a big player in the voice sector. The conversation focused on the voice user interface of living room entertainment systems, but the technology is much the same. He said that the ability to have a real conversation with a computer system is progressing.

    The industry is also working on elements such as identifying people by their voice. This can useful in creating profiles and, for instance, keeping kids from accessing restricted content.

    Carl Weinschenk
    Carl Weinschenk
    Carl Weinschenk Carl Weinschenk Carl Weinschenk is a long-time IT and telecom journalist. His coverage areas include the IoT, artificial intelligence, artificial intelligence, drones, 3D printing LTE and 5G, SDN, NFV, net neutrality, municipal broadband, unified communications and business continuity/disaster recovery. Weinschenk has written about wireless and phone companies, cable operators and their vendor ecosystems. He also has written about alternative energy and runs a website, The Daily Music Break, as a hobby.

    Get the Free Newsletter!

    Subscribe to Daily Tech Insider for top news, trends, and analysis.

    Latest Articles