Friday, November 28, 2008

IBM Predicts Talking Web

IBM's annual crystal ball list of Innovations That Will Change Our Lives in the Next Five Years includes a forecast of a voice-enabled talking web. "You will be able to sort through the Web verbally to find what you are looking for and have the information read back to you," the article predicts.
IBM itself has launched several voice-enabled products and initiatives over the years, most notably the WebSphere Voice family of web servers, which adds various voice functionality to its flagship WebSphere platform, leveraging it in areas such as unified messaging and call-center automation.
Some problems exist with a vision as the one advocated by the article. Speech recognition accuracy and noise filtering have obviously come a long way and may only pose a minor impediment.
The user's desire to speak rather than type or click is another problem. Issuing voice commands in the presence of others may not always be desirable and can be disruptive, for instance at work on public transport. Lastly, there are usability concerns, beyond the quality of speech technology, when converting a visual 2- or even 3-dimensional representation of information into a 1-dimensional audio stream. The cognitive load increases significantly with tasks more complex than, for instance, obtaining time-table information or finding the nearest Italian restaurant.
The effort that stands behind the vision, to put voice technology to uses beyond call-center automation, is laudable. Mobile internet access and computing on-the-road may indeed do their parts to make this vision come true. And clearly, there are use cases, such as improved accessibility for users with impairments, that on their own accord merit making the web voice-accessible. Wide-spread usage of a voice-enabled web, however, may be more than five years off.


enbert said...

Enabling the informations of the web to be accessable via voice user interface sounds really charming to me ... as a first impression.
But on secound thoughts, I know there are to many situations I would not use this interface at all. Just imagine the nosies a internet caffee would produce ... You would have to wear headphones all the time ... and cheking the mail on a public transport could be anoing.
What really would make sense, is to have a voice user interface as an additional user interface side by side to established UI's like GUI. This could extend the possibilities of application ... but I see it as mistake to try establishing a voice based interface as alternative for all purposes.

Just my 2 cents


Okko said...

True - an optional voice interface (sort of like a mobile one) is probably a great idea.
Some efforts in this direction exist already, such as VXML+HTML or CSS3.
Most of that is geared to address usability/accessibility problems. However with speech recognition coming to the iPhone and Android phones, who knows what will happen!