Thursday, February 26, 2009

Kindle Speech Synthesis

News about speech and language technology tend to be an in-industry affair, interesting largely to those who need and use it on a daily basis or those who produce (develop or market) it. Every so often however, mainstream news surface that raise issues of broad interest. Google's efforts with speech recognition are an example of this. Last month, Amazon's Kindle 2 e-book reader created a buzz with its text-to-speech "audio book" functionality.

The underlying issue is that Amazon is selling e-books, which can be listened to using speech synthesis, without owning the rights to produce audio book versions. The Authors's Guild argues that this undermines the lucrative audio book market. While it is arguable that a synthesized voice is comparable to the experience of listening to a well-produced audio book, Amazon decided not to fight this one out.

What do you think? Can synthesized audio books provide an experience comparable to real voice productions?

Monday, February 16, 2009

Microsoft Recite Preview - Note Dictation and Voice Search

Arstechnica reports today on the release of Microsoft Recite "Technology Preview" for Windows Mobile. The applications lets users record short notes as audio snippets, which can later be searched for content by speaking key words. Apparently it does not entail speech recognition rather than simpler pattern matching, meaning it cannot be searched in text form but may work more robustly, eliminating the effort of training for speaker-independency.

While not a full product yet, this sounds like a nifty little application for cognitive off-loading.

Have you tried Microsoft Recite?




Sunday, February 8, 2009

More speech on the iPhone

The iPhone has proved a game-changer in many regards and speech is no exception. Both Google and Yahoo (with vlingo) have deployed mobile speech applications for the iPhone.
Today I came across another sighting of iPhone speech recognition, Vocalia by Creaceed, employing open-source ASR engine Julius for back-end technology. There is no "push to talk" button but a "shake to retry", which may prove useful when recognition goes awry. The app supports French, English and German for now and costs €2.99. Dictation is not available at this point, though Julius is certainly capable of it from an architecture point of view.

Other speech and language related iPhone apps:,


Has anyone used these extensively? What is your experience with speech on the iPhone?

Monday, February 2, 2009

Zumba Lumba - iPhone killer or simply a hoax?

A no-frills phone with the unlikely name of Zumba Lumba has recently received some attention by the BBC. The phone is said to be top-secret, developed by a defense-aviation company. It does without frills like a camera or an applications platform, but touts some interesting security and computational features, (not only) related to speech technology:

  • Cloud computing - the phone uses no local storage for contacts, data.
  • Network speech recognition - user input is recognized over the internet. This should avoid hardware intensive local computing for voice input, but requires internet access.
  • Voice identification - enhanced security, because the phone will only respond to a single user's voice.
Some seem to think this is a potential iPhone killer at least in terms of making use of innovative input modalities (though Google already released a speech recognition app for the iPhone.) Others simply thinks it's a hoax.

Either way, the idea of joining mobile with cloud computing is interesting. Using voice identification for security has its appeal as well, even if it's unclear whether keeping data in the cloud and sending voice data over the internet is any more secure than simply keeping data on your phone, locally.