Tuesday, December 29, 2009

Bye Bye 2009, new URL

Good-bye 2009. With you I will close the doors on this blog. At least at its current URL.

After preciously few posts in the past months, in part due to parenthood, in part due to facing new new challenges in academia (both are challenging in their own ways, of course), I have decided to put everything in a new location. I will continue to talk about "news" in speech and language tech, however since these are hard to come by (at least the really good stuff) I want to add a bit of a personal note. More howtos, gotchas, what's-Okko-up-to. Less, whooo-look-who-is-talking-about-speech. I'm sure you know what I mean.

With that, have a happy start to 2010 and please continue to follow at www.okkoblog.com.


Saturday, July 11, 2009

Speech and Dialog Conferences / Speech for iPhone and Android

Conference time: I will be spending a couple of days in London and Brighton from September 5th attending Interspeech, SIGDIAL as well as a researcher round-table. Anyone interested in meeting up, feel free to get in touch.

Also, here are some more or less recent, interesting news for Android (at about 6:20, thanks Schamai) and iPhone speech developers.

Thursday, June 18, 2009

Incrementality in Verbal Interaction

Since I've joined a research program at Potsdam University end of last year (as a researcher and PhD student), I've decided to use this blog for some additional, more personal updates. This is the first :-).

Our research is concerned with human-machine spoken dialog systems from an incremental, i.e. real-time processing, perspective. As such, members of our team, including me, were recently invited to a workshop on "Incrementality in Verbal Interaction." The workshop brought together an interesting mix of perspectives on incrementality from Psycholinguistics as well as Theoretical and Computational Linguistics. Slides from our project presentation are available here.

Thursday, April 2, 2009

Tim O'Reilly: Google Voice Search Key Technology

ReadWriteWeb reports Tim O'Reilly addressed attendees at the San Francisco Web 2.0 Expo this week, talking about key technologies for the Web >2.0. Voice search (Google iPhone App), he claimed was a tipping point in terms "sensor based interfaces".

While not the only vendor to provide voice search (i.e. Yahoo oneSearch powered by Vlingo) Google certainly seems ahead in the game in what appears to be a gradual unfolding of a broad voice strategy, such as Voice Search and recently rebranding a feature-enhanced GrandCentral as Google Voice. Future work on the voice front we can expect includes promotion of its own speech recognition capacities through Android, Google Gears bringing speech capacities to all browers, tighter integration of Gaudi (audio indexing) with other services and perhaps one day opening up voice services over APIs.

As I've previously pointed out, to Google voice is just another form of data, but what's slowly beginning to emerge is a central role for speech and voice technologies to play in coming developments for the web and how we search and interface with it.

Wednesday, April 1, 2009

Language Technology April Fools

Just posting some gems from today concerning speech and language technology, such as natural language generation, speech recognition and natural language processing.

Have you found any others?

Thursday, February 26, 2009

Kindle Speech Synthesis

News about speech and language technology tend to be an in-industry affair, interesting largely to those who need and use it on a daily basis or those who produce (develop or market) it. Every so often however, mainstream news surface that raise issues of broad interest. Google's efforts with speech recognition are an example of this. Last month, Amazon's Kindle 2 e-book reader created a buzz with its text-to-speech "audio book" functionality.

The underlying issue is that Amazon is selling e-books, which can be listened to using speech synthesis, without owning the rights to produce audio book versions. The Authors's Guild argues that this undermines the lucrative audio book market. While it is arguable that a synthesized voice is comparable to the experience of listening to a well-produced audio book, Amazon decided not to fight this one out.

What do you think? Can synthesized audio books provide an experience comparable to real voice productions?

Monday, February 16, 2009

Microsoft Recite Preview - Note Dictation and Voice Search

Arstechnica reports today on the release of Microsoft Recite "Technology Preview" for Windows Mobile. The applications lets users record short notes as audio snippets, which can later be searched for content by speaking key words. Apparently it does not entail speech recognition rather than simpler pattern matching, meaning it cannot be searched in text form but may work more robustly, eliminating the effort of training for speaker-independency.

While not a full product yet, this sounds like a nifty little application for cognitive off-loading.

Have you tried Microsoft Recite?