Monday, April 9, 2007

Web 3.0 and Natural Language Processing

Web 3.0 is getting some buzz in the blogosphere. Like Web 2.0, it begs the question that recently ran by its readers: what is it? However this time around things seems a bit easier.

Web 2.0 seems to be happy with being vaguely defined (delimited may be a better term) and equally a social and a technological movement. Web 3.0 clearly hovers over the idea of the "Semantic Web", a term coined by Tim Berners-Lee, in which richly mark-upped hypertext and data allow for novel more meaningful human-machine and machine-machine communication. Radar Networks (currently in stealth mode) claim to be driving some interesting developments in this direction and are followed closely by those interested.

This has already raised some questions: will content be expensive hand labor or machine boot-strappable, what new privacy policies do we have to live with, how does one separate style and content, what are alternatives to RDF.

Sadly, there's very little inspiring out there about potential applications.

My question (though not uniquely mine) to add to this: What role will natural language processing play in this (i.e. how "semantic" is this talk of Semantics)? Semantic content in RDF appears to be little more than a means for one machine to tell another who authored a particular book or what are the postal codes in the greater Boston area. Semantics to me is as much about intentions ("Why is web-service A dispensing such information?") and interpreting such information for the purposes of action ("What can web-service B - or my browser or I - do with it?").

Perhaps this misses the mark and semantic really isn't about natural language. But there is a weaker, more real form of this "language and technology" concern: Insofar as semantics is just information, can it be bootstrapped by a machine (perhaps even linguistically informed rather than statistically)?


No comments: