April 24, 2008

Semantic Web W3C Track at WWW2008

Yesterday I chaired a Semantic Web session at the W3C Track at WWW2008. Nice turnout (about 100 people), and I had to cut the discussions to keep within schedule, which is always a good sign…

Three presentations, fairly different from one another. Tom Heath and Chris Bizer made a presentation (co-authored with Tim Berners-Lee) on the Linking Open Data project. Real good stuff. Maybe the most impressive part was when Chris flipped through the figures on the “current” status of the linked dataset, starting from a year ago at WWW2007 up to April 2008. And the fact that, actually, we essentially lost track of how many triplets are out there; there are simply too many of those! I also did not know that Tom worked on Revyu by automatically adding information coming from DPBedia to an entry. I really hope that the coming year will see lots of user applications that rely on this huge amount of public RDF data out there…

Raphaël Troncy made a presentation on managing multimedia content on the Semantic Web. The situation today is really a maze with all kinds of standards, semi-standards, etc, on how to describe, annotate, reason about, say, video. Lots of work ahead, both in the Semantic Web area and in others. Think of the fact that we still do not have a generally accepted URI to describe something like an area in an image, or a specific point in time in a video. (There was, actually, a short discussion after the presentation on how some of the current URI schemes fit, or not fit, general Web Architecture…)

Huajun Chen gave an overview on what is happening in the Semantic Web area in China. In two words: a lot. Some of the technologies developed in China are now well-known all around, some of them less. We should realize that there are more Semantic Web related blogs and subscribers to local mailing lists than anywhere else… I think one of the challenges is to bind the various SW communities beyond the boundaries of languages, where Chinese is probably the largest “local” community. I do not have any magic bullet here, but presentations like Huajun’s are important to have…

  2. Ivan, thanks for yet another very useful event write-up!
    One question that comes to mind after looking at the Linking Open Data presentation: they seem to live in an entirely schema-less world…? Wouldn’t there be any value in exploiting schema’s/ontologies to improve the linking between the data-sets (which is currently quite sparse)? The whole notion of schema’s didn’t appear once in their presentation…

    Comment by Frank van Harmelen — June 13, 2008 @ 0:43

  3. Hi Frank and Ivan,

    we are not living in a entirely schema-less world but are for sure more relaxed about schemata than many people with an ontology engineering background.

    There is a general tendency in the LOD community to reuse as many existing properties and classes from well-known vocabularies like FOAF, DC, SKOS or SIOC as possible. So the usual procedure is to start with these terms and only define additional dataset-specific terms if no fitting terms are found in well-known vocabularies.

    The schemata of many datasets within the LOD cloud are defined using RDF-S (hardly any OWL) and there are also some datasets in the cloud that do not even define subclass/subproperty relationships, but are still considered useful by many applications (for instance DBpedia).

    What is in contrast considered as very important in the LOD community is that properties and classes can be looked up on the Web. Many applications use this mechanism to retrieve labels for properties. Some brave applications (for instance the FalconS search engine) also do subclass inferencing on Web data after applying some trust heuristics.

    I personally have the feeling that the usefulness of Linked Data does not depend too much on the capability to do sophisticated inferences on top of the data, but more on the possibility to discover related data by following RDF links between data sources.

    I don’t understand your proposal about using schema/ontologies to improve the linking between the datasets. How do schemata help to find out that two records in different databases talk about the same thing? I think only very indirectly though a combination of schema mapping and identify resolution algorithms?

    Domains where more sophisticated schemata level information would for sure be very useful for the Linked Data setting are in my opinion data visualization, schema mapping and data fusion.

    It would be great if schema authors would start annotating their schemata with hints on how instances should be rendered by Semantic Web clients. The Fresnel display vocabulary does for instance provide for this.

    It would also be great if schema authors would start to publish more mappings between their terms and existing terms from other schemata on the Web. The Neologism vocabulary publishing tool is already a first good step into this direction, as it provides for defining subclass and subproperty links between different vocabularies (

    Let’s also hope that the new RIF rule language will provide for publishing more sophisticated schema/term mappings on the Web.



    Comment by Chris Bizer — June 13, 2008 @ 16:22

