Ivan’s private site

October 27, 2010

Publication of the Last Call for RDFa Core 1.1

The W3C RDFa Working Group has just published the “Last Call Working Draft” for RDFa Core 1.1. As Manu Sporny, the co-chair of the group, said in his tweet, this W3C jargon is equivalent to a “feature freeze”. Ie, the group does not know of any outstanding technical issues and of missing features that it would reasonably plan to add. Put it another way, this is last round of commenting before proceeding to final implementation testing and, hopefully, to a final W3C Standard. I.e., Last Call doesn’t mean that the group takes no more comments; on the contrary, technical comments are very welcome and necessary to make it sure that the final outcome is correct. Please, send your comments to the groups mailing list: public-rdfa-wg@w3.org (there is also a public archive).

Although lots of things have been discussed in the past few months (i.e., since the last draft published in August) not many things have significantly changed, in fact. Most of the changes are editorial, making the text clearer, more precise, etc. (You can look at the “diff” file, if you are interested.) This document is for the Core, i.e., the generic RDFa processing that can be used for any DOM. It is to be expected to have, in a few days, a similar document published for XHTML+RDFa 1.1 by the same Working Group, and an HTML5+RDFa 1.1 by the HTML Working Group.

I have also worked, in parallel to the specification work, on a modified version of the RDFa distiller. While the “official” service remains unchanged and relies on the current RDFa Recommendation, there is now a “Shadow” version, that relies on RDFa 1.1. The underlying code has undergone some cleanups beyond the adaptation to RDFa 1.1 so I am sure there are bugs…

Finally, a blatant self-promotion: Stéphane Corlosquet, Lin Clark and I will give a tutorial at the upcoming ISWC conference in Shanghai on RDFa and Drupal. The RDFa part relies on 1.1… (There are links to the slides on the page but you do not expect us not to touch them any more before the tutorial itself, do you? So make sure you look at them again after the event…)

June 8, 2010

RDFa API draft

A few weeks ago I have already blogged about the publication of the RDFa 1.1 drafts. An essential, additional, piece of the RDFa puzzle has now been published (as a First Public Working Draft): the RDFa API. (Note that there is no reference to “1.1” in the title: the API is equally valid for versions 1.0 and 1.1 of RDFa. More about this below.)

Defining an API for RDFa was a slightly complex exercise. Indeed, this API has very different constituencies, ie, possible user communities, and these communities have different needs and background knowledge. On the one hand, you have the (for the lack of a better word) “RDF” community, ie, people who are familiar and comfortable with RDF concepts, and are used to handle triples, triple stores, iterating through triples, etc. Essentially, people who either have already a background in using, say, Jena or RDFLib, or who can grasp these concepts easily due to their background. But, on the other hand, there are also people coming more from the traditional Web Application programmers’ community who may not be that familiar with RDF, and who are looking for an easy way to get to the additional data in the (X)HTML content that RDFa provides. Providing a suitable level of abstraction for both of these communities took quite a while, and this is the reason why the RDFa API FPWD could not be published together with RDFa 1.1.

But we have it now. I will give a few examples below on how this API can be used; look at the draft for more details (there more examples there; the examples in this blog use, actually, some of the examples from the document!). The usual caveat applies: this is a working draft with new releases in the future; comments are more than welcome! (Please, send them to public-rdfa-wg@w3.org, rather than answering to this blog.)

The “non-RDF” user

Let me use the same RDFa example as in the previous blog:

<div prefix="relation: http://vocab.org/relationship/ foaf: http://xmlns.com/foaf/0.1/">
   <p about="http://www.ivan-herman.net/foaf#me"
      rel="relation:spouse" resource="http://www.ivan-herman.net/Eva_Boka"
      property="foaf:name">Ivan Herman</p>
</div>

Encoding the (RDF) triples:

<http://www.ivan-herman.net/foaf#me> <http://vocab.org/relationship/spouse> <http://www.ivan-herman.net/Eva_Boka> .
<http://www.ivan-herman.net/foaf#me> <http://xmlns.com/foaf/0.1/name> "Ivan Herman" .

So how can one get to these using the API? The simplest approach is to get the collection of property-value pairs assigned to a specific subject. Using the API, one can say:

>> var ivan = document.getItemBySubject("http://www.ivan-herman.net/foaf#me")

yielding an instance of what the document calls a ”Property Group”. This object contains all the property-value pairs (well, in RDF terms, the predicate-object pairs) that are associated with http://www.ivan-herman.net/foaf#me. It is then possible to find out what properties are defined for a subject:

>> print(ivan.properties)
<http://vocab.org/relationship/spouse>
<http://xmlns.com/foaf/0.1/name>

It is also possible relate back to the DOM node that holds the subject via ivan.origin, and use the get method to get to the values of a specific property, ie:

>>>print(ivan.get("http://xmlns.com/foaf/0.1/name"))
Ivan Herman

Note that, on the level, the user does not really have to understand the details of RDF, of predicates, etc; what one gets back is, essentially, a literal or a representation of an IRI, and that is about it. Simple, isn’t it?

Of course, there is slightly more to it. First of all, finding the property groups only through the subjects may be too restrictive. The API therefore includes similar methods to search through the content via properties (“return all the property groups whose subject has a specific property”) or via type (i.e., rdf:type in RDF terms, @typeof in RDFa terms). One can also search for DOM Nodes rather than for Property Groups. Eg, using

>> document.getElementsByType("http://http://xmlns.com/foaf/0.1/Person")

one can get hold of the elements that are used for persons (e.g., to highlight these nodes or their corresponding subtrees on the screen by manipulating their CSS style). I also used full URI-s everywhere in the examples; it is also possible to set CURIE like prefixes to make the coding a bit simpler and less error prone.

An that is really it for simple applications. Note that many RDFa content (eg, Facebook‘s Open Graph protocol, or Google‘s snippets) include only a few simple statements whose management is perfectly possible with these few methods.

The RDF user

Of course, RDF users may want more and, sometimes, the complexity of the RDF content does require more complex methods. The RDFa API spec does indeed provide a more complex set of possibilities.

The underlying infrastructure for the API is based on the abstract concept of a store. One can create such a store, or can get hold of the default one via:

>> var store = document.data.store

The default store contains the triples that are retrieved from the current page. It is then possible to iterate through the triples, possibly via an extra, user-specified filter. Furthermore, it is possible to  create (RDF) new triples and add them to the store. One can add converter methods that control how typed literals are mapped onto, say, Javascript objects (although an implementation will provide a default set for you). In some ways, fairly standard RDF programming stuff, yielding, for example, to the following code:

>> triples = document.data.store.filter(); // get all the triples
>> for( var i=0; i < triples.size; i++ ) {
>>   print(triples[i]);
>> }
<http://www.ivan-herman.net/foaf#me> <http://vocab.org/relationship/spouse> <http://www.ivan-herman.net/Eva_Boka>
<http://www.ivan-herman.net/foaf#me> <http://xmlns.com/foaf/0.1/name> "Ivan Herman"

There is one aspect that is very well worth emphasizing. The parser to the store is conceptually separated from the store (similarly to, say, RDFLib). What this means is that, though an implementation provides a default parser for the document, it is perfectly possible to add another parser parsing the same document and putting the triples into the same store. The RDFa API document does not require that the parser must be RDFa 1.1; one can create a separate store and use, for example, an RDFa 1.0 parser. But much more interesting is the fact that one can also add a, say, hCard microformat parser that produces RDF triples. The triples may be added to the same store, essentially merging the underlying RDF graphs. Eg, one can have the following:

>> store = document.data.store; // by default, this store has all the RDFa 1.1 content
>> document.data.createParser("hCard",store).parse();

merging the RDFa and the hCard terms in one triple store. I think the possibility to use the RDFa API to, at last, merge the different RDF interpretable syntaxes in this area in one place is very powerful. (It must be noted that the details of the parser interface is still changing; e.g., it is not yet clear how various parsers should be identified. This is for one of the next releases…)

As I said, comments are more than welcome, there is still work to do, but the first, extremely important step has been made!

May 12, 2010

RIF (Core) and LOD

Linked Data (Semantic Web) candies
Image by reedster via Flickr

W3C has just published a Proposed Recommendation for the Rule Interchange Format (RIF); this means, in the W3C jargon, that the technical work is done, and the W3C asks its members for a seal of approval to publish it as Recommendation.

Somehow the RIF development was not on the radar screen of the Semantic Web community. There may be many reasons for that, and I think we should just accept this as part of history. The future is much more important; we should indeed realize that RIF is an important piece of the Semantic Web technical architecture and let us do our best to get it embraced widely.

RIF Core is the simplest variant of RIF. It is not very complicated. It is a simple rule language; one can define a series of Horn rules, there are some safety features built in so that the rules can be executed, conceptually, by a forward chaining engine, it has the familiar XSD datatypes with the usual operations, it operates on URI-s, and it has a notion analogous to RDF blank nodes. There is a separate document that describes how RIF (Core) rules operate with RDF data and how the various semantics (RIF, RDF(S), OWL) work together. The details are not really important here, suffices it to say that it, essentially, works like one would expect as a layperson… The RIF syntax is a little bit convoluted for the moment, but there may be work coming up to improve that in form of alternative, more readable syntaxes.

So what can it be used for? At the W3C LOD Camp in Raleigh (held as part of the WWW2010 conference), Sandro Hawke already gave a simple example why RIF should be interesting for LOD applications. Let me add a few further examples that might be of interest.

Remember OWL-RL? The OWL Working Group has defined a subset of OWL that can be handled by rules. The rules themselves were also published by the OWL WG, albeit using an abstract notation. Those rules can be described in RIF Core as well; the RIF group has published this mapping in a separate document. Following those rules a RIF Core engine can handle OWL-RL directly.

Why is that interesting?—you might ask. Well, there has been quite some discussions when defining OWL RL on whether the features included in OWL RL represent the right set for users. Some claimed that there are other OWL features that could be added; others said that, on the contrary, the complexity of OWL RL is already too high and the features should be reduced to make them more palatable to users. In some ways, the usage of RIF Core may make this discussion moot. Indeed, users, or user communities, can define the rules they are interested in RIF by cherry picking the rules described by the RIF WG in the document cited above. They can send those rules to their RIF Core reasoner alongside their data, and get what they want. If that rule set consists only of 2-3 OWL rules, because that is all the application cares about, than all the better, the RIF inference engine will just do its job faster. If the user wants to add OWL features that are not in OWL RL, that may also be doable; the OWL 2 RDF-Based semantics specification is such that, in many cases, the extra rules can be extracted fairly easily from the OWL 2 Full semantics, using the patterns in the RIF/OWL RL document (although I have to emphasize that this does not work in all cases!). Note that this model of “sending” the RIF rule set alongside the RDF data to a reasoner is exactly the way RIF reasoning is being defined for SPARQL1.1 in the separate Entailment Regimes document (still in draft). Note also that I referred to OWL RL here, but the same approach could be used with RDFS with, obviously, a smaller RIF Rule set.

Another, albeit related application of RIF came to my mind reading an email discussion on whether inferences should be materialized for large LOD datasets or not and, if yes, which ones. As an answer to Vasiliy Faronov’s question, Leigh Dodds also proposed a text to be added to his Linked Data Patterns book. The resulting discussion thread was really about which inferences should really be materialized. Materializing them all may not be realistic; but if only a selection of the possible inferences is used (eg, subset of RDFS or OWL) how would consumers of the data know? It looks like RIF may come to rescue. Publishers could simply publish the rules they use for materializing their inferences in RIF. (Again, this is not always possible; RIF cannot cover the whole of OWL. But it does cover a very large percentage of the use cases.) Consumers may actually choose whether they want to download all the triples, including the inferenced triples, or whether they choose to download data from the “core” dataset only together with the RIF file, and materialize the inferences locally using a local RIF engine (or use the RIF file with an RIF Entailment aware SPARQL 1.1 engine).

RIF is and should be considered as integral and essential part of the Semantic Web Technology landscape. Let us hope many implementations of, at least, RIF Core will bloom to make this a reality! (There is a public list of existing implementations so far.)

January 15, 2009

OWL panel, documents, comments…

Filed under: Semantic Web,Work Related — Ivan Herman @ 8:56
Tags: , ,

Right before Christmas the videos from the ISWC2008 conference were released. (Nice Christmas present from Tim Finin and his colleagues…) So I re-listened to the panel discussion “An OWL 2 Far?”. It is always good to listen to such discussions after a few months and, shall we say, with a slightly cooler head… It is clear from the discussion (and I guess all parties can agree on that) that the different views on OWL still generate passionate feelings and discussions…

A few weeks after the conference a bunch of OWL documents were published; I also had a blog on a particular profile of it, namely OWL-RL (one of the issues discussed quite vehemently at that panel, including by yours truly…). What is slightly surprising is that, in contrast to the passions raised there, not many comments have been submitted to the Working Group yet (look at the public archive of the mailing list). This is the time to do this, though: the OWL 2 documents are in Last Call in the W3C process, ie, this is when the technical design is getting finalized. The Last Call period ends on the 23rd of January, which is not that far away… So, please, if you have concerns or issues, or even if you just want say what a wonderful work that is, speak up now!

December 15, 2008

W3C’s Validators are Not Only HTML…

Filed under: Semantic Web,Work Related — Ivan Herman @ 18:15
Tags: , , , ,

W3C Validator Button A few days ago W3C launched its validator donation program (see also Olivier’s extra blog on this). The Semantic Web community might think that this is relevant for HTML, CSS, etc, only, but this is not so. The same validator program of W3C also includes the RDF and the Feed validators; also, the HTML validator has also been upgraded recently to handle the RDFa+XHTML DTD. (Although it is not required by RDFa to use this DTD, it is a good idea to have it there if you want to be sure about the validity of your file.) Ie, even if you are interested in the Semantic Web only (which I do not believe is anybody’s case, actually), you should consider the program…

ago

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 2,512 other followers