Ivan’s private site

March 1, 2013

RDFa 1.1, microdata, and turtle-in-HTML now in the core distribution of RDFLib

This has been in the works for a while, but it is done now: the latest (3.4.0 version) of the python RDFLib library has just been released, and it includes and RDFa 1.1, microdata, and turtle-in-HTML parser. In other words, the user can add structured data to an HTML file, and that will be parsed into RDF and added to an RDFLib Graph structure. This is a significant step, and thanks to Gunnar Aastrand Grimnes, who helped me adding those parsers into the main distribution.

I have written a blog last summer on some of the technical details of those parsers; although there has been updates since then, essentially following the minor changes that the RDFa Working has defined for RDFa, as well as changes/updates on the microdata->RDF algorithm, the general approach described in that blog remains valid, and it is not necessary to repeat it here. For further details on these different formats, some of the useful links are:

Enjoy!

May 17, 2011

HTTP Protocol for RDF Stores

Filed under: Semantic Web,Work Related — Ivan Herman @ 9:43
Tags: , ,

Last week the W3C SPARQL Working Group has published a number last call working drafts for SPARQL 1.1. Much have been already said on various fora on the new features of SPARQL 1.1, like update, entailment regimes, property paths; I will not repeat here. But I think it is worthwhile calling attention on one of the documents that may not be seen as a “core” SPARQL query language document, namely the Graph Store HTTP Protocol.

Indeed, this document stands a little bit apart. Instead of adding to the query (and now also update) language, it concentrates on how the HTTP protocol should be used in conjunction with graph stores. I.e., what is the meaning of the well known HTTP verbs like PUT, GET, POST, or DELETE  for graph stores, what should be the response codes, etc. It is important to emphasize that this HTTP behaviour is not bound to SPARQL endpoints; instead, it is valid for any Web sites that serve as a graph store. This could include, for example, a Web site simply storing a number of RDF graphs with minimal services to get or change the content of those. (In this respect, this document is closer to, e.g., the Atom Publishing Protocol which includes similar features for ATOM data, and which also plays an important role for technologies like, for example, OData.) Because such setups, i.e., “just” stores of RDF graphs without a SPARQL endpoint, are fairly frequent, it is important to have these HTTP details set. So… worth looking at this document and send feedbacks to the Working Group! (Use the public-sparql-dev@w3.org mailing list for comments.)

Enhanced by Zemanta

April 20, 2011

RDFa 1.1 Primer (draft)

Filed under: Semantic Web,Work Related — Ivan Herman @ 10:21
Tags: , , ,

I have had several posts in the past on the new features of RDFa 1.1 and where it adds functionalities to RDFa 1.0. The Working Group has just published a first draft for an RDFa 1.1 Primer, which gives an introduction to RDFa. We did have such a primer already for RDFa, but the new version has been updated in the spirit of RDFa 1.1… Check it out if you are interested in RDFa!

April 9, 2011

Announcement on rNews

Filed under: Semantic Web,Work Related — Ivan Herman @ 6:38
Tags: , ,
Semantic Web Bus / Bandwagon

Image by dullhunk via Flickr

A few days ago IPTC published a press release on rNews: “Standard draft for embedding metadata in online news”. This is, potentially, a huge thing for Linked Data and the Semantic Web. Without going into too much technical details (no reason to repeat what is on the IPTC pages on rNews, you can look it up there) what this means is that, potentially, all major online news services on the globe, from the Associated Press to the AFP, or from the New York Times to the Süddeutsche Zeitung, will have have their news items enriched with metadata, and this metadata will be expressed in RDFa. In other words, the news items will be usable, by extracting RDF, as part of any Semantic Web applications, can be mashed up with other types of data easily, etc. In short, news item will become a major part of the Semantic Web landscape with the extra specificity to be an extremely dynamic set of data that is renewed every day. That is exciting!

Of course, it will take some time to get there, but we should realize that IPTC is the major standard setting body in the news publishing world. I.e., rNews has a major chance to be largely adopted. It is time for the Semantic Web community to pay attention…

Enhanced by Zemanta

April 1, 2011

2nd Last Call for RDFa 1.1

Filed under: Semantic Web,Work Related — Ivan Herman @ 2:58
Tags: , ,

The W3C RDFa Working Group published a “Last Call” for RDFa 1.1 back at the end of October last year. This was meant to be a “feature freeze” version and was asking for public comments. Well, the group received quite a number of those. Lots of small things, requiring changes of the documents in many places to make them more precise even in various corner cases, and some more significant ones. In some ways, it shows that the W3C process works, ensuring quite an influence of the community on the final shape of the documents. Because of the many changes the group decided to re-issue a Last Call (yes, the jargon is a bit misleading here…), aimed at a last check before the document goes to its next phase on the road of becoming a standard. Almost all the changes are minor for users, though important for, e.g., implementers to ensure interoperability. “Almost all”, because there is one new and, I believe, very important though controversial new feature, namely the so-called default profiles.

I have already blogged about profiles when they were first published back in April last year. In short, profile documents provide an indirection mechanism to define prefixes and terms for an RDFa source: publishers may collect all the prefixes they deem important for a specific application and authors, instead of being required to define a whole set of prefixes in the RDFa file itself, can just refer to the profile file to have them all at their disposal. I think the profile feature was the feature stirring the biggest interest in the RDFa 1.1 work: they are undeniably useful, and undeniably controversial… Indeed, in theory at least, profiles represent yet another HTTP round when extracting RDF from and RDFa file, which is never a good thing. But a good caching mechanism or other implementation tricks can greatly alleviate the pain… (B.t.w., the group has also created some guidelines for profile publishers to help implementers.)

This draft goes one step further by introducing default profiles. These are profiles just like any other, but they are defined with fixed URI-s (namely http://www.w3.org/profile/rdfa-1.1 for RDFa 1.1 in general, and, additionally, http://www.w3.org/profile/html-rdfa-1.1 for the various HTML variants) and the user does not have to declare them in an RDFa source. Which means that a very simple HTML+RDFa file of the sort:

<html>
  <body>
    <p about ="xsd:maxExclusive" rel="rdf:type" resource="owl:DatatypeProperty">
      An OWL Axiom: "xsd:maxExclusive" is a Datatype Property in OWL.
    </p>
  </body>
</html>

(note the missing prefix declarations!) will produce that RDF triple that you might expect. Can’t be simpler, can it?

Why? Why was it necessary to introduce this? Well, the experience shows that many HTML+RDFa authors forget to declare the prefixes. One can look, for example, at the pages that include Facebook’s Open Graph Protocol RDFa statements: although I do not have an exact numbers, I would suspect that around 50% of these pages do not have them. That means that, strictly speaking, those statements cannot be interpreted as RDF triples. The Semantic Web community may ask, try to convince, beg, etc., the HTML authors (or the underlying tools) to do “the right thing”, and we certainly should continue doing so, but we also have to accept this reality. A default profile mechanism can alleviate that, thereby greatly extending the amount of triples that can become part of a Web of Data. And even for seasoned RDF(a) users not having to declare anything for some of the common prefixes is a plus.

Of course, the big, nay, the BIG issue is: what prefixes and terms would those default profiles declare? What is the decision procedure? At this time, we do not have a final answer yet. It is quite obvious that all the vocabularies defined by W3C Recommendations and official Notes and that have a fixed prefix (most of them do) should be part of the list. We may want to add Member Submissions to this list. If you look at the default profile, these are already there in the first table (i.e., the code example above is safe). The HTML variant would add all the traditional @rel values, like license, next, previous, etc.

But what else? At the moment, the profiles include a set of prefixes and terms that are just there for testing purposes (although they do indicate a tendency), so do not take the default profile as the final content. For the HTML @rel values, we would, most probably, rely on any policy that the HTML5 Working Group will define eventually; the role of the HTML default profile will simply be to reflect those. That seems quite straightforward However, the issues of default prefixes is clearly different. For those, the Working Group is contemplating two different approaches

  1. Set up some sort of a registration mechanism, not unlike the xpointer registry. This would also include some accompanying mailing lists where objections can be raised against the inclusion of a specific prefix, etc.
  2. Try to get some information from search engines on the Semantic Web (Sindice, Yahoo!, anyone else?) that may provide with a list of, say, the top 20 prefixes as used on the Semantic Web. Such a list would reflect the real usage of vocabularies and prefixes. (We still have to see whether this is an information these engines can provide or not.)

At this moment it is not yet clear which way is realistic. Personally, I am more in favour of the second approach (if technically feasible), but the end result may be different; this is a policy that W3C will have to set up.

Apart from the content, another issue is the change mode and frequency of the default profile. First of all, the set of default prefixes can only grow. I.e., once a prefix has made it on the default profile, it has to stay there with an unchanged URI. That is obviously important to ensure stability. I.e., new prefixes coming to the fore by virtue of being used by the community can be added to the set, but no prefix can be removed. As for the frequency: a balance has to be found between stability, i.e., that RDFa processors can rely (e.g., for caching) on a not-too-frequent change of the default profiles, and relevance, i.e., that new vocabularies could find their way into the set of default prefixes. Again my personal feeling is that an update of the profiles once every 6 months, or even once a year, might strike a good balance here. To be decided.

As before, all comments are welcome but, again as before, I would prefer if you sent those comments to the RDFa WG’s mailing list rather than commenting this blog: public-rdfa-wg@w3.org (see also the archives).

Finally: I have worked on a new version of my RDFa distiller to include all the 1.1 features. This version of the distiller is now public, so you can try out the different new features. Of course, it is still not a final release, there are bugs, so…

Enhanced by Zemanta

March 13, 2011

Example for the power of open data…

Earthquakes around the globe on the week of the 11th of March

I wish I would not have to use this example… But I just hit it this morning via a tweet of Jim Hendler. RPI has an example on how can one combine public gov data (in this case, a Data.gov dataset on Earthquakes), its RDF version with a SPARQL query, and a visualization tool like Exhibit. The result is an interactive map on Earthquakes of the last week. Running the demo today reveals an incredible amount (over 160) of events on the coast of Honshu, Japan, which led to the earthquake and tsunami disaster on the 11th of March. I do not know how much time it took for Li Ding to prepare the original demo, but I suspect it was not a big deal once the tools were in place.

The demo is dynamic, in the sense that in a week it will probably show some other data than today. So I have made a screen dump for memento (I hope it is all right with Jim and Din). If you are looking at it now, it is worth zooming into the area around Japan to gain some more insight into the sheer dimensions of the disaster: there were  325 quakes (out of 411 around the globe) in that area during the week! I must admit I did not know that…

I have the, hopefully not too naïve, belief that tools like this may not only increase our factual knowledge, but would also help, in future, to help those who are now struggling in coping with the aftermath of this disaster. Yes, having open data, and tools to handle them and integrate them, is really important.

August 3, 2010

New RDFa Core 1.1 and XHTML+RDFa 1.1 drafts

Filed under: Semantic Web,Work Related — Ivan Herman @ 20:49
Tags: , , , ,

W3C has just published updated drafts of RDFa Core 1.1 and XHTML+RDFa 1.1. These are “just” new heart-beat documents, meaning that they are not fundamentally new (the first drafts of these documents were published last April) but not yet ”Last Call” documents, i.e., the group does not yet consider the specification work finished. Although… in fact it is not far from that point. The WG has spent the last few weeks to get through open issues, and not many are left open at this moment.

So what has changed since my last blog on the subject where I introduced the new features compared to RDFa 1.0? In fact, nothing spectacular. Lots of minor clarifications issues to make things more precise. There has been a change on the treatment of XML Literals: whereas, in RDFa 1.0, XML Literals are automatically generated any time XML markup is in the text, RDFa 1.1 explicitly requires a corresponding datatype specification; otherwise a plain literal is created in RDF. (This is the only backward incompatibility of RDFa 1.0, as foreseen by the charter.)

Probably the most important addition to RDFa Core was triggered by a comment of Jeni Tenison (though the problem was raised by others, too). Jeni emphasized a slightly dangerous aspect of the profile mechanism in RDFa 1.1. To remind the reader: using the @profile attribute the author of an RDFa 1.1 file can refer to another file somewhere on the Web; that “profile file” may include, in one place, prefix declarations, term specifications, and (this is also new in this version!) a default term URI (see again my earlier blog on the details). The question is: what happens if the profile file is unreachable? The danger is that an RDFa 1.1 would possibly generate wrong triples, which is actually worse than not generate triples at all. The decision of the group (as Jeni actually proposed) was that the whole DOM subtree, i.e., all triples would be dropped starting with the element with the un-referenceable profile.

The profile mechanism has stirred quite some interest both among users of RDFa and elsewhere. Martin Hepp was probably the first to publish an RDF 1.1 profile for GoodRelations and related vocabulary prefixes at http://www.heppnetz.de/grprofile/. To use, essentially, his example, this means that one can use

<div profile="http://www.heppnetz.de/grprofile/">
  <span about="#company" typeof="gr:BusinessEntity>
    <span property="gr:legalName">Hepp's bakery</span>,
    see also the <a rel="rdfs:seeAlso" href="http://example.org/bakery">
    home page of the bakery.</a>
</div>

Because Martin’s profile includes a prefix definition for rdfs, too (alongside a number of other prefixes), the profile definition replaces a whole series of namespace declarations that were necessary in RDFa 1.0. I would guess that similar profile files, with term or prefix definitions, will be defined for foaf or for Dublin Core, too. Other obvious candidates for such profile definitions are the “big” users of RDFa information like Facebook or Google, who can specify the vocabularies they understand, i.e., index. (This did come up at the W3C camp in Raleigh, during the exciting discussion on the Facebook vocabulary.) Finally, another interesting discussion generated by RDFa’s profile mechanism occurred at the “RDF Next” Workshop in Palo Alto a few weeks ago: some participants proposed to consider a similar mechanism in a next version of Turtle (I must admit this came as a surprise, although it does make sense…)

As for implementations of profiles? Profiles are defined in such a way that an RDFa processor can recursively invoke itself to extract the necessary information for processing; indeed, RDFa is also used to encode the prefix, term, etc, definitions (Turtle or RDF/XML can also be used, but RDFa is the only required format). This means that an RDFa processor does not have to implement a different parser to handle the profile information. My ”shadow” RDFa distiller implements this (as well as all RDFa 1.1 features) and it was not complicated. It actually implements a caching mechanism, too: some well known and well published profiles can be stored locally so that the distiller does not go through an extra HTTP request all the time (yes, I know, this may lead to inconsistencies in theory but if such cache is refreshed regularly via, say, a crontab job, it should be o.k. in practice). At the moment the content of that cache is of course curated by hand. (The usual caveat applies: this is code in development, with bugs, with possibly frequent and unannounced changes…) You are all welcome to try the shadow distiller to see what RDFa is capable of. Of course, other RDFa 1.1 implementations are in the making. If you have one, it would be good to know about them, the Working Group is constantly looking for implementation experiences…

June 8, 2010

RDFa API draft

A few weeks ago I have already blogged about the publication of the RDFa 1.1 drafts. An essential, additional, piece of the RDFa puzzle has now been published (as a First Public Working Draft): the RDFa API. (Note that there is no reference to “1.1” in the title: the API is equally valid for versions 1.0 and 1.1 of RDFa. More about this below.)

Defining an API for RDFa was a slightly complex exercise. Indeed, this API has very different constituencies, ie, possible user communities, and these communities have different needs and background knowledge. On the one hand, you have the (for the lack of a better word) “RDF” community, ie, people who are familiar and comfortable with RDF concepts, and are used to handle triples, triple stores, iterating through triples, etc. Essentially, people who either have already a background in using, say, Jena or RDFLib, or who can grasp these concepts easily due to their background. But, on the other hand, there are also people coming more from the traditional Web Application programmers’ community who may not be that familiar with RDF, and who are looking for an easy way to get to the additional data in the (X)HTML content that RDFa provides. Providing a suitable level of abstraction for both of these communities took quite a while, and this is the reason why the RDFa API FPWD could not be published together with RDFa 1.1.

But we have it now. I will give a few examples below on how this API can be used; look at the draft for more details (there more examples there; the examples in this blog use, actually, some of the examples from the document!). The usual caveat applies: this is a working draft with new releases in the future; comments are more than welcome! (Please, send them to public-rdfa-wg@w3.org, rather than answering to this blog.)

The “non-RDF” user

Let me use the same RDFa example as in the previous blog:

<div prefix="relation: http://vocab.org/relationship/ foaf: http://xmlns.com/foaf/0.1/">
   <p about="http://www.ivan-herman.net/foaf#me"
      rel="relation:spouse" resource="http://www.ivan-herman.net/Eva_Boka"
      property="foaf:name">Ivan Herman</p>
</div>

Encoding the (RDF) triples:

<http://www.ivan-herman.net/foaf#me> <http://vocab.org/relationship/spouse> <http://www.ivan-herman.net/Eva_Boka> .
<http://www.ivan-herman.net/foaf#me> <http://xmlns.com/foaf/0.1/name> "Ivan Herman" .

So how can one get to these using the API? The simplest approach is to get the collection of property-value pairs assigned to a specific subject. Using the API, one can say:

>> var ivan = document.getItemBySubject("http://www.ivan-herman.net/foaf#me")

yielding an instance of what the document calls a ”Property Group”. This object contains all the property-value pairs (well, in RDF terms, the predicate-object pairs) that are associated with http://www.ivan-herman.net/foaf#me. It is then possible to find out what properties are defined for a subject:

>> print(ivan.properties)
<http://vocab.org/relationship/spouse>
<http://xmlns.com/foaf/0.1/name>

It is also possible relate back to the DOM node that holds the subject via ivan.origin, and use the get method to get to the values of a specific property, ie:

>>>print(ivan.get("http://xmlns.com/foaf/0.1/name"))
Ivan Herman

Note that, on the level, the user does not really have to understand the details of RDF, of predicates, etc; what one gets back is, essentially, a literal or a representation of an IRI, and that is about it. Simple, isn’t it?

Of course, there is slightly more to it. First of all, finding the property groups only through the subjects may be too restrictive. The API therefore includes similar methods to search through the content via properties (“return all the property groups whose subject has a specific property”) or via type (i.e., rdf:type in RDF terms, @typeof in RDFa terms). One can also search for DOM Nodes rather than for Property Groups. Eg, using

>> document.getElementsByType("http://http://xmlns.com/foaf/0.1/Person")

one can get hold of the elements that are used for persons (e.g., to highlight these nodes or their corresponding subtrees on the screen by manipulating their CSS style). I also used full URI-s everywhere in the examples; it is also possible to set CURIE like prefixes to make the coding a bit simpler and less error prone.

An that is really it for simple applications. Note that many RDFa content (eg, Facebook‘s Open Graph protocol, or Google‘s snippets) include only a few simple statements whose management is perfectly possible with these few methods.

The RDF user

Of course, RDF users may want more and, sometimes, the complexity of the RDF content does require more complex methods. The RDFa API spec does indeed provide a more complex set of possibilities.

The underlying infrastructure for the API is based on the abstract concept of a store. One can create such a store, or can get hold of the default one via:

>> var store = document.data.store

The default store contains the triples that are retrieved from the current page. It is then possible to iterate through the triples, possibly via an extra, user-specified filter. Furthermore, it is possible to  create (RDF) new triples and add them to the store. One can add converter methods that control how typed literals are mapped onto, say, Javascript objects (although an implementation will provide a default set for you). In some ways, fairly standard RDF programming stuff, yielding, for example, to the following code:

>> triples = document.data.store.filter(); // get all the triples
>> for( var i=0; i < triples.size; i++ ) {
>>   print(triples[i]);
>> }
<http://www.ivan-herman.net/foaf#me> <http://vocab.org/relationship/spouse> <http://www.ivan-herman.net/Eva_Boka>
<http://www.ivan-herman.net/foaf#me> <http://xmlns.com/foaf/0.1/name> "Ivan Herman"

There is one aspect that is very well worth emphasizing. The parser to the store is conceptually separated from the store (similarly to, say, RDFLib). What this means is that, though an implementation provides a default parser for the document, it is perfectly possible to add another parser parsing the same document and putting the triples into the same store. The RDFa API document does not require that the parser must be RDFa 1.1; one can create a separate store and use, for example, an RDFa 1.0 parser. But much more interesting is the fact that one can also add a, say, hCard microformat parser that produces RDF triples. The triples may be added to the same store, essentially merging the underlying RDF graphs. Eg, one can have the following:

>> store = document.data.store; // by default, this store has all the RDFa 1.1 content
>> document.data.createParser("hCard",store).parse();

merging the RDFa and the hCard terms in one triple store. I think the possibility to use the RDFa API to, at last, merge the different RDF interpretable syntaxes in this area in one place is very powerful. (It must be noted that the details of the parser interface is still changing; e.g., it is not yet clear how various parsers should be identified. This is for one of the next releases…)

As I said, comments are more than welcome, there is still work to do, but the first, extremely important step has been made!

May 28, 2010

Self-documenting vocabularies using RDFa

Filed under: Semantic Web,Work Related — Ivan Herman @ 18:17
Tags: , , ,

This was one of the use cases some of us had in mind when RDFa was being developed, and it is nice to see that happening in practice… Olaf Hartig and Jun Zhao have recently published a provenance vocabulary. I am not knowledgeable enough to get into the detail of the provenance part. However, what also caught my attention is the way the vocabulary is defined: it is using XHTML+RDFa. So, while the URI above leads to a nice XHTML version of the vocabulary, readable by humans, the same source can also be used to get to the formal, RDF version of the vocabulary. Just use a distiller or extractor of any kind. I.e., do not repeat yourself, even when defining a formal vocabulary… I find this cool.

April 22, 2010

RDFa 1.1 Drafts

Filed under: Semantic Web,Work Related — Ivan Herman @ 15:52
Tags: , , ,

W3C has just published two RDFa 1.1 documents. In the W3C jargon these are called “First Public Working Drafts”, which means that they are by no means complete and features may still change based on community feedback. However, they document the intentions of the Working Group as for the direction it thinks it should take. (Note that another FPWD, namely an RDFa API specification, should follow very soon; hopefully a new version of the HTML+RDFa will be issued soon, too, incorporating these changes.) It may be interesting to summarize some of the important changes compared to the previous version of RDFa; that is what I will attempt to do. By the way, if you want to comment on RDFa 1.1, I would prefer you commented on the Group’s dedicated mailing list rather than on this blog: public-rdfa-wg@w3.org (see also the archives).

So, what are the new things?

1. Separation of RDFa 1.1 Core and RDFa 1.1 XHTML. This is one of the differences visible immediately: instead of one specification (which was the case for RDFa 1.0) we have now two. This comes from the fact that the RDFa attributes may be, and have already been, used in XML languages other than XHTML (SVG or ODF are good examples). Separation of the Core and the XHTML specific features is just a to make such adaptations cleaner. (I will use XHTML examples in this blog, though.)

2. Default term URI (a.k.a. @vocab attribute). RDFa 1.0 has a number of terms that can be used with attributes like @rel or @rev without any further qualifications. Examples are ‘next’ or ‘license’. These values were “inherited” by RDFa 1.0 from HTML; the spec simply assigned a URI for each of those to use them as RDF properties (eg, http://www.w3.org/1999/xhtml/vocab#next or http://www.w3.org/1999/xhtml/vocab#license).

However, a more flexible way of defining such terms, ie, not sticking to a fixed set and URI-s, is very useful for HTML authors in general. That is achieved by the new @vocab attribute: it defines a ‘default term URI’ that is concatenated with any term to form a URI. The mechanism can be applied to most of the RDFa attributes, not only to @rel and @rev. For example, the following RDFa 1.1 code:

<div vocab="http://xmlns.com/foaf/0.1/">
   <p about="#me" typeof="Person" property="name">Ivan Herman</p>
</div>

will generate the familiar FOAF triples:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<#me> a foaf:Person;
      foaf:name "Ivan Herman" .

The effect of @vocab is of course valid for the whole subtree starting at <div> (unless another @vocab takes it over somewhere down the line). This makes simple annotations with RDFa very easy for authors who do not want to deal with the intricacies of URI-s and CURIE-s.

3. Profile documents for terms. The problem with the @vocab mechanism is that it works only with one vocabulary. However, in many cases, one want to mix vocabularies: after all, this is one of the main advantages of using RDF (and hence RDFa). Eg, one would like to encode something like:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rel: <http://vocab.org/relationship/> .
<#me> a foaf:Person;
      foaf:name "Ivan Herman" ;
      rel:spouseOf <http://www.ivan-herman.net/Eva_Boka> .

A solution is to use a profile document. This is a simple RDF document that can be made available in various serializations (though only RFDa is mandatory for an RDFa processor) and describes the mapping of terms to URIs. Eg, one could define the http://example.org/simple-profile.ttl file (using here Turtle syntax for simplicity):

@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
[] rdfa:uri "http://vocab.org/relationship/spouseOf";
  rdfa:term "spouse" .
[] rdfa:uri "http://xmlns.com/foaf/0.1/name";
  rdfa:term "name" .
[] rdfa:uri "http://xmlns.com/foaf/0.1/Person";
  rdfa:term "Person" .

and use that document in an RDFa file as follows:

<div profile="http://example.org/simple-profile.ttl">
   <p about="#me" typeof="Person"
      rel="spouse" resource="http://www.ivan-herman.net/Eva_Boka" 
      property="name">Ivan Herman</p>
</div>

yielding the triples we wanted. Note that the @profile attribute allows for several URIs, ie, for several profile documents, and the corresponding term definitions are merged. Here again, a @profile attribute down the tree will supersede term definitions.

Of course, using the profile document is a heavier mechanism and requires, at least conceptually, an extra HTTP request. I am sure there will be community comments on that. However, I personally do not expect average authors to fiddle around much with those profile files. Instead, vocabulary publishers like Google, Yahoo, Facebook, Dublin Core, or Creative Commons may publish the terms they use and understand in the form of such profile documents, and authors could simply refer to those. RDFa tools could (and are encouraged to) cache the information stored in those widely used profile documents which alleviates the HTTP request issue in practice.

4. Profile documents for prefixes. Using profile documents for terms is great but it does not scale very well. If the vocabularies are large then publishers of those vocabularies would have to create and maintain fairly large files. In such cases the CURIE mechanism, already defined in RDFa 1.0, becomes a good alternative: instead of having a separate URI defined explicitly for each term, one can use an abbreviations for the “base” URIs of those vocabularies via prefixes.

In RDFa 1.0 this required an author to add a series of ‘xmlns:XXX’ attributes to the RDFa content. That mechanism is still valid in RDFa 1.1, but one can also define prefixes as part of a profile document. Eg, the previous turtle fragment could have been written as

@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
[] rdfa:uri "http://vocab.org/relationship/";
  rdfa:prefix "relation" .
[] rdfa:uri "http://xmlns.com/foaf/0.1/";
  rdfa:prefix "foaf" .

and the RDFa fragment would then look like:

<div profile="http://example.org/simple-profile.ttl">
   <p about="#me" typeof="foaf:Person"
      rel="relation:spouse" resource="http://www.ivan-herman.net/Eva_Boka" 
      property="foaf:name">Ivan Herman</p>
</div>

to generate the same triples as before. This mechanism may become important, as I said, when several large vocabularies (or simply a large number of vocabularies) are used.

5. Usage of @prefix to define CURIE prefixes. Actually, the usage of the xmlns:XXX syntax to set CURIE prefixes was (and is) controversial; there may be host languages that do not work with such attributes. RDFa 1.1 provides therefore an alternative which, though semantically identical, avoids the usage of a ‘:’ character in the attribute name. Using this @prefix attribute an alternative to the previous RDFa could be:

<div prefix="relation: http://vocab.org/relationship/ foaf: http://xmlns.com/foaf/0.1/">
   <p about="#me" typeof="foaf:Person"
      rel="relation:spouse" resource="http://www.ivan-herman.net/Eva_Boka" 
      property="foaf:name">Ivan Herman</p>
</div>

which looks very much like an RDFa 1.0 document but using the @prefix attributes instead of @xmlns:foaf and @xmlns:relation. The old @xmlns: approach remains valid, of course, but the new one is preferred from now on.

6. URIs everywhere. The final item I mention is the possibility to use URI-s everywhere, ie, to bypass the CURIE abbreviation mechanism if so desired. Whereas some of the RDFa 1.0 attributes required the usage of CURIE-s (eg, @rel or @property), this is no longer true in RDFa 1.1. The rules are fairly simple: if an attribute value is of the form ‘pp:xxx’ and the ‘pp’ string cannot be interpreted as a CURIE prefix then the string ‘pp:xxx’ is considered to be a URI and is treated as such in the generated RDF. That also means that CURIE-s can be used in @about and @resource without the slightly awkward ‘safe’ CURIE-s.

The development of RDFa 1.1 is obviously not done yet; lots of details are to be checked, and some additional minor features (eg, possible changes on the handling of XML Literals) are still to be worked out. And, first and foremost, community comments on the directions taken are important. But these First Public Working Drafts give the general direction…

Next Page »

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,613 other followers