<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Ivan’s private site &#187; Semantic Web</title>
	<atom:link href="http://ivan-herman.name/category/work-related/semantic-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://ivan-herman.name</link>
	<description></description>
	<lastBuildDate>Thu, 25 Apr 2013 05:56:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='ivan-herman.name' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/b36c82dd81cc7fc066d729227bbf8cba?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Ivan’s private site &#187; Semantic Web</title>
		<link>http://ivan-herman.name</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://ivan-herman.name/osd.xml" title="Ivan’s private site" />
	<atom:link rel='hub' href='http://ivan-herman.name/?pushpress=hub'/>
		<item>
		<title>Multilingual Linked Open Data?</title>
		<link>http://ivan-herman.name/2013/03/16/multilingual-linked-open-data/</link>
		<comments>http://ivan-herman.name/2013/03/16/multilingual-linked-open-data/#comments</comments>
		<pubDate>Sat, 16 Mar 2013 12:13:43 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Vocabulary]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=976</guid>
		<description><![CDATA[Experts developing Web sites for various cultures and languages know that it is way better to include such features into Web pages at the start, i.e., at the time of the core design, rather than to &#8220;add&#8221; them once the site is done. What is valid for Web sites is also valid for data deployed on [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=976&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.multilingualweb.eu"><img class="alignright" alt="Logo of the EU Multilingual Web Project" src="http://www.multilingualweb.eu/images/mw3.png" width="205" height="196" /></a>Experts developing Web sites for various cultures and languages know that it is way better to include such features into Web pages at the start, i.e., at the time of the core design, rather than to &#8220;add&#8221; them once the site is done. What is valid for Web sites is also valid for data deployed on the Web, and that is especially true for Linked Data whose mantra is to <em>combine</em> data and datasets from all over the place.</p>
<p>Why do I say all this? I had the pleasure to participate, earlier this week, at the <a href="http://www.multilingualweb.eu/rome-program">MultilingualWeb Workshop</a> in Rome, Italy. One of the topics of the workshop was Linked (Open) Data and its multilingual (and, also, multicultural) aspects. There were a number of presentations at a dedicated session (the presentations are online, linked from the <a href="http://www.multilingualweb.eu/rome-program">Workshop Page</a>; just scroll down and look for a session entitled &#8220;Machines&#8221;), and there was also a separate <a href="http://www.multilingualweb.eu/documents/rome-workshop/rome-lod">break-out session</a> (the slides are not yet on-line, but they should be soon). There are also a number of interesting projects and issues in this area beyond those presented at the event; for example, the <a href="http://lemon-model.net/">lemon model</a> or the (related) <a href="http://www.monnet-project.eu/">Monnet</a> EU project as examples.</p>
<p>All these projects are great. However, the overall situation in the Linked Data world is, in this respect, not that great, at least in my view. If one looks at the various Linked Data (or Semantic Web) related mailing lists, discussion fora, workshops, etc, multilingual or multicultural issues are almost never discussed. I did not make any systematic analysis of the various datasets on the LOD cloud, but I have the impression that only a few of them are prepared for multilingual use (e.g., by providing alternative labels and other metadata in different languages). URI-s are defined in English, most of the vocabularies we use are documented in only one language; they may be hard to use for non-English speakers. Worse, vocabularies may not even be properly <em>prepared</em> for multicultural use (just consider the <a href="http://www.w3.org/International/multilingualweb/rome/slides/22-ishida.pdf">complexity of personal names</a> which is hardly ever properly reflected in vocabularies). And this is where we hit the same problem as for Web sites; with all its successes we are still at the beginning of the deployment of Linked Data: our community should have much more frequent discussions on how to handle this issue <em>now</em>, because after a while it may be too late.</p>
<p>B.t.w., one of the outcomes of the break-out session at the Workshop was that a W3C Community Group should be created soon to produce some best practices for Multilingual Linked Open Data. There is already some work done in the area, look at the <a href="http://www.weso.es/MLODPatterns/">page set up by José Emilio Labra Gayo, Dimitris Kontokostas, and Sören Auer</a>; this may very well be the starting point. Watch this space!</p>
<p>It is hard. But it will be harder if we miss this particular boat.</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=68c1dd2a-fcd6-4674-bf8b-6d18a370781e" /></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/linked-data/'>Linked Data</a>, <a href='http://ivan-herman.name/tag/open-data/'>Open Data</a>, <a href='http://ivan-herman.name/tag/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/tag/vocabulary/'>Vocabulary</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/976/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/976/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=976&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2013/03/16/multilingual-linked-open-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://www.multilingualweb.eu/images/mw3.png" medium="image">
			<media:title type="html">Logo of the EU Multilingual Web Project</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=68c1dd2a-fcd6-4674-bf8b-6d18a370781e" medium="image" />
	</item>
		<item>
		<title>RDFa 1.1, microdata, and turtle-in-HTML now in the core distribution of RDFLib</title>
		<link>http://ivan-herman.name/2013/03/01/rdfa-1-1-microdata-and-turtle-in-html-now-in-the-core-distribution-of-rdflib/</link>
		<comments>http://ivan-herman.name/2013/03/01/rdfa-1-1-microdata-and-turtle-in-html-now-in-the-core-distribution-of-rdflib/#comments</comments>
		<pubDate>Fri, 01 Mar 2013 12:14:45 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[microdata]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[RDFLib]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[Turtle]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=972</guid>
		<description><![CDATA[This has been in the works for a while, but it is done now: the latest (3.4.0 version) of the python RDFLib library has just been released, and it includes and RDFa 1.1, microdata, and turtle-in-HTML parser. In other words, the user can add structured data to an HTML file, and that will be parsed [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=972&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This has been in the works for a while, but it is done now: the latest (<a href="https://github.com/RDFLib/rdflib/archive/3.4.0.zip">3.4.0 version</a>) of the python <a class="zem_slink" title="RDFLib" href="http://rdflib.net/" target="_blank" rel="homepage">RDFLib</a> library has just been released, and it includes and RDFa 1.1, microdata, and turtle-in-HTML parser. In other words, the user can add structured data to an HTML file, and that will be parsed into RDF and added to an RDFLib Graph structure. This is a significant step, and thanks to Gunnar Aastrand Grimnes, who helped me adding those parsers into the main distribution.</p>
<p>I have written a <a href="http://ivan-herman.name/2012/08/31/rdfa-microdata-turtle-in-html-and-rdflib/">blog last summer</a> on some of the technical details of those parsers; although there has been updates since then, essentially following the minor changes that the RDFa Working has defined for RDFa, as well as changes/updates on the microdata-&gt;RDF algorithm, the general approach described in that blog remains valid, and it is not necessary to repeat it here. For further details on these different formats, some of the useful links are:</p>
<ul>
<li><span style="line-height:12px;">For RDFa, there is a new version of an RDFa 1.1 Primer in preparation. It is probably worth keeping an eye on the <a href="http://www.w3.org/2010/02/rdfa/sources/rdfa-primer/Overview-src.html">editor’s draft of the primer</a>. The primer has the links to the official recommendations if one wants to look up the gory details. Alternatively, look at the <a href="http://rdfa.info">RDFa community page</a>!</span></li>
<li>For microdata, the <a href="http://www.w3.org/TR/microdata/">official specification</a> is of course available; the conversion to RDF is the subject of a <a href="http://www.w3.org/TR/microdata-rdf/">separate W3C Note</a>.</li>
<li>For turtle-in-HTML, you can look at the <a href="http://www.w3.org/TR/turtle/#in-html">latest version of the Turtle spec</a>.</li>
</ul>
<p>Enjoy!</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=09823f2f-ef3b-44f7-a3c7-0b2e167acf22" /></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/code/'>Code</a>, <a href='http://ivan-herman.name/category/work-related/code/python/'>Python</a>, <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/html/'>HTML</a>, <a href='http://ivan-herman.name/tag/microdata/'>microdata</a>, <a href='http://ivan-herman.name/tag/python/'>Python</a>, <a href='http://ivan-herman.name/tag/rdf/'>RDF</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a>, <a href='http://ivan-herman.name/tag/rdflib/'>RDFLib</a>, <a href='http://ivan-herman.name/tag/resource-description-framework/'>Resource Description Framework</a>, <a href='http://ivan-herman.name/tag/turtle/'>Turtle</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/972/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/972/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=972&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2013/03/01/rdfa-1-1-microdata-and-turtle-in-html-now-in-the-core-distribution-of-rdflib/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=09823f2f-ef3b-44f7-a3c7-0b2e167acf22" medium="image" />
	</item>
		<item>
		<title>Nice RDFa 1.1 example…</title>
		<link>http://ivan-herman.name/2012/11/26/nice-rdfa-1-1-example/</link>
		<comments>http://ivan-herman.name/2012/11/26/nice-rdfa-1-1-example/#comments</comments>
		<pubDate>Mon, 26 Nov 2012 21:20:10 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Work Related]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[Linked Library Data]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=947</guid>
		<description><![CDATA[I know I had seen that before, but I ran into this again: the WorldCat.org site (a must for book lovers…) has a nice structure using RDFa 1.1. Let us take an example page for a book, say, one of the latest books of Amitav Ghosh, the “Sea of poppies”. The page itself has all [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=947&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><img class="alignright" title="Cover page for Ghosh's novel, the Sea of Poppies" alt="Cover page for Ghosh's novel, the Sea of Poppies" src="http://coverart.oclc.org/ImageWebSvc/oclc/+-+552694167_140.jpg?SearchOrder=+-+OT,OS,TN,AV,FA,GO" height="209" width="140" />I know I had seen that before, but I ran into this again: the <a href="http://www.worldcat.org/">WorldCat.org</a> site (a must for book lovers…) has a nice structure using RDFa 1.1. Let us take an example page for a book, say, one of the latest books of Amitav Ghosh, the <a href="http://www.worldcat.org/title/sea-of-poppies/oclc/216941700">“Sea of poppies”.</a> The page itself has all kinds of data; what is interesting here is that the formal, bibliographical data is also encoded in RDFa 1.1. Running, for example, an <a href="http://www.w3.org/2012/pyRdfa/extract?format=json&amp;uri=http%3A//www.worldcat.org/title/sea-of-poppies/oclc/216941700">RDF distiller on the page</a> you get the bibliographical data. Here is an excerpt in <a href="http://json-ld.org/">JSON-LD</a>):</p>
<pre>{
    "@context": {
        "library": "http://purl.org/library/", 
        "oclc": "http://www.worldcat.org/oclc/", 
        "skos": "http://www.w3.org/2004/02/skos/core#", 
        "schema": "http://schema.org/", 
        . . .
    }, 
    "@id": "oclc:216941700", 
    "@type": "schema:Book", 
    "schema:about": [
        {
            "@id": "http://id.worldcat.org/fast/1122346", 
            "@type": "skos:Concept", 
            "schema:name": {
                "@value": "Social classes‍", 
                "@language": "en"
            }
        }, 
        . . .
    ],
    "schema:bookEdition": {
        "@value": "1st American ed.", 
        "@language": "en"
    }, 
    "schema:inLanguage": {
        "@value": "en", 
        "@language": "en"
    }, 
    "library:placeOfPublication": {
        "@type": "schema:Place", 
        "schema:name": {
            "@value": "New York :", 
            "@language": "en"
        }
    },
    . . .</pre>
<p>Note that WorldCat.org uses the <a href="http://schema.org">schema.org</a> vocabulary, where appropriate, but mixes it with a number of other vocabularies; exactly where the power of RDFa lies! Great for bibliographic applications that can use this type of data, possibly mixed with data coming from other libraries…</p>
<p>By the way, I was reminded to look at the site by a recent document just published by the Library of Congress: <a href="http://www.loc.gov/marc/transition/pdf/marcld-report-11-21-2012.pdf">“Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services”</a>. It is still a draft, and there are quite some discussions around it in the library community, but the overall picture is what counts: the library community may (let us be optimistic: will!) become one of the major actors in the Linked Data world, as well as users of structured data on the Web, most probably RDFa. Yay!</p>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/json/'>JSON</a>, <a href='http://ivan-herman.name/tag/linked-data/'>Linked Data</a>, <a href='http://ivan-herman.name/tag/linked-library-data/'>Linked Library Data</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/947/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/947/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=947&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2012/11/26/nice-rdfa-1-1-example/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://coverart.oclc.org/ImageWebSvc/oclc/+-+552694167_140.jpg?SearchOrder=+-+OT,OS,TN,AV,FA,GO" medium="image">
			<media:title type="html">Cover page for Ghosh&#039;s novel, the Sea of Poppies</media:title>
		</media:content>
	</item>
		<item>
		<title>RDFa 1.1 and microdata now part of the main branch of RDFLib</title>
		<link>http://ivan-herman.name/2012/11/06/rdfa-1-1-and-microdata-now-part-of-the-main-branch-of-rdflib/</link>
		<comments>http://ivan-herman.name/2012/11/06/rdfa-1-1-and-microdata-now-part-of-the-main-branch-of-rdflib/#comments</comments>
		<pubDate>Tue, 06 Nov 2012 19:34:40 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[microdata]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[RDFLib]]></category>
		<category><![CDATA[Turtle]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=945</guid>
		<description><![CDATA[A while ago I wrote of the fact that I have adapted my RDFa and microdata to RDFlib. Although I did some work on it since then, nothing really spectacular happened (e.g., I have updated the microdata part to the latest version of the microdata-&#62;RDF conversion note, and I have also gone through the tedious exercise [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=945&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A while ago I wrote of the fact that I have adapted my <a href="http://ivan-herman.name/2012/08/31/rdfa-microdata-turtle-in-html-and-rdflib/">RDFa and microdata to RDFlib</a>. Although I did some work on it since then, nothing really spectacular happened (e.g., I have updated the microdata part to the latest version of the <a href="http://www.w3.org/TR/2012/NOTE-microdata-rdf-20121009/">microdata-&gt;RDF</a> conversion note, and I have also gone through the tedious exercise to make the modules usable for <a class="zem_slink" title="Python (programming language)" href="http://www.python.org/" target="_blank" rel="homepage">Python3</a>).</p>
<p>Nevertheless, a significant milestone has been reached now, but this was not done by me but rather by Gunnar Aastrand Grimnes, who “maintains” <a href="http://rdflib.net">RDFlib</a>: the separate branch for RDFa and microdata has now been merged with the <a href="https://github.com/RDFLib/rdflib">master branch of RDFLib on github</a>. So here we are; whenever the next official release of RDFLib comes, these parsers will be part of it…</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=5053a709-c82b-4d0a-8ed1-8c50fb5df74c" /></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/code/'>Code</a>, <a href='http://ivan-herman.name/category/work-related/code/python/'>Python</a>, <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/microdata/'>microdata</a>, <a href='http://ivan-herman.name/tag/python/'>Python</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a>, <a href='http://ivan-herman.name/tag/rdflib/'>RDFLib</a>, <a href='http://ivan-herman.name/tag/turtle/'>Turtle</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/945/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/945/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=945&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2012/11/06/rdfa-1-1-and-microdata-now-part-of-the-main-branch-of-rdflib/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=5053a709-c82b-4d0a-8ed1-8c50fb5df74c" medium="image" />
	</item>
		<item>
		<title>RDFa, microdata, turtle-in-HTML, and RDFLib</title>
		<link>http://ivan-herman.name/2012/08/31/rdfa-microdata-turtle-in-html-and-rdflib/</link>
		<comments>http://ivan-herman.name/2012/08/31/rdfa-microdata-turtle-in-html-and-rdflib/#comments</comments>
		<pubDate>Fri, 31 Aug 2012 15:38:44 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[microdata]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[RDFLib]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[Turtle]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=926</guid>
		<description><![CDATA[For those of us programming in Python, RDFLib is certainly one of the RDF packages of choice. Several years ago, when I developed a distiller for RDFa 1.0, some good souls picked the code up and added it to RDFLib as one of the parser formats. However, years have gone by, and have seen the [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=926&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>For those of us programming in Python, <a class="zem_slink" title="RDFLib" href="http://rdflib.net/" rel="homepage" target="_blank">RDFLib</a> is certainly one of the RDF packages of choice. Several years ago, when I developed a <a href="http://www.w3.org/2007/08/pyRdfa/">distiller for RDFa 1.0</a>, some good souls picked the code up and added it to RDFLib as one of the parser formats. However, years have gone by, and have seen the development of <a href="http://www.w3.org/TR/rdfa-primer/">RDFa 1.1</a>, of <a href="http://www.w3.org/TR/microdata/">microdata</a>, and also the specification of <a href="http://www.w3.org/TR/2012/WD-turtle-20120710/#in-html">directly embedding Turtle into HTML</a>. It is time to bring all these into RDFLib…</p>
<p>Some times ago I have developed both a <a href="http://www.w3.org/2012/pyRdfa/">new version of the RDFa distiller</a>, adapted for the  1.1 RDFa standard, as well as a <a href="http://www.w3.org/2012/pyMicrodata/">microdata to RDF distiller</a>, based on the Interest Group note on <a href="http://www.w3.org/TR/2012/NOTE-microdata-rdf-20120308/">converting microdata to RDF</a>. Both of these were packages and applications <em>on top</em> of RDFLib. Which is fine because they can be used with the deployed RDFLib installations out there. But, ideally, these should be retrofitted into the core of RDFLib; I have used the last few quiet days of the vacation period in August to do just that (thanks to Niklas Lindström and Gunnar Grimes for some email discussion and helping me through the hooplas of RDFLib-on-github). The results are in a separate <a href="https://github.com/RDFLib/rdflib/tree/structured_data_parsers">branch of the RDFLib github repository</a>, under the name <code>structured_data_parsers</code>. Using these parsers here is what one can do:</p>
<pre>g = Graph()
# parse an SVG+RDF 1.1 file an store the results in 'g':
g.parse(URI_of_SVG_file, format="rdfa1.1") 
# parse an HTML+microdata file an store the results in 'g':
g.parse(URI_of_HTML_file, format="microdata")
# parse an HTML file for any structured conent an store the results in 'g':
g.parse(URI_of_HTML_file, format="html")</pre>
<p>The third option is interesting (thanks to Dan Brickley who suggested it): this will parse an HTML file for <em>any</em> structured data, let that be in microdata, RDFa 1.1, or in Turtle embedded in a <code>&lt;script type="text/turtle"&gt;...&lt;/script&gt;</code> tag.</p>
<p>The core of the RDFa 1.1 has gone through a very thorough testing, using the extensive <a href="http://rdfa.info/test-suite/">test suite</a> on <a href="http://rdfa.info">rdfa.info</a>. This is less true for microdata, because there is not yet an extensive test suite for that one yet (but the code is also simpler). On the other hand, any restructuring like that may introduce some extra bugs. I would very much appreciate if interested geeks in the community could install and test it, and forward me the bugs that are still undeniably there… Note that the <a href="http://www.w3.org/TR/2012/NOTE-microdata-rdf-20120308/">microdata-&gt;RDF mapping specification</a> may still undergo some changes in the coming few weeks/months (primarily catching up with some development around <a href="http://schema.org">schema.org</a>); I hope to adapt the code to the changes quickly.</p>
<p>I have also made some arbitrary decisions here, which are minor, but arbitrary nevertheless. Any feedback on those is welcome:</p>
<ul>
<li>I decided not to remove the old, 1.0 parser from this branch. Although the new version of the RDFa 1.1 parser can switch into 1.0 mode if the necessary switches are in the code (e.g., <code>@version</code> or a RDFa 1.0 specific DTD), in the absence of those 1.1 will be used. As, unfortunately, 1.1 is not 100% backward compatible with 1.0, this may create some issues with deployed applications. This also means that the <code>format="rdfa"</code> argument will refer to 1.0 and <em>not</em> to 1.1. Am I too cautious here?</li>
<li>The format argument in parse can also hold media types. Some of those are fairly obvious: e.g., <code>application/svg+xml</code> will map on the new parser with RDFa 1.1, for example. But what should be the default mapping for <code>text/html</code>? At present, it maps to the “universal” extractor (i.e., extracting everything).</li>
</ul>
<p>Of course, at some point, this branch will be merged with the <a href="https://github.com/RDFLib">main branch of RDFLib</a> meaning that, eventually, this will be part of the core distribution. I cannot say at this point when this will happen, I am not involved in the day-to-day management of the RDFLib development.</p>
<p>I hope this will be useful…</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=7cb3f1c3-0c7d-4570-a2c4-7a7d59c7f7fd" alt="" /></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/code/'>Code</a>, <a href='http://ivan-herman.name/category/work-related/code/python/'>Python</a>, <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/html/'>HTML</a>, <a href='http://ivan-herman.name/tag/microdata/'>microdata</a>, <a href='http://ivan-herman.name/tag/python/'>Python</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a>, <a href='http://ivan-herman.name/tag/rdflib/'>RDFLib</a>, <a href='http://ivan-herman.name/tag/resource-description-framework/'>Resource Description Framework</a>, <a href='http://ivan-herman.name/tag/turtle/'>Turtle</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/926/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/926/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=926&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2012/08/31/rdfa-microdata-turtle-in-html-and-rdflib/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=7cb3f1c3-0c7d-4570-a2c4-7a7d59c7f7fd" medium="image" />
	</item>
		<item>
		<title>Structured Data in HTML in the mainstream</title>
		<link>http://ivan-herman.name/2012/04/18/structured-data-in-html-in-the-mainstream/</link>
		<comments>http://ivan-herman.name/2012/04/18/structured-data-in-html-in-the-mainstream/#comments</comments>
		<pubDate>Wed, 18 Apr 2012 06:31:11 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[microdata]]></category>
		<category><![CDATA[RDFa]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=874</guid>
		<description><![CDATA[As referred to in my previous blog on LDOW2012, Hannes Hühleisen and Chris Bizer, but also Peter Mika and Tim Potter, published some findings on structured data in HTML based on Web Crawl results and analysis. Both Hannes’ and Peter’ papers are now on line. Hannes and Chris based their results on CommonCrawl, whereas Peter [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=874&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>As referred to in my <a href="http://ivan-herman.name/2012/04/17/linked-data-on-the-web-workshop-lyon/">previous blog on LDOW2012</a>, Hannes Hühleisen and Chris Bizer, but also Peter Mika and Tim Potter, published some findings on structured data in HTML based on Web Crawl results and analysis. Both <a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-2.pdf">Hannes’</a> and <a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-1.pdf">Peter’</a> papers are now on line. Hannes and Chris based their results on CommonCrawl, whereas Peter and Tim rely on Bing.</p>
<p>Although there are some controversies as for the usability of these crawls as well as the interpretation of their results (see <a href="http://www.w3.org/mid/1FBCC4EF-2706-48D3-A6A1-78A232EEC05A@unibw.de">Martin Hepp’s</a> mail, and the answer by <a href="http://www.w3.org/mid/4F8D830A.2000009@yahoo-inc.com">Peter Mika</a> as well as the resulting thread on the mailing list) I think what is really important is the big picture which emerges from both set of results: no one can reasonably dispute the importance of structured data in HTML any more. Although I vividly remember a time when this <em>was</em> was a matter of bitter discussions, I think we can put this issue behind us now. I do not think I can summarize it better than Peter did in <a href="http://www.w3.org/mid/4F8D86FA.2080309@yahoo-inc.com">another of his emails</a>:</p>
<blockquote><p>…both studies confirm that the Semantic Web, and in particular metadata in HTML, is taking on in major ways thanks to the efforts of Facebook, the sponsors of schema.org and many other individuals and organizations. Comparing to our previous numbers, for example we see a five-fold increase in RDFa usage with 25% of webpages containing RDFa data (including OGP), and over 7% of web pages containing microdata. These are incredibly impressive numbers, which illustrate that this part of the Semantic Web has gone mainstream.</p></blockquote>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/html/'>HTML</a>, <a href='http://ivan-herman.name/tag/microdata/'>microdata</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/874/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/874/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=874&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2012/04/18/structured-data-in-html-in-the-mainstream/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>
	</item>
		<item>
		<title>Linked Data on the Web Workshop, Lyon</title>
		<link>http://ivan-herman.name/2012/04/17/linked-data-on-the-web-workshop-lyon/</link>
		<comments>http://ivan-herman.name/2012/04/17/linked-data-on-the-web-workshop-lyon/#comments</comments>
		<pubDate>Tue, 17 Apr 2012 07:00:06 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[Access Control]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[microdata]]></category>
		<category><![CDATA[OWL]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[SPARQL]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=863</guid>
		<description><![CDATA[(See the Workshop’s home page for details.) The LDOW20** series have become more than workshops; they are really a small conferences. I did not count the number of participants (the meeting room had a fairly odd shape which made it a bit difficult) but I think it was largely over a hundred. Nice to see… [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=863&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>(See the <a href="http://events.linkeddata.org/ldow2012/">Workshop’s home page</a> for details.)</p>
<p>The LDOW20** series have become more than workshops; they are really a small conferences. I did not count the number of participants (the meeting room had a fairly odd shape which made it a bit difficult) but I think it was largely over a hundred. Nice to see…</p>
<p>The usual caveat applies for my notes below: I am selective here with some papers which is no judgement on any other paper at the workshop. These are just some of my thoughts jotted down…</p>
<p>Giuseppe Rizzo made a presentation related to all the tools we know have to tag texts and thereby being able to use these resources in linked data (<a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-02.pdf">“NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud”</a>), i.e., the Zemanta or Open Calais services of this World. As these services become more and more important, having a clear view of what they can do, how one can use them individually or together, etc., is essential. Their project, called <a href="http://nerd.eurecom.fr/">NERD</a>, will become an important source for this community, bookmark that page:-)</p>
<p>Jun Zhao made a presentation (<a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-03.pdf">“Towards Interoperable Provenance Publication on the Linked Data Web”</a>) essentially on the work of the <a href="http://www.w3.org/2011/prov/wiki/Main_Page">W3C Provenance Working Group</a>. I was pleased to see and listen to this presentation: I believe the outcome of that group is very important for this community and, having played a role in the creation of that group, I am anxious to see it succeed. B.t.w., a new round of publication coming from that group should happen very soon, watch the news…</p>
<p>Another presentation, namely Arnaud Le Hors’ on <a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-04.pdf">“Using read/write Linked Data for Application Integration — Towards a Linked Data Basic Profile”</a> was also closely related to W3C work. Arnaud and his colleagues (at IBM) came to this community after a long journey working on application integration; think, e.g., of systems managing software updates and error management. These systems are fundamentally data oriented and IBM has embarked into a Linked Data based approach (after having tried others). The particularity of this approach is to stay very “low” level, insofar as they use only basic HTTP protocol reading <em>and writing</em> RDF data. This approach seems to strike chord at a number of other companies (Elsevier, EMC, Oracle, Nokia) and their work form the basis of a new W3C Working Group that should be started this coming summer. This work may become a significant element of palette of technologies around Linked Data.</p>
<p>Luca Costabello talked about Access Control, Linked Data, and Mobile (<a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-05.pdf">“Linked Data Access Goes Mobile: Context-Aware Authorization for Graph Stores”</a>). Although Luca emphasized that their solution is not a complete solution for Linked Data access control issues in general, it may become an important contribution in that area nevertheless. Their approach is to modify SPARQL queries “on-the-fly” by including access control clauses; for that purpose, an <a href="http://ns.inria.fr/s4ac">access control ontology (S4AC)</a> has been developed and used. One issue is: how would that work with a purely HTTP level read/write Linked Data Web, like the one Arnaud is talking about? Answer: we do not know yet:-)</p>
<p>Igor Popov concentrated on user interface issues (<a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-12.pdf">“Interacting with the Web of Data through a Web of Inter-connected Lenses”</a>): how to develop a framework whereby data-oriented applications can cooperate quickly, so that lambda users could explore data, switching easily to applications that are well adapted to a particular dataset, and without being forced to use complicated programming or use too “geeky” tools. This is still an alpha level work, but their site-in-development, called <a href="http://mashpoint.net/">Mashpoint</a> is a place to watch. There are (still) not enough work on user-facing data exploration tools, I was pleased to see this one…</p>
<p>What is the dynamics of Linked Data? How does it change? This is the question Tobias Käfer and his friends try to answer in future (<a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-14.pdf">“Towards a Dynamic Linked Data Observatory”</a>). For that, data is necessary, and Tobias’ presentation was on how to determine what collection of resources to regularly watch and measure. The plan is to produce a snapshot of the data once a week for a year; the hope is that based on this collected data we will learn more about the overall evolution of linked data. I am really curious to see the results of that. One more reason to be at LDOW2013:-)</p>
<p>Tobias’ presentation has an important connection to the last presentation of the day, made by Axel Polleres (<a href="http://events.linkeddata.org/ldow2012/papers/ldow2012-paper-16.pdf">OWL: Yet to arrive on the Web of Data?</a>) insofar as what he presented was based on the analysis of the Linked Data out there. The issue has been around, with lots of controversy, for a while: what level of OWL should/could be used for Linked Data? OWL 2 as a whole seems to be too complex for the amount of data we are talking about, both in terms of program efficiency and in terms of conceptually complexity for end users. OWL 2 has defined a much simpler profile, called OWL 2 RL, which does have some traction but may be still too complex, e.g., for implementations. Axel and his friends analyzed the usage of OWL statements out there, and also established some criteria on what type of rules should be used to make OWL processing really efficient; their result is another profile called <a href="http://semanticweb.org/OWLLD/">OWL LD</a>. It is largely a subset of OWL 2 RL, though it does adopt some datatypes that OWL 2 RL does not have.</p>
<p>There are some features that are left out of OWL 2 RL which I am not fully convinced of; after all their measurement was based on data in 2011, and it is difficult to say how much time it takes for new OWL 2 features to really catch up. I think that keys and property chains should/could be really useful on the Linked Data, and can be managed by rule engines, too. So the jury is still out on this, but it would be good to find a way to stabilize this at some point and see the LD crowd look at OWL (i.e., the subset of OWL) more positively. Of course, another approach would be to concentrate on an easy way to encode Rules into RDF which might make this discussion moot in a certain sense; one of the things we have not succeeded to do yet:-(</p>
<p>The day ended by a panel, on which I also participated; I would let others judge whether the panel was good or not. However, the panel was preceded by a presentation of Chris on the current deployment of RDFa and microdata which was really interesting. (His slides will be on the workshop’s page soon.) The deployment of RDFa, microdata, and microformats has become really strong now; structured data in HTML is a well established approach out there. RDFa and microdata covers now half of the cases, the other half being microformats, which seems to indicate a clear shift towards RDFa/microdata, ie, a more syntax oriented approach (with a clear mapping to RDF). Microdata is used almost exclusively with schema.org vocabularies (which is to be expected) whereas RDFa makes use of a larger palette of various other vocabularies. All these were to be expected, but it is nice to see being reflected in collected data.</p>
<p>It was a great event. Chris, Tim, and Tom: thanks!</p>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/access-control/'>Access Control</a>, <a href='http://ivan-herman.name/tag/linked-data/'>Linked Data</a>, <a href='http://ivan-herman.name/tag/microdata/'>microdata</a>, <a href='http://ivan-herman.name/tag/owl/'>OWL</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a>, <a href='http://ivan-herman.name/tag/resource-description-framework/'>Resource Description Framework</a>, <a href='http://ivan-herman.name/tag/sparql/'>SPARQL</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/863/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/863/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=863&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2012/04/17/linked-data-on-the-web-workshop-lyon/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>
	</item>
		<item>
		<title>Nice reading on Semantic Search</title>
		<link>http://ivan-herman.name/2012/01/24/nice-reading-on-semantic-search/</link>
		<comments>http://ivan-herman.name/2012/01/24/nice-reading-on-semantic-search/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 16:53:15 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[OWL]]></category>
		<category><![CDATA[semantic search]]></category>
		<category><![CDATA[Web search engine]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=835</guid>
		<description><![CDATA[I had a great time reading a paper on Semantic Search[1]. Although the paper is on the details of a specific Semantic Web search engine (DERI’s SWSE), I was reading it as somebody not really familiar with all the intricate details of such a search engine setup and operation (i.e., I would not dare to [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=835&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I had a great time reading a paper on Semantic Search<a href="#l1">[1]</a>. Although the paper is on the details of a specific Semantic Web search engine (<a class="zem_slink" title="DERI" href="http://deri.ie" rel="homepage">DERI</a>’s <a href="http://swse.deri.org/">SWSE</a>), I was reading it as somebody not really familiar with all the intricate details of such a search engine setup and operation (i.e., I would not dare to give an opinion on whether the choice taken by this group is better or worse than the ones taken by the developers of other engines) and wanting to gain a good image of what is happening in general. And, for that purpose, this paper was really interesting and instructive. It is long (cca. 50 pages), i.e., I did not even try to understand everything at my first reading, but it did give a great overall impression of what is going on.</p>
<p>One of the “associations” I had, maybe somewhat surprisingly, is with another paper I read lately, namely a report on basic profiles for Linked Data<a href="#l2">[2]</a>. In that paper Nally et al. look at what “subsets” of current Semantic Web specifications could be defined, as “profiles”, for the purpose of publishing and using Linked Data. This was also a general topic at a <a href="http://www.w3.org/2011/09/LinkedData/">W3C Workshop on Linked Data Patterns</a> at the end of last year (see also the <a href="http://www.w3.org/2011/09/LinkedData/Report">final report</a> of the event) and it is not a secret that W3C is considering setting up a relevant Working Group in the near future. Well, the experiences of an engine like SWSE might come very handy here. For example, SWSE uses a subset of the <a href="http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules">OWL 2 RL Profile</a> for inferencing; that may be a good input for a possible Linked Data profile (although the differences are really minor, if one looks at the appendix of the paper that lists the rule sets the engine uses). The idea of “Authoritative Reasoning” is also interesting and possibly relevant; that approach makes a lot of pragmatic sense, I wonder whether this is not something that should be, somehow, documented for a general use. And I am sure there are more: In general, analyzing the experiences of major Semantic Web search engines on handling Linked Data might provide a great set of input for such pragmatic work.</p>
<p>I was also wondering about a very different issue. A great deal of work had to be done in SWSE on the proper handling of <code>owl:sameAs</code>. On the other hand, one of the recurring discussions on various mailing list and elsewhere is on whether the usage of this property is semantically o.k. or not (see, e.g., <a href="#l3">[3]</a>). A possible alternative would be to define (beyond <code>owl:sameAs</code>) a set of properties borrowed from the <a href="http://www.w3.org/TR/2009/REC-skos-reference-20090818/">SKOS Recommendation</a>, like <code>closeMatch</code>, <code>exactMatch</code>, <code>broadMatch</code>, etc. It is almost trivial to generalize these SKOS properties for the general case but, reading this paper, I was wondering: what effect would such predicates have on search? Would it make it more complicated or, in fact, would such predicates make the life of search engines easier by providing “hints” that could be used for the user interface? Or both? Or is it already too late, because the ubiquitous usage of <code>owl:sameAs</code> is already so prevalent that it is not worth touching that stuff? I do not have a clear answer at this moment…</p>
<p>Thanks to the authors!</p>
<ol>
<li id="l1">A. Hogan, et al., “<a href="http://www.websemanticsjournal.org/index.php/ps/article/view/240">Searching and Browsing Linked Data with SWSE: the Semantic Web Search Engin</a>e”, <a class="zem_slink" title="Journal of Web Semantics" href="http://www.elsevier.com/locate/websem" rel="homepage">Journal of Web Semantics</a>, vol. 4, no. December, pp. 365-401, 2011.</li>
<li id="l2">M. Nally and S. Speicher, “<a href="http://www.ibm.com/developerworks/rational/library/basic-profile-linked-data/index.html">Toward a Basic Profile for Linked Data</a>”, IBM developersWork, 2011.</li>
<li id="l3">H. Halpin, et al. “<a href="http://www.w3.org/TR/2009/REC-skos-reference-20090818/">When owl:sameAs Isn&#8217;t the Same: An Analysis of Identity in Linked Data</a>”, Proceedings of the International Semantic Web Conference, pp. 305-320, 2010</li>
</ol>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=a4155037-cf51-44fa-9bb9-5cbb0e3a7725" alt="" /></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/linked-data/'>Linked Data</a>, <a href='http://ivan-herman.name/tag/owl/'>OWL</a>, <a href='http://ivan-herman.name/tag/semantic-search/'>semantic search</a>, <a href='http://ivan-herman.name/tag/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/tag/web-search-engine/'>Web search engine</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/835/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/835/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=835&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2012/01/24/nice-reading-on-semantic-search/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=a4155037-cf51-44fa-9bb9-5cbb0e3a7725" medium="image" />
	</item>
		<item>
		<title>Where we are with RDFa 1.1?</title>
		<link>http://ivan-herman.name/2011/12/16/where-we-are-with-rdfa-1-1/</link>
		<comments>http://ivan-herman.name/2011/12/16/where-we-are-with-rdfa-1-1/#comments</comments>
		<pubDate>Fri, 16 Dec 2011 11:48:46 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[HTML5]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[RDFa 1.1 Lite]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[schema.org]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=809</guid>
		<description><![CDATA[There has been a flurry of activities around RDFa 1.1 in the past few months. Although a number of blogs and news items have been published on the changes, all those have become “officialized” only the past few days with the publication of the latest drafts, as well as with the publication of RDFa 1.1 [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=809&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div class="mceTemp">
<div class="wp-caption alignright" style="width: 130px"><a href="http://commons.wikipedia.org/wiki/File:Rdface.gif"><img class="zemanta-img-inserted zemanta-img-configured" title="English: RDFa Content Editor" src="http://upload.wikimedia.org/wikipedia/commons/4/48/Rdface.gif" alt="English: RDFa Content Editor" width="120" height="120" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
</div>
<p style="text-align:left;">There has been a flurry of activities around RDFa 1.1 in the past few months. Although a number of blogs and news items have been published on the changes, all those have become “officialized” only the past few days with the <a href="http://www.w3.org/blog/SW/2011/12/16/new-versions-of-rdfa-core-1-1-and-the-xhtmlrdfa-1-1-drafts/">publication of the latest drafts</a>, as well as with the <a href="http://www.w3.org/blog/SW/2011/12/09/rdfa-lite-1-1-draft-published-rdfa-1-1-primer-updated/">publication of RDFa 1.1 Lite</a>. It may be worth looking back at the past few months to have a clearer idea on what happened. I make references to a number of other blogs that were published in the past few months; the interested readers should consult those for details.</p>
<p style="text-align:left;">The latest official drafts for RDFa 1.1 were published in Spring 2011. However, lot has happened since. First of all, the <a href="http://www.w3.org/2010/02/rdfa/wiki/Main_Page">RDFWA Working Group</a>, working on this specification, has received a significant amount of comments. Some of those were rooted in implementations and the difficulties encountered therein; some came from potential authors who asked for further simplifications. Also, the announcement of <a href="http://schema.org">schema.org</a> had an important effect: indeed, this initiative drew attention on the importance of structured data in Web pages, which also raised further questions on the usability of RDFa for that usage pattern This came to the fore even more forcefully at the <a href="http://www.w3.org/QA/2011/09/impressions_on_the_schemaorg_w.html">workshop organized by the stakeholders of schema.org</a> in Mountain View. A new<a href="http://www.w3.org/wiki/Html-data-tf"> task force on the relationships of RDFa and microdata</a> has been set up at W3C; beyond looking at the relationship of these two syntaxes, that task force also raised a number of issues on RDFa 1.1. These issues have been, by and large, accepted and handled by the Working Group (and reflected in the new drafts).</p>
<p style="text-align:left;">What does this mean for the new drafts? The bottom line: there have been some fundamental changes in RDFa 1.1. For example, profiles, introduced in earlier releases of RDFa 1.1, have been removed due to implementation challenges; however, management of vocabularies have acquired an <em>optional</em> feature that helps vocabulary authors to “bind” their vocabularies to other vocabularies, without introducing an extra burden on authors (see<a href="http://www.w3.org/blog/SW/2011/09/19/recent-changes-in-rdfa-1-1/"> another blog</a> for more details). Another long-standing issue was whether RDFa should include a syntax for ordered lists; this has been done now (see the <a href="http://www.w3.org/blog/SW/2011/09/19/recent-changes-in-rdfa-1-1/">same blog</a> for further details).</p>
<p style="text-align:left;">A more recent important change concerns the usage of <code>@property</code> and <code>@rel</code>. Although usage of these attributes for RDF savy authors was never a real problem (the former is for the creation of literal objects, whereas the latter is for URI references), they have proven to be a major obstacle for ‘lambda’ HTML authors. This issue came up quite forcefully at the schema.org workshop in Mountain View, too. After a long technical discussion in the group, the new version reduces the usage difference between the two significantly. Essentially, if, on the same element, <code>@property</code> is present together with, say, <code>@href</code> or <code>@resource</code>, and <code>@rel</code> or <code>@rev</code> is <em>not</em> present, a URI reference is generated as an object of the triple. I.e., when used on a, say, <code>&lt;link&gt;</code> or <code>&lt;a&gt;</code> element, <code>@property</code>  behaves exactly like <code>@rel</code>. It turns out that this usage pattern is so widespread that it covers most of the important use cases for authors. The new version of the <a href="http://www.w3.org/TR/rdfa-primer/">RDFa 1.1 Primer</a> (as well as the <a href="http://www.w3.org/TR/2011/WD-rdfa-core-20111215/">RDFa 1.1 Core</a>, actually) has a number of examples that show these. There are also some other changes related to the behaviour of <code>@typeof</code> in relations to <code>@property</code>; please consult the specification for these.</p>
<p style="text-align:left;">The publication of <a href="http://www.w3.org/TR/rdfa-lite/">RDFa 1.1 Lite</a> was also a very important step. This defines a “sub-set” of the RDFa attributes that can serve as a guideline for HTML authors to express simple structured data in HTML without bothering about more complex features. This is the subset of RDFa that <a href="http://blog.schema.org/2011/11/using-rdfa-11-lite-with-schemaorg.html">schema.org will “accept”,</a>  as an alternative to the <a href="http://dev.w3.org/html5/md/">microdata</a>, as a possible syntax for schema.org vocabularies. (There are some examples on how some schema.org example look like in RDFa 1.1 Lite on a <a href="http://www.w3.org/QA/2011/11/schemaorg_and_rdfa_11_lite_how.html">different blog</a>.) In some sense, RDFa 1.1 Lite can be considered like the equivalent of microdata, except that it leaves the door open for more complex vocabulary usage, mixture with different vocabularies, etc. (The <a href="http://www.w3.org/wiki/Html-data-tf">HTML Task Force</a> will publish soon a more detailed comparison of the different syntaxes.)</p>
<p style="text-align:left;">So here is, roughly, where we are today. The recent publications by the W3C RDFWA Working Group have, as I said, ”officialized” all the changes that were discussed since spring. The group decided not to publish a Last Call Working Draft, because the last few weeks’ of work on the <a href="http://www.w3.org/wiki/Html-data-tf">HTML Task Force</a> may reveal some new requirements; if not, the last round of publications will follow soon.</p>
<p style="text-align:left;">And what about implementations? Well, <a href="http://www.w3.org/2007/08/pyRdfa/Shadow.html">my “shadow” implementation of the RDFa distiller</a> (which also includes a separate “<a href="http://www.w3.org/2007/08/pyRdfa/Validator.html">validator</a>” service) incorporates all the latest changes. I also added a new feature a few weeks ago, namely the possibility to <a href="http://www.w3.org/QA/2011/11/rdfa_11_meets_json-ld_in_the_d.html">serialize the output in JSON-LD</a> (although this has become outdated a few days ago, due to some <a href="http://json-ld.org/minutes/2011-12-13/">changes in JSON-LD</a>…). I am not sure of the exact status of Gregg Kellogg’s <a href="http://rdf.greggkellogg.net/distiller">RDF Distiller</a>, but, knowing him, it is either already in line with the latest drafts or it is only a matter of a few days to be so. And there are surely more around that I do not know about.</p>
<p style="text-align:left;">This last series of publications have provided a nice closure for a busy RDFa year. I guess the only thing now is to wish everyone a Merry Christmas, a peaceful and happy Hanukkah, or other festivities you honor at this time of the year.  In any case, a very happy New Year!</p>
<div class="zemanta-pixie" style="margin-top:10px;height:15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="float:right;" src="http://img.zemanta.com/zemified_e.png?x-id=374ecad3-7da1-4de6-a4f1-d8d92bb1ba64" alt="Enhanced by Zemanta" /></a></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/code/python/'>Python</a>, <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/html/'>HTML</a>, <a href='http://ivan-herman.name/tag/html5/'>HTML5</a>, <a href='http://ivan-herman.name/tag/json/'>JSON</a>, <a href='http://ivan-herman.name/tag/rdfa/'>RDFa</a>, <a href='http://ivan-herman.name/tag/rdfa-1-1-lite/'>RDFa 1.1 Lite</a>, <a href='http://ivan-herman.name/tag/resource-description-framework/'>Resource Description Framework</a>, <a href='http://ivan-herman.name/tag/schema-org/'>schema.org</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/809/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/809/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=809&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2011/12/16/where-we-are-with-rdfa-1-1/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://upload.wikimedia.org/wikipedia/commons/4/48/Rdface.gif" medium="image">
			<media:title type="html">English: RDFa Content Editor</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/zemified_e.png?x-id=374ecad3-7da1-4de6-a4f1-d8d92bb1ba64" medium="image">
			<media:title type="html">Enhanced by Zemanta</media:title>
		</media:content>
	</item>
		<item>
		<title>W3C Library Linked Data Reports</title>
		<link>http://ivan-herman.name/2011/11/07/w3c-library-linked-data-reports/</link>
		<comments>http://ivan-herman.name/2011/11/07/w3c-library-linked-data-reports/#comments</comments>
		<pubDate>Mon, 07 Nov 2011 15:16:53 +0000</pubDate>
		<dc:creator>Ivan Herman</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Work Related]]></category>
		<category><![CDATA[Library]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>

		<guid isPermaLink="false">http://ivan-herman.name/?p=800</guid>
		<description><![CDATA[There was an official news a few days on the publication of the W3C Library Linked Data Incubator Group final report. The report is an interesting read even though I am probably not part of the “typical” target readership. After all, the primary goal of this report is really to convince the reader on the [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=800&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div class="wp-caption alignright" style="width: 310px"><a href="http://commons.wikipedia.org/wiki/File:Library_of_Congress.jpg"><img class="zemanta-img-inserted zemanta-img-configured" title="The Library of Congress main reading room, Jef..." src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Library_of_Congress.jpg/300px-Library_of_Congress.jpg" alt="The Library of Congress main reading room, Jef..." width="300" height="191" /></a><p class="wp-caption-text">The main hall of the Library of Congress</p></div>
<p>There was an official news a few days on the publication of the <a href="http://www.w3.org/blog/SW/2011/10/27/w3c-library-linked-data-xg-final-report-published/">W3C Library Linked Data Incubator Group final report</a>. The report is an interesting read even though I am probably not part of the “typical” target readership. After all, the primary goal of this report is really to convince the reader on the interest and importance of combining the activities of libraries with Linked Data… But the key recommendations of the report are worth repeating:</p>
<ul>
<li>That <strong>library leaders</strong> identify sets of data as possible candidates for early exposure as Linked Data and foster a discussion about Open Data and rights;</li>
<li>That <strong>library standards bodies</strong> increase library participation in Semantic Web standardization, develop library data standards that are compatible with Linked Data, and disseminate best-practice design patterns tailored to library Linked Data;</li>
<li>That <strong>data and systems designers</strong> design enhanced user services based on Linked Data capabilities, create <abbr title="Uniform Resource Identifiers">URIs</abbr> for the items in library datasets, develop policies for managing <abbr title="Resource Description Framework">RDF</abbr> vocabularies and their <abbr title="Uniform Resource Identifiers">URIs</abbr>, and express library data by re-using or mapping to existing Linked Data vocabularies;</li>
<li>That <strong>librarians and archivists</strong> preserve Linked Data element sets and value vocabularies and apply library experience in curation and long-term preservation to Linked Data datasets.</li>
</ul>
<p>However, what was not absolutely clear from the original announcement is that the official report also has two “companion” documents, namely a <a href="http://www.w3.org/2005/Incubator/lld/XGR-lld-usecase-20111025/">Use Case collection</a>, and a list of references to metadata element sets in RDF, to relevant vocabularies, and to published element sets (e.g., the <a href="http://id.loc.gov/vocabulary/countries.html">sets of URI-s</a>, set up by the <a class="zem_slink" title="Library of Congress" href="http://maps.google.com/maps?ll=38.8886111111,-77.0047222222&amp;spn=0.01,0.01&amp;q=38.8886111111,-77.0047222222%20%28Library%20of%20Congress%29&amp;t=h" rel="geolocation">US Library of Congress</a>, listing all countries in the World). This document, entitled <a href="http://www.w3.org/2005/Incubator/lld/XGR-lld-vocabdataset-20111025/">“Datasets, Value Vocabularies, and Metadata Element Sets”</a> is a real, somewhat hidden gem: a possible starting point for practitioners who wants to work with Library Linked Data! Thanks to Antoine Isaac and his friends for collecting these. I wonder how we could get it regularly updated…</p>
<div class="zemanta-pixie" style="margin-top:10px;height:15px;"><a class="zemanta-pixie-a" title="Enhanced by Zemanta" href="http://www.zemanta.com/"><img class="zemanta-pixie-img" style="float:right;" src="http://img.zemanta.com/zemified_e.png?x-id=8d1b59c0-c0d5-4389-a225-45b50fb3a862" alt="Enhanced by Zemanta" /></a></div>
<br />Filed under: <a href='http://ivan-herman.name/category/work-related/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/category/work-related/'>Work Related</a> Tagged: <a href='http://ivan-herman.name/tag/library/'>Library</a>, <a href='http://ivan-herman.name/tag/linked-data/'>Linked Data</a>, <a href='http://ivan-herman.name/tag/metadata/'>Metadata</a>, <a href='http://ivan-herman.name/tag/open-data/'>Open Data</a>, <a href='http://ivan-herman.name/tag/semantic-web/'>Semantic Web</a>, <a href='http://ivan-herman.name/tag/uniform-resource-identifier/'>Uniform Resource Identifier</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/800/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/800/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&#038;blog=557157&#038;post=800&#038;subd=ivanherman&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ivan-herman.name/2011/11/07/w3c-library-linked-data-reports/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&#38;d=identicon" medium="image">
			<media:title type="html">ivanherman</media:title>
		</media:content>

		<media:content url="http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Library_of_Congress.jpg/300px-Library_of_Congress.jpg" medium="image">
			<media:title type="html">The Library of Congress main reading room, Jef...</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/zemified_e.png?x-id=8d1b59c0-c0d5-4389-a225-45b50fb3a862" medium="image">
			<media:title type="html">Enhanced by Zemanta</media:title>
		</media:content>
	</item>
	</channel>
</rss>
