Ivan’s private site

February 21, 2008

RDFa Syntax LC is out

Filed under: Python,Semantic Web,Work Related — Ivan Herman @ 19:41

The RDFa Syntax Last Call document has just been published; yey!

I have also made an update of the RDFa processor that I coded last summer; it is now available for download and is also used through the “RDFa Distiller” service page. I have played with RDFa in practical terms, too; my foaf file in HTML, the W3C SW Activity Home page, and the Semantic Web FAQ page are all annotated with RDFa now. Once one is used to it, it is fairly straightforward to add even complex RDF statements to HTML pages with an arbitrarily large number of different vocabularies mixed in. Of course, authoring tools would be good, but let us take things one step at a time… Having the Last Call published (ie, the Working Groups believing to have taken care of all technical issues) is a major, big step ahead!

B.t.w., Benjamin Nowack jumped on the SW-FAQ RDF file to make a nice little hack; here is the mail he sent on the SW SWEO list the other day:

Heh, silly stuff, just FYI: On the #foaf channel is foafbot (a SPARQLy reincarnation
of an earlierbot we had there years ago). It understands RDFa, and allows the
specification of custom commands at [1]. I made it load Ivan’s FAQ, and
created an “faq” command, so that you can now pass a keyword or phrase
to the bot and it will respond with a pointer to the FAQ (if something
matched the RDFa-encoded question), e.g.:

<bengee> foafbot, faq giant ontology

<foafbot> bengee, see http://www.w3.org/2001/sw/SW-FAQ#whgiantont

;)

Benji

[1] http://semsol.org/semcamp/sparqlbot

Isn’t that cool? As far as I could see, it took him about 10 minutes to add this hack, thanks to the SW-FAQ being in RDF…

February 14, 2008

New version of the SPARQL Python wrapper

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 13:49
Tags:

About half a year ago I announced the availability of a SPARQL endpoint interface to Python. It was really a beta release back then (ie, last July), but recently it went through a more thorough testing and improvement cycle. This was not my merit; all praise should go to Sergio Fernàndez and Carlos Tejo (both from CTIC Foundation, Spain) who decided to use the package in one of their internal projects. They revealed some problems (of course…), and we then worked together to prepare a proper 1.0 release. It is my pleasure to consider them as co-authors of this small package!

As before, the code is available from my site; the API documentation is included in the distribution (and is also available online). However, the project has also been moved to sourceforge, and is now available there, too (including the on-line documentation).

September 3, 2007

Yet another RDFa processor…

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 17:30
Tags: ,

The summer months were quite relaxed, so at some point I decided to write an RDFa processor (in Python). I know, I could have used Elias Torres’ parser (also included in RDFlib), but my goal was a bit different. It was at a time when the RDFa task force had long technical discussion on the details of the main RDFa parsing/processing rules, and I wanted to test whether those rules, as described at that moment, were correct and implementable (they were). And, while I was at it I then decided to properly finish up the implementation to make it generally usable.

The result is a Python package (it can also be downloaded as a compressed tar file) which uses RDFLib to build up the graph as well as for final serialization. To the best of my knowledge the parser follows the latest (not yet published:-( version of RDFa, and I definitely plan to keep it that way in future. There is also a “distiller” that can be used on-line. The implementation (mainly for the distiller) is not complete: indeed, I should work on a proper error handling rather than relying on Python’s xml minidom package simply throwing an exception on the user’s face for, say, an invalid XHTML…

I also decided to test it on something more complicated, so I created an RDFa version of my foaf data. I have now an XHTML file with my foaf data that can be used (either via the distiller or directly using Python) to generate my RDF/XML foaf file. It shows one of the real advantages of RDFa: the foaf data mixes quite a number of various vocabularies, but that is absolutely no problem for something like RDFa. In any case, I do not intend to edit my foaf data in RDF/XML any more…

July 22, 2007

Yet another RDFa converter

Filed under: Code,Semantic Web,Work Related — Ivan Herman @ 9:31
Tags: ,

I realized a week ago that Dave Beckett’s triplr tool (“Stuff in, triples out”) also includes an RDFa converter now, see his news item of 2007-07-17. Ie, I can now use the URI http://triplr.org/rdfa-rdf/http://rdfa.info/ to extract or refer to the RDF content from the RDFa info page’s RDFa statements. Of course (after all, this is Dave’s tool!) I could also put “turtle” in the URI instead of “rdf” to yield, well, turtle.

The converter, of course, is still based on the latest public release of the RDFa syntax, and many things will change as a result of the current work in the RDFa group (which has become real active in the last few months, so I think new and significantly better release of the spec will come soon!). But I am sure an update of triplr will follow that soon afterwards…

July 6, 2007

SPARQL Endpoint interface to Python

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 12:43
Tags: ,

I played with SPARQL on my local machine, and I also got inspired by Lee’s SPARQL library for Javascript. But, well, I prefer Python… So I made a set of utility classes first for myself, but then I decided to package it more properly. Maybe others can find it useful, too.

The goal is to give some help in turning a SPARQL query into the corresponding HTTP GET Protocol, send it to a SPARQL endpoint somewhere on the Web, and do something with the results. The simplest usage is something like:

from SPARQL import SPARQLWrapper
queryString = "SELECT * WHERE { ?s ?p ?o. }"
sparql = SPARQLWrapper("http://localhost:2020/sparql")
# add a default graph, though that can also be done in the query string
sparql.addDefaultGraph("http://www.example.com/data.rdf")
sparql.setQuery(queryString)
try :
    ret = sparql.query() # ret is a stream with the results in XML, it is a file like object
except:
    deal_with_the_exception() # eg, syntax error

To make it even easier to use, conversions to more Python-friendly formats can also done on the results: eg, turn it into a proper DOM tree if the result is XML, use Bob Ippolito’s simplejson module to convert a return format in JSON into Python dictionary, or parse it with RDFLib and return an RDFLib Graph in case the return is in RDF/XML. Ie, one could have done:

try :
    sparql.setReturnFormat(SPARQL.JSON)
    ret = sparql.query()
    dict = ret.convert()
except:
    deal_with_the_exception()

where “dict” is a Python dictionary. There are some more tricks in the library, but that essentially it…

The code is available from my site; the API documentation is included in the distribution (and is also available online).

It is an early release. There are some problems, and I expect some more. I have primarily tested it with two different SPARQL endpoints running on my local machine (joseki3 and virtuoso) and also with some public SPARQL endpoints. There are some differences on the return media type for, eg, JSON or N3, the non-standard arguments (eg, setting the return format) still diverge a bit, etc. But I would expect these to converge over time. However, I am sure that my code will have problems with some of the endpoints at least on those grounds (or others)…

March 30, 2007

Yet another converter to RSS1.0

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 17:19

One of the problems I had for a long time with the RSS feeds of the different blogging systems is the difficulty to generate an RSS feed for a specific category only. For example, this blog includes entries on, say, the Semantic Web, but also on Hungarian issues (possibly in Hungarian). Obviously, I would not want to export an RSS feed for an audience on Semantic Web that would include those Hungarian entries… But it was impossible to do that with wordpress, for example.

I made therefore such a conversion script in python. It really is only a wrapper around Mark Pilgrim’s excellent universal feed parser. You can use this script as some sort of an off-line converter or (via a separate small script) as a CGI script on your server. Just copy the python files in the appropriate directories corresponding to your local setup. For convenience’s sake I have also added the source of the universal feed parser to the distribution.

« Previous Page

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 2,545 other followers