Ivan’s private site

February 14, 2008

New version of the SPARQL Python wrapper

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 13:49
Tags:

About half a year ago I announced the availability of a SPARQL endpoint interface to Python. It was really a beta release back then (ie, last July), but recently it went through a more thorough testing and improvement cycle. This was not my merit; all praise should go to Sergio Fernàndez and Carlos Tejo (both from CTIC Foundation, Spain) who decided to use the package in one of their internal projects. They revealed some problems (of course…), and we then worked together to prepare a proper 1.0 release. It is my pleasure to consider them as co-authors of this small package!

As before, the code is available from my site; the API documentation is included in the distribution (and is also available online). However, the project has also been moved to sourceforge, and is now available there, too (including the on-line documentation).

September 3, 2007

Yet another RDFa processor…

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 17:30
Tags: ,

The summer months were quite relaxed, so at some point I decided to write an RDFa processor (in Python). I know, I could have used Elias Torres’ parser (also included in RDFlib), but my goal was a bit different. It was at a time when the RDFa task force had long technical discussion on the details of the main RDFa parsing/processing rules, and I wanted to test whether those rules, as described at that moment, were correct and implementable (they were). And, while I was at it I then decided to properly finish up the implementation to make it generally usable.

The result is a Python package (it can also be downloaded as a compressed tar file) which uses RDFLib to build up the graph as well as for final serialization. To the best of my knowledge the parser follows the latest (not yet published:-( version of RDFa, and I definitely plan to keep it that way in future. There is also a “distiller” that can be used on-line. The implementation (mainly for the distiller) is not complete: indeed, I should work on a proper error handling rather than relying on Python’s xml minidom package simply throwing an exception on the user’s face for, say, an invalid XHTML…

I also decided to test it on something more complicated, so I created an RDFa version of my foaf data. I have now an XHTML file with my foaf data that can be used (either via the distiller or directly using Python) to generate my RDF/XML foaf file. It shows one of the real advantages of RDFa: the foaf data mixes quite a number of various vocabularies, but that is absolutely no problem for something like RDFa. In any case, I do not intend to edit my foaf data in RDF/XML any more…

July 6, 2007

SPARQL Endpoint interface to Python

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 12:43
Tags: ,

I played with SPARQL on my local machine, and I also got inspired by Lee’s SPARQL library for Javascript. But, well, I prefer Python… So I made a set of utility classes first for myself, but then I decided to package it more properly. Maybe others can find it useful, too.

The goal is to give some help in turning a SPARQL query into the corresponding HTTP GET Protocol, send it to a SPARQL endpoint somewhere on the Web, and do something with the results. The simplest usage is something like:

from SPARQL import SPARQLWrapper
queryString = "SELECT * WHERE { ?s ?p ?o. }"
sparql = SPARQLWrapper("http://localhost:2020/sparql")
# add a default graph, though that can also be done in the query string
sparql.addDefaultGraph("http://www.example.com/data.rdf")
sparql.setQuery(queryString)
try :
    ret = sparql.query() # ret is a stream with the results in XML, it is a file like object
except:
    deal_with_the_exception() # eg, syntax error

To make it even easier to use, conversions to more Python-friendly formats can also done on the results: eg, turn it into a proper DOM tree if the result is XML, use Bob Ippolito’s simplejson module to convert a return format in JSON into Python dictionary, or parse it with RDFLib and return an RDFLib Graph in case the return is in RDF/XML. Ie, one could have done:

try :
    sparql.setReturnFormat(SPARQL.JSON)
    ret = sparql.query()
    dict = ret.convert()
except:
    deal_with_the_exception()

where “dict” is a Python dictionary. There are some more tricks in the library, but that essentially it…

The code is available from my site; the API documentation is included in the distribution (and is also available online).

It is an early release. There are some problems, and I expect some more. I have primarily tested it with two different SPARQL endpoints running on my local machine (joseki3 and virtuoso) and also with some public SPARQL endpoints. There are some differences on the return media type for, eg, JSON or N3, the non-standard arguments (eg, setting the return format) still diverge a bit, etc. But I would expect these to converge over time. However, I am sure that my code will have problems with some of the endpoints at least on those grounds (or others)…

March 30, 2007

Yet another converter to RSS1.0

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 17:19

One of the problems I had for a long time with the RSS feeds of the different blogging systems is the difficulty to generate an RSS feed for a specific category only. For example, this blog includes entries on, say, the Semantic Web, but also on Hungarian issues (possibly in Hungarian). Obviously, I would not want to export an RSS feed for an audience on Semantic Web that would include those Hungarian entries… But it was impossible to do that with wordpress, for example.

I made therefore such a conversion script in python. It really is only a wrapper around Mark Pilgrim’s excellent universal feed parser. You can use this script as some sort of an off-line converter or (via a separate small script) as a CGI script on your server. Just copy the python files in the appropriate directories corresponding to your local setup. For convenience’s sake I have also added the source of the universal feed parser to the distribution.

« Previous Page

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,021 other followers