Ivan’s private site

April 27, 2009

Simple OWL 2 RL service

Filed under: Code, Python, Semantic Web, Work Related — Ivan Herman @ 14:12
Tags: , , , ,

The W3C OWL Working group has published a number of OWL 2 documents last week. This included an updated version of the OWL 2 RL profile. I have already blogged about this profile (“Bridge Between SW communities: OWL RL”) when the previous release was published; there are no radical changes in this release, so there is no reason to repeat what was said there.

I have been playing with a simple and naive implementation of OWL 2 RL for a while; I have now decided to live dangerously;-) and release the software and the corresponding service. So… you can go to the OWL 2 RL generator service, give an RDF graph, and see what RDF triples an OWL 2 RL system should generate. It should give you some ideas of what OWL 2 RL is all about.

I cannot emphasize enough that this is not a production level tool. Beyond the bugs that I have not yet found, a proper implementation would, for example, optimize the owl:sameAs triples and, instead of storing them in the graph, would generate those on the fly when, say, a SPARQL request is issued. But my goal was not to produce something optimal; instead, I wanted to see whether OWL 2 RL can be implemented without any sophisticated tool or not. The answer is: yes it can. This also means that if I could do it, anybody with a basic knowledge of the underlying RDF environment and programming language (RDFLib and Python in this case) can do it, too. No need to be familiar with any complex algorithms, rule language implementation tricks, complicated external tools, description logic concepts, whatever…

December 6, 2008

New Python releases

The fact that there are new Python releases is nothing new. But this time it is a bit different. While there is a new, 2.6.1 version of Python (which is “just” and upgrade), there is now also a 3.0 version (a.k.a. Python 3000). And Python 3.0 is not backward compatible with the older Python versions. Although the differences are not radical (see the “what is new?” page), it is still true that older Python applications may not run with Python 3.0.

I must admit that I am a bit skeptical about this move. I just do not want to spend my time changing my old Python applications to run Python 3.0 even if they need further development and I am probably not the only one. Of course, for the time being, I can get by, because the Python community plans to maintain the 2.X lines in parallel with the 3.X line. But for how long?

The beauty of Python was (and still is) its simplicity and, compared to many other programming languages, its ease of use. It has already grown a little bit too complex for my taste in the past few years (E.g., I have never really grasped the big importance of, say, decorators and I never used those), but I could safely ignore those if I wanted. As far as I am concerned, none of the new, incompatible features in Python 3000 warranted such a radical change (well, maybe the better handling of unicode makes a major difference). I am a little bit afraid that the Python community has shot itself in the foot with this move which may become a maintainers’ nightmare. I am happy to be proven otherwise, though…

February 21, 2008

RDFa Syntax LC is out

Filed under: Python, Semantic Web, Work Related — Ivan Herman @ 19:41

The RDFa Syntax Last Call document has just been published; yey!

I have also made an update of the RDFa processor that I coded last summer; it is now available for download and is also used through the “RDFa Distiller” service page. I have played with RDFa in practical terms, too; my foaf file in HTML, the W3C SW Activity Home page, and the Semantic Web FAQ page are all annotated with RDFa now. Once one is used to it, it is fairly straightforward to add even complex RDF statements to HTML pages with an arbitrarily large number of different vocabularies mixed in. Of course, authoring tools would be good, but let us take things one step at a time… Having the Last Call published (ie, the Working Groups believing to have taken care of all technical issues) is a major, big step ahead!

B.t.w., Benjamin Nowack jumped on the SW-FAQ RDF file to make a nice little hack; here is the mail he sent on the SW SWEO list the other day:

Heh, silly stuff, just FYI: On the #foaf channel is foafbot (a SPARQLy reincarnation
of an earlierbot we had there years ago). It understands RDFa, and allows the
specification of custom commands at [1]. I made it load Ivan’s FAQ, and
created an “faq” command, so that you can now pass a keyword or phrase
to the bot and it will respond with a pointer to the FAQ (if something
matched the RDFa-encoded question), e.g.:

<bengee> foafbot, faq giant ontology

<foafbot> bengee, see http://www.w3.org/2001/sw/SW-FAQ#whgiantont

;)

Benji

[1] http://semsol.org/semcamp/sparqlbot

Isn’t that cool? As far as I could see, it took him about 10 minutes to add this hack, thanks to the SW-FAQ being in RDF…

February 14, 2008

New version of the SPARQL Python wrapper

Filed under: Code, Python, Semantic Web, Work Related — Ivan Herman @ 13:49
Tags:

About half a year ago I announced the availability of a SPARQL endpoint interface to Python. It was really a beta release back then (ie, last July), but recently it went through a more thorough testing and improvement cycle. This was not my merit; all praise should go to Sergio Fernàndez and Carlos Tejo (both from CTIC Foundation, Spain) who decided to use the package in one of their internal projects. They revealed some problems (of course…), and we then worked together to prepare a proper 1.0 release. It is my pleasure to consider them as co-authors of this small package!

As before, the code is available from my site; the API documentation is included in the distribution (and is also available online). However, the project has also been moved to sourceforge, and is now available there, too (including the on-line documentation).

September 3, 2007

Yet another RDFa processor…

Filed under: Code, Python, Semantic Web, Work Related — Ivan Herman @ 17:30
Tags: ,

The summer months were quite relaxed, so at some point I decided to write an RDFa processor (in Python). I know, I could have used Elias Torres’ parser (also included in RDFlib), but my goal was a bit different. It was at a time when the RDFa task force had long technical discussion on the details of the main RDFa parsing/processing rules, and I wanted to test whether those rules, as described at that moment, were correct and implementable (they were). And, while I was at it I then decided to properly finish up the implementation to make it generally usable.

The result is a Python package (it can also be downloaded as a compressed tar file) which uses RDFLib to build up the graph as well as for final serialization. To the best of my knowledge the parser follows the latest (not yet published:-( version of RDFa, and I definitely plan to keep it that way in future. There is also a “distiller” that can be used on-line. The implementation (mainly for the distiller) is not complete: indeed, I should work on a proper error handling rather than relying on Python’s xml minidom package simply throwing an exception on the user’s face for, say, an invalid XHTML…

I also decided to test it on something more complicated, so I created an RDFa version of my foaf data. I have now an XHTML file with my foaf data that can be used (either via the distiller or directly using Python) to generate my RDF/XML foaf file. It shows one of the real advantages of RDFa: the foaf data mixes quite a number of various vocabularies, but that is absolutely no problem for something like RDFa. In any case, I do not intend to edit my foaf data in RDF/XML any more…

July 6, 2007

SPARQL Endpoint interface to Python

Filed under: Code, Python, Semantic Web, Work Related — Ivan Herman @ 12:43
Tags: ,

I played with SPARQL on my local machine, and I also got inspired by Lee’s SPARQL library for Javascript. But, well, I prefer Python… So I made a set of utility classes first for myself, but then I decided to package it more properly. Maybe others can find it useful, too.

The goal is to give some help in turning a SPARQL query into the corresponding HTTP GET Protocol, send it to a SPARQL endpoint somewhere on the Web, and do something with the results. The simplest usage is something like:

from SPARQL import SPARQLWrapper
queryString = "SELECT * WHERE { ?s ?p ?o. }"
sparql = SPARQLWrapper("http://localhost:2020/sparql")
# add a default graph, though that can also be done in the query string
sparql.addDefaultGraph("http://www.example.com/data.rdf")
sparql.setQuery(queryString)
try :
    ret = sparql.query() # ret is a stream with the results in XML, it is a file like object
except:
    deal_with_the_exception() # eg, syntax error

To make it even easier to use, conversions to more Python-friendly formats can also done on the results: eg, turn it into a proper DOM tree if the result is XML, use Bob Ippolito’s simplejson module to convert a return format in JSON into Python dictionary, or parse it with RDFLib and return an RDFLib Graph in case the return is in RDF/XML. Ie, one could have done:

try :
    sparql.setReturnFormat(SPARQL.JSON)
    ret = sparql.query()
    dict = ret.convert()
except:
    deal_with_the_exception()

where “dict” is a Python dictionary. There are some more tricks in the library, but that essentially it…

The code is available from my site; the API documentation is included in the distribution (and is also available online).

It is an early release. There are some problems, and I expect some more. I have primarily tested it with two different SPARQL endpoints running on my local machine (joseki3 and virtuoso) and also with some public SPARQL endpoints. There are some differences on the return media type for, eg, JSON or N3, the non-standard arguments (eg, setting the return format) still diverge a bit, etc. But I would expect these to converge over time. However, I am sure that my code will have problems with some of the endpoints at least on those grounds (or others)…

March 30, 2007

Yet another converter to RSS1.0

Filed under: Code, Python, Semantic Web, Work Related — Ivan Herman @ 17:19

One of the problems I had for a long time with the RSS feeds of the different blogging systems is the difficulty to generate an RSS feed for a specific category only. For example, this blog includes entries on, say, the Semantic Web, but also on Hungarian issues (possibly in Hungarian). Obviously, I would not want to export an RSS feed for an audience on Semantic Web that would include those Hungarian entries… But it was impossible to do that with wordpress, for example.

I made therefore such a conversion script in python. It really is only a wrapper around Mark Pilgrim’s excellent universal feed parser. You can use this script as some sort of an off-line converter or (via a separate small script) as a CGI script on your server. Just copy the python files in the appropriate directories corresponding to your local setup. For convenience’s sake I have also added the source of the universal feed parser to the distribution.

Blog at WordPress.com.