Ivan’s private site

March 30, 2007

Yet another converter to RSS1.0

Filed under: Code,Python,Semantic Web,Work Related — Ivan Herman @ 17:19

One of the problems I had for a long time with the RSS feeds of the different blogging systems is the difficulty to generate an RSS feed for a specific category only. For example, this blog includes entries on, say, the Semantic Web, but also on Hungarian issues (possibly in Hungarian). Obviously, I would not want to export an RSS feed for an audience on Semantic Web that would include those Hungarian entries… But it was impossible to do that with wordpress, for example.

I made therefore such a conversion script in python. It really is only a wrapper around Mark Pilgrim’s excellent universal feed parser. You can use this script as some sort of an off-line converter or (via a separate small script) as a CGI script on your server. Just copy the python files in the appropriate directories corresponding to your local setup. For convenience’s sake I have also added the source of the universal feed parser to the distribution.

Taxonomy of musical instruments in SKOS

Filed under: Semantic Web,Work Related — Ivan Herman @ 16:27

Frédéric Giasson has published a new version of the Music Ontology (revision 1.11). One of the discussion on the mailing list was the taxonomy used for musical instruments (both classic and not). The current musical ontology leaves this open, the users can refer to their own taxonomy.

At the time of the discussion I made a conversion of a taxonomy on MusicBrainz (I hope to have used most up-to-date version). I converted it into SKOS, which seems to be the most appropriate format for something like that. I created a purl URI today; ie, here is a stable namespace for it: http://www.purl.org/net/MusicInstruments.

Of course, any update request is welcome…

Rectification, 2007-04-06: we agreed with Frédéric that it would be better to use another namespace for the taxonomy that would be closer to the music ontology itself. The final namespace for the taxonomy is therefore: http://purl.org/ontology/mo/mit#.

March 17, 2007

Gherkin building

Filed under: General,Private — Ivan Herman @ 18:50

The Gherkin building

While I was in London I decided to have a closer look at the 30 St Mary Axe building, also known as “Gherkin building”. I have seen it on photos and on TV before, I was curious how it looks in reality. I must say I liked it. It is very unusual to see such rounded forms in the middle of the city, and even the fact that it is surrounded by other, relatively low buildings did not spoil the overall impression. I was lucky to have some sunshine, which resulted in a all kinds of reflections of the buildings around in the Gherkin building itself, as well as the reflection of the Gherkin building in the surrounding buildings. Unfortunately, for security reasons, I was not allowed to go into the building; real pity. I would have liked to see that.

Infotech for Pharma & Biotech

Filed under: Semantic Web,Work Related — Ivan Herman @ 13:50

I spent a few days in London, UK, at the Infotech for Pharma & Biotech conference. I must admit that, sometimes, I felt like being at the University again, and trying (in vain…) to understand what the talk was all about. After all, this is not really my world… Nevertheless, one of the things I saw (again!) is that this community has indeed immense data integration problems. Immense both in terms of the amount of data as well as the amount of money… Hence their real interest in Semantic Web technologies, too (witness the very active Interest Group on Health Care and Life Sciences at W3C). And that is also why I was invited, ie, to make a small introduction on Semantic Web. It was pretty well received, by the way, I had a series of nice discussions with people afterwards.

It was really good that my talk was followed by a presentation of Giles Day, from Pfizer. He talked about an annotation system they are working on that would provide common annotations on all kinds of data formats (internal wikis, Word documents, web sites, specialized drug discovery tools, etc). The annotations are then collected in an RDF triple store, queried and analyzed by semantic agents and mining tools. The goal is to make the lifecycle of drug discovery more efficient. One of his interesting remarks was: “without ontologies we are doomed to failure”, ie, just using some general tagging mechanism would not be appropriate. This reminded me of something I said earlier, ie, that the Web today is really a collection of very different communities; what may work very well for one community (loose tagging in this case) does not necessarily work for others…

March 11, 2007

And what about SKOS? (re: Freebase discussion)

Filed under: Semantic Web,Work Related — Ivan Herman @ 11:04

There has been quite a number of blogs last week-end on Freebase and on Tim O’Reilly’s blog on it. His remark on the Semantic Web has led to a number of replies, too, eg, from Jim Hendler, Danny Ayers, Shelley Powers, Kingsley Idehen, Henry Story, and others that I may forget (sorry to those). The incriminated sentence that people referred to is: “But unlike the W3C approach to the semantic web, which starts with controlled ontologies, Metaweb adopts a folksonomy approach, in which people can add new categories (much like tags), in a messy sprawl of potentially overlapping assertions.”

It always strikes when people seems to equate the usage of W3C’s Semantic Web to some sort of a mythical, centrally managed and controlled ontology. As if the goal of W3C would be to become some sort of a Big Brother on the Web… I do not want to repeat all the arguments that have already been blogged (nothing in the ones cited about I would disagree with). Jim, among other things, already referred to the OWL FAQ; let me also refer to the SW FAQ (still a draft, but hopefully useful nevertheless). Some of these questions are addressed in both.

However, beyond what is said in those FAQ-s, let me also refer to SKOS here. Not yet a finished technology, true, but quite mature already (a new version should be published pretty soon, b.t.w.). It strikes me that SKOS may be a possible interesting technology to organize the tags that Freebase intends to use. It does not impose a strong logical structure like OWL, but gives a way of “structuring” the tags that might be very useful in, say, linking them to other tagging systems outside of Freebase. It is also possible to put it into a smart user interface. If I were active in the development of Freebase, I would certainly have a look. Just as Shelley puts it: “…In other words, MetaWeb could have used RDF to implement it’s functionality, and none of us would know and most people wouldn’t care. There is nothing in what the W3C has proposed that’s counter to anything MetaWeb hopes to achieve.” The same holds for SKOS!

Finally… What strikes me, re-reading Tim’s blog, is that the core of the article is an enthusiastic description of Freebase, which I have absolutely no problem with. That remark on W3C’s Semantic Web does not add anything to the article’s core message whatsoever. I just wonder why it was necessary to put that remark into the article in the first place… I really do not believe looking for confrontation at all costs and all the time is beneficial for the Web (whichever version we are talking about). But that may be only me…

March 9, 2007

„A kettőt nem tévesztheti össze, aki nem vak és nem hülye”

Filed under: Hungary,Private — Ivan Herman @ 12:59

(Moved to my Hungarian Blog)

March 3, 2007

On “Where is XML Going?”

Filed under: Semantic Web,Work Related — Ivan Herman @ 9:30

A few days ago Kurt Cagle posted a blog entitled “Where is XML Going”. He lists a number technologies and trends of interest in the XML development community. There is nothing in his post I would disagree with (well… I am always a bit cautious when RDF is presented as part of XML; this is not really true and this strong association has done a lot of harm to the image of RDF in the past. Oh well…). He refers to XSLT2, XQuery, XHTML (note the ‘X’!), XForms, SVG, etc., even RDFa (!) as technologies to watch for.

What caught my attention was not really the content of the blog itself. Instead, it made me realize again how diverse the Web has become. Kurt clearly represents and refers to the XML development community; however, we all know that there are large Web communities out there that frown upon anything that begins with an ‘X’ and would prefer it never existed. And anything in between these two… Which is, in fact, all right: all major fields diversify over the years. Although there are some fundamental architectural principles that underpin the Web (and we can be grateful to the W3C TAG to remind us of those time and time again), different communities with different views, working in different worlds (corporate intranets, social networks, digital libraries, you-name-it) do and should coexist. Just none of the different communities should decide that they represent the Web, and that all others should be thrown out…

When working on the Semantic Web, mainly on messaging (eg, in groups like the SWEO IG), we should always remind ourselves of this. The Semantic Web has the potential to be applicable in very different areas and application domains, and we should build bridges to possibly all of the different communities, recognizing their differences, not concentrating on one only (whichever it is). It is not easy, the different fields begin to have different cultures and, consequently, the messaging should be different and adapted to those but, well, that is the way it is…

(It reminds me of my previous life, when I used to work in Computer Graphics. 25 years ago, when I began to work there, it was one discipline; today, it has split into diverse fields like realistic and non-realistic rendering, information and scientific visualization, animation, etc, etc. And yes, each of those communities had the tendency to claim they represented Computer Graphics, and that the others were aliens at best or bastardising the field at worst. And I do not think any of those communities were right. Nevertheless, some common principles still bind these together even if it is not always easy; associations like ACM SIGGRAPH or Eurographics have a major role to play in that.)

Blog at WordPress.com.