Ivan’s private site

October 28, 2009

ISWC2009 2-3

Second day

In fact, there is much less to say… In the morning I was on two workshops; I was at the Uncertainty Reasoning on the SW one for a while, but then I was asked to participate at a panel at the Semantics for the Rest of Us one, so I had to switch. This was a bit unfortunate, because I could not really ‘dive in’ to any of the two. And my afternoon was taken up by ‘networking’, catching up with some people on many many issues that are not worth blogging (yet?).

I listened to Kathryn Laskey’s presentation on how to combine probability theory in the mathematical sense (the good old Kolmogorov axiomatic theory on probability that I learned at university in a distant past…) with first order logic. I cannot claim to have really understood all the details but it made me curious enough to put reading her paper on my to do list…

As for the panel “Little vs Large Semantics: What’s next for the Semantic Web languages?”, with Leigh, Kendall and Ora on the panel besides me… it was not that exciting, I must admit. Maybe the main message I take away from it was the passionate request of Chris Welty to re-open RDF (see also Pat’s keynote below!).

Third day (well, first real conference day)

Preamble: I would have wanted to add links to papers. And I couldn’t: I have not found the papers on the Web. Neither on Springer’s site nor elsewhere. I may have missed a reference somewhere, if somebody knows then tell me. But if the papers are not available, I think it is a shame…

The conference began with a keynote of Pat Hayes. Entertaining and also thought provoking; Pat is a great speaker. What really interested me is his talk on ‘RDF Redux'; I was actually anxious to listen to that one at SemTech last June but he had to call this off back then. So he repeated it here. This is typically the kind of talk that needs more thinking afterwards to understand it (and Pat has promised to write it down!), but he essentially proposed to re-think and re-do some of the fundamentals of RDF semantics. Instead of set-based model theory which we have today, and which makes the treatment of b-nodes, shall we say, a bit complicated (some would use harsher words:-) we should consider RDF graphs as ‘things’ on a ‘surface’ (think of it as a real surface on a sheet of paper) and b-nodes are just ‘scratches on that  surface’. (A bit like ‘context’ of a graph?) Because these surfaces are different from one graph to the other, when a merge occurs then in fact a new surface is created where the unified graph is put, and the issue of b-nodes becomes natural (instead of the ‘renaming’ procedure that the current semantics document describes). Pat claims that the whole semantics could be re-written that way and none of the current RDF implementations would change. But one can go one step further: there may be different kinds of surfaces (eg, negations) and surfaces can have a name (a bit like named graphs) and all can be put together to provide a powerful semantics for these entities. His further claim was that such an extended semantics of RDF could be powerful enough to describe, conceptually, RDFS or even OWL, ie, the semantics should not be layered any more.

No way I would accept all this argumentation on face value:-), so I have to think about this and, mainly, read whatever Pat may want to write down to understand it. In the meantime, I may have to look into the concepts of conceptual graphs, and the Peircian notation of logic that Pat referred to as inspiration…

A more general take away (see also Chris’ remark above): maybe it is time to look into RDF again? A scary thought. Touching to something that is fundamental on the SW has to be done with extreme care… We will see.

There were two papers in the same session that were very close in subject and topic: one of Jesse Weaver and Jim Hendler on the parallel materialization of RDFS graphs and the one of Jacopo Urbani et al on using MapReduce for RDFS reasoning. (Sigh…, this is where I would like to put a reference!) Both aimed at similar challenges, namely the materialization of RDFS inference results of a graph using parallel computing methods. And there was one more similarity: both had some sort of a classification of the rules in the rule set described in the RDF Semantics document to help improving the processing. (Eg, to analyze which rules should be duplicated among processing nodes and which one can be handled without, or which one need a special treatment for a map-reduce pair). It seems that it would be worthwhile to see if some of these classifications (‘ontology rules’ and the like) could be extended to OWL 2 RL (Jesse Weaver told me afterwards that they want to look into this).  But, to put things into perspective: we are the points when billions of triples can be expanded with relative ease. Who would have thought a few years ago? There was also a remark on one of Jesse’s slide (I do not remember the exact wording) which said that RDFS is insanely parallelizable:-)  It was a really interesting session.

The SW in use session included  a paper from Landong Zuo et al “Supporting multi-view network analysis to understand company value chains”.  Integrating a bunch of data in the UK on companies, integrating them in an RDF store, and let users get information on the ‘value chains’, ie, how companies relate to one another as producers/consumers. Technically, the interesting point was the fact that users had the possibility to interactively add new relationships, new classifications to the system, essentially new rules that could be evaluated. The whole system seemed to be a really cool, a well engineered and well functioning machinery. As the speaker put it, although all conclusions drawn from the system could be found by the users by analyzing databases, but it would take weeks to do what this system can give them in a few minutes. This is exactly the kind of message we need for the outside  world about the usefulness of Semantic Web technologies.

On another session Martin Szomszor presented an experiment they conducted at the ESWC conference, combining RFID-based personal badges with an underlying SW system. The resulting system could be used to show personal contacts among delegates, could help people find others with similar interest, could retrace later whom one met at what point (“I remember talking to that chap, but I do not remember his name!”), etc. Lots of privacy issues, for example, but I would have liked to see that in practice, that is for sure!

Stéphane Corlosquet’s presentation on SW and Drupal was really exciting. I already knew about the plans of Drupal 7 to incorporate RDF management from the start, that all Drupal 7 pages will be annotated via RDFa. The RDFa community has been  fairly excited about that for a while now. But the work done by Stéphane and others provide some additional modules that makes it easy to add a SPARQL endpoint to a Drupal based site easily, to import other RDF content, or to manage the vocabularies used on the pages and the like. They already have such a system running with the current Drupal, but these modules will become part of the standard Drupal 7 module set that one can download from the drupal site. And that is cool.  It significantly lowers the barrier to build Web sites that are prepared to be part of the Linked Data cloud, even if the system administrators are not SW experts. I expect this to open up quite a lot of possibilities…

Off to the next day! More paper and the presentation of the Semantic Web challenge finalists…

October 15, 2008

Semantic Web and uncertainty

The issue of uncertainty on the Semantic Web has been around for a while now, although it is still largely a research issue (Though not only; C&P has an extension of their Pellet tool to handle a particular probabilistic extension of OWL; but I am not aware of any other commercial system of the kind.) Ken and Kathryn Laskey and Paulo Costa have been organizing a series of workshops (the URSW series) on the subject for several years now (there will be one on the coming ISWC2008 conference, too!), and there was also a W3C Incubator Group on the subject that issued a report not a long time ago. But still a lot to be done…

The reason I remembered all that is because I found a survey of Thomas Lukasiewicz and Umberto Straccia[1] that is worth reading if you are interested in the subject. The survey gives a separate description of probabilistic, possibilistic, and fuzzy extensions of the DL dialect that is at the basis of OWL DL, together with further references if one wants to dig deeper (156 of those!). It is not an easy read at all, and I couldn’t say I understood all the details described there, far from it… But as all good surveys do it gives you an idea or, or refreshes your memory on what is happening in the area. And that is always incredibly useful.

The approaches described in the paper are fairly high level in the sense that (as the authors emphasize, too) the extensions are all on top of SHOIN(D), ie, OWL DL. That makes the constructions sometimes quite complex (mainly in the probabilistic case) and they are probably difficult to use for a lambda user. (Although, who knows. They are certainly complex to implement, but maybe the usage is not that bad. I am not sure.) However, the authors themselves refer to alternative approaches on top of simpler DL dialects (without giving too much details). It would be nice to have a survey on the extensions of a level corresponding to simpler OWL profiles like OWL RL that the OWL WG at W3C is also working on now. Just as OWL RL might be a good “entry point” for a large family of users into the world of OWL, an uncertainty extension of that level might be of a great interest, too…

Reading this survey also reminded me of short paper by Fensel and van Harmelen[2], “Unifying Reasoning and Search to Web Scale”, on which I had a very short blog a while ago. I just wonder whether fuzzy or probabilistic reasoning may not be a good approach to the problems they describe there… Althought this is clearly still a long way off.

Anyway. I learned something today…

  1. Lukasiewicz, Thomas, and Umberto Straccia. “Managing uncertainty and vagueness in description logics for the Semantic Web.” Journal of Web Semantics: Science, Services and Agents on the World Wide Web 6, no. 4 (2008). Available on line as a pre-print.
  2. “Unifying Reasoning and Search to Web Scale”, by Dieter Fensel and Frank van Harmelen, IEEE Internet Computing, Volume 11, No. 2, March/April 2007.

May 25, 2008

SemTech 2008

I just came back from the SemTech conference in San Jose: a great (almost) week. Obviously, it had a social aspect: meeting lots of people is always a pleasure. Friends and colleagues I have not seen for a long time (possibly years) or, a frequent occurrence, meeting people with whom I had contacts for a long time in email, various fora, even telcos, but never met personally. I do not even want to make an attempt to list names here, it would be too long!

As for the conference… it so happens that I was part of the closing panel. The moderator asked us, to close the session, what we would tell our colleagues when we get back home. Well, here is, more or less, what I said: for all of us who have been part of the Semantic Web community for several years it was, we must admit, sometimes an uphill battle. Experts, journalists, personalities and pundits of all kinds have repeatedly buried the Semantic Web as an unnecessary, uninteresting, if not harmful technology, that has no interest for the “real” world, which should simply be ignored. Well, we just saw how misguided these opinions are: there was a real “buzz” at the event, with companies showing their newest products, with a double digit increase in the attendance of the conference compared to last year (more than a thousand participants this year!). One of the organizers told me that the various business cards collections at the conference was full of hitherto unknown small companies and new startups that see this space as a possibility for growth. It is not clear whether the Semantic Web has reached an “inflection point” already but, and that is sure, the buzz, the interest, the excitement is there, and this was clearly in the air at the conference.

The usual caveat applies: way too many things happened during the week to have seen everything. That being said, and in no particular order, some of the things I noted…

There was a great presentation on Open Calais. This service can have a really important role to play in “bridging” tagging and a more systematic terminology usage. Another way of putting it: the service also binds natural language understanding to Semantic Web. Great stuff. I would really like to see this and similar services being integral part of some more “social” sites…

Eric Miller made a great keynote representing Zepheira, which comes with some nice products based on the combination of a number of open source projects (exhibit, potlock,…). Eric used the terms “reuse, repurpose, remix [data]” as a characterization of what they do: quick and easy combination of various types of data on the Web. This slogan could also characterize the Semantic Web in general… Actually, although I was not present at Lee Feigenbaum’s presentation, he showed me in a break an application (easy mix of spreadsheet data with SW tools) which could be characterized with exactly the same description. Convergence of thoughts…

By the way, Zepheira also had an announcement on the combination of Aduna’s Sesame environment with the open source triple store Mulgara. One of the prominent RDF programming environment combined with a triple store written bottom up (as opposed to adapting a relational database for triplet storage): that can be a very powerful combination. Another project shown by Zepheira is the renewed purl.org software (soon to be deployed on the purl site, too). The promise is to have a software that is much easier to deploy and install; it also have some new features, like a direct implementation of the HTTP Range-14 to define non-informational resource URIs. (The fact that they would do that is not new, I even wrote about this almost a year ago, but the project is now completed.)

The other keynote at the opening session was Nova Spivack’s on Twine. I am a regular user of Twine; lot has already been written on it in the blogosphere, I do not want to repeat it here. Twine was certainly one of the new services that was referred to and discussed a lot at the event, with half f the audience being a beta tester already and the other half asking for it…

Natasha Noy and Tania Tudorache presented Collaborative Protégé; they also showed a web version of (Collaborative) Protégé. With the role of large communities developing ontologies together, such tool is really important…

And then we got Yahoo!. Actually, two very different presentations: the one of Dave Beckett on the Semantic Web back-end for sites like Yahoo! Finance, or Yahoo! Kids. These sites are driven by a combination of MySQL, Redland, simple vocabularies (external to Yahoo! like Dublin Core or maintained internally), and with RDF everywhere. And, as Dave said: all this works well even on the scale of Yahoo!. Peter Mika’s presentation was on a very different part of the company, namely SearchMonkey. Ben Adida has just published a blog on this (he was writing it while we were sitting at Peter’s presentation:-), and he describes it better than I could, so go there for more details…

By the way of RDFa (also used by SearchMonkey): I may be biased by my work in RDFa, but I really felt that RDFa was a buzz within the overall buzz of the conference. A number of middleware will (or does) generate RDFa on their output and there was a general interest for the technology. Ben (and his colleague, Nathan Yergler) also gave a presentation on RDFa and on its usage in Creative Commons. Unfortunately, there was some last minute change in the session order, and this session became a victim, so there were a much smaller audience than the topic deserved. Oh well…

The room was packed full for Pavel Klinov’s presentation on Pronto. Although this is not the first time that Clark & Parsia talks about Pronto (I have seen several articles on their blog), it may have been the first big conference where they talked about it (I might be wrong on that, though). Pronto is a combination of OWL with probabilistic reasoning. In some application areas this type of tool is very important (and technically exciting, although, I must admit, I have only a very general idea on how it works, the theory behind it is not an easy read…)

Mike Dean gave a mini-tutorial on SWRL’s usage on the last day. In spite of the general fatigue of the last day of a long week, it was really interesting and he could keep his audience. All this combination of rules, OWL, RIF, ontology mapping, database schema to RDF mapping, etc, continues to be an exciting area of R&D…

Speaking of tutorials, there was also a separate session on XBRL on the first day. XBRL stands for “eXtensible Business Reporting Language”; it is a true data integration project but restricted to a very specific application area (financial reporting) and developed using very complex XML schemas. The community behind XBRL is now seriously considering the Semantic Web as a possibility to break out of their “community silo”. Huge amount of public data, possibly linked to the rest of the Web of Data…

Did I miss some great sessions at the conference? That is for sure. For example, I was not there at Chime’s presentation on GRDDL, or Jim Hendler’s presentation on the role of ontologies (Jim always finds catchy titles for his talks, this one is entitled “The Fellowship of the (Semantic) Web: The Two Towers”; we can have lots of discussions on who Frodo, Gandalf or Sauron are but, well, maybe we should not go there…), or the session on Health Care and Life Sciences. But this is always what happens. Nevertheless, I could see a lot, I had lots of hallway conversations,… it was a great week!

P.S.: I used the URI-s of the conference’s web site for the various presentations; as far as I know, the presentation slides will be put on those pages eventually. At least I hope…

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,616 other followers