Ivan’s private site

September 28, 2008

ESTC2008, Vienna

I had the pleasure, last week, to be at the 2nd European Semantic Technology Conference in Vienna, Austria. I had a really good week…

The conference was not all that big, cca. 200 participants, 70-75% from industry, plus a small exhibition. It would be all too easy to dismiss the conference due to its size, when compared with SemTech which had more than 1000 participants. But that would really be unfair: first of all, despite the best efforts of the European Commision, Europe is still very much a divided market (one of the feedbacks I heard was to say “Austria is really too small as a market”) which does affect attendance and, as Mark Greaves reminded us at the closing panel, SemTech was also a smaller event a few years ago and it is only lately that it made a huge jump in attendance. Ie, there is room and good prospects for ESTC to grow!

It is always difficult to give a good overview of such a conference; due to the parallel sessions one is bound to miss most of it:-(. Just some highlights from my own perspective.

A passage in an old palace in Vienna

A passage in an old palace in Vienna

One interesting aspect the high profile of technologies aiming at extracting structures from unstructured content, typically text. Three out of the four keynotes (Peter Jackson’s from Reuters, Hans Uszkoreit’s from DFKI, and Hugo Zaragoza’s from Yahoo!) were either fully or partially concentrating on this.  A number of other presentations also touched upon this as part of developing Semantic Web applications, and there were also hallway conversations on the usage of public services like Open Calais or Zemanta. Although these services have not been developed exclusively for the Semantic Web, they are clearly extremely useful for that applcation area, too. It is also interesting to see that Wikipedia (and, by extension, DBPedia) URI-s begin to play a more an more important role, through these services, as reference URI-s. (I had a blog a while ago which also generated a modest discussion, if you are interested).

Another recurring topic was the “long tail” (shame on me but I must admit I did not know this business term). Orestis Terzidis, from SAP, gave a nice keynote showing, through some SAP case studies, how Semantic Web technologies can be very useful in exploiting the business opportunities in this “long tail” through the flexibility, the possibilities for adaptation and personalization, etc, that they can provide. A good example was the presentation given by Liberté Crozon from Discotheka outlining the plan to exploit SW techniques to build a really good archiving and search services for classical music (as a fan of classical music I know all too well what a fosterchild it is on the music related web sites…).

What else? I had discussions, chats with people from well known tool vendors like Franz Inc, Aduna, or Ontotext; it is always nice to see what they do, what new and cool things they come up with (and they do come up with new things, check out the presentations like the ones of …). The talk on the NeOn project (by Mathieu d’Aquin) was interesting; the NeOn toolkit has the potential in becoming a major player as a tool to develop various ontologies. Not only OWL-DL, but also other dialects of OWL (OWL Full, OWL RL, etc), SKOS, RDFS, and all that in a possibly distributed setting. Cool stuff in the making! Leo Sauermann gave a nice presentation on Semantic Desktops; although there is a SW public Use Case on its already (beyond a bunch of papers), it is always good to have an additional insight. Raphael Volz announced a new tool (still in alpha, though) on managing information on persons for persons. Finally, let me quote here one of the slides of David Norheim (who presented a nice application for Norwegian public schools):

- New standards (e.g. SPARQL), proposals for standardization (e.g. SPARUL), new tools (e.g. Jena), open source (e.g. Tomcat, Apache), lack of good documentation all say high risk!!!!

- However, the support and maintenance from the W3C community and open source developers (e.g. Jena team) has been impressive, the support through IRC channels, mailing lists etc has been invaluable for the project.

I take that as a compliment for the SW community at large!

Finally, on a more private note: there was a time when I had to go to Vienna quite frequently for business reasons but, nevertheless, the city never ceases to amaze me with all the things you can see there. I discovered this lovely passage just across the conference site…

There will be an ESTC2009: the dates are already set (2009-09-30 to 2009-10-02) although the location isn’t. But it is worth adding this to your agenda…

June 13, 2008

Web data visualization with ontologies

Filed under: General,Semantic Web,Work Related — Ivan Herman @ 10:14
Tags: , ,

It is nice to see when very different communities reuse one another’s work, ie, when the fragmentation of research and development into different fields is, at least a little bit, reduced… I ran into a paper Gilson & al[1] on “From Web data to visualization via ontology mapping” in a journal (the Computer Graphics Forum) that is usually not read by Semantic Web experts. So it may be worth drawing their attention on it… Instead of trying to paraphrase the content of the paper, why not simply reproduce the abstract:

In this paper, we propose a novel approach for automatic generation of visualizations from domain-specific data available on the web. We describe a general system pipeline that combines ontology mapping and probabilistic reasoning techniques. With this approach, a web page is first mapped to a Domain Ontology, which stores the semantics of a specific subject domain (e.g., music charts). The Domain Ontology is then mapped to one or more Visual Representation Ontologies, each of which captures the semantics of a visualization style (e.g., tree maps). To enable the mapping between these two ontologies, we establish a Semantic Bridging Ontology, which specifies the appropriateness of each semantic bridge. Finally each Visual Representation Ontology is mapped to a visualization using an external visualization toolkit. Using this approach, we have developed a prototype software tool, SemViz, as a realisation of this approach. By interfacing its Visual Representation Ontologies with public domain software such as ILOG Discovery and Prefuse, SemViz is able to generate appropriate visualizations automatically from a large collection of popular web pages for music charts without prior knowledge of these web pages.

Worth reading. And thanks to my friend David Duce to talk to me about it…

[1] O. Gilson et al., “FromWeb Data to Visualization via Ontology Mapping,” Computer Graphics Forum, vol. 27, Number 3, 2008 (the paper is also available on-line). The paper was originally presented at the joint Eurographics/IEEE Symposium on Visualization, where it won the best paper award.

May 25, 2008

SemTech 2008

I just came back from the SemTech conference in San Jose: a great (almost) week. Obviously, it had a social aspect: meeting lots of people is always a pleasure. Friends and colleagues I have not seen for a long time (possibly years) or, a frequent occurrence, meeting people with whom I had contacts for a long time in email, various fora, even telcos, but never met personally. I do not even want to make an attempt to list names here, it would be too long!

As for the conference… it so happens that I was part of the closing panel. The moderator asked us, to close the session, what we would tell our colleagues when we get back home. Well, here is, more or less, what I said: for all of us who have been part of the Semantic Web community for several years it was, we must admit, sometimes an uphill battle. Experts, journalists, personalities and pundits of all kinds have repeatedly buried the Semantic Web as an unnecessary, uninteresting, if not harmful technology, that has no interest for the “real” world, which should simply be ignored. Well, we just saw how misguided these opinions are: there was a real “buzz” at the event, with companies showing their newest products, with a double digit increase in the attendance of the conference compared to last year (more than a thousand participants this year!). One of the organizers told me that the various business cards collections at the conference was full of hitherto unknown small companies and new startups that see this space as a possibility for growth. It is not clear whether the Semantic Web has reached an “inflection point” already but, and that is sure, the buzz, the interest, the excitement is there, and this was clearly in the air at the conference.

The usual caveat applies: way too many things happened during the week to have seen everything. That being said, and in no particular order, some of the things I noted…

There was a great presentation on Open Calais. This service can have a really important role to play in “bridging” tagging and a more systematic terminology usage. Another way of putting it: the service also binds natural language understanding to Semantic Web. Great stuff. I would really like to see this and similar services being integral part of some more “social” sites…

Eric Miller made a great keynote representing Zepheira, which comes with some nice products based on the combination of a number of open source projects (exhibit, potlock,…). Eric used the terms “reuse, repurpose, remix [data]” as a characterization of what they do: quick and easy combination of various types of data on the Web. This slogan could also characterize the Semantic Web in general… Actually, although I was not present at Lee Feigenbaum’s presentation, he showed me in a break an application (easy mix of spreadsheet data with SW tools) which could be characterized with exactly the same description. Convergence of thoughts…

By the way, Zepheira also had an announcement on the combination of Aduna’s Sesame environment with the open source triple store Mulgara. One of the prominent RDF programming environment combined with a triple store written bottom up (as opposed to adapting a relational database for triplet storage): that can be a very powerful combination. Another project shown by Zepheira is the renewed purl.org software (soon to be deployed on the purl site, too). The promise is to have a software that is much easier to deploy and install; it also have some new features, like a direct implementation of the HTTP Range-14 to define non-informational resource URIs. (The fact that they would do that is not new, I even wrote about this almost a year ago, but the project is now completed.)

The other keynote at the opening session was Nova Spivack’s on Twine. I am a regular user of Twine; lot has already been written on it in the blogosphere, I do not want to repeat it here. Twine was certainly one of the new services that was referred to and discussed a lot at the event, with half f the audience being a beta tester already and the other half asking for it…

Natasha Noy and Tania Tudorache presented Collaborative Protégé; they also showed a web version of (Collaborative) Protégé. With the role of large communities developing ontologies together, such tool is really important…

And then we got Yahoo!. Actually, two very different presentations: the one of Dave Beckett on the Semantic Web back-end for sites like Yahoo! Finance, or Yahoo! Kids. These sites are driven by a combination of MySQL, Redland, simple vocabularies (external to Yahoo! like Dublin Core or maintained internally), and with RDF everywhere. And, as Dave said: all this works well even on the scale of Yahoo!. Peter Mika’s presentation was on a very different part of the company, namely SearchMonkey. Ben Adida has just published a blog on this (he was writing it while we were sitting at Peter’s presentation:-), and he describes it better than I could, so go there for more details…

By the way of RDFa (also used by SearchMonkey): I may be biased by my work in RDFa, but I really felt that RDFa was a buzz within the overall buzz of the conference. A number of middleware will (or does) generate RDFa on their output and there was a general interest for the technology. Ben (and his colleague, Nathan Yergler) also gave a presentation on RDFa and on its usage in Creative Commons. Unfortunately, there was some last minute change in the session order, and this session became a victim, so there were a much smaller audience than the topic deserved. Oh well…

The room was packed full for Pavel Klinov’s presentation on Pronto. Although this is not the first time that Clark & Parsia talks about Pronto (I have seen several articles on their blog), it may have been the first big conference where they talked about it (I might be wrong on that, though). Pronto is a combination of OWL with probabilistic reasoning. In some application areas this type of tool is very important (and technically exciting, although, I must admit, I have only a very general idea on how it works, the theory behind it is not an easy read…)

Mike Dean gave a mini-tutorial on SWRL’s usage on the last day. In spite of the general fatigue of the last day of a long week, it was really interesting and he could keep his audience. All this combination of rules, OWL, RIF, ontology mapping, database schema to RDF mapping, etc, continues to be an exciting area of R&D…

Speaking of tutorials, there was also a separate session on XBRL on the first day. XBRL stands for “eXtensible Business Reporting Language”; it is a true data integration project but restricted to a very specific application area (financial reporting) and developed using very complex XML schemas. The community behind XBRL is now seriously considering the Semantic Web as a possibility to break out of their “community silo”. Huge amount of public data, possibly linked to the rest of the Web of Data…

Did I miss some great sessions at the conference? That is for sure. For example, I was not there at Chime’s presentation on GRDDL, or Jim Hendler’s presentation on the role of ontologies (Jim always finds catchy titles for his talks, this one is entitled “The Fellowship of the (Semantic) Web: The Two Towers”; we can have lots of discussions on who Frodo, Gandalf or Sauron are but, well, maybe we should not go there…), or the session on Health Care and Life Sciences. But this is always what happens. Nevertheless, I could see a lot, I had lots of hallway conversations,… it was a great week!

P.S.: I used the URI-s of the conference’s web site for the various presentations; as far as I know, the presentation slides will be put on those pages eventually. At least I hope…

« Previous Page

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,613 other followers