May 25, 2008

SemTech 2008

I just came back from the SemTech conference in San Jose: a great (almost) week. Obviously, it had a social aspect: meeting lots of people is always a pleasure. Friends and colleagues I have not seen for a long time (possibly years) or, a frequent occurrence, meeting people with whom I had contacts for a long time in email, various fora, even telcos, but never met personally. I do not even want to make an attempt to list names here, it would be too long!

As for the conference… it so happens that I was part of the closing panel. The moderator asked us, to close the session, what we would tell our colleagues when we get back home. Well, here is, more or less, what I said: for all of us who have been part of the Semantic Web community for several years it was, we must admit, sometimes an uphill battle. Experts, journalists, personalities and pundits of all kinds have repeatedly buried the Semantic Web as an unnecessary, uninteresting, if not harmful technology, that has no interest for the “real” world, which should simply be ignored. Well, we just saw how misguided these opinions are: there was a real “buzz” at the event, with companies showing their newest products, with a double digit increase in the attendance of the conference compared to last year (more than a thousand participants this year!). One of the organizers told me that the various business cards collections at the conference was full of hitherto unknown small companies and new startups that see this space as a possibility for growth. It is not clear whether the Semantic Web has reached an “inflection point” already but, and that is sure, the buzz, the interest, the excitement is there, and this was clearly in the air at the conference.

The usual caveat applies: way too many things happened during the week to have seen everything. That being said, and in no particular order, some of the things I noted…

There was a great presentation on Open Calais. This service can have a really important role to play in “bridging” tagging and a more systematic terminology usage. Another way of putting it: the service also binds natural language understanding to Semantic Web. Great stuff. I would really like to see this and similar services being integral part of some more “social” sites…

Eric Miller made a great keynote representing Zepheira, which comes with some nice products based on the combination of a number of open source projects (exhibit, potlock,…). Eric used the terms “reuse, repurpose, remix [data]” as a characterization of what they do: quick and easy combination of various types of data on the Web. This slogan could also characterize the Semantic Web in general… Actually, although I was not present at Lee Feigenbaum’s presentation, he showed me in a break an application (easy mix of spreadsheet data with SW tools) which could be characterized with exactly the same description. Convergence of thoughts…

By the way, Zepheira also had an announcement on the combination of Aduna’s Sesame environment with the open source triple store Mulgara. One of the prominent RDF programming environment combined with a triple store written bottom up (as opposed to adapting a relational database for triplet storage): that can be a very powerful combination. Another project shown by Zepheira is the renewed purl.org software (soon to be deployed on the purl site, too). The promise is to have a software that is much easier to deploy and install; it also have some new features, like a direct implementation of the HTTP Range-14 to define non-informational resource URIs. (The fact that they would do that is not new, I even wrote about this almost a year ago, but the project is now completed.)

The other keynote at the opening session was Nova Spivack’s on Twine. I am a regular user of Twine; lot has already been written on it in the blogosphere, I do not want to repeat it here. Twine was certainly one of the new services that was referred to and discussed a lot at the event, with half f the audience being a beta tester already and the other half asking for it…

Natasha Noy and Tania Tudorache presented Collaborative Protégé; they also showed a web version of (Collaborative) Protégé. With the role of large communities developing ontologies together, such tool is really important…

And then we got Yahoo!. Actually, two very different presentations: the one of Dave Beckett on the Semantic Web back-end for sites like Yahoo! Finance, or Yahoo! Kids. These sites are driven by a combination of MySQL, Redland, simple vocabularies (external to Yahoo! like Dublin Core or maintained internally), and with RDF everywhere. And, as Dave said: all this works well even on the scale of Yahoo!. Peter Mika’s presentation was on a very different part of the company, namely SearchMonkey. Ben Adida has just published a blog on this (he was writing it while we were sitting at Peter’s presentation:-), and he describes it better than I could, so go there for more details…

By the way of RDFa (also used by SearchMonkey): I may be biased by my work in RDFa, but I really felt that RDFa was a buzz within the overall buzz of the conference. A number of middleware will (or does) generate RDFa on their output and there was a general interest for the technology. Ben (and his colleague, Nathan Yergler) also gave a presentation on RDFa and on its usage in Creative Commons. Unfortunately, there was some last minute change in the session order, and this session became a victim, so there were a much smaller audience than the topic deserved. Oh well…

The room was packed full for Pavel Klinov’s presentation on Pronto. Although this is not the first time that Clark & Parsia talks about Pronto (I have seen several articles on their blog), it may have been the first big conference where they talked about it (I might be wrong on that, though). Pronto is a combination of OWL with probabilistic reasoning. In some application areas this type of tool is very important (and technically exciting, although, I must admit, I have only a very general idea on how it works, the theory behind it is not an easy read…)

Mike Dean gave a mini-tutorial on SWRL’s usage on the last day. In spite of the general fatigue of the last day of a long week, it was really interesting and he could keep his audience. All this combination of rules, OWL, RIF, ontology mapping, database schema to RDF mapping, etc, continues to be an exciting area of R&D…

Speaking of tutorials, there was also a separate session on XBRL on the first day. XBRL stands for “eXtensible Business Reporting Language”; it is a true data integration project but restricted to a very specific application area (financial reporting) and developed using very complex XML schemas. The community behind XBRL is now seriously considering the Semantic Web as a possibility to break out of their “community silo”. Huge amount of public data, possibly linked to the rest of the Web of Data…

Did I miss some great sessions at the conference? That is for sure. For example, I was not there at Chime’s presentation on GRDDL, or Jim Hendler’s presentation on the role of ontologies (Jim always finds catchy titles for his talks, this one is entitled “The Fellowship of the (Semantic) Web: The Two Towers”; we can have lots of discussions on who Frodo, Gandalf or Sauron are but, well, maybe we should not go there…), or the session on Health Care and Life Sciences. But this is always what happens. Nevertheless, I could see a lot, I had lots of hallway conversations,… it was a great week!

P.S.: I used the URI-s of the conference’s web site for the various presentations; as far as I know, the presentation slides will be put on those pages eventually. At least I hope…

  1. Ivan – thanks for the mention – the slides are now publicly available at http://www.cs.rpi.edu/~hendler/presentations/SemTech2008-2Towers.pdf — Jim H.

    Comment by Jim Hendler — May 26, 2008 @ 4:43

  2. It was a pleasure and honor to meet you and Jim: The highlight of the conference for me. A close second was meeting a lot of fellow Twinerians and Radarians whom I haven’t already met.

    What disappointed me was that we couldn’t get any press to attend. None. Not even bloggers!! Not even with the offer of free logoware. ;-)

    How many A-list bloggers live in Silicon Valley? Probably most. Also, a lot of Gartner analysts live in the area, and so do some other top industry analysts like Rob Enderle (you could throw a rock to his place from the Fairmont). But nobody showed up.

    So, we have to assume that semweb is still viewed as a research endeavor. Further, the idea of a company even touting semweb may be unproductive, perhaps even counterproductive. In other words, it isn’t about what Twine or Powerset do using semweb, it’s about what they do relative to competitive offerings — regardless of the core tech.

    I’ll be focusing more and more effort on my “Apps: On Semantic Web & Related Applications” (like intelligent agents and knowledge discovery) twine. Apps, quite frankly, are what it’s all about outside of academia. More apps are needed, as is more apps-centered experimentation and research.

    See http://www.twine.com/twine/apps.


    - David Scott Lewis

    Comment by David Scott Lewis — May 26, 2008 @ 18:36

  3. Ivan- it was great meeting you. Thanks for this write-up i will include it in the wrap-up post that i am writing for my collegues and clients who were unable to attend this year.
    I managed to get the final keynote panel on Bringing SemTech Back to the Business on video and the SemTech conference folks have said it was ok for me to publish- so here is the URL http://blip.tv/file/933309

    i will send you a separate email as well with my follow-ups.


    Comment by daniela barbosa — May 30, 2008 @ 1:32

  4. [...] A few weeks ago was SemTech’08, the annual Semantic Technologies conference, aimed at a commercial audience (both buyers and sellers). This year it was (again) larger than previous years (now over 1000, up from under 300 a few years ago). A good write-up of some of the highlights is at Ivan Herman’s  Blog. [...]

    Pingback by LarKC weblog » Blog Archive » SemTech 2008 — June 11, 2008 @ 21:52

  5. Ivan, thanks for the very informative write-up! I’ve forwarded it to the blog of the LarKC project:
    http://blog.larkc.eu/?p=81. (For the LarKC project, see http://www.larkc.eu ). Thanks again!

    Comment by Frank van Harmelen — June 11, 2008 @ 21:57

