I had a great time reading a paper on Semantic Search[1]. Although the paper is on the details of a specific Semantic Web search engine (DERI’s SWSE), I was reading it as somebody not really familiar with all the intricate details of such a search engine setup and operation (i.e., I would not dare to give an opinion on whether the choice taken by this group is better or worse than the ones taken by the developers of other engines) and wanting to gain a good image of what is happening in general. And, for that purpose, this paper was really interesting and instructive. It is long (cca. 50 pages), i.e., I did not even try to understand everything at my first reading, but it did give a great overall impression of what is going on.
One of the “associations” I had, maybe somewhat surprisingly, is with another paper I read lately, namely a report on basic profiles for Linked Data[2]. In that paper Nally et al. look at what “subsets” of current Semantic Web specifications could be defined, as “profiles”, for the purpose of publishing and using Linked Data. This was also a general topic at a W3C Workshop on Linked Data Patterns at the end of last year (see also the final report of the event) and it is not a secret that W3C is considering setting up a relevant Working Group in the near future. Well, the experiences of an engine like SWSE might come very handy here. For example, SWSE uses a subset of the OWL 2 RL Profile for inferencing; that may be a good input for a possible Linked Data profile (although the differences are really minor, if one looks at the appendix of the paper that lists the rule sets the engine uses). The idea of “Authoritative Reasoning” is also interesting and possibly relevant; that approach makes a lot of pragmatic sense, I wonder whether this is not something that should be, somehow, documented for a general use. And I am sure there are more: In general, analyzing the experiences of major Semantic Web search engines on handling Linked Data might provide a great set of input for such pragmatic work.
I was also wondering about a very different issue. A great deal of work had to be done in SWSE on the proper handling of owl:sameAs. On the other hand, one of the recurring discussions on various mailing list and elsewhere is on whether the usage of this property is semantically o.k. or not (see, e.g., [3]). A possible alternative would be to define (beyond owl:sameAs) a set of properties borrowed from the SKOS Recommendation, like closeMatch, exactMatch, broadMatch, etc. It is almost trivial to generalize these SKOS properties for the general case but, reading this paper, I was wondering: what effect would such predicates have on search? Would it make it more complicated or, in fact, would such predicates make the life of search engines easier by providing “hints” that could be used for the user interface? Or both? Or is it already too late, because the ubiquitous usage of owl:sameAs is already so prevalent that it is not worth touching that stuff? I do not have a clear answer at this moment…
Thanks to the authors!
- A. Hogan, et al., “Searching and Browsing Linked Data with SWSE: the Semantic Web Search Engine”, Journal of Web Semantics, vol. 4, no. December, pp. 365-401, 2011.
- M. Nally and S. Speicher, “Toward a Basic Profile for Linked Data”, IBM developersWork, 2011.
- H. Halpin, et al. “When owl:sameAs Isn’t the Same: An Analysis of Identity in Linked Data”, Proceedings of the International Semantic Web Conference, pp. 305-320, 2010












