This should be, actually, a comment on Péter’s comment on my previous blog, but it really becomes a separate topic. Ie, I decided to put it into a separate blog. Besides, it is a bit too long for a comment…
To summarize, the JWS journal has a pre-print service running, as a back end, the openacademia software developed by Péter and his friends. Which also means that the JWS data should be accessible in RDF, probably following the the SWC ontology (although I have not found a pointer on the JWS site).
But, if so, don’t we have a low hanging, hm, dogfood here for the SW community? We begin to have most of the recent SW publications in RDF somewhere on the net. Beyond the JWS papers the Semantic Web Conference Corpus site not only includes the RDF data for ISWC, ESWC, ASWC, and some related workshops, but it also has a SPARQL endpoint. I know that Daniel Schwabe is working on getting the WWW2008 conference material into a similar format and, hopefully, we can have the material available for the WWW200X conferences available somewhere on the Web. I maintain a list of books on a wiki (well, hopefully, the community maintains it…) but I also keep the same list on Bibsonomy, and the list is therefore available in RDF, too (again, using the SWC ontology). And there might be other resources that I do not know about.
So… the easy thing to do is to integrate all this RDF data via some SPARQL endpoint. Because the data is already in RDF, that does not cost anything (although I am not 100% sure all the data follow the same vocabulary, so querying might be a bit tricky). But what I would love to see is to have a general service with a nice user interface on top of the data. I want to be able to search easily through the data without writing SPARQL queries or dive into the RDF graph directly with an RDF browser. The scale can be tricky. A few weeks ago David Huynh created a nice exhibit page for the ESWC2008 data. It really looks great and helps a lot in searching the data. However… as an experiment I copied his file, and added a few more datasets from the SW Corpus. Well… it turned out to be too much for Exhibit (I may have made a mistake somewhere, of course, but I do not believe Exhibit is good enough for that amount of data). Ie, a more dedicated interface should be created to provide this service for end users (maybe along the lines of openacademia?).
And, of course, it is easy to have nice ideas on how to add new features with all the data around… For example, the book wiki page has references to Chris Bizer’s bookmashup data via the ISBN numbers. We could use DBpedia and Geonames to access information on conference cities, FOAF data on authors and editors… We could use some good service (like MOAT) to have a uniform tagging system for the papers’ topic, or use Ed Summers’ Library of Congress Subject headings in SKOS… In other words, this could become a nice LOD application, too! (Hm, maybe it is not such a low hanging dogfood after all?)
What I would really like is to get a comment on this blog saying “you uninformed fool, this already exists here and here!”. I would humbly stand corrected, and would happily use the service. Anyone with this comment?
