Ivan’s private site

February 22, 2008

Setting up an RDFa file with apache

Filed under: Semantic Web, Work Related — Ivan Herman @ 17:56
Tags:

As I said yesterday, the SW FAQ file is now in XHTML/RDFa. However, I was wondering how to set up the environment so that the right URI-s would lead to the right format, ie, either HTML or RDF. Of course, one could generate the SW-FAQ.rdf file offline and put that on the server, but that sounded a little bit like cheating (although, I must admit, that is what I did first). What one would like is

  • http://www.w3.org/2001/sw/SW-FAQ should return
    • XHTML by default
    • RDF/XML if so requested, but generated from the XHTML file on-the-fly via an RDFa processor; and to that with an HTTP 303 round (to make it really neat)
  • http://www.w3.org/2001/sw/SW-FAQ.rdf should return RDF/XML, again generated on-the-fly
  • http://www.w3.org/2001/sw/SW-FAQ.html should return, well, XHTML

It so happens that, on apache, a little bit of .htaccess wizardry works. The problem is that you have to be the wizard, which I am not. Luckily, my colleague and friend Ralph Swick is :-) . So here is the .htaccess file:

RewriteEngine On
RewriteBase /2001/sw/

#This is where the RDFa distiller is called on-the-fly:
RewriteRule SW-FAQ.rdf /2007/08/pyRdfa/extract?uri=http://www.w3.org/2001/sw/SW-FAQ.html [L]

# Take care of the RDF case when so requested
RewriteCond %{HTTP_ACCEPT} application/rdf\+xml
RewriteRule ^SW-FAQ$ SW-FAQ.rdf [R=303,L]

RewriteRule ^SW-FAQ$ SW-FAQ.html [L]

And voilà! Thanks Ralph…

February 21, 2008

RDFa Syntax LC is out

Filed under: Python, Semantic Web, Work Related — Ivan Herman @ 19:41

The RDFa Syntax Last Call document has just been published; yey!

I have also made an update of the RDFa processor that I coded last summer; it is now available for download and is also used through the “RDFa Distiller” service page. I have played with RDFa in practical terms, too; my foaf file in HTML, the W3C SW Activity Home page, and the Semantic Web FAQ page are all annotated with RDFa now. Once one is used to it, it is fairly straightforward to add even complex RDF statements to HTML pages with an arbitrarily large number of different vocabularies mixed in. Of course, authoring tools would be good, but let us take things one step at a time… Having the Last Call published (ie, the Working Groups believing to have taken care of all technical issues) is a major, big step ahead!

B.t.w., Benjamin Nowack jumped on the SW-FAQ RDF file to make a nice little hack; here is the mail he sent on the SW SWEO list the other day:

Heh, silly stuff, just FYI: On the #foaf channel is foafbot (a SPARQLy reincarnation
of an earlierbot we had there years ago). It understands RDFa, and allows the
specification of custom commands at [1]. I made it load Ivan’s FAQ, and
created an “faq” command, so that you can now pass a keyword or phrase
to the bot and it will respond with a pointer to the FAQ (if something
matched the RDFa-encoded question), e.g.:

<bengee> foafbot, faq giant ontology

<foafbot> bengee, see http://www.w3.org/2001/sw/SW-FAQ#whgiantont

;)

Benji

[1] http://semsol.org/semcamp/sparqlbot

Isn’t that cool? As far as I could see, it took him about 10 minutes to add this hack, thanks to the SW-FAQ being in RDF…

SW for Health Care and Life Sciences Workshop, W3C Track

Filed under: Semantic Web, Work Related — Ivan Herman @ 11:10
Tags: ,

The program for WWW2008 is really shaping up.  I already blogged a while ago on the SW related stuffs at the conference, and on the LOD workshop program yesterday. Well, the program of the Health Care and Life Sciences Workshop is also public now. Again, lots of great stuff there. Last but not least: the program of the W3C Track is also public with, as usual, a SW session (and others!).

It will be an interesting week (an an interesting place).

February 20, 2008

Linked Data on the Web Workshop in Beijing

Filed under: Semantic Web, Work Related — Ivan Herman @ 17:06
Tags:

The preliminary programme for the “Linked Data on the Web” Workshop (one of the workshops at the WWW2008 conference) is now online. It looks really good… worth checking out!

February 14, 2008

New version of the SPARQL Python wrapper

Filed under: Code, Python, Semantic Web, Work Related — Ivan Herman @ 13:49
Tags:

About half a year ago I announced the availability of a SPARQL endpoint interface to Python. It was really a beta release back then (ie, last July), but recently it went through a more thorough testing and improvement cycle. This was not my merit; all praise should go to Sergio Fernàndez and Carlos Tejo (both from CTIC Foundation, Spain) who decided to use the package in one of their internal projects. They revealed some problems (of course…), and we then worked together to prepare a proper 1.0 release. It is my pleasure to consider them as co-authors of this small package!

As before, the code is available from my site; the API documentation is included in the distribution (and is also available online). However, the project has also been moved to sourceforge, and is now available there, too (including the on-line documentation).

February 12, 2008

On the “Google generation”

Filed under: General, Private, Semantic Web, Work Related — Ivan Herman @ 17:03

I stumbled across an interesting study made for the British Library and the UK Joint Information Systems Committee (JISC) on the the “Google generation” and the Web, more exactly search. The goal of the study is to analyse this generation’s behaviour in terms of Web usage, more specifically in terms of finding information on the Web and the role that libraries can play. The target of the study are libraries and librarians, ie, it is a somewhat specialized view, but it is nevertheless interesting read.

Unfortunately, it is long. But, luckily, there is an executive summary; although it is 35 pages, it is worth reading. (I must admit I did not find the time or energy to read more than this summary until now.) I reproduced (in a little bit shortened form) slides 18 to 20 at the end of this blog: these are “myth buster” results that I found fairly interesting…

There is also an nice comment and predictions on the future evolution (and the necessity of libraries to react on those) which include a note on the Semantic Web. I quote from the pages on “looking into the future”:

The world wide web as we have seen and experienced it so far could be completely revolutionised by the advent of the `semantic web’. [...] Some pundits believe that this scenario is very far away and, indeed that it may never happen on a wide scale. Our view is that the semantic web is a tool that will reach its tipping point fairly soon. In five years, 2013, there could be substantial developments that might allow a whole generation of undergraduates to begin to experience its potential.

This is especially likely to be the case in niche areas, like e-Science, especially biology, creating new opportunities for major research libraries to be involved in completely new forms of activity such as real-time publishing and the sharing of experimental data on the internet.

Note that the text also refers to “sharing experimental data”. Amen! :-)


So here are some of the myth on this generation and the related finding of the study:

They are more competent with technology
Our verdict: Generally true, we think, but older users are catching up fast. [..]
They have very high expectations of ICTs
Our verdict: Probably true, since we live in a global web culture dominated by a handful of unifying brands. [...] this expectation is relative, all of us are information consumers now.
They prefer interactive systems and are turning away from being passive consumers of information
Our verdict: Generally true, as borne out by young people’s media consumption patterns: passive media such as television and newspapers are in decline.
They have shifted decisively to digital forms of communication: texting rather than talking
Our verdict: Open. it is very difficult to see messaging as a fundamental trend, its current popularity is certainly influenced by its relatively low cost compared with voice.
They multitask in all areas of their lives
Our verdict: Open. There is no hard evidence. However, it is likely that being exposed to online media early in life may help to develop good parallel processing skills. The wider question is whether sequential processing abilities, necessary for ordinary reading, are being similarly developed.
They are used to being entertained and now expect this of their formal learning experience at university
Our verdict: Open. [...] We are a little concerned by the current interest in using games technologies to enhance students’ learning and library-based experience. When broadcast news makers introduced entertainment show production techniques 20-30 years ago, research showed that these enhanced `interest’ but impeded the absorption of information.
They prefer visual information over text
Our verdict: A qualified yes, but text is still important. As technologies improve and costs fall, we expect to see video links beginning to replace text in the social networking context. However, for library interfaces, there is evidence that multimedia can quickly lose its appeal, providing short-term novelty.
They have zero tolerance for delay and their information needs must be fulfilled immediately
Our verdict: No. We feel that this is a truism of our time and there is no hard evidence to suggest that young people are more impatient in this regard.[...]
They find their peers more credible as information sources than authority figures
Our verdict: On balance, we think this is a myth. Research in the specific context of the information resources that children prefer and value in a secondary school setting shows that teachers, relatives and textbooks are consistently valued above the internet. We feel this statement has more to do with social networking sub-culture and teenagers’ naturally rebellious tendencies[...]
They need to feel constantly connected to the Web
Our verdict: We do not believe that this is a specific Google generation trait. Recent research by Ofcom shows that the over-65s spend four hours a week longer online than 18-24s. We suspect that factors specific to the individual, personality and background, are much more significant than generation.
They are the `cut-and-paste’ generation
Our verdict: We think this is true, there is a lot of anecdotal evidence and plagiarism is a serious issue.
They pick up computer skills by trial-and-error
Our verdict: This is a complete myth. The popular view that Google generation teenagers are twiddling away on a new device while their parents are still reading the manual is a complete reversal of reality[...]
They prefer quick information in the form of easily digested chunks, rather than full text
Our verdict: This is a myth. CIBER deep log studies show that, from undergraduates to professors, people exhibit a strong tendency towards shallow, horizontal, `flicking’ behaviour in digital libraries. Power browsing and viewing appear to be the norm for all. The popularity of abstracts among older researchers rather gives the game away. Society is dumbing down.
They are expert searchers
Our verdict: This is a dangerous myth. Digital literacies and information literacies do not go hand in hand. A careful look at the literature over the past 25 years finds no improvement (or deterioration) in young people’s information skills.
They think everything is on the web (and it’s all free)
Our verdict: Open. Anecdotally, this appears to be true for a large minority of young people, but no one seems to have framed a research question in this form and investigated it more deeply. Certainly this was a prevalent view earlier in the evolution of the internet, indeed its central ethos. To reverse the question, there is much evidence that young people are unaware of library-sponsored content, or at least reluctant to use it. This is the library’s problem, not the fault of young people.
They do not respect intellectual property
Our verdict: This seems to be only partly true. Findings from Ofcom surveys reveal that both adults and children (aged 12-15) have very high levels of awareness and understanding of the basic principles of intellectual property. However, young people feel that copyright regimes are unfair and unjust and a big age gap is opening up.[...]
They are format agnostic
Our verdict: This may be true of some users, young and old, but not all. We have not found any careful analysis of this question, which is surprising given its import for libraries and publishers alike. We suspect that this is no longer a meaningful issue: content is no longer format dependent in cyberspace.

February 4, 2008

Europe: the nationality of a First Lady

Filed under: General, Private — Ivan Herman @ 9:54
Tags:

One of the nice and also interesting aspects of the European Union is that people take many of its advantages for granted. At a time when “Euroscepticism” has a certain (in my view, unjustified) popularity, it is worth reminding people about small things (like the Shengen agreement) that are really part of our life in a unified Europe, and the advantages it bears.

Living as a European national in another European Union country has become the most natural thing of all. I live in the Netherlands with a French passport, my colleague with whom I share an office is a British national; I have several colleagues (French, German, Dutch, Italian) who do not live in the country of their origin. And, administratively, this has become a breeze, not much more complicated as moving to another province (yeah, it is not all that rosy, the “portability” of pensions, for example, is not yet solved, but there is progress).

This week-end has seen a highly visible case. The current French president, Nicolas Sarkozy, has married a woman called Carla Bruni. The publicity around this took (in my view) a completely ridiculous turn in the French people magazines, but I did find one interesting aspect of the whole story. Indeed, the new First Lady in France is… an Italian national! I wonder whether she would be asked to take the French nationality; I sincerely hope not. There is absolutely no reason: her presence in that position is just another proof that, well, Europe works…

February 1, 2008

New DCMI documents published

Filed under: Semantic Web, Work Related — Ivan Herman @ 14:16

I must admit I did not realize that… but the DCMI has just published two important documents. It was not such a long time ago, though, so I may not be the only one who missed it (thanks to Kjetil Kjernsmo who drew my attention, too, to this in a mail he sent to the Linking Open Data mailing list).

The first document is an update of the well-known DC terms that we all use. This updated “DCMI Metadata Terms” defines the ranges and domains of most of the terms and puts all these terms into the same namespace (http://purl.org/dc/terms/). The old namespace (http://purl.org/dc/elements/1.1/) remains untouched, ie, no documents using those become “invalid” in any sense, but it is probably better to start using the /terms/ namespace: it gives a better semantic inference possibility. There is a separate documentation for the terms, and the namespace dereferences (of course…) to the appropriate RDF/XML document.

For those who are familiar with the DCMI “Abstract model” but also care about the Semantic Web the second document is also interesting. The title of the document tells it all: “Expressing Dublin Core metadata using the Resource Description Framework (RDF)”…

See also the DCMI news items on the new term document and the RDF one for more details.

Blog at WordPress.com.