Ivan’s private site

November 12, 2009

Pay to be free…

Filed under: Social aspects,Work Related — Ivan Herman @ 17:00
Tags:

I may not be well informed, so this may be a known approach for some of you, but it is the first time I see this…

There has been a tension between (scientific) publishers and authors for a while on whether one is allowed to put one’s publication on the Web. When dealing with traditional publishers the author usually gives away his/her copyright and the papers are rarely available on the Web (which is a source of constant frustrations to readers). Fortunately, this is not always the case; for example, the proceedings of the World Wide Web conference series are published by ACM, but the papers are nevertheless available on the Web for free (thanks to IW3C2).

Well, a counter-proposal from a publisher is quite amazing. A Hungarian publisher, Akadémiai Kiadó, offers authors a deal, called the “Optional Open Article”: if you pay the nice sum of 900€, then the paper is also put onto an on line edition and is made freely available on the Web. (The fact that it is then freely available is clear in the agreement posted on the web site). Pay for your freedom. Isn’t this wonderful?

And, to make it clear: this is a very prestigious publisher in Hungary, is related to the Hungarian Academy of Sciences and, therefore, the prime publishers locally of Hungarian scientists…

I find it appalling.  But this may only be me.

October 16, 2009

Seduce with free services?

Filed under: General,Private,Social aspects,Work Related — Ivan Herman @ 8:01

I ran into this two times in a week. I hope it is just a coincidence…

The story is simple. You find some service on the Web which looks nice and helpful. There are various options: you may take a minimal service, which is free of charge, or you can also choose extra services for a fee. It sounds like a decent choice: if the minimal service fits your needs, you are happy, if you need more, you pay something. I presume we all use services like that.

But then… if you take the free option, you may get a mail after 2-3 years’ of  usage saying that sorry, the free service is discontinued next month; you are welcome to upgrade for the paying service, otherwise, well, good bye. As I said I got this type of mail twice in a week: one from a service giving a minimal synchronization of my phone’s calendar with Google’s, the other providing a simple email certificate for signing my mails. On a matter of principle I will not upgrade; I do not find this approach really acceptable.

So… will Gmail, WordPress, or other similar services decide that they have attracted enough customers, they can now start charging? As I said, I hope this was just a coincidence and not some sort of a general direction…

April 29, 2009

Drawing consequences on large corpus of data…

Filed under: Social aspects,Work Related — Ivan Herman @ 18:15
Tags: , , , ,

I spent some time today reading through the WWW2009 paper on “Mapping the World’s Photos”, from David Crandall et al[1] . The paper reports on a work analyzing a large number (35 million) of photographs extracted from Flickr, including their metadata. The interesting point of the paper is that they combine various analysis tools: they analyse the users’ tags, the geo location in the photos’ metadata, timing information of a series of photos from the same user, and image processing analysis of the photos’ content. The combination of many different types of information leads to a better clustering of the photo data: photos can be organized in terms of location (either on large scale, ie, on the level of, say, a city, or on a much smaller scale, ie, on the level of a landmark like the Eiffel Tower). This clustering can be done without a priori knowledge of the image contents themselves.

There is the technology/algorithmic side of the paper that I cannot really comment on, I am not familiar enough with the clustering algorithms they used. However, at least for me, the more interesting aspect of the paper is the “social’’ one. As the authors say:

As researchers discovered a decade ago with large-scale collections of Web pages, studying the connective structure of a corpus at a global level exposes a fascinating picture of what the world is paying attention to. In the case of global photo collections, it means that we can discover, through collective behavior, what people consider to be the most significant landmarks both in the world and within specific cities […]; which cities are most photographed […] which cities have the highest and lowest proportions of attention-drawing landmarks […]; which views of these landmarks are the most characteristic […]; and how people move through cities and regions as they visit different locations within them […]. These resulting views of the data add to an emerging theme in which planetary-scale datasets provide insight into different kinds of human activity — in this case those based on images[…].

And this, of course, is really fascinating. But… it can also be dangerous if not done with care, because it is way too easy to jump on false conclusions. Indeed, study of such corpus cannot and should not be done, at least in my view, without a careful consideration of social, cultural, and economical issues. (This is of course no critique on the authors at all who concentrated on the algorithmic aspect only and did a great work at that!)

Let me take one example from the paper: the clustering algorithm produces a table “showing the most photographed places on Earth ranked by number of distinct photographers”. The first 15 cities on the list are: New York, London, San Francisco, Paris, Los Angeles, Chicago, Washington, Seattle, Rome, Amsterdam, Boston, Barcelona, San Diego, Berlin, and Las Vegas. 8 cities from the US, 7 from Western Europe. None from Canada, Asia, Africa, Australia, Latin America… Fascinating (and highly photogenic!) cities like Kyoto, Beijing, Rio de Janeiro, or Istambul are missing. This is not the fault of the authors: this is what this particular data set, ie, Flickr, gives you. However, can we, should we say that the World is not paying attention to these cities? I do not think so. To really draw conclusions, one would have to look at the demography of Flickr users, at economic issues, whether different communities use Flickr or some other photo site elsewhere in the World… The lack of Japanese cities in the list (knowing that Japanese make tons of pictures everywhere they go!) seems to indicate that their attitude towards social sites like Flickr might be different than what we are used to in “the West”. People going to Cairo may not have the same type of sophisticated cameras and easy Internet access to produce Flickr-quality pictures. And there may be many other different aspects that I do not even think of at this moment…

This is indeed an exciting line of research. But we, computer scientists, should be modest enough to realize that drawing social conclusions from such data requires us to work with experts in other disciplines. We could then come up with defensible conclusions that would be interesting to explore and exploit. Ie, the future, in this respect, lies in interdisciplinary work.

  1. Crandall, David, Backstrom, Lars, Huttenlocher, Daniel and Kleinberg, Jon (2009) ‘Mapping the World’s Photos’, In Maarek, Y. and Nejdl, W. (eds.), Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, ACM Press, pp. 761-770. Available online.

Blog at WordPress.com.