Ivan’s private site

August 11, 2013

The value of community driven content (OSM vs. Google Map)

Filed under: Links,Social aspects,Work Related — Ivan Herman @ 12:51

Notre dame de la gardeThis is just a nice little example which might be worth noting for those who do not know Open Street Map (I am also a relatively new user of it).

I had a nice walk in Marseille yesterday, which included going down from the big cathedral on the top of the hill (“Notre Dame de La Garde“) to the seaside. There is a not-very-well-known path behind the church that one can take which is, for my taste anyway, a gorgeous way of doing it.

The path of course appears on Google’s Map: look at the small path going from the church to the “Rue du Bois Sacré”. However: look at the same area using Open Street Map: not only is the path there, but it gives a bunch of details. Indeed, it is not really a simple path: it is a long series of steps, i.e., do not try to drive even a bike there:-( And because it is a hot city, it is also good to know that there is small public fountain along the path (and, indeed, it is there and it works!)…

It is not really Google’s fault. They probably got the material from some sort of an official mapping system (they could not get their camera cars or bikes up there…) and there is no way a company, even as huge as Google, can cover such details. But a community-driven site can: people can add such details easily. (Actually, there was part of the path that was missing, and I will add it soon using my GPS readings.) Therein lies the power:-)

May 12, 2012

The fallacy of scientific publications…

Filed under: Social aspects,Work Related — Ivan Herman @ 14:08
Image of Bólyai's original manuscript for non-Euclidean geometry

János Bólyai’s seminal work on non-Euclidean geometry was published as an “Appendix” to his father’s mathematical textbook. Would hardly be considered as an academic publication today…

Yesterday a colleague in the UK, Jeni Tennison, published a great blog on her site. The title is probably very much unclear for the non-initiated (“Using “Punning” to Answer httpRange-14”) and the details are not of relevance for now. Suffices it to say that she touches on one of the “permathread” discussions that regularly rages on the various technical mailing lists related to Semantic Web. Jeni’s blog offers a very clear explanation of the problem and offers a way forward.

Apart from the technical content I was wondering: would that blog ever be considered as part of Jeni’s academic achievements if she was working at an academic institution? And the answer is, sadly, a clear “no”. “No”, because she “just” wrote it is a personal communication, and she did not go through the time consuming road of “official” publications in a journal or a conference. ”No”, because she does not have formal scientific references, “just” references to mailing lists, wiki pages and the like. ”No”, because the blog was not officially peer reviewed; alas! the fact that she had very long and discussions on some of her ideas on public mailing list with some of the best known experts in the field does not count. “No”, in spite of the fact that, if her ideas are accepted by the community (which is, of course, in no way sure), these would influence the technical direction for the work of hundreds of people, as well as practically deployment of systems, software, etc; at the minimum, there will be dozens if not hundreds of reactions and references to this blog in the days and weeks to come. I can easily make the bet that her piece will have a greater influence in the advancement of a particular area of science and technology than many of the hundreds of academically high valued papers that are published this year.

Is this Jeni’s loss? If she is to pursue an academic career then, of course it is. But it is a much greater loss for science that ignores such intellectual achievements by keeping to its outdated scholarly commutation rules. In fact, it shows that science may have to go back to its old traditions of communication: after all, in the good old times, many of the greatest achievements of science were first published as personal letters or journals. Something have been lost…

(If you are interested in these issues, you may consider looking at the Force11 Community’s Web site and the Force11 Manifesto… that community will, hopefully, evolve significantly in the months to come.)

September 28, 2010

ICT2010 Event Brussels, 2nd day: eGov (#ict2010eu for twitter…)

The main event today, as far as I am concerned, was the Governmental Linked Data session that some of us organized under the auspices of the Open Knowledge Foundation. The idea was to talk about the goals, dreams, and problems of Governmental Linked Data to the non-initiated (and the non-converted:-). I believe (although one is never objective about one’s own child) that the session went really well. There were cca. 140 people in the audience which, frankly, exceeded my expectation. Josema gave a nice overview of his “dreams”, i.e., what are the goals and promises of this whole move; this was followed by Jonathan’s dreams that were, of course, largely identical to Josema’s, but he also gave some data and facts about what is happening in Europe these days (e.g., in the area of data catalogues). He also referred to the upcoming European data catalogue project (PublicData.eu) which will be a great asset when it comes. Jeni talked not only about her dreams but also some of the practical experiences in deploying that stuff; as somebody deeply involved in the UK governmental project, i.e., as a person in the trenches, so to say, Jeni was really a great person to talk about that. The fourth and last speaker was Andreas, showing some existing applications on linked governmental data, and also talking about his dream of an application that would, e.g., help in the discussion on problematic societal issues like the Stuttgart 21 project. (Actually, Andreas had the temerity of using the Internet for live demos; with the absolutely awful quality network at the conference I would not have dared to do so!) There was also a lively discussion and questions after the presentations, both as part of the official session as well as after it. It is difficult to say how many people we “reached”, of course, but I think we were successful in getting the idea of Governmental Linked Data more accepted by a wider audience. (B.t.w., there is also a page with all the slide references.) It was interesting that, later in the day, I had a chat with A colleague who claimed that by now the very idea of linked data, and of governmental linked data, is widely accepted by everybody as a way to go, though, of course, lots of details have to be fleshed out. I may not be so up-beat than he is, but, well, it may just be my usual pessimism…

Other than this session, I also listened to several session on the Future Internet. There is now a new funding round on this topic (with a deadline mid January), so it obviously drew quite some attention. In spite of the fact that it is quite difficult to grasp what this think is all about. The goals described by various speakers were putting an emphasis on the societal aspects of upcoming works, on trying to understand what the profound, societal consequences of the ubiquitous internet presence are, what social changes will that bring, how can we understand, via interdisciplinary work, the evolutions, etc. These are all really exciting questions although also very difficult. What bothered me a little bit that all this sounded very familiar: it was the same set of goals outlined by the Web Science Initiative, these days Web Science Trust: just make a global change of “internet” to “Web”, and you got the same! This was all the more disturbing that, when asked about other organizations doing similar work, the representative of the Commission referred to “a UK project called Web Science Initiative, you know, started by Wendy Hall and Tim Berners-Lee…”, i.e., they completely missed the fact that WST is not a UK thing… Missing communications here?

I ranted yesterday on some of the oddities of the conference organization. Sorry, I have to add some more: we (the organizers of the session) sent them the detailed program of the session a few weeks ago. They did put it up on the Web in… Microsoft Word format. What would have costed them to convert that at least into PDF (or ask us to do it, if necessary), let alone turning it into HTML. At a time when everybody is talking about mobile devices and mobile internet, putting up a piece of information that no mobile phone, for example, can read… (B.t.w., they distributed the program of the conference on a USB stick, which is fine, but with a bunch of programs running on Windows only… When will such organizers learn that there are people out there using Linux or a Mac? Sigh…)

B.t.w.: if you have not realized yet, the #ict2010eu twitter feed contains a huge number of entries, a bunch of them are related to our session…

September 27, 2010

ICT2010 Event Brussels, 1st day (#ict2010eu for twitter…)

ICT2010: the obligatory get-together for many in the European ICT world, to know and understand what the EU Commission plans to do, but also to meet possible partners for the next round of EU project proposals… Huge crowd (to be honest, a bit too huge for my taste), many exhibitions, plenaries (both keynotes and panels) as well as parallel sessions.

The day began with the usual opening ceremony with the opening speech of the Belgian prime minister (Yves Letèrme) who (well, we are in Belgium!) made his speech by cutting it into three sections, one in English, one in French, one in Dutch. Fortunately there were translation services for those who do not understand all three… What was much more interesting is the plenary session that followed, which included keynotes and a panel. What to remember of that plenary? Some points:

  • I liked Neelie Kroes’ keynote (for our non European friends, she is the commissioner for ICT, the direct “boss” of this branch here, although I am not sure she would like to be addressed this way). Not primarily for the content of what she said but more for the (I think) genuine enthusiasm that she seemed to have for the future of ICT in Europe, for the will of doing and improving things. She was very upbeat, that Europe should get its acts together (a theme that came back during the day several times) to move ahead big times. With my background at W3C, I was also sensitive on the emphasis she put on the fundamental importance of standards, on cooperation, on openness.
  • Silvana Koch-Mehrin, vice-president of the European Parliament had an interesting remark on the internet of things and on the various services that it would generate: that people should have the right, by default, to opt-in to services, and not the possibility to simply opt-out. After the privacy stories with Facebook, that resonated to me… She also said that the parliament is currently conducting two studies on how the various social ICT tools (email, social sites, etc) influence the work of the parliament (both in the positive and the negative sense). I would be very interested to read the results of those studies.
  • Christian Renaudot, CEO of Agfa-Gevaert, gave a talk on how the current plans of the commission fits the goals of a huge company like his, primarily in the area of health care related machinery. It was interesting. And he raised an issue that came up many times during the day: that the social relationships are also changing with the ICT evolution (he referred to the doctor-patient relationships in his example) and that serious work has to be done to understand those changes and where they lead to in the future. I fully agree, I just do not know how such research should be conducted efficiently…

The keynotes were followed by a panel.  A question that came up during the discussion was what the measure is of success, i.e., that the coming years would really evolve in the right way in ICT land. Many of the answers were very “techie”: rate of broadband penetration, number of services, that sort of things. But then Neelie Kroes intervened, essentially saying “well, I am an economist by formation, so I should like the numbers, but I do not. What interests me is whether the quality of life will be improved with those evolution, if we have more raw models for entrepreneurs in Europe, so that people would not always look elsewhere” Yes, It is way too easy to forget what is important…

Another question that came up a lot, and that was also the main thrust of discussion at the evening panel, is how to achieve that more global companies would be born in Europe, that there would more successful start ups, that not everyone would constantly use the Silicon Valley as a reference, etc. This questions comes up fairly often and we heard a number of the answers that are usually given: lack of venture capital, different attitudes v.a.v., for example, of failure, the melting pot aspect of the californian culture, etc. But it was interesting that there was a clear upbeat tone as well. First of all, that these things have greatly improved in the past few years, with centers around Europe that attract lots of entrepreneurs (Cambridge, Berlin, Sophia Antipolis, Paris, etc) but also that Europe has a lot to offer in terms of living quality that does attract many people, and more and more at that. One thing that is really missing is educated people. The universities in Europe are not efficient enough, and the US (at least the Silicon Valley area but, I think, elsewhere, too) is much more open to foreigners coming in, bringing their expertise to start up something (after all, Google, Yahoo, or Youtube have all been set up by foreigners). This has to improve in hoping that Europe ever gets in par with the rest of the world. (I must say that when I see the successes of such politicians and parties as Geert Wilders in the Netherlands, the Jobbik in Hungary or the Le Pen family in France, I am not very optimistic on the issue of foreigner friendliness. I think Europe is heading in the wrong and frightful direction here. But that would be the topic of another blog…)

The organization of the conference is, however,… awful. An ICT conference where there is no decent wireless access: for crying out loud, this is ridiculous! Such a conference should have an exemplary access, letting people tweet and blog and email about what is happening, and letting people not on site chime in via the same tools. Isn’t that ridiculous when we talk about broadband access for everyone, about mobile computing, and all that jazz? Of course, chiming in from the outside may not have worked anyway: the panels were not organized counting on the involvement of the audience. Which is also a shame. The organizers should really learn…

Let us see what tomorrow brings!

November 12, 2009

Pay to be free…

Filed under: Social aspects,Work Related — Ivan Herman @ 17:00

I may not be well informed, so this may be a known approach for some of you, but it is the first time I see this…

There has been a tension between (scientific) publishers and authors for a while on whether one is allowed to put one’s publication on the Web. When dealing with traditional publishers the author usually gives away his/her copyright and the papers are rarely available on the Web (which is a source of constant frustrations to readers). Fortunately, this is not always the case; for example, the proceedings of the World Wide Web conference series are published by ACM, but the papers are nevertheless available on the Web for free (thanks to IW3C2).

Well, a counter-proposal from a publisher is quite amazing. A Hungarian publisher, Akadémiai Kiadó, offers authors a deal, called the “Optional Open Article”: if you pay the nice sum of 900€, then the paper is also put onto an on line edition and is made freely available on the Web. (The fact that it is then freely available is clear in the agreement posted on the web site). Pay for your freedom. Isn’t this wonderful?

And, to make it clear: this is a very prestigious publisher in Hungary, is related to the Hungarian Academy of Sciences and, therefore, the prime publishers locally of Hungarian scientists…

I find it appalling.  But this may only be me.

October 16, 2009

Seduce with free services?

Filed under: General,Links,Social aspects,Work Related — Ivan Herman @ 8:01

I ran into this two times in a week. I hope it is just a coincidence…

The story is simple. You find some service on the Web which looks nice and helpful. There are various options: you may take a minimal service, which is free of charge, or you can also choose extra services for a fee. It sounds like a decent choice: if the minimal service fits your needs, you are happy, if you need more, you pay something. I presume we all use services like that.

But then… if you take the free option, you may get a mail after 2-3 years’ of  usage saying that sorry, the free service is discontinued next month; you are welcome to upgrade for the paying service, otherwise, well, good bye. As I said I got this type of mail twice in a week: one from a service giving a minimal synchronization of my phone’s calendar with Google’s, the other providing a simple email certificate for signing my mails. On a matter of principle I will not upgrade; I do not find this approach really acceptable.

So… will Gmail, WordPress, or other similar services decide that they have attracted enough customers, they can now start charging? As I said, I hope this was just a coincidence and not some sort of a general direction…

April 29, 2009

Drawing consequences on large corpus of data…

Filed under: Social aspects,Work Related — Ivan Herman @ 18:15
Tags: , , , ,

I spent some time today reading through the WWW2009 paper on “Mapping the World’s Photos”, from David Crandall et al[1] . The paper reports on a work analyzing a large number (35 million) of photographs extracted from Flickr, including their metadata. The interesting point of the paper is that they combine various analysis tools: they analyse the users’ tags, the geo location in the photos’ metadata, timing information of a series of photos from the same user, and image processing analysis of the photos’ content. The combination of many different types of information leads to a better clustering of the photo data: photos can be organized in terms of location (either on large scale, ie, on the level of, say, a city, or on a much smaller scale, ie, on the level of a landmark like the Eiffel Tower). This clustering can be done without a priori knowledge of the image contents themselves.

There is the technology/algorithmic side of the paper that I cannot really comment on, I am not familiar enough with the clustering algorithms they used. However, at least for me, the more interesting aspect of the paper is the “social’’ one. As the authors say:

As researchers discovered a decade ago with large-scale collections of Web pages, studying the connective structure of a corpus at a global level exposes a fascinating picture of what the world is paying attention to. In the case of global photo collections, it means that we can discover, through collective behavior, what people consider to be the most significant landmarks both in the world and within specific cities […]; which cities are most photographed […] which cities have the highest and lowest proportions of attention-drawing landmarks […]; which views of these landmarks are the most characteristic […]; and how people move through cities and regions as they visit different locations within them […]. These resulting views of the data add to an emerging theme in which planetary-scale datasets provide insight into different kinds of human activity — in this case those based on images[…].

And this, of course, is really fascinating. But… it can also be dangerous if not done with care, because it is way too easy to jump on false conclusions. Indeed, study of such corpus cannot and should not be done, at least in my view, without a careful consideration of social, cultural, and economical issues. (This is of course no critique on the authors at all who concentrated on the algorithmic aspect only and did a great work at that!)

Let me take one example from the paper: the clustering algorithm produces a table “showing the most photographed places on Earth ranked by number of distinct photographers”. The first 15 cities on the list are: New York, London, San Francisco, Paris, Los Angeles, Chicago, Washington, Seattle, Rome, Amsterdam, Boston, Barcelona, San Diego, Berlin, and Las Vegas. 8 cities from the US, 7 from Western Europe. None from Canada, Asia, Africa, Australia, Latin America… Fascinating (and highly photogenic!) cities like Kyoto, Beijing, Rio de Janeiro, or Istambul are missing. This is not the fault of the authors: this is what this particular data set, ie, Flickr, gives you. However, can we, should we say that the World is not paying attention to these cities? I do not think so. To really draw conclusions, one would have to look at the demography of Flickr users, at economic issues, whether different communities use Flickr or some other photo site elsewhere in the World… The lack of Japanese cities in the list (knowing that Japanese make tons of pictures everywhere they go!) seems to indicate that their attitude towards social sites like Flickr might be different than what we are used to in “the West”. People going to Cairo may not have the same type of sophisticated cameras and easy Internet access to produce Flickr-quality pictures. And there may be many other different aspects that I do not even think of at this moment…

This is indeed an exciting line of research. But we, computer scientists, should be modest enough to realize that drawing social conclusions from such data requires us to work with experts in other disciplines. We could then come up with defensible conclusions that would be interesting to explore and exploit. Ie, the future, in this respect, lies in interdisciplinary work.

  1. Crandall, David, Backstrom, Lars, Huttenlocher, Daniel and Kleinberg, Jon (2009) ‘Mapping the World’s Photos’, In Maarek, Y. and Nejdl, W. (eds.), Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, ACM Press, pp. 761-770. Available online.

Blog at WordPress.com.