Imagine the (legitimate) outcry if a local municipality, a State government, or the Federal government in the US deployed an infrastructure that would systematically identify and track people as they went about their daily lives, without a viable option to opt out. While the US has laws that govern when and how data about individuals could be used, the mere availability of such data would lead to temptations that would be irresistible in practice, yet not necessary for the functioning of this society.
Blog Archive: 2011
I am seeing an interesting not-quite-yet-a-trend on the emergence of collaborative search tools. I am not talking about research tools such as SearchTogether or Coagmento, but of real companies started for the purpose of putting out a search tool that supports explicit collaboration. The two recent entries in this category of which I am aware are SearchTeam and Searcheeze. While they share some similarities, they are actually quite different tools.
Google recently unveiled Citations, its extension to Google Scholar that helps people to organize the papers and patents they wrote and to keep track of citations to them. You can edit metadata that wasn’t parsed correctly, merge or split references, connect to co-authors’ citation pages, etc. Cool stuff. When it comes to using this tool for information seeking, however, we’re back to that ol’ Google command line. Sigh.
Stephen Robertson’s talk at the CIKM 2011 Industry event caused me to think about recall and precision again. Over the last decade precision-oriented searches have become synonymous with web searches, while recall has been relegated to narrow verticals. But is precision@5 or NCDG@1 really the right way to measure the effectiveness of interactive search? If you’re doing a known-item search, looking up a common factoid, etc., then perhaps it is. But for most searches, even ones that might be classified as precision-oriented ones, the searcher might wind up with several attempts to get at the answer. Dan Russell’s a Google a day lists exactly those kinds of challenges: find a fact that’s hard to find.
So how should we think about evaluating the kinds of searches that take more than one query, ones we might term session-based searches?
HCIR 2011 took place almost three weeks ago, but I am just getting caught up after a week at CIKM 2011 and an actual almost-no-internet-access vacation. I wanted to start off my reflections on HCIR with a summary of Gary Marchionini‘s keynote, titled “HCIR: Now the Tricky Part.” Gary coined the term “HCIR” and has been a persuasive advocate of the concepts represented by the term. The talk used three case studies of HCIR projects as a lens to focus the audience’s attention on one of the main challenges of HCIR: how to evaluate the systems we build.
We are about to deploy an experimental system for searching through CiteSeer data. The system, Querium, is designed to support collaborative, session-based search. This means that it will keep track of your searches, help you make sense of what you’ve already seen, and help you to collaborate with your colleagues. The short video shown below (recorded on a slightly older version of the system) will give you a hint about what it’s like to use Querium.
Many people have asked me why I decided to write a book. A better questions is: “When you realized that writing the book was going to be orders of magnitude harder and take much longer than you thought it would, what made you decide to continue writing the book?”
My co-author, Wolfgang Polak, and I recently received a book review of the sort that is the dream of every author. A dream review is, of course, positive. But more importantly, it praises the aspects of the book that were most important to the author – the reasons the author kept going after other books on the subject came out and the author had a more reasonable (but still too optimistic) estimate of the vast amount of effort it would take to finish it. (The review appeared in Computing Reviews, but is behind a paywall. Excerpts appear on the book’s Amazon and MIT press web pages.)
In our case, one of the things that kept us going Continue Reading
Critiques of software patents is all the rage lately, from bloggers like Daniel Tunkelang to the NPR. The list of problems with them includes that they stifle innovation, that they are tools to beat up small companies and startups, and that they are simply trading cards that big corporations use to protect each other at everyone else’s expense. So why are software patents different from other patents? Why aren’t people arguing about scrapping the patent system entirely?
Last week I had the opportunity to attend a debate-style talk featuring Bob Zeidman (pro) and Prof. Edward A. Lee (con) about software patents hosted by the Computer History Museum, which I found quite helpful in understanding the issues. The motion under consideration was “Software patents encourage innovation.”
The discussion on my previous post has raised some interesting and valid points regarding holding conferences in countries like China that block some (or all) internet traffic. Given that the conference has an audience that extends beyond the location of the conference, how can this audience be served in the presence of country-sponsored firewalls? Specifically, how can we get access to the Twitter stream and to other media being generated by the conference?
A number of ACM groups have recently made decisions to hold their conferences in China. The list of major conferences includes CSCW2011, SIGIR2011, Ubicomp 2011, and ICSE 2011, just to name a few. This seems like a strange trend. The purpose of academic conferences is to disseminate ideas in an open and public manner, and thus the argument has been made that taking these conferences to China will help expose China and Chinese researchers to these Western ideals. Yet what we see conference after conference are the restrictions that China imposes on electronic communication.