First, I agree with Daniel Tunkelang and many others that this is an important step for Google, an important departure from the ranked list. Hurray! My sense about this way of performing exploratory search is that Google Squared addresses (at least) two different kinds of activity: filtering through structured data, and collecting different instances of some topic. The former, of interest to the set-oriented retrieval crowd, should make it easier to do feature comparison for a variety of domains, typically with some sort of a shopping angle. In this post, I will focus on the second kind, the less structured exploratory search.
To me, exploratory search typically evokes memories of systems built for searching on TREC-6 topics, looking for instances of ferry boat sinkings, for example. I approached Google Squared with that spirit in mind. For some topics, Google Squaredcan offer some results given just a couple of keywords (old habits die hard), whereas for others it wants several examples of instances that it can then use to find other, presumably related, documents.
I tried the ferry boat accident query in a number of ways, but failed to get any useful hits. Given the news of the day, I then searched for airplane accidents, and got an interesting list. While Google Squared seems quite good at naming aspects of the topic, it is less effective at populating them for some of the hits. The most reliable data seems to have come from Wikipedia, where perhaps it’s available in a sufficiently structured form for the Google scraper to work reliably. But then it took an aggregation page from http://www.planecrashinfo.com/and reported it as a single event. It also seemed to find a bunch of airline sites that don’t yield useful information on specific accidents. It wasn’t particularly good at getting aspect values for some aspects (oddly, including the date of the event) and the number of passengers involved. I don’t know how they generate the aspect schema, but it might be useful to give the system feedback (and corrected values) when it makes incorrect inferences.
Having exhausted one macabre topic, I moved onto the next, creating a new square based on the query ‘acts of terrorism.’ I wanted to find descriptions of specific instances of terrorism, both recent and not so recent. The initial result was disappointing, returning one hit from some article from the Treasury Department that wasn’t really on topic. I then tried to teach the system about what I was interested in by suggesting the world trade center bombing. It promptly identified a useful hit, and also suggested a long list of other instances of kind I was looking for: Oklahoma City bombing, US Embassy bombings in Africa, Bali, Madrid, September 11th attacks. I had to fish around a bit to get the USS Cole, but then it suggested London underground bombings and the anthrax episode in 2001. So far, so good.
But then the suggestion engine ran dry, offering me several other hits on the 9/11 attacks, without the ability to ignore them because I already had those documents. Here is where we get into the tricky parts of relevance feedback in exploratory search: I should be able to say that a particular result is relevant but redundant, or not relevant, or relevant and useful. But Google Squared just offers me suggestions, with no way of controlling them short of coming up with new additional queries to run.
So I tried that: I typed “lebanon,” hoping to boost some events I know have occurred there over the last 25 years. I got an overview article from about.terrorism.com. Related, but not what I am looking for. And still no good suggestions. I tried “marine barracks,” and got an article about something in the DC area. “marine barracks lebanon” elicited a useful document, but no useful suggestions. Same for “Khobar Towers.”
In short, the recommendations it was willing to provide based on a two-word query were the standard precision-oriented results, presented in a more interactive tabular format, with some ability on my part to determine the order. But once that “top-10” list was exhausted, despite considerable additional terms added to specify potentially-useful facets, the system was unable to generate more useful results. Furthermore, it is interesting to note that the hits were mostly about US and Western interests being attacked, and reported on in the Western media. My queries on ‘israel’ and ‘suicide bombings’ yielded only single overview articles and did not generate any more recommendations. Of course I could try starting a new square for that, but the whole point of exploratory search is to support evolving information needs, and Google Squared appears not to be able to do that well.
So the results are mixed. I will need to spend more time with this interesting tool to understand its capabilities. Overall, I am pleased that these steps have been taken, and hope that Google continues to push the envelope by providing truly novel ways for people to explore all the information they have so assiduously organized.