Query suggestion vs. term suggestion


Diane Kelly presented an interesting (and much tweeted-about) paper at SIGIR this week. The paper, “A Comparsion of Query and Term Suggestion Features for Interactive Searching,” co-written with Karl Gyllstrom and Earl Bailey, looks at the effects that query and term suggestions have on users’ performance and preferences. These are important topics for interactive information seeking, both for known-item and exploratory search.

Term suggestion is based on relevance feedback (or pseudo-relevance feedback), but is sometimes problematic because people may not understand why particular terms are selected. There are at least two ways to mitigate this problem. The first, presenting recommended terms in context, has been suggested by Diriye, Blandford and Tombros in their recent JCDL paper Polyrepresentational Approach to Interactive Query Expansion, which I blogged about earlier. The second alternative, evaluated in this paper, was to offer users complete query suggestions rather than individual terms. The experiment compared the effectiveness of recommendations that used user-generated queries and keywords with automatically generated ones.

A few things struck me in this paper:

  • The authors generated automated query recommendations by clustering results of a user’s query, and then deriving query terms characteristic of each cluster. They found that often the best queries were generated from clusters whose average rank in the original result list was in the 50-100 range. This is further evidence in support of our collaborative search algorithms that examine the tails of ranked lists of documents for potentially-useful information. While people will typically overlook these results, systems that support exploratory search may profit from mining that part of the result lists.
  • The experiments described in this paper showed that “subjects who received user-generated query suggestions performed the best,” that is, better than those who received term suggestions or automatically-generated query suggestions. Could the relatively higher lexical coherence of these queries account for their advantage? It would be interesting to measure the extent to which the more naturally-phrased queries improved performance or people’s perception.  In other words, would a query suggestion like “auto manufacturer recall” result in better performance than the more awkward “minor major reasons automobile”?
  • There was some disagreement among participants as to the value of suggestions, with some participants declaring that suggestions were useless, while others adopting them as the primary means of constructing queries. These results parallel my findings in an experiment where subjects were allowed to construct queries either by typing words, by selecting passages, or by clicking on automatically-generated query-mediated links. That experiment showed no difference in performance among the three interface conditions on a TREC-like search task, but a cluster analysis showed that subjects preferred particular styles of interaction: some did a lot of typing, while others preferred to follow links. The implication here is that information exploration interfaces should support multiple means of eliciting users’ information needs, as cognitive styles (and perhaps prior experience) can affect people’s choices and preferences.
  • People found suggestions useful even when they did not select them because it prompted them to think about the search topic in different ways. This result has interesting methodological implications for measuring the effectiveness of query suggestions. More exploration, please!
  • Finally, it is not surprising that subjects relied more on suggestions for difficult topics. Again, this argues for increased range of search support tools for exploratory search, as these kinds of searches tend to be more difficult to express than known-item searches.

Overall, this was an interesting paper that only scratches the surface in terms of analysis of the collected data. I look forward to more results from this experiment, and to other follow-on experiments that attempt to tease apart the factors that contribute to the uptake of search suggestions.


  1. […] online yet), work done with UNC co-authors Karl Gyllstrom and Earl Bailey. David Karger and Gene Golovchinsky have already blogged about this talk, so rather than summarize I’ll add my personal reaction: […]

Comments are closed.