Most popular web search engines are optimized for precision—getting that useful document in the in the top five or ten hits so that the user doesn’t have to page through the results to find it. This works well for known-item search (finding an address of a restaurant, a birthday of a movie star, etc.) and for searches that rely on combinations of keywords.
But some kinds of information needs don’t fit that pattern well. Sometimes the information being sought is spread over multiple documents, sometimes people need to find multiple instances of documents that match some query to compare or contrast them, etc. The task becomes more recall- rather than precision-oriented. Furthermore, these searches may be repeated over time, as the user finds information that causes the information need to change. Medical information seeking is one obvious such example. Are there others?
Some possible useful directions for research in this space
- Identify, describe, and catalog classes of information seeking that exhibit these recall-oriented behaviors
- Identify web-based resources that are associated with specific areas of recall-oriented search
- Identify evidence of such searches in web logs for this sort of behavior
- Create heuristics for detecting recall-oriented search behavior in near-real-time
- Devise user interfaces to support recall-oriented search
- Don’t reinvent the wheel — try to leverage existing search tools such as Yahoo BOSS or the Google Search API
I am sure that much of this work has already been done (as Daniel Tunkelang points out), but it would be useful, I think, to bring it together in a coherent way to inform system design.