Blog Archive: 2011

Released: Reverted Indexing source code

on Comments (1)

I am pleased to announce that we are releasing a version of the reverted indexing framework as open source software! The release includes the framework and an implementation in Lucene.

Reverted indexing is an information retrieval technique for query expansion, relevance feedback, and a variety of other operations. The details are described on our web site, in several posts on this blog, and in our CIKM 2010 paper. The source code and JAR file can be downloaded from Reverted Indexing page; see the Javadocs for details of the API.

Continue Reading

Reverted Indexing

on Comments (8)

Traditional interactive information retrieval systems function by creating inverted lists, or term indexes. For every term in the vocabulary, a list is created that contains the documents in which that term occurs and its relative frequency within each document. Retrieval algorithms then use these term frequencies alongside other collection statistics to identify the matching documents for a query.

In a paper to be published at CIKM 2010, Jeremy Pickens, Matt Cooper and  I describe a way of using the inverted index to associate document ids with the queries that retrieve them. Our approach combines the inverted index with the notion of retrievability to create an efficient query expansion algorithm that is useful for a number of applications, including relevance feedback. We call this kind of index a reverted index because rather than mapping terms onto documents, it maps document ids onto queries that retrieved the associated documents.

Continue Reading

What is this thing called Search?

on Comments (3)

In a recent blog post, Vegard Sandvold proposed a taxonomy of search systems based on two dimensions — algorithmic vs. user-powered and information accessibility. The first dimension represents a tradeoff between systems and people in terms of who does the information seeking, and the second one measures the ease of finding information in some search space. His blog post was intended to solicit discussion, and, in that spirit, here is my take on his ideas.

Continue Reading

Communicating about Collaboration

on Comments (25)

What does it mean to collaborate while searching?

There are many different ways to characterize collaborative information seeking, many dimensions on which collaborative search systems can be categorized.

For the past few years Jeremy Pickens and I have been thinking that our model of collaborative exploratory search needs some further explication. Or maybe we’re just trying to understand it better ourselves. We have found that to explain what our model is, we have to simultaneously explain what our model is not.  This has led to numerous discussions not only about the various dimensions of collaboration, but also about the relative importance among those dimensions for distinguishing between systems.

Continue Reading

Models of interaction, part 1

on Comments (2)

Recently, I’ve been involved in a lot of discussions about exploratory search on this blog and in comments on The Noisy Channel. One way to look at exploratory search (and there are many others!) is to separate issues of interaction from issues of retrieval. The two are complementary: for example, recently Daniel Tunkelang posted about using sets rather than ranked lists as a way of representing search results. This has implications on one hand for how the retrieval engine identifies promising documents, and on the other for how results are to be communicated to the user, and how the user should interact with them.

Continue Reading

CFP: Special Issue of IP&M on Collaborative Information Seeking


Meredith Ringel Morris, Jeremy Pickens and I are editing a Special Issue of Information Processing & Management on Collaborative Information Seeking. Our goal is to bring together papers that describe explicit (intentional) collaboration during various aspects of online information seeking. In contract to recommendation or collaborative filtering work, we are looking for work that describes small groups of people working toward a common goal.

The deadline for submission is May 8, 2009.

More details on the call are available here.