Blog Archive: 2009

Never mind about the Turkers, what do YOU think?

on Comments (4)

Let’s do an experiment. Here’s a TREC topic that specifies an information need

Food/Drug Laws

Description: What are the laws dealing with the quality and processing of food, beverages, or drugs?

Narrative: A relevant document will contain specific information on the laws dealing with such matters as quality control in processing, the use of additives and preservatives, the avoidance of impurities and poisonous substances, spoilage prevention, nutritional enrichment, and/or the grading of meat and vegetables. Relevant information includes, but is not limited to, federal regulations targeting three major areas of label abuse: deceptive definitions, misleading health claims, and untrue serving sizes and proposed standard definitions for such terms as high fiber and low fat.

Below are links to four documents that have been identified by some systems as being relevant to the above topic. Are they?

(I apologize in advance for the primitive nature of this form and its many usability defects.)

Turk vs. TREC

on Comments (9)

We’ve been dabbling in the word of Mechanical Turk, looking for ways to collect judgments of relevance for TREC documents. TREC raters’ coverage is spotty, since it is based on pooled (and sometimes sampled) documents identified by a small number of systems that participated in a particular workshop. When evaluating our research systems against TREC data from prior years, we found that many of the identified documents had not received any judgments (relevant or non-relevant) from TREC assessors. Thus we turned to Mechanical Turk for answers.

Continue Reading

Lack of progress as an opportunity for progress

on Comments (2)

Timothy G. Armstrong, Alistair Moffat, William Webber, and Justin Zobel have written what will undoubtedly be a controversy and discussion-inspiring paper for the upcoming CIKM 2009 conference. The paper compares over 100 studies of information retrieval systems based on various TREC collections, and concludes that not much progress has been made over the last decade in terms off Mean Average Precision (MAP). They also found that studies that use the TREC data outside the TREC competition tend to pick poor baselines to show short-term improvement (which is publishable) without demonstrating long-term gains in system performance. This interesting analysis is summarized in a blog post by William Webber.

Continue Reading

Is TREC good for Information Retrieval research?

on Comments (1)

In his comment to an earlier post, Miles Efron reiterated the usefulness of the various TREC competitions to fostering IR research. I agree with him (and with others) that TREC has certainly been a good incubator both in its annual competition and in follow-on studies that use its data in other ways.  And, as Miles points out,we have seen a proliferation of collections: everything from the original newspaper articles to blogs, video, large corpora, etc.

Continue Reading