Why accuracy is not used, as an evaluation parameter in information retrieval models?
Answers
The standard approach to information retrieval system evaluation revolves
RELEVANCE around the notion of relevant and nonrelevant documents. With respect to a
user information need, a document in the test collection is given a binary
classification as either relevant or nonrelevant. This decision is referred to as
GOLD STANDARD the gold standard or ground truth judgment of relevance. The test document
GROUND TRUTH collection and suite of information needs have to be of a reasonable size:
you need to average performance over fairly large test sets, as results are
highly variable over different documents and information needs. As a rule
of thumb, 50 information needs has usually been found to be a sufficient
minimum.
INFORMATION NEED Relevance is assessed relative to an information need, not a query. For
example, an information need might be:
Information on whether drinking red wine is more effective at reducing
your risk of heart attacks than white wine.
This might be translated into a query such as:
wine AND red AND white AND heart AND attack AND effective