已收录 268921 条政策
 政策提纲
  • 暂无提纲
TREC NOVELTY TRACK AT IRITSIG
[摘要] In TREC 2004, IRIT modified important features of the strategy that was developed for TREC2003. Changes include tuning parameter values, topic expansion and exploitation of sentencescontext.According to our method, a sentence is considered as relevant if it matches the topic with a certainlevel of coverage. This coverage depends on the category of the terms used in the texts. Fourtypes of terms have been defined highly relevant, scarcely relevant, nonrelevant (like stopwords), highly nonrelevant terms (negative terms). Term categorization is based on topicanalysis: highly nonrelevant terms are extracted from the narrative parts that describe what willbe a nonrelevant document. The three other types of terms are extracted from the rest of thequery. Each term of a topic is weighted according to both its occurrence and the topic part itbelongs to (title, descriptive, narrative). Additionally we increase the score of a sentence wheneither the previous or the next sentence is relevant. When topic expansion is applied, terms fromrelevant sentences (task 3) or from the first retrieved sentences (task 1) are added to the initialterms.With regard to the novelty part, a sentence is considered as novel if its similarity with each ofpreviously processed and selected as novel sentences does not exceed a certain threshold. Inaddition, this sentence should not be too similar to a virtual sentence made of the n best
[发布日期]  [发布机构] 
[效力级别]  [学科分类] 社会科学、人文和艺术(综合)
[关键词]  [时效性] 
   浏览次数:3      统一登录查看全文      激活码登录查看全文