已收录 272993 条政策
 政策提纲
  • 暂无提纲
Grounding natural language phrases in images and video
[摘要] Grounding language in images has shown it can help improve performance on many image-language tasks. To spur research on this topic, this dissertation introduces a new dataset which provides the ground truth annotations of the location of noun phrase chunks in image captions.I begin by introducing a constituent task termed phrase localization, where the goal is to localize an entity known to exist in an image when provided with a natural language query.To address this task, I introduce a model which learns a set of models, each of which capture a different concept which is useful in our task.These concepts can be predefined, such as attributes gleamed from the adjectives, as well as those which are automatically learned in a single-end-to-end neural network.I also address the more challenging detection style task, where the goal is to localize a phrase and determine if it is associated with an image.Multiple applications of the models presented in this work demonstrate their value beyond the phrase localization task.
[发布日期]  [发布机构] 
[效力级别]  [学科分类] 
[关键词] Computer Vision, Natural Language Processing, Phrase Grounding [时效性] 
   浏览次数:6      统一登录查看全文      激活码登录查看全文