Unified Models for Recovering Semantics and Geometry from Scenes.
[摘要] Understanding contents of an image, or scene understanding, is an important yet very challenging problem in computer vision. In the last few years, substantially different approaches have been adopted for understanding ;;things;; (object categories that have a well defined shape such as people and cars), ;;stuff;; (object categories that have an amorphous spatial extent such as grass and sky), and the ;;geometry;; of scenes.In this thesis, we propose coherent models for the simultaneous recognition of ;;things;;, ;;stuff;;, and ;;geometry;;. The key contributions are i) to model their individual properties as well as relative properties, and ii) to propose a coherent framework that efficiently solves complicated tasks for scene understanding. We demonstrate that each task can be improved by also solving the other tasks in a joint fashion. The proposed models are capable of handling different types of inputs such as RGB, RGB-D, or hierarchically organized images.We have carried out extensive quantitative and qualitative experimental analysis to demonstrate the effectiveness of our theoretical findings and showed that our approaches yield competitive performances with respect to state-of-the-art methods.
[发布日期] [发布机构] University of Michigan
[效力级别] Scene Understanding [学科分类]
[关键词] Computer Vision;Scene Understanding;Conditional Random Field;Computer Science;Electrical Engineering;Engineering;Electrical Engineering: Systems [时效性]