Integrating bottom-up and top-down information
[摘要] In this thesis I present a framework for integrating bottom-up and top-down computer vision algorithms. I developed this framework, which I call the Map-Dictionary Pixel framework, because my intuition is that there is a need for tools that make it easier to build computer vision systems that mimic the way human visual systems process information. In particular, we humans humans create models of objects around us, and we use these models, top-down, to interpret, analyze and discern objects in the information that comes bottom-up from the visual world. After introducing my Map-Dictionary Pixel framework, I demonstrate how it empowers computer vision algorithms. I implement two different systems that extract the pixels of the image that correspond to a human. Even though each system uses different sets of algorithms, both use Map-Dictionary Pixel framework as the connecting pipeline. The two implementations demonstrate the utility of the Map-Dictionary Pixel framework and provide an example of how it can be used.
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]