The dynamics of invariant object and action recognition in the human visual system
[摘要] Humans can quickly and effortlessly recognize objects, and people and their actions from complex visual inputs. Despite the ease with which the human brain solves this problem, the underlying computational steps have remained enigmatic. What makes object and action recognition challenging are identity-preserving transformations that alter the visual appearance of objects and actions, such as changes in scale, position, and viewpoint. The majority of visual neuroscience studies examining visual recognition either use physiology recordings, which provide high spatiotemporal resolution data with limited brain coverage, or functional MRI, which provides high spatial resolution data from across the brain with limited temporal resolution. High temporal resolution data from across the brain is needed to break down and understand the computational steps underlying invariant visual recognition. In this thesis I use magenetoencephalography, machine learning, and computational modeling to study invariant visual recognition. I show that a temporal association learning rule for learning invariance in hierarchical visual systems is very robust to manipulations and visual disputations that happen during development (Chapter 2). I next show that object recognition occurs very quickly, with invariance to size and position developing in stages beginning around 100ms after stimulus onset (Chapter 3), and that action recognition occurs on a similarly fast time scale, 200 ms after video onset, with this early representation being invariant to changes in actor and viewpoint (Chapter 4). Finally, I show that the same hierarchical feedforward model can explain both the object and action recognition timing results, putting this timing data in the broader context of computer vision systems and models of the brain. This work sheds light on the computational mechanisms underlying invariant object and action recognition in the brain and demonstrates the importance of using high temporal resolution data to understand neural computations.
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]