Benchmarking models of the ventral stream
[摘要] This work establishes a benchmark by which to measure models of the ventral stream using crowd-sourced human behavioral measurements. We collected human error patterns on an object recognition task across a variety of images. By comparing the error pattern of these models to the error pattern of humans, we can measure how similar to the human behavior the model;;s behavior is. Each model we tested was composed of two parts: an encoding phase which translates images to features, and a decoding phase which translates features to a classifier decision. We measured the behavioral consistency of three encoder models: a convolutional neural network, and a particular view of neural activity of either are V4 or IT. We measured three decoder models: logistic regression and 2 different types of support vector machines. We found the most consistent error pattern to come from a combination of IT neurons and a logistic regression but found that this model performed far worse than humans. After accounting for performance, the only model that was not invalidated was a combination of IT neurons and an SVM.
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]