资源预览内容
第1页 / 共131页
第2页 / 共131页
第3页 / 共131页
第4页 / 共131页
第5页 / 共131页
第6页 / 共131页
第7页 / 共131页
第8页 / 共131页
第9页 / 共131页
第10页 / 共131页
亲,该文档总共131页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual CortexT. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, T. PoggioAI Memo 2005-036December 2005 CBCL Memo 259 2005 massachusetts institute of technology, cambridge, ma 02139 usa www.csail.mit.edumassachusetts institute of technology computer science and artificial intelligence laboratoryA theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortexThomas Serre, Minjoon Kouh, Charles Cadieu, Ulf Knoblich, Gabriel Kreiman and Tomaso Poggio1Center for Biological and Computational Learning, McGovern Institute for Brain Research, Computer Science andArtificial Intelligence Laboratory, Brain Sciences Department, Massachusetts Institute of TechnologyAbstractWe describe a quantitative theory to account for the computations performed by the feedforward path of the ventral stream of visual cortex and the local circuits implementing them. We show that a model instan- tiating the theory is capable of performing recognition on datasets of complex images at the level of human observers in rapid categorization tasks. We also show that the theory is consistent with (and in some casehas predicted) several properties of neurons in V1, V4, IT and PFC. The theory seems sufficiently com- prehensive, detailed and satisfactory to represent an interesting challenge for physiologists and modelers: either disprove its basic features or propose alternative theories of equivalent scope. The theory suggests a number of open questions for visual physiology and psychophysics.This version replaces the preliminary “Halloween” CBCL paper from Nov. 2005.ThisreportdescribesresearchdonewithintheCenterforBiological Reynolds et al., 1999. Very few address a generic, high-level computational function such as object recognition (see Fukushima, 1980; Amit and Mascaro, 2003; Wersing and Koerner, 2003; Perrett and Oram, 1993). We are not aware of any model which does it in a quantitative way while being consistent with psychophysical data on recognition and physiological data throughout the different areas of visual cortex while using plausible neural circuits. In this paper, we propose a quantitative theory of object recognition in primate visual cortex that 1) bridges several levels, from biophysics to physiology, to behavior and 2) achieves human level performance in rapid recognition of complex natural images. The theory is restricted to the feedforward path of the ventralstream and therefore to the first 150 ms or so of visual recognition; it does not describe top-down influences, though it is in principle capable of incorporating them.Recognition is computationally difficult.The visual system rapidly and effortlessly recognizes a large number of diverse objects in cluttered, natural scenes. In particular, it can easily categorize images or partsof them, for instance as faces, and identify a specific one. Despite the ease with which we see, visualrecognition one of the key issues addressed in computer vision is quite difficult for computers and isindeed widely acknowledged as a very difficult computational problem. The problem of object recognitionis even more difficult from the point of view of Neuroscience, since it involves several levels of under- standing from the information processing or computational level to the level of circuits and of cellular and biophysical mechanisms. After decades of work in striate and extrastriate cortical areas that have produceda significant and rapidly increasing amount of data, the emerging picture of how cortex performs object recognition is in fact becoming too complex for any simple, qualitative “mental” model. It is our belief that a quantitative, computational theory can provide a much needed framework for summarizing and organizing existing data and for planning, coordinating and interpreting new experiments.Recognition is a difficult trade-off between selectivity and invariance.The key computational issue inobject recognition is the specificity-invariance trade-off: recognition must be able to finely discriminate be- tween different objects or object classes while at the same time be tolerant to object transformations such as scaling, translation, illumination, viewpoint changes, change in context and clutter, non-rigid transfor- mations (such as a change of facial expression) and, for the case of categorization, also to shape variationswithin a class. Thus the main computational difficulty of object recognition is achieving a very good trade- off between selectivity and invariance.Architecture and function of the ventral visual stream.Object recognition in cortex is thought to be me- diated by the ventral visual pathway Ungerleider and Haxby, 1994 running from primary visual cortex, V1, over extrastriate visual areas V2 and V4 to inferotemporal cortex, IT. Based on physiological experi- ments in
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号