A visual routine is a means of extracting information from a visual scene. In his studies on human visual cognition, Shimon Ullman proposed that the human visual system's task of perceiving shape properties and spatial relations is split into two successive stages: an early "bottom-up" state during which base representations are generated from the visual input, and a later "top-down" stage during which high-level primitives dubbed "visual routines" extract the desired information from the base representations. Cognition is a concept used in different ways by different disciplines but is generally accepted to mean the process of awareness or thought Shimon Ullman (שמעון אולמן born January 28, 1948 in Jerusalem) is a professor of computer science at the Weizmann Institute of Science The visual system is the part of the Nervous system which allows organisms to see. In humans, the base representations generated during the bottom-up stage correspond to retinotopic maps (more than 15 of which exist in the cortex) for properties like color, edge orientation, speed of motion, and direction of motion. These base representations rely on fixed operations performed uniformly over the entire field of visual input, and do not make use of object-specific knowledge, task-specific knowledge, or other higher-level information.
The visual routines proposed by Ullman are high-level primitives which parse the structure of a scene, extracting spatial information from the base representations. These visual routines are composed of a sequence of elementary visual operators specific to the task at hand. Visual routines differ from the fixed operations of the base representations in that they are not applied uniformly over the entire visual field --- rather, they are only applied to objects or areas specified by the routines. The term visual field is sometimes used as a Synonym to Field of view, though they do not designate the same thing Ullman lists the following as examples of visual operators: shifting the processing focus, indexing a salient item for further processing, spreading activation over an area delimited by boundaries, tracing boundaries, and marking a location or object for future reference. When combined into visual routines, these elementary operators can be used to perform relatively sophisticated spatial tasks such as counting the number of objects satisfying a certain property, or recognizing a complex shape.
A number of researchers have implemented visual routines for processing camera images, to perform tasks like determining the object a human in the camera image is pointing at. Researchers have also applied the visual routines approach to artificial map representations, for playing real-time 2D video games. A video game is a Game that involves interaction with a User interface to generate visual feedback on a video device. In those cases, however, the map of the video game was provided directly, alleviating the need to deal with real-world perceptual tasks like object recognition and occlusion compensation. Object recognition in Computer vision is a task of finding given object in an image or video sequence