IU Indianapolis ScholarWorks :: Browsing by Subject "Visual perception"

Browsing by Subject "Visual perception"

Now showing 1 - 4 of 4

A spatial map: a propitious choice for constraining the binding problem
(Frontiers Media, 2024-07-02) Han, Zhixian; Sereno, Anne B.; Medicine, School of Medicine
Many studies have shown that the human visual system has two major functionally distinct cortical visual pathways: a ventral pathway, thought to be important for object recognition, and a dorsal pathway, thought to be important for spatial cognition. According to our and others previous studies, artificial neural networks with two segregated pathways can determine objects' identities and locations more accurately and efficiently than one-pathway artificial neural networks. In addition, we showed that these two segregated artificial cortical visual pathways can each process identity and spatial information of visual objects independently and differently. However, when using such networks to process multiple objects' identities and locations, a binding problem arises because the networks may not associate each object's identity with its location correctly. In a previous study, we constrained the binding problem by training the artificial identity pathway to retain relative location information of objects. This design uses a location map to constrain the binding problem. One limitation of that study was that we only considered two attributes of our objects (identity and location) and only one possible map (location) for binding. However, typically the brain needs to process and bind many attributes of an object, and any of these attributes could be used to constrain the binding problem. In our current study, using visual objects with multiple attributes (identity, luminance, orientation, and location) that need to be recognized, we tried to find the best map (among an identity map, a luminance map, an orientation map, or a location map) to constrain the binding problem. We found that in our experimental simulations, when visual attributes are independent of each other, a location map is always a better choice than the other kinds of maps examined for constraining the binding problem. Our findings agree with previous neurophysiological findings that show that the organization or map in many visual cortical areas is primarily retinotopic or spatial.
Exploring neural architectures for simultaneously recognizing multiple visual attributes
(Springer Nature, 2024-12-03) Han, Zhixian; Sereno, Anne B.; Psychology, School of Science
Much experimental evidence in neuroscience has suggested a division of higher visual processing into a ventral pathway specialized for object recognition and a dorsal pathway specialized for spatial recognition. Previous computational studies have suggested that neural networks with two segregated pathways (branches) have better performance in visual recognition tasks than neural networks with a single pathway (branch). One previously proposed possibility is that two pathways increase the learning efficiency of a network by allowing separate networks to process information about different visual attributes separately. However, most of these previous studies were limited, considering recognition of only two visual attributes, identity and location, simultaneously with a restricted number of classes in each attribute. We investigate whether it is always advantageous to use two-pathway networks when recognizing other visual attributes as well as examine whether the advantage of using two-pathway networks would be different when there are a different number of classes in each attribute. We find that it is always advantageous to use segregated pathways to process different visual attributes separately, with this advantage increasing with a greater number of classes. Thus, using a computational approach, we demonstrate that it is computationally advantageous to have separate pathways if the amount of variations of a given visual attribute is high or that attribute needs to be finely discriminated. Hence, when the size of the computer vision model is limited, designing a segregated pathway (branch) for a given visual attribute should only be used when it is computationally advantageous to do so.
Multiresolution variance-based image fusion
(2013-05) Ragozzino, Matthew; Salama, Paul; Christopher, Lauren; Rizkalla, Maher E.; King, Brian
Multiresolution image fusion is an emerging area of research for use in military and commercial applications. While many methods for image fusion have been developed, improvements can still be made. In many cases, image fusion methods are tailored to specific applications and are limited as a result. In order to make improvements to general image fusion, novel methods have been developed based on the wavelet transform and empirical variance. One particular novelty is the use of directional filtering in conjunction with wavelet transforms. Instead of treating the vertical, horizontal, and diagonal sub-bands of a wavelet transform the same, each sub-band is handled independently by applying custom filter windows. Results of the new methods exhibit better performance across a wide range of images highlighting different situations.
Video anatomy : spatial-temporal video profile
(2014-07-31) Cai, Hongyuan; Zheng, Jiang Yu; Tuceryan, Mihran; Popescu, Voicu Sebastian; Tricoche, Xavier; Prabhakar, Sunil; Gorman, William J.
A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview.

Browsing by Subject "Visual perception"

Results Per Page

Sort Options