- Browse by Author
Browsing by Author "Wang, Jinqiao"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item CoupleNet: Coupling Global Structure with Local Parts for Object Detection(IEEE, 2017-10) Zhu, Yousong; Zhao, Chaoyang; Wang, Jinqiao; Zhao, Xu; Wu, Yi; Lu, Hanqing; Medicine, School of MedicineThe region-based Convolutional Neural Network (CNN) detectors such as Faster R-CNN or R-FCN have already shown promising results for object detection by combining the region proposal subnetwork and the classification subnetwork together. Although R-FCN has achieved higher detection speed while keeping the detection performance, the global structure information is ignored by the position-sensitive score maps. To fully explore the local and global properties, in this paper, we propose a novel fully convolutional network, named as CoupleNet, to couple the global structure with local parts for object detection. Specifically, the object proposals obtained by the Region Proposal Network (RPN) are fed into the the coupling module which consists of two branches. One branch adopts the position-sensitive RoI (PSRoI) pooling to capture the local part information of the object, while the other employs the RoI pooling to encode the global and context information. Next, we design different coupling strategies and normalization ways to make full use of the complementary advantages between the global and local branches. Extensive experiments demonstrate the effectiveness of our approach. We achieve state-of-the-art results on all three challenging datasets, i.e. a mAP of 82.7% on VOC07, 80.4% on VOC12, and 34.4% on COCO. Codes will be made publicly available1.Item Domain Adaptation Tracker With Global and Local Searching(IEEE, 2018) Zhao, Fei; Zhang, Ting; Wu, Yi; Wang, Jinqiao; Tang, Ming; Medicine, School of MedicineFor the convolutional neural network (CNN)-based trackers, most of them locate the target only within a local area, which makes the trackers hard to recapture the target after drifting into the background. Besides, most state-of-the-art trackers spend a large amount of time on training the CNN-based classification networks online to adapt to the current domain. In this paper, to address the two problems, we propose a robust domain adaptation tracker based on the CNNs. The proposed tracker contains three CNNs: a local location network (LL-Net), a global location network (GL-Net), and a domain adaptation classification network (DA-Net). For the former problem, if we come to the conclusion that the tracker drifts into the background based on the output of the LL-Net, we will search for the target in a global area of the current frame based on the GL-Net. For the latter problem, we propose a CNN-based DA-Net with a domain adaptation (DA) layer. By pre-training the DA-Net offline, the DA-Net can adapt to the current domain by only updating the parameters of the DA layer in one training iteration when the online training is triggered, which makes the tracker run five times faster than MDNet with comparable tracking performance. The experimental results show that our tracker performs favorably against the state-of-the-art trackers on three popular benchmarks.Item Feature Distilled Tracking(IEEE, 2017-12) Zhu, Guibo; Wang, Jinqiao; Wang, Peisong; Wu, Yi; Lu, Hanqing; Medicine, School of MedicineFeature extraction and representation is one of the most important components for fast, accurate, and robust visual tracking. Very deep convolutional neural networks (CNNs) provide effective tools for feature extraction with good generalization ability. However, extracting features using very deep CNN models needs high performance hardware due to its large computation complexity, which prohibits its extensions in real-time applications. To alleviate this problem, we aim at obtaining small and fast-to-execute shallow models based on model compression for visual tracking. Specifically, we propose a small feature distilled network (FDN) for tracking by imitating the intermediate representations of a much deeper network. The FDN extracts rich visual features with higher speed than the original deeper network. To further speed-up, we introduce a shift-and-stitch method to reduce the arithmetic operations, while preserving the spatial resolution of the distilled feature maps unchanged. Finally, a scale adaptive discriminative correlation filter is learned on the distilled feature for visual tracking to handle scale variation of the target. Comprehensive experimental results on object tracking benchmark datasets show that the proposed approach achieves 5x speed-up with competitive performance to the state-of-the-art deep trackers.