摘要:
Despite the significant advancements in camouflaged object detection achieved by convolutional neural network (CNN) methods and vision transformer (ViT) methods, both have limitations. CNN-based methods fail to explore long-range dependencies due to their limited receptive fields, while ViT-based methods lose detailed information due to large-span aggregation. To address these issues, we introduce a novel model, the double-extraction and triple-fusion network (DTNet), which leverages the global context modeling capabilities of ViT-based encoders and the detail capture capabilities of CNN-based encoders through decision-level feature fusion to make up the respective shortcomings for more complete segmentation of camouflaged objects. Specifically, it incorporates a boundary guidance module, designed to aggregate high-level and low-level boundary information through multi-scale feature decoding, thereby guiding the local detail representation of the transformer. It also includes a global context aggregation module, which shrinks the information of adjacent channels from top to bottom and aggregates information of high-level and low-level scales from bottom to top for feature decoding. It also contains a multi-feature fusion module to fuse global context features and local detail features. This module employs the attention mechanism in different channels to assign varying weights to long-range and short-range information. Through extensive experimentation, it has proven that the DTNet significantly surpasses 20 recently state-of-the-art methods in terms of performance. The related code and datasets will be posted at
https://github.com/KungFuProgrammerle/DTNet
.
摘要:
Camouflaged instance segmentation (CIS) focuses on handling instances that attempt to blend into the background. However, existing CIS methods emphasize global interactions but overlook hidden clues at various scales, resulting in inaccurate recognition of camouflaged instances. To address this, we propose a multi-scale pooling network (MSPNet) to mine the hidden cues offered by the camouflaged instances at various scales. The network achieves an enhanced fusion of multi-scale information mainly through multilayer pooling. Specifically, the pyramid pooling transformer (P2T) is utilized as a robust backbone for extracting multi-scale features. Then, we introduce an end-to-end pooling learning transformer (PLT) to obtain instance-aware parameters and high-quality mask features. To further augment the fusion of various mask features, we design a novel multi-scale complementary feature pooling (MCFP) module. Additionally, we also suggest an instance normalization module with fused spatial attention (FSA-IN) to combine instance-aware parameters and mask features, resulting in the final camouflaged instances. Experimental results show the effectiveness of MSPNet, surpassing existing CIS models on the COD10K-Test and NC4K datasets, with respective average precision (AP) scores of 49.6% and 53.4%. This demonstrates the effectiveness of the proposed approach in detecting camouflaged instances. Our code will be published at
https://github.com/another-u/MSPNet-main
.
摘要:
Camouflaged instance segmentation (CIS) aims to segment instances that are seamlessly embedded in their surroundings. Existing CIS methods often focus on utilizing global information but neglect local information, resulting in incomplete feature representation and reduced accuracy. To address this, we propose a global-to-local network (GLNet) for CIS, leveraging both global and local information for enhanced feature representation and segmentation. Specifically, GLNet consists of two main components: global capture and local refinement. In global capture, we introduce a novel dual-branch convolutional feed-forward network (Dual-FFN), which aims to more effectively capture camouflaged instances in complex scenes. In local refinement, we design a U-shape feature fusion module (UFFM) and an edge-guide fusion module (EFM). These modules facilitate the fusion of multi-scale features by cascading. As a result, the network gains an enhanced ability to discern the intricate details of camouflaged instances. Experimental results demonstrate that our GLNet outperforms existing methods, with a 49.3% average precision (AP) on the COD10K-Test.
摘要:
In high-power laser system such as Petawatt lasers, the laser beam can be intense enough to result in saturation of nonlinear refraction index of medium. We present an analytical and simulative investigation of hot image formation in an intense laser beam through a saturable nonlinear medium slab based on Fresnel-Kirchhoff diffraction integral and the standard split-step Fourier method. The analytical results are found in agreement with the simulative ones. It is shown that, hot images can still form in an intense laser beam through a saturable nonlinear medium slab, additionally, the saturable nonlinearity does not change the location of hot images, while may decrease the intensity of hot images, i.e., the intensity of hot images decreases with the saturation light intensity lowering, and can stop to increase with the intensity of the incident laser beam heightening due to saturation of nonlinearity. Moreover, variations of intensity of hot images with the obscuration type and the slab thickness are discussed.
关键词:
Spin Hall effect of light;vector beam;spin accumulation
摘要:
We report the demonstration of intrinsic spin Hall effect (SHE) of cylindrical vector beam. Employing a fan-shaped aperture to block part of the vector beam, the intrinsic vortex phases are no longer continuous in the azimuthal direction, and results in the spin accumulation at the opposite edges of the light beam. Due to the inherent nature of the phase and independency of light-matter interaction, the observed SHE is intrinsic. Modulating the topological charge of the vector beam, the spin-dependent splitting can be enhanced and the direction of spin accumulation is switchable.