作者机构:
[Fangyan Wang; Guowen Yue] College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, Hunan, China;Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, 421002, Hunan, China;Hunan Engineering Research Center of Cyberspace Security Technology and Applications, Hengyang Normal University, Hengyang, 421002, Hunan, China;[Ge Jiao] College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, Hunan, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, 421002, Hunan, China<&wdkj&>Hunan Engineering Research Center of Cyberspace Security Technology and Applications, Hengyang Normal University, Hengyang, 421002, Hunan, China
通讯机构:
[Ge Jiao] C;College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, Hunan, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, 421002, Hunan, China<&wdkj&>Hunan Engineering Research Center of Cyberspace Security Technology and Applications, Hengyang Normal University, Hengyang, 421002, Hunan, China
摘要:
Currently, Camouflaged Object Detection (COD) methods often rely on single-view feature perception, which struggles to fully capture camouflaged objects due to environmental interference such as background clutter, lighting variations, and viewpoint changes. To address this, we propose the Multi-view Collaboration Network (MCNet), inspired by human visual strategies for complex scene analysis. MCNet incorporates multiple perspectives for enhanced feature extraction. The global perception module takes the original, far, and near views, using different large-kernel convolutions and multi-head attention mechanisms for global feature embedding. In parallel, the local perception module processes the tilted, projected, and color-jittered views, extracting fine-grained local features through multi-branch deep convolutions and dilated convolutions. To facilitate deep interaction between global and local features, we introduce the hybrid interactive module, which explores the correlation of multi-view feature information and adaptively fuses features. For feature decoding, the dynamic pyramid shrinkage module integrates dynamic gated convolutions with a pyramid shrinkage mechanism, progressively aggregating semantic features through a hierarchical shrinking strategy and group fusion strategy. Experimental results on popular COD benchmark datasets show that MCNet outperforms 18 state-of-the-art methods.
Currently, Camouflaged Object Detection (COD) methods often rely on single-view feature perception, which struggles to fully capture camouflaged objects due to environmental interference such as background clutter, lighting variations, and viewpoint changes. To address this, we propose the Multi-view Collaboration Network (MCNet), inspired by human visual strategies for complex scene analysis. MCNet incorporates multiple perspectives for enhanced feature extraction. The global perception module takes the original, far, and near views, using different large-kernel convolutions and multi-head attention mechanisms for global feature embedding. In parallel, the local perception module processes the tilted, projected, and color-jittered views, extracting fine-grained local features through multi-branch deep convolutions and dilated convolutions. To facilitate deep interaction between global and local features, we introduce the hybrid interactive module, which explores the correlation of multi-view feature information and adaptively fuses features. For feature decoding, the dynamic pyramid shrinkage module integrates dynamic gated convolutions with a pyramid shrinkage mechanism, progressively aggregating semantic features through a hierarchical shrinking strategy and group fusion strategy. Experimental results on popular COD benchmark datasets show that MCNet outperforms 18 state-of-the-art methods.
作者机构:
College of Computer Science and Technology, Hengyang Normal University, Hengyang, China;Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, China;[Yezhou Zhang; Lang Li; Yu Ou] College of Computer Science and Technology, Hengyang Normal University, Hengyang, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, China
通讯机构:
[Lang Li] C;College of Computer Science and Technology, Hengyang Normal University, Hengyang, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, China
摘要:
Deep learning algorithms are increasingly employed to exploit side-channel information, such as power consumption and electromagnetic leakage from hardware devices, significantly enhancing attack capabilities. However, relying solely on power traces for side-channel information often requires adequate domain knowledge. To address this limitation, this work proposes a new attack scheme. Firstly, a Convolutional Neural Network (CNN)-based plaintext-extended bilinear feature fusion model is designed. Secondly, multi-model intermediate layers are fused and trained, yielding in the increase of the amount of effective information and generalization ability. Finally, the model is employed to predict the output probability of three public side-channel datasets (e.g. ASCAD, AES
$$\_$$
HD, and AES
$$\_$$
RD), and analyze the recovery key guessing entropy for each key to efficiently assess attack efficiency. Experimental results showcase that the plaintext-extended bilinear feature fusion model can effectively enhance the Side-Channel Attack (SCA) capabilities and prediction performance. Deploying the proposed method, the number of traces required for a successful attack on the ASCAD
$$\_$$
R dataset is significantly reduced to less than 914, representing an 70.5% reduction in traces compared to the network in Convolutional Neural Network-Visual Geometry Group (CNNVGG16) with plaintext, which incorporating plaintext features before the fully connected layer. Compared to existing solutions, the proposed scheme requires only 80% of the power traces for the attack mask design using only 75 epochs. As a result, the power of the proposed method is well proved through the different experiments and comparison processes.
摘要:
Epicanthus refers to the longitudinal curved skin folds that cover the medial canthus, which affect aesthetics due to covering the medial canthus angle and lacrimal mound. Various surgical methods exist for correcting epicanthus, each with its own set of advantages and disadvantages, and lacking a standardized operational protocol, making it difficult for beginners to master and for clinical promotion.This article aims to explore a standardized and simplified five-step procedure for treating epicanthus and report our clinical experience and effectiveness. A retrospective analysis was conducted from October 2019 to September 2022 at the Burn and Plastic Surgery Department of the Second Affiliated Hospital of South China University. A consistent team of doctors utilized a five-step method to correct the medial canthus in 306 patients with epicanthus. All patients were followed up for more than 6 months. We observed 306 patients and used iris diameter as a reference value to subjectively evaluate the clinical effect through photo evaluation and scar scoring.Objective evaluation of clinical efficacy was achieved through the inter canthal distance (ICD) and palpebral fissure length (PFL). The study included 295 females and 11 males, with an average follow-up time of 14.2 months.The average increase rate of PFL is 14.9%, and the average reduction rate of ICD is 8.6%. Two cases of bleeding and swelling were promptly treated, and no long-term complications were left. 85 cases of scar hyperplasia were treated with KELO-COTE® silicone gel, triamcinolone injection, and appropriate laser therapy in combination, and the scars gradually resolved after 12 months. 4 cases of recurrence and 2 cases of asymmetry underwent reoperation. Observing the satisfaction and effectiveness rate of 306 patients, the overall satisfaction and effectiveness rate reached over 95%. About 96.40% of patients were satisfied with the surgery and would recommend it to their family and friends. The paired t-test was used for statistical analysis, and the results showed statistical significance. The five-step method for correcting epicanthus proves to be a simple, efficient, and reliable technique that is easily mastered by beginners. It boasts high patient satisfaction and carries a low risk of scar formation.
通讯机构:
[Zhao, HH ] H;Hengyang Normal Univ, Coll Comp Sci & technol, Hengyang 421008, Peoples R China.;Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421008, Peoples R China.
关键词:
Three-dimensional human pose estimation;Transformer;GCN;Prior knowledge
摘要:
Transformer-based approaches have significantly driven recent progress in three-dimensional human pose estimation. However, existing transformer-based approaches are still deficient in capturing localized features, and they lack task-specific a priori information by obtaining queries, keys, and values through simple linear mappings. Existing methods lack effective human constraints for model training. We introduce the Spatial Encoding Graph Convolutional Network Transformer (SEGCNFormer), designed to enhance model capacity in capturing local features. In addition, we propose a Temporal-Aware Network, which generates queries, keys, and values possessing a priori knowledge of human motion, enabling the model to better understand the structural information of human poses. Finally, we leverage the knowledge of human anatomy and motion to design the Human Structural Science Loss, which performs a rationality assessment of human actions and imposes physical constraints on the generated poses. Our method outperforms existing methods on the Human3.6M dataset in both 27 and 81 sampling frames, and our predicted poses are closer to the actual poses with less error. For the existing three issues, we proposed effective methods and conducted targeted experiments, which confirmed the effectiveness of our strategies.
通讯机构:
[Zhao, HH ] H;Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Hunan, Peoples R China.
关键词:
Text to image;Image generation;Generative Adversarial Network;Attention
摘要:
Text-to-image generation is a challenging and significant research task. It aims to synthesize high-quality images that match the given descriptive statements. Existing methods still have problems in generating semantic information fusion insufficiently, and the generated images cannot represent the descriptive statements properly. Therefore, A novel method named EMF-GAN(Efficient Multilayer Fusion Generative Adversarial Network) is proposed. It uses a Multilayer Fusion Module (MF Module) and Efficient Multi-Scale Attention Module (EMA Module) to fuse the semantic information into the feature maps gradually. It realizes the full utilization of the semantic information and obtains high-quality realistic images. Extensive experimental results show that our EMF-GAN is highly competitive in image generation quality and semantic consistency. Compared with the state-of-the-art methods, EMF-GAN shows significant performance improvement on both CUB (FID from 14.81 to 10.74) and COCO (FID from 19.32 to 16.86) datasets. It can generate photorealistic images with richer details and text-image consistency. Code can be found at https://github.com/zxcnmmmmm/EMF-GAN-master .
Text-to-image generation is a challenging and significant research task. It aims to synthesize high-quality images that match the given descriptive statements. Existing methods still have problems in generating semantic information fusion insufficiently, and the generated images cannot represent the descriptive statements properly. Therefore, A novel method named EMF-GAN(Efficient Multilayer Fusion Generative Adversarial Network) is proposed. It uses a Multilayer Fusion Module (MF Module) and Efficient Multi-Scale Attention Module (EMA Module) to fuse the semantic information into the feature maps gradually. It realizes the full utilization of the semantic information and obtains high-quality realistic images. Extensive experimental results show that our EMF-GAN is highly competitive in image generation quality and semantic consistency. Compared with the state-of-the-art methods, EMF-GAN shows significant performance improvement on both CUB (FID from 14.81 to 10.74) and COCO (FID from 19.32 to 16.86) datasets. It can generate photorealistic images with richer details and text-image consistency. Code can be found at https://github.com/zxcnmmmmm/EMF-GAN-master .
摘要:
Rice is a staple food for nearly half the global population and, with rising living standards, the demand for high-quality grain is increasing. Chalkiness, a key determinant of appearance quality, requires accurate detection for effective quality evaluation. While traditional 2D imaging has been used for chalkiness detection, its inherent inability to capture complete 3D morphology limits its suitability for precision agriculture and breeding. Although micro-CT has shown promise in 3D chalk phenotype analysis, high-throughput automated 3D detection for multiple grains remains a challenge, hindering practical applications. To address this, we propose a high-throughput 3D chalkiness detection method using micro-CT and VSE-UNet. Our method begins with non-destructive 3D imaging of grains using micro-CT. For the accurate segmentation of kernels and chalky regions, we propose VSE-UNet, an improved VGG-UNet with an SE attention mechanism for enhanced feature learning. Through comprehensive training optimization strategies, including the Dice focal loss function and dropout technique, the model achieves robust and accurate segmentation of both kernels and chalky regions in continuous CT slices. To enable high-throughput 3D analysis, we developed a unified 3D detection framework integrating isosurface extraction, point cloud conversion, DBSCAN clustering, and Poisson reconstruction. This framework overcomes the limitations of single-grain analysis, enabling simultaneous multi-grain detection. Finally, 3D morphological indicators of chalkiness are calculated using triangular mesh techniques. Experimental results demonstrate significant improvements in both 2D segmentation (7.31% improvement in chalkiness IoU, 2.54% in mIoU, 2.80% in mPA) and 3D phenotypic measurements, with VSE-UNet achieving more accurate volume and dimensional measurements compared with the baseline. These improvements provide a reliable foundation for studying chalkiness formation and enable high-throughput phenotyping.
作者机构:
College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China;Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, 421002, China;[Shengcheng Xia; Yu Ou; Jiahao Xiang] College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang Normal University, Hengyang, 421002, China
摘要:
Label distribution learning techniques can significantly enhance the effectiveness of side-channel analysis. However, this method relies on using probability density functions to estimate the relationships between labels. The settings of parameters play a crucial role in the impact of the attacks. This study introduces a non-parametric statistical method to calculate the distribution between labels, specifically employing smoothing with the Gaussian kernel function and adjusting bandwidth. Then, the aggregation of the results from each label processed by the Gaussian kernel facilitates a hypothesis-free estimation of the label distribution. This method accurately represents the actual leakage distribution, speeding up guess entropy convergence. Secondly, we exploit similarities between profiling traces, proposing an analysis scheme for sample correlation locally of label distribution learning. Furthermore, Signal to-Noise Ratio (SNR) is employed to re-extract and reduce dataset dimensions to 500 power consumption points, resulting in noise reduction. Our results showcase the successful training of 800 profiling traces using our method for sample correlation locally of label distribution learning, with the findings indicating its exceptional performance.
Label distribution learning techniques can significantly enhance the effectiveness of side-channel analysis. However, this method relies on using probability density functions to estimate the relationships between labels. The settings of parameters play a crucial role in the impact of the attacks. This study introduces a non-parametric statistical method to calculate the distribution between labels, specifically employing smoothing with the Gaussian kernel function and adjusting bandwidth. Then, the aggregation of the results from each label processed by the Gaussian kernel facilitates a hypothesis-free estimation of the label distribution. This method accurately represents the actual leakage distribution, speeding up guess entropy convergence. Secondly, we exploit similarities between profiling traces, proposing an analysis scheme for sample correlation locally of label distribution learning. Furthermore, Signal to-Noise Ratio (SNR) is employed to re-extract and reduce dataset dimensions to 500 power consumption points, resulting in noise reduction. Our results showcase the successful training of 800 profiling traces using our method for sample correlation locally of label distribution learning, with the findings indicating its exceptional performance.
期刊:
Expert Systems with Applications,2025年272:126693 ISSN:0957-4174
通讯作者:
Chen, WH
作者机构:
[Yan, Li; Chen, Wenhui; Zhao, Huihuang; Yang, Yanqing; Wang, Weijie] Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China.;[Chen, Wenhui; Yang, Yanqing] Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China.;[Zhao, Huihuang] Hunan Univ, Natl Engn Lab Robot Visual Percept & Control Techn, Hengyang, Peoples R China.
通讯机构:
[Chen, WH ] H;Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China.;Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China.
关键词:
Time series floating point data;Lossless compression;Internet of things;Compression algorithm;Heuristic genetic algorithm
摘要:
The processing of large volumes of time series data across various fields presents significant challenges, particularly when it comes to effectively managing floating-point numbers. Current dual precision floating-point lossless compression algorithms often struggle to deliver exceptional performance on diverse datasets, highlighting their inherent limitations. To address this issue, we propose a novel method called the Heuristic Genetic Algorithm Parameter Optimizer for Lossless Compression of Time Series Floating Point Data (HGA-ACTF). This method features a highly effective parameter optimizer designed specifically for compression algorithms that utilize leading zeros. The combination of our parameter optimizer and the HGA-ACTF algorithm strategy has been proven to outperform existing leading compression algorithms across multiple fields. This approach not only enhances the compression ratio but also significantly reduces both compression and decompression times. In our comparative study, we evaluated the HGA-ACTF algorithm against eleven well-performing algorithms and a variant of the algorithm, integrating our parameter optimizer and algorithmic strategy into other adaptable algorithms, and demonstrating notable improvements. Experimental results indicate that the HGA-ACTF algorithm achieves an average compression ratio improvement of 38.87%, with some datasets showing improvements of up to 54.36%. Our approach effectively addresses the transmission and storage of time series data, significantly reducing the overhead associated with data processing. The code can be found at https://github.com/wwj10/HGA-ACTF .
The processing of large volumes of time series data across various fields presents significant challenges, particularly when it comes to effectively managing floating-point numbers. Current dual precision floating-point lossless compression algorithms often struggle to deliver exceptional performance on diverse datasets, highlighting their inherent limitations. To address this issue, we propose a novel method called the Heuristic Genetic Algorithm Parameter Optimizer for Lossless Compression of Time Series Floating Point Data (HGA-ACTF). This method features a highly effective parameter optimizer designed specifically for compression algorithms that utilize leading zeros. The combination of our parameter optimizer and the HGA-ACTF algorithm strategy has been proven to outperform existing leading compression algorithms across multiple fields. This approach not only enhances the compression ratio but also significantly reduces both compression and decompression times. In our comparative study, we evaluated the HGA-ACTF algorithm against eleven well-performing algorithms and a variant of the algorithm, integrating our parameter optimizer and algorithmic strategy into other adaptable algorithms, and demonstrating notable improvements. Experimental results indicate that the HGA-ACTF algorithm achieves an average compression ratio improvement of 38.87%, with some datasets showing improvements of up to 54.36%. Our approach effectively addresses the transmission and storage of time series data, significantly reducing the overhead associated with data processing. The code can be found at https://github.com/wwj10/HGA-ACTF .
摘要:
Hyperspectral images (HSIs) contain rich spectral and spatial information, motivating the development of a novel circulant singular spectrum analysis (CiSSA) and multiscale local ternary pattern fusion method for joint spectral-spatial feature extraction and classification. Due to the high dimensionality and redundancy in HSIs, principal component analysis (PCA) is used during preprocessing to reduce dimensionality and enhance computational efficiency. CiSSA is then applied to the PCA-reduced images for robust spatial pattern extraction via circulant matrix decomposition. The spatial features are combined with the global spectral features from PCA to form a unified spectral-spatial feature set (SSFS). Local ternary pattern (LTP) is further applied to the principal components (PCs) to capture local grayscale and rotation-invariant texture features at multiple scales. Finally, the performance of the SSFS and multiscale LTP features is evaluated separately using a support vector machine (SVM), followed by decision-level fusion to combine results from each pipeline based on probability outputs. Experimental results on three popular HSIs show that, under 1% training samples, the proposed method achieves 95.98% accuracy on the Indian Pines dataset, 98.49% on the Pavia University dataset, and 92.28% on the Houston2013 dataset, outperforming several traditional classification methods and state-of-the-art deep learning approaches.
摘要:
The objective of image-based virtual try-on is to seamlessly integrate clothing onto a target image, generating a realistic representation of the character in the specified attire. However, existing virtual try-on methods frequently encounter challenges, including misalignment between the body and clothing, noticeable artifacts, and the loss of intricate garment details. To overcome these challenges, we introduce a two-stage high-resolution virtual try-on framework that integrates an attention mechanism, comprising a garment warping stage and an image generation stage. During the garment warping stage, we incorporate a channel attention mechanism to effectively retain the critical features of the garment, addressing challenges such as the loss of patterns, colors, and other essential details commonly observed in virtual try-on images produced by existing methods. During the image generation stage, with the aim of maximizing the utilization of the information proffered by the input image, the input features undergo double sampling within the normalization procedure, thereby enhancing the detail fidelity and clothing alignment efficacy of the output image. Experimental evaluations conducted on high-resolution datasets validate the effectiveness of the proposed method. Results demonstrate significant improvements in preserving garment details, reducing artifacts, and achieving superior alignment between the clothing and body compared to baseline methods, establishing its advantage in generating realistic and high-quality virtual try-on images.
The objective of image-based virtual try-on is to seamlessly integrate clothing onto a target image, generating a realistic representation of the character in the specified attire. However, existing virtual try-on methods frequently encounter challenges, including misalignment between the body and clothing, noticeable artifacts, and the loss of intricate garment details. To overcome these challenges, we introduce a two-stage high-resolution virtual try-on framework that integrates an attention mechanism, comprising a garment warping stage and an image generation stage. During the garment warping stage, we incorporate a channel attention mechanism to effectively retain the critical features of the garment, addressing challenges such as the loss of patterns, colors, and other essential details commonly observed in virtual try-on images produced by existing methods. During the image generation stage, with the aim of maximizing the utilization of the information proffered by the input image, the input features undergo double sampling within the normalization procedure, thereby enhancing the detail fidelity and clothing alignment efficacy of the output image. Experimental evaluations conducted on high-resolution datasets validate the effectiveness of the proposed method. Results demonstrate significant improvements in preserving garment details, reducing artifacts, and achieving superior alignment between the clothing and body compared to baseline methods, establishing its advantage in generating realistic and high-quality virtual try-on images.
作者:
Long Chen;Huihuang Zhao*;Jiaxin He;Weiliang Meng
期刊:
Digital Signal Processing,2025年:105321 ISSN:1051-2004
通讯作者:
Huihuang Zhao
作者机构:
National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, China;[Long Chen; Jiaxin He] College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China;Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, 421002, China;[Weiliang Meng] School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China;[Huihuang Zhao] National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, China<&wdkj&>College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, 421002, China
通讯机构:
[Huihuang Zhao] N;National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, China<&wdkj&>College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, 421002, China
摘要:
With the rapid development of generative models, there is an increasing demand for universal fake image detectors. In this paper, we investigate the problem of fake image detection for the synthesis of generative models to detect fake images from multiple generative methods. Recent research methods explore the benefits of pre-trained models and mainly adopt a fixed paradigm of training additional classifiers separately, but we find that the fixed paradigm hinders the full learning of forgery features, leading to insufficient representation learning in the detector.The main reason is that the fixed paradigm pays too much attention to global features and neglects local features, which limits the ability of the model to perceive image details and leads to some information loss or confusion. In this regard, based on the pre-trained visual language space, our method introduces two core designs. First, we designed a Deep Window Triple Attention (DWTA) module. A similar dense sliding window strategy is adopted to capture multi-scale local abnormal patterns, and the sensitivity to generated artifacts is enhanced through the triple attention mechanism. Secondly, we proposed a Cross-Space Feature Alignment(CSFA) module. A two-way interactive channel between global features and local features is established, and the alignment loss function is used to achieve semantic alignment of cross-modal feature spaces. The aligned features are adaptively fused through a gating mechanism to obtain the final adaptive forged features. Experiments demonstrate that our method, when trained solely on ProGAN data, achieves superior cross-generator generalization: it attains an average accuracy of 94.7% on unseen GANs and generalizes to unseen diffusion models with 94% accuracy, surpassing existing methods by 2.1%. The source code is available at https://github.com/long2580h/GLFAFormer .
With the rapid development of generative models, there is an increasing demand for universal fake image detectors. In this paper, we investigate the problem of fake image detection for the synthesis of generative models to detect fake images from multiple generative methods. Recent research methods explore the benefits of pre-trained models and mainly adopt a fixed paradigm of training additional classifiers separately, but we find that the fixed paradigm hinders the full learning of forgery features, leading to insufficient representation learning in the detector.The main reason is that the fixed paradigm pays too much attention to global features and neglects local features, which limits the ability of the model to perceive image details and leads to some information loss or confusion. In this regard, based on the pre-trained visual language space, our method introduces two core designs. First, we designed a Deep Window Triple Attention (DWTA) module. A similar dense sliding window strategy is adopted to capture multi-scale local abnormal patterns, and the sensitivity to generated artifacts is enhanced through the triple attention mechanism. Secondly, we proposed a Cross-Space Feature Alignment(CSFA) module. A two-way interactive channel between global features and local features is established, and the alignment loss function is used to achieve semantic alignment of cross-modal feature spaces. The aligned features are adaptively fused through a gating mechanism to obtain the final adaptive forged features. Experiments demonstrate that our method, when trained solely on ProGAN data, achieves superior cross-generator generalization: it attains an average accuracy of 94.7% on unseen GANs and generalizes to unseen diffusion models with 94% accuracy, surpassing existing methods by 2.1%. The source code is available at https://github.com/long2580h/GLFAFormer .
摘要:
Significant energy challenges are faced in real-time secure data monitoring in intelligent environmental monitoring systems. A lightweight block cipher LECipher is proposed to provide security for intelligent environmental monitoring systems with low energy. Firstly, a generalized Feistel variant structure (GFS) is proposed, which has good diffusivity. Secondly, an 8-bit S-box generation scheme is constructed based on Boolean functions. The hardware area of the S-box only requires 25.32 Gate Equivalents (GE). Its hardware area has significant advantages compared with the currently published 8-bit S-boxes. Then, using XOR and cyclic shift operations combined with the recursive depth-first search (DFS) strategy in the linear layer. Two 8th-order binary involution circulant matrices are constructed. A 16th-order binary involution circulant matrix is constructed to further improve the security of the LECipher on this basis. The design idea of dynamic adjustment is adopted in the key schedule algorithm. More specially, the different key schedule operations are performed alternately every three rounds to improve the randomization of the round key. Encryption and decryption of LECipher require only 1542 GE and 6.50 mu J/bit on the UMC 0.18 mu m, according to a detailed hardware performance evaluation. The energy is reduced by 61% and 29% compared with the SKINNY and Midori ciphers, respectively. Furthermore, comprehensive security analyses show that LECipher maintains sufficient security boundaries. Finally, an experimental platform for encrypted transmission of intelligent environmental monitoring systems based on LECipher is established, which further verifies the feasibility of its application in internet of things (IoT) devices.
摘要:
The adoption of deep learning-based side-channel analysis (DL-SCA) is crucial for leak detection in secure products. Many previous studies have applied this method to break targets protected with countermeasures. Despite the increasing number of studies, the problem of model overfitting. Recent research mainly focuses on exploring hyperparameters and network architectures, while offering limited insights into the effects of external factors on side-channel attacks, such as the number and type of models. This paper proposes a Side-channel Analysis method based on a Stacking ensemble, called Stacking-SCA. In our method, multiple models are deeply integrated. Through the extended application of base models and the meta-model, Stacking-SCA effectively improves the output class probabilities of the model, leading to better generalization. Furthermore, this method shows that the attack performance is sensitive to changes in the number of models. Next, five independent subsets are extracted from the original ASCAD database as multi-segment datasets, which are mutually independent. This method shows how these subsets are used as inputs for Stacking-SCA to enhance its attack convergence. The experimental results show that Stacking-SCA outperforms the current state-of-the-art results on several considered datasets, significantly reducing the number of attack traces required to achieve a guessing entropy of 1. Additionally, different hyperparameter sizes are adjusted to further validate the robustness of the method.
The adoption of deep learning-based side-channel analysis (DL-SCA) is crucial for leak detection in secure products. Many previous studies have applied this method to break targets protected with countermeasures. Despite the increasing number of studies, the problem of model overfitting. Recent research mainly focuses on exploring hyperparameters and network architectures, while offering limited insights into the effects of external factors on side-channel attacks, such as the number and type of models. This paper proposes a Side-channel Analysis method based on a Stacking ensemble, called Stacking-SCA. In our method, multiple models are deeply integrated. Through the extended application of base models and the meta-model, Stacking-SCA effectively improves the output class probabilities of the model, leading to better generalization. Furthermore, this method shows that the attack performance is sensitive to changes in the number of models. Next, five independent subsets are extracted from the original ASCAD database as multi-segment datasets, which are mutually independent. This method shows how these subsets are used as inputs for Stacking-SCA to enhance its attack convergence. The experimental results show that Stacking-SCA outperforms the current state-of-the-art results on several considered datasets, significantly reducing the number of attack traces required to achieve a guessing entropy of 1. Additionally, different hyperparameter sizes are adjusted to further validate the robustness of the method.
摘要:
To address the limitations of existing visually meaningful image encryption algorithms in terms of visual quality and adaptability across different wavelet types, this paper introduces a novel greedy algorithm-based embedding method. Our approach innovatively selects the optimal decomposition basis dynamically by calculating an error value for each candidate basis, tailored to the specific characteristics of both the host and plaintext images. This error-based selection process significantly enhances visual quality across a spectrum of wavelet transformations. Furthermore, we propose a novel two-dimensional hyperchaotic map (2D-RECM), characterized by its hyperchaotic behavior across a broad range of control parameters. The integration of 2D-RECM with our embedding algorithm results in a robust visually meaningful image encryption algorithm (RECM-VMIEA). Extensive simulations and analyses confirm that RECM-VMIEA not only maintains high visual quality but also exhibits exceptional security and robustness, outperforming existing algorithms in these critical aspects.
作者机构:
[Lujie Wang; Xiyu Sun; Chenchen He] College of Computer Science and Technology, Hengyang Normal University, Hengyang, China;Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, China;[Zhong Chen] College of Computer Science and Technology, Hengyang Normal University, Hengyang, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, China
通讯机构:
[Zhong Chen] C;College of Computer Science and Technology, Hengyang Normal University, Hengyang, China<&wdkj&>Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, China
关键词:
Image encryption;Region of interest;Lifting scheme;Chaos;NMS;Object detection
摘要:
Securing image transmissions has become critical as the demand for image sharing and storage grows. In some commercial uses, in order to achieve a balance between encryption security and efficiency, some researchers have tried to encrypt only the region of interest of the user in the image. In order to accurately encrypt region of interest images, this paper proposed a region of interest image encryption algorithm based on lifting scheme and object detection. The algorithm automatically identifies the region of interest in the image and encrypts it securely and effectively. Firstly, the region of interest in the image is detected using the object detection model, and the non-maximum suppression algorithm is modified to solve the problem that the detection box outputted by the object detection model does not completely contain the region of interest. Then the existing pixel extraction method for region of interest images is improved to extract the pixels from the region of interest images at a faster speed, which improves the overall efficiency of the algorithm. Finally, based on the thought of wavelet lifting transform, combined with chaotic system, a two-layer hybrid lifting scheme encryption algorithm is proposed to encrypt the extracted pixels. Experimental results and security analysis show that the algorithm proposed in this paper can effectively protect all objects at once with high security.
摘要:
In 5G and beyond 5G networks, function placement is a crucial strategy for enhancing the flexibility and efficiency of the Radio Access Network (RAN). However, demonstrating optimal function splitting and placement to meet diverse user demands remains a significant challenge. The function placement problem is known to be NP-hard, and previous studies have attempted to address it using Deep Reinforcement Learning (DRL) approaches. Nevertheless, many existing methods fail to capture the network state in RANs with specific topologies, leading to suboptimal decision-making and resource allocation. In this paper, we propose a method referred to as GDRL, which is a deep reinforcement learning approach that utilizes graph neural networks to address the functional placement problem. To ensure policy stability, we design a policy gradient algorithm called Graph Proximal Policy Optimization (GPPO), which integrates GNNs into both the actor and critic networks. By incorporating both node and edge features, the GDRL enhances feature extraction from the RAN's nodes and links, providing richer observational data for decision-making and evaluation. This, in turn, enables more accurate and effective decision outcomes. In addition, we formulate the problem as a mixed-integer nonlinear programming model aimed at minimizing the number of active computational nodes while maximizing the centralization level of the virtualized RAN (vRAN). We evaluate the GDRL across different RAN scenarios with varying node configurations. The results demonstrate that our approach achieves superior network centralization and outperforms several existing methods in overall performance.
摘要:
Amodal instance segmentation refers to sensing the entire instance in an image, thereby segmenting the visible parts of an object and the regions that may be masked. However, existing amodal instance segmentation methods predict rough mask edges and perform poorly in segmenting objects with significant size differences. In addition, the occlusion environment greatly limits the performance of the model. To address the above problems, this work proposes an amodal instance segmentation method called MFC-Net to accurately segment objects in an image. For the rough prediction of mask edges, the model introduces the multi-path transformer structure to obtain finer object semantic features and boundary information, which improves the accuracy of edge region segmentation. For the problem of poor segmentation of object instances with significant size differences, we design an adaptive feature fusion module AFF, which dynamically captures the scale changes related to object size and fuses the multi-scale semantic feature information, so that the model obtains a receptive field adapted to the object size. To address the poor performance of segmentation in the occlusion environment, we designed the context-aware mask segmentation module CMS in the prediction module to make a preliminary prediction of the object’s amodal region. The module enhances the amodal perception of the model by modeling the long-range dependencies of the objects and capturing the contextual information of the occluded part of the object. Compared with the state-of-the-art methods, the MFC-Net proposed in this paper achieves a mAP of 73.3% on the D2SA dataset and 33.9% and 36.9% on the KINS and COCOA-cls datasets, respectively. Moreover, MFC-Net can produce complete and detailed amodal masks.
Amodal instance segmentation refers to sensing the entire instance in an image, thereby segmenting the visible parts of an object and the regions that may be masked. However, existing amodal instance segmentation methods predict rough mask edges and perform poorly in segmenting objects with significant size differences. In addition, the occlusion environment greatly limits the performance of the model. To address the above problems, this work proposes an amodal instance segmentation method called MFC-Net to accurately segment objects in an image. For the rough prediction of mask edges, the model introduces the multi-path transformer structure to obtain finer object semantic features and boundary information, which improves the accuracy of edge region segmentation. For the problem of poor segmentation of object instances with significant size differences, we design an adaptive feature fusion module AFF, which dynamically captures the scale changes related to object size and fuses the multi-scale semantic feature information, so that the model obtains a receptive field adapted to the object size. To address the poor performance of segmentation in the occlusion environment, we designed the context-aware mask segmentation module CMS in the prediction module to make a preliminary prediction of the object’s amodal region. The module enhances the amodal perception of the model by modeling the long-range dependencies of the objects and capturing the contextual information of the occluded part of the object. Compared with the state-of-the-art methods, the MFC-Net proposed in this paper achieves a mAP of 73.3% on the D2SA dataset and 33.9% and 36.9% on the KINS and COCOA-cls datasets, respectively. Moreover, MFC-Net can produce complete and detailed amodal masks.
通讯机构:
[Li, L ] H;Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China.;Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China.
关键词:
Deep learning;Side-channel attack;Multilabel;Machine learning;Information security
摘要:
Deep learning methods have significantly impact in the side-channel attack (SCA) community. However, the training and verification phases of deep learning-based side-channel attacks (DL-SCA) typically focus on a single byte, which leads to the requirement of training numerous models to recover all partial key bytes. To resolve the problem, this paper proposes the TripM model, triple-keys attack model, which can attack three bytes in a single training session. First, TripM leverages label groups black to learn multiple bytes of leaked information in a single training session, where the label groups refers to divide labels to different groups according to the different attack bytes. The labels of TripM comprise three label groups, each group containing the point-of-interest information of the corresponding key. Second, the architectural design of TripM features two identical convolutional branches, allowing for the application of weight-sharing techniques. Both branches utilize the same weights, reducing the size of the model parameters and accelerating the training process. Finally, the TripM model employs a multithreading technique in the key recovery phase, where three threads concurrently compute the 3-byte Guessing Entropy (GE) value. Experimental results demonstrate that TripM can efficiently process the public ASCAD and TinyPower datasets, with an average of 80 and 89 traces required to recover a key. Average Layer-wise Correlation (AVE-LWC) visualization techniques also illustrate that TripM possesses excellent feature extraction capabilities.
通讯机构:
[Li, L ] H;Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China.;Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China.
关键词:
Internet of Things;Lightweight block cipher;Generalized Feistel;Lai-Massey;High diffusion
摘要:
Lightweight block ciphers are critical for ensuring secure data transmission in resource-limited Internet of Things (IoT) devices. In designing secure and efficient lightweight block ciphers, balancing diffusion property and resource consumption becomes a key metric. This paper proposes QLW, a highly diffusive lightweight block cipher, designed to meet the growing security needs of resource-constrained devices. QLW employs a combined variant form of generalized Feistel structure (GFS) and Lai-Massey structure as its underlying structure. The QLW round function adopts a GFS, refined into a double half-round structure. The branch XOR and F-function utilize the Lai-Massey structure. Under the combined effect of both, QLW achieves full diffusion with just two rounds. Meanwhile, the QLW cipher uses a standard genetic algorithm (GA) to optimize a 4-bit S-box, ensuring robust security. The final S-box design occupies only 15.01 gate equivalents (GE) and requires eight logic gates, minimizing hardware overhead. Moreover, QLW achieves high diffusion with low-resource consumption using a linear matrix built from bitwise operations and logic gates. Furthermore, the QLW cipher increases the unpredictability of the rotation by incorporating a dynamic round constant T from the key schedule, enhancing resistance to algebraic attacks. Finally, the QLW is subjected to a security evaluation and hardware implementation. The results demonstrate that the hardware implementation of QLW requires only 1655.26 GE of area, consumes 7.37 mu\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upmu $$\end{document}J/bit of energy, and is resistant to known attacks such as differential cryptanalysis, linear cryptanalysis, and integral attack, with good security redundancy.