摘要:
Current camouflaged object detection (COD) methods rely heavily on large-scale datasets with pixel-level annotations. We propose a semi-supervised iterative learning network (SILNet) to address the reliance on large-scale pixel-level annotations in COD. SILNet employs a co-training strategy with convolutional networks and Transformers as encoders, followed by a binary gated decoder (BGD) for feature fusion. To optimize the use of labeled data, we introduce an optimal representative election mechanism (OREM) to identify key sequences of unlabeled images, guiding iterative learning and pseudo-label generation. To reduce noise in pseudo-labels, we incorporate a long-range representation module (LRM) leveraging Mamba’s background modeling. Experiments show that SILNet trained with only 10% of the labeled data outperforms state-of-theart unsupervised and weakly supervised methods, achieving performance competitive with fully supervised models.
摘要:
This paper proposes an A* artificial intelligence pathfinding algorithm based on the hexagonal grid map of Unreal Engine 5. This algorithm utilizes the rich tools and resources provided by Unreal Engine 5 to evaluate each node through a heuristic function, thus finding the shortest path. Test results show that this algorithm not only can quickly find the shortest path, but also can effectively avoid obstacle grids, with advantages such as high efficiency, flexibility, and scalability. This research result has high practical value for solving pathfinding problems on hexagonal grid maps and can provide strong support for game development and other fields of artificial intelligence applications.
作者机构:
[Yuxing Lu] Peking University;[Weichen Zhao; Ge Jiao; Yuan Yang] Hengyang Normal University
会议名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
会议时间:
14 April 2024
会议地点:
Seoul, Korea, Republic of
会议论文集名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
关键词:
Multi-Modal Media Manipulation;DeepFake Detection;Mask Signal Modeling
摘要:
Detecting and Grounding Multi-Modal Media Manipulation (DGM
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">4</sup>
) is an emerging task that aims to identify and locate manipulated elements in both textual and visual media. Given the complexity of this task, the model requires more sophisticated reasoning capabilities to align multi-modal features and capture forgery traces. To this end, we propose a Concentrated reasoning and Unified reconstruction framework (CrUr) for DGM
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">4</sup>
. Instead of adhering to traditional hierarchical reasoning paradigms, we directly carry out all inference tasks using integrated multi-modal features. Specifically, we extract and align features at a finer granularity, capturing subtle differences that may indicate manipulation by leveraging advanced mask signal modeling. Moreover, to adapt to fine-grained reasoning tasks, we design a transformer-based Reconstruction Harmonizer to facilitate more complex interactions among the reconstructed features, ultimately obtaining integrated features. Experimental results on the DGM
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">4</sup>
datasets show that our method achieves state-of-the-art performances.
作者机构:
College of Computer Science and Technology, Hengyang Normal University, Hunan, Hengyang, 421002, China;Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hunan, Hengyang, 421002, China
会议名称:
13th International Conference on Computer Engineering and Networks, CENet 2023
作者机构:
[Mingcong Gao] Hengyang Normal University, Hengyang, China
会议名称:
2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS)
会议时间:
23 August 2024
会议地点:
Hassan, India
会议论文集名称:
2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS)
关键词:
bidirectional-long short term memory;convolutional neural network;deep learning;mel-frequency cepstral coefficients;music genre classification
摘要:
Music is known to be time series data, where the increase in the data size pose a significant challenge to build a robust music genre classification system. The robust system requires large amount of labelled music data and necessitates the requirement of capturing effective data features for enhanced classification of music genre. The proposed research focused on developing a Deep Learning (DL) framework for classification with four steps. Initially, the music labelled data is collected from the GTZAN and ballroom dataset. The collected data is pre-processed using normalization for equalizing the volume of all samples from the database to achieve a constant level. The pre-processed data undergoes feature extraction to extract Mel-Frequency Cepstral Coefficients (MFCCs). These features obtain characteristics of an audio signal that describes the short-term power spectrum on the mel scale of the audiobook signals. The extracted features are successfully classified using Bidirectional-Long Short Term Memory combined Convolutional Neural Network (Bi-LSTM-CNN). The proposed Bi-LSTM-CNN achieved 98.79% accuracy, 97.84% precision, and 97.05% f1-score which surpassed existing DL approach such as Masked Predictive Encoder- Recurrent Neural Network (MPE-RNN) and VGG16- Bi-LSTM.
作者机构:
[Xigui Lei; Ge Jiao; Hai Liu; Yuxin Hu] College of Computer Science and Technology, Hengyang Normal University, Hengyang, Hunan, China [email protected]
会议名称:
CAIBDA '24: Proceedings of the 2024 4th International Conference on Artificial Intelligence, Big Data and Algorithms
会议论文集名称:
Artificial Intelligence, Big Data and Algorithms
摘要:
In recent years, there has been a rise in drowning accidents in swimming pools. As a result, there is a growing interest in using deep learning methods to detect drowning incidents. However, current research has identified several issues with existing drowning detection methods, including poor real-time performance, a high number of parameters, and extensive calculations. To address these problems, a lightweight and real-time drowning detection algorithm, YPMNet, was developed. This algorithm utilizes the MobileNetV3 lightweight network to reconstruct the YOLOPose backbone network, resulting in faster feature extraction. Additionally, the SE attention mechanism is incorporated to balance the trade-off between model accuracy and lightweight design. To further optimize the algorithm for mobile devices, the h-swish activation function is used to enhance the model's nonlinear mapping power and robustness. The experimental results demonstrate that YPMNet achieves a detection accuracy of 90% and a detection speed of 32fps, meeting the real-time requirements for drowning detection.
作者机构:
[Yang Liu; Shuaikang Song; Hua Lv] Hengyang Normal University, Hengyang, China [email protected]
会议名称:
ICCSIE '24: Proceedings of the 2024 9th International Conference on Cyber Security and Information Engineering
会议论文集名称:
Cyber Security and Information Engineering
摘要:
In this paper, in view of the shortcomings of traditional methods of identifying key nodes in public opinion dissemination, such as low identification accuracy, low coverage of key nodes, and shallow central relevance, an optimization method based on knowledge graph is proposed. By collecting public opinion data on Weibo and building a social network, combining the community discovery algorithm with the improved PageRank algorithm, the effective identification of key nodes is realized. Experimental comparison shows that in the case of opinion leaders accounting for the top 3%, the coverage rate of key nodes of this method can reach 46.5%, and the recognition accuracy rate reaches 94.1%, which is significantly better than the hierarchical analysis method (coverage rate 21.2%, accuracy rate 90.7%) and the clustering analysis method (coverage rate 15.8%, accuracy rate 91.1%).Therefore, this study shows that knowledge graph technology can effectively improve the identification performance of key nodes of public opinion dissemination. The research in this paper confirms that the method of integrating knowledge graph can significantly improve the identification accuracy and coverage rate of key nodes of public opinion dissemination, and provides a powerful tool for public opinion management.
作者:
Ge Jiao;Yingjie Jiang;Lingcheng Zeng;Lei Zeng;Guangyong Zheng;...
作者机构:
[Lingcheng Zeng] Songbai Middle School, China;[Lei Zeng] Qidong Middle School Attached to HYNU, China;[Yingjie Jiang] Chuanshan Experimental Middle School, China;[Ge Jiao; Guangyong Zheng; Kangman Li] College of Computer Science and Technology, Hengyang Normal University, China
会议名称:
IECT '24: Proceedings of the 2024 International Conference on Intelligent Education and Computer Technology
会议论文集名称:
Intelligent Education and Computer Technology
摘要:
With the rapid advancement of AI technology in the field of basic education, the concept, form, method, mode, and environment of education and teaching are undergoing significant changes. As a result, information technology teachers are confronted with the challenge of restructuring their subject knowledge and developing intelligent literacy. In the context of applying AI to basic education, intelligent literacy has become an essential professional skill for IT teachers. This paper proposes a framework for cultivating intelligent literacy among information technology teachers based on Artificial Intelligence Technological Pedagogical Content Knowledge (AI-TPACK) theory. The study aims to elucidate the meaning of AI-TPACK for information technology teachers, establish AI-TPACK models and practice paths for them, propose strategies for integrating subject teaching knowledge with artificial intelligence technology, and enhance cultivation and development plans for their intelligent literacy. The goal is to enhance the teaching abilities of information technology teachers so that they can acquire comprehensive artificial intelligence literacy in their work and better adapt to the new requirements and challenges presented by the era of AI.
作者机构:
[Yuxing Lu] Peking University;[Weichen Zhao; Ge Jiao; Yuan Yang] Hengyang Normal University
会议名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
会议时间:
14 April 2024
会议地点:
Seoul, Korea, Republic of
会议论文集名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
关键词:
Person search;Text-based person re-identification;Color reasoning
摘要:
Text-based Person Search (TBPS) aims to retrieve the person images based on the given text descriptions. Due to the heterogeneity between modalities and the fine granularity of the person, it is challenging to address the task. Existing methods often overlook granularity consistency across different color channels, which means there’s much potential to enhance retrieval performance. In this paper, we propose a Dual-Color Granularity Alignment (DCGA) method for Text-Based Person Search. DCGA harnesses both color and grayscale information to address issues of color reliance and granularity consistency. Moreover, by employing an improved CR Loss with grayscale information used as an additional weak supervision, DCGA addresses intra-class variance and dataset scarcity. Extensive experiments have demonstrated that our proposed DCGA method achieves state-of-the-art results on all three public datasets.
作者机构:
[Tai Wang] Faculty of Artificial Intelligence Education, Central China Normal University, Wuhan, China;College of Educational, Science Hengyang Normal University, Hengyang, China;[Shuhui Wang] Faculty of Artificial Intelligence Education, Central China Normal University, Wuhan, China<&wdkj&>College of Educational, Science Hengyang Normal University, Hengyang, China
会议名称:
2024 4th International Conference on Educational Technology (ICET)
会议时间:
13 September 2024
会议地点:
Wuhan, China
会议论文集名称:
2024 4th International Conference on Educational Technology (ICET)
关键词:
new CEE reform;admission quality;college major group;moderating effect
摘要:
The "3+1+2" subject selection mechanism, combined with the college major group admission model, has become the most popular scheme in the ongoing reform of the new college entrance examination (CEE) in China. This study utilizes CEE admission panel data from local higher normal colleges in Hunan Province from 2018 to 2023 to assess the reform’s impacts using a fixed-effects model, taking college major group as the unit of analysis. The empirical study examines changes in admission quality among both advantaged and disadvantaged college major groups, across various subject categories, before and after the reform’s implementation. Findings indicate that the reform has significantly improved admission quality in college advantaged major groups, whereas disadvantaged groups have experienced a decline. The trend analysis suggests that the new CEE’s "major-oriented" mechanism marks a partial breakthrough from the traditional "score-centric" admissions system. It highlights immediate improvements in the admission quality in college advantaged groups, yet lacks a sustained moderating effect over time. Conversely, the ongoing decline in admission quality within college disadvantaged groups post-reform points to a significant moderating effect of the new CEE reform, underscoring the need for in-depth studies to track and clarify these disparities. This study sheds light on the differentiated effects of the new CEE reform, suggesting directions for future educational policy and reform efforts.