版权说明 操作指南
首页 > 成果 > 详情

When CNN meet with ViT: decision-level feature fusion for camouflaged object detection

认领
导出
Link by DOI
反馈
分享
QQ微信 微博
成果类型:
期刊论文
作者:
Yue, Guowen;Jiao, Ge;Li, Chen;Xiang, Jiahao
通讯作者:
Jiao, G
作者机构:
[Li, Chen; Jiao, Ge; Yue, Guowen; Xiang, Jiahao] Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China.
[Jiao, Ge; Xiang, Jiahao] Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China.
通讯机构:
[Jiao, G ] H
Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China.
Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China.
语种:
英文
关键词:
Convolutional neural network;Vision transformer;Camouflaged object detection;Feature fusion
期刊:
VISUAL COMPUTER
ISSN:
0178-2789
年:
2024
页码:
1-16
基金类别:
Hunan Provincial Natural Science Foundation of China [2021JJ50074, 2022JJ50016]; Science and Technology Plan Project of Hunan Province [2016TP1020]; The 14th Five-Year Plan Key Disciplines and Application-oriented Special Disciplines of Hunan Province [351]
机构署名:
本校为第一且通讯机构
院系归属:
计算机科学与技术学院
物理与电子工程学院
摘要:
Despite the significant advancements in camouflaged object detection achieved by convolutional neural network (CNN) methods and vision transformer (ViT) methods, both have limitations. CNN-based methods fail to explore long-range dependencies due to their limited receptive fields, while ViT-based methods lose detailed information due to large-span aggregation. To address these issues, we introduce a novel model, the double-extraction and triple-fusion network (DTNet), which leverages the global context modeling capabilities of ViT-based encoders and the detail capture capabilities of CNN-based...

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com