融合空间相关性和局部特征转换器的遮挡行人重识别
Spatial correlation and local feature transformer based occluded person re identification
  
DOI:
中文关键词:  遮挡行人重识别;局部特征;图像块序列;视觉转换器
英文关键词:person re identification; local feature; patch sequence; vision transformer
基金项目:南京邮电大学自然科学基金(NY221077)资助项目
作者单位
朱松豪 南京邮电大学 自动化学院、人工智能学院,江苏 南京 210023 
赵云斌 南京邮电大学 自动化学院、人工智能学院,江苏 南京 210023 
焦 淼 山东鲁能泰山电缆有限公司 特变电工,山东 新泰 271219 
摘要点击次数: 219
全文下载次数: 65
中文摘要:
      遮挡的行人重识别是计算机视觉中的一个挑战性领域,它面临着特征表示效率低下和识别准确率低等问题。卷积神经网络方法更注重局部特征的提取,因此难以提取被遮挡行人的特征,效果也不尽如人意。最近,视觉转换器被引入到重识别领域,并通过构建图像块序列之间的全局特征联系取得了最先进的结果。然而,视觉转换器在提取局部特征方面的性能不如卷积神经网络。因此,设计了一个基于空间相关性和局部特征序列的行人重识别网络。所提出的网络利用3个模块来提高视觉转换器的效率:(1) 图像块全维度增强模块。设计了一个与图像块序列大小相同的可学习张量,该张量是全维的,并可完全嵌入到图像块序列中,用以丰富训练样本的多样性;(2) 图像块序列融合重构模块。提取已经获得的图像块序列中不太重要的部分,并将它们与原始的图像块序列融合以重构原始图像块序列;(3) 空间切割模块。从空间方向上对图像块序列进行切片和分组,并引入身份损失,可以有效提高图像块序列的短程相关性。对遮挡和整体重识别数据集的实验结果表明,所提网络的性能优于其他先进方法。
英文摘要:
      Occluded person re identification is one of the challenging tasks for computer vision. It faces problems such as inefficient feature representation and low recognition accuracy. Since convolutional neural networks pay more attention to the extraction of local features, they are hardly suitable to extract features of occluded pedestrians and output satisfuing results. Recently, vision transformer has been introduced into the field of re identification and achieves the most advanced results by constructing the relationship of global features between patch sequences. But its performance on local feature extraction is inferior to that of convolutional neural networks. Therefore, we design a spatial correlation and local feature sequence based person re identification model. The proposed model utilizes three modules to enhance the vision transformer. (1) Patch full dimension enhancement module. We design a learnable tensor with the same size as patch sequences. It is full dimensional and can be deeply embedded in patch sequences to enrich the diversity of training samples. (2) Fusion and reconstruction module. We extract the less important part of obtained patch sequences, and fuse them with the original patch sequences to reconstruct the original patch sequences. (3) Spatial slicing module. We slice and group patch sequences from the spatial direction. This can effectively improve the short range correlation of patch sequences. We conduct experimental on occluded and holistic re identification datasets and the results demonstrate that the proposed model can stably achieve superior performance and outperforms the state of the art methods.
查看全文  查看/发表评论  下载PDF阅读器

你是第3085610访问者
版权所有《南京邮电大学学报(自然科学版)》编辑部
Tel:86-25-85866913 E-mail:xb@njupt.edu.cn
技术支持:本系统由北京勤云科技发展有限公司设计

欢迎访问《南京邮电大学学报(自然科学版)》编辑部!