TY - GEN
T1 - Doubly Deformable Aggregation of Covariance Matrices for Few-Shot Segmentation
AU - Xiong, Zhitong
AU - Li, Haopeng
AU - Zhu, Xiao Xiang
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Training semantic segmentation models with few annotated samples has great potential in various real-world applications. For the few-shot segmentation task, the main challenge is how to accurately measure the semantic correspondence between the support and query samples with limited training data. To address this problem, we propose to aggregate the learnable covariance matrices with a deformable 4D Transformer to effectively predict the segmentation map. Specifically, in this work, we first devise a novel hard example mining mechanism to learn covariance kernels for the Gaussian process. The learned covariance kernel functions have great advantages over existing cosine similarity-based methods in correspondence measurement. Based on the learned covariance kernels, an efficient doubly deformable 4D Transformer module is designed to adaptively aggregate feature similarity maps into segmentation results. By combining these two designs, the proposed method can not only set new state-of-the-art performance on public benchmarks, but also converge extremely faster than existing methods. Experiments on three public datasets have demonstrated the effectiveness of our method. (Code: https://github.com/ShadowXZT/DACM-Few-shot.pytorch
AB - Training semantic segmentation models with few annotated samples has great potential in various real-world applications. For the few-shot segmentation task, the main challenge is how to accurately measure the semantic correspondence between the support and query samples with limited training data. To address this problem, we propose to aggregate the learnable covariance matrices with a deformable 4D Transformer to effectively predict the segmentation map. Specifically, in this work, we first devise a novel hard example mining mechanism to learn covariance kernels for the Gaussian process. The learned covariance kernel functions have great advantages over existing cosine similarity-based methods in correspondence measurement. Based on the learned covariance kernels, an efficient doubly deformable 4D Transformer module is designed to adaptively aggregate feature similarity maps into segmentation results. By combining these two designs, the proposed method can not only set new state-of-the-art performance on public benchmarks, but also converge extremely faster than existing methods. Experiments on three public datasets have demonstrated the effectiveness of our method. (Code: https://github.com/ShadowXZT/DACM-Few-shot.pytorch
KW - Deep kernel learning
KW - Few-shot segmentation
KW - Gaussian process
KW - Similarity measurement
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85144496818&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-20044-1_8
DO - 10.1007/978-3-031-20044-1_8
M3 - Conference contribution
AN - SCOPUS:85144496818
SN - 9783031200434
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 133
EP - 150
BT - Computer Vision – ECCV 2022 - 17th European Conference, Proceedings
A2 - Avidan, Shai
A2 - Brostow, Gabriel
A2 - Cissé, Moustapha
A2 - Farinella, Giovanni Maria
A2 - Hassner, Tal
PB - Springer Science and Business Media Deutschland GmbH
T2 - 17th European Conference on Computer Vision, ECCV 2022
Y2 - 23 October 2022 through 27 October 2022
ER -