Metric Learning for Projections Bias of Generalized Zero-shot Learning,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Metric Learning for Projections Bias of Generalized Zero-shot Learning
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2023-09-04 , DOI: arxiv-2309.01390
Chong Zhang, Mingyu Jin, Qinkai Yu, Haochen Xue, Xiaobo Jin

Generalized zero-shot learning models (GZSL) aim to recognize samples from seen or unseen classes using only samples from seen classes as training data. During inference, GZSL methods are often biased towards seen classes due to the visibility of seen class samples during training. Most current GZSL methods try to learn an accurate projection function (from visual space to semantic space) to avoid bias and ensure the effectiveness of GZSL methods. However, during inference, the computation of distance will be important when we classify the projection of any sample into its nearest class since we may learn a biased projection function in the model. In our work, we attempt to learn a parameterized Mahalanobis distance within the framework of VAEGAN (Variational Autoencoder \& Generative Adversarial Networks), where the weight matrix depends on the network's output. In particular, we improved the network structure of VAEGAN to leverage the discriminative models of two branches to separately predict the seen samples and the unseen samples generated by this seen one. We proposed a new loss function with two branches to help us learn the optimized Mahalanobis distance representation. Comprehensive evaluation benchmarks on four datasets demonstrate the superiority of our method over the state-of-the-art counterparts. Our codes are available at https://anonymous.4open.science/r/111hxr.

中文翻译：

广义零样本学习的投影偏差的度量学习

广义零样本学习模型（GZSL）旨在仅使用已见类别中的样本作为训练数据来识别已见或未见类别中的样本。在推理过程中，由于训练期间可见类样本的可见性，GZSL 方法通常会偏向于可见类。当前大多数 GZSL 方法都试图学习准确的投影函数（从视觉空间到语义空间）以避免偏差并确保 GZSL 方法的有效性。然而，在推理过程中，当我们将任何样本的投影分类到其最近的类中时，距离的计算将很重要，因为我们可能会在模型中学习有偏差的投影函数。在我们的工作中，我们尝试在 VAEGAN（变分自动编码器和生成对抗网络）的框架内学习参数化的马哈拉诺比斯距离，其中权重矩阵取决于网络的输出。特别是，我们改进了 VAEGAN 的网络结构，利用两个分支的判别模型来分别预测已见样本和由该已见样本生成的未见样本。我们提出了一个具有两个分支的新损失函数来帮助我们学习优化的马氏距离表示。四个数据集的综合评估基准证明了我们的方法相对于最先进的同行的优越性。我们的代码可在 https://anonymous.4open.science/r/111hxr 上获取。我们提出了一个具有两个分支的新损失函数来帮助我们学习优化的马氏距离表示。四个数据集的综合评估基准证明了我们的方法相对于最先进的同行的优越性。我们的代码可在 https://anonymous.4open.science/r/111hxr 上获取。我们提出了一个具有两个分支的新损失函数来帮助我们学习优化的马氏距离表示。四个数据集的综合评估基准证明了我们的方法相对于最先进的同行的优越性。我们的代码可在 https://anonymous.4open.science/r/111hxr 上获取。

更新日期：2023-09-06

点击分享查看原文

点击收藏

阅读更多本刊新发论文