6.1 The Performance of the Attack Method
Metric learning models. In this experiment, we adopt the following state-of-the-art metric learning models to evaluate the performance of AckMetric.
LMNN [
57] is a method that aims at letting the
\(k\) -nearest neighbors belong to the same class, while the instances from different classes are separated by a large margin.
GMML [
71] formulates the metric learning process as a smooth, strongly convex optimization problem by using pairs of similar and dissimilar points.
ITML [
6] models the metric learning problem in an information-theoretic setting by leveraging the relationship between the multivariate Gaussian distribution and the set of Mahalanobis distances.
LowRank [
72] presents a similarity algorithm by encoding low-rank structures to the learning process to conduct the sparse feature selection.
SCML [
48] aims at learning a sparse combination of locally discriminative metrics that are inexpensive to generate.
AML [
5] first generates synthetic hard samples based on GANs, and then uses these generated hard samples to boost the discriminability of the learned metric learning model.
Datasets. The details of the adopted real-world datasets are described as follows: The
Parkinson’s disease dataset [
35] contains 22 features and 195 biomedical voice samples collected from 31 humans, in which 23 were diagnosed with Parkinson’s Disease. The
Heart dataset and the
Ionosphere dataset are two binary classification datasets from UCI machine-learning repository.
2 The
MNIST 8v9 dataset [
33] is a subset of the 784-dimensional MNIST set, and it contains 2,016 images. The
AT&T face recognition dataset3 constains 400 grayscale images of 40 individuals in 10 different poses. The task of this face dataset is to determine whether two face images are from the same identity or not. Additionally, we also adopt three UCI regression datasets (i.e.,
Energy,
Housing, and
Concrete). For each of them, we first normalize the real-valued output of each instance to [0,1], and then label the top
\(30\%\) of the instances with the positive category and the remaining instances with negative category. The statistical information of the adopted datasets are described in Table
1.
Performance. We evaluate the performance of AckMetric through measuring the percentage of the successfully generated adversarial pairs that can fool the target metric model on the above eight real-world datasets. For each adopted dataset, we first randomly select a subset of instances as the training set, and then randomly sample the testing instance pairs from the remaining instances. The number of training instances and testing instance pairs for each dataset is provided in Table
4 (in the supplementary Appendix). In this experiment, the parameters of each adopted metric learning model are the same as that in its original work. Figure
1 reports the results for the six models on the Parkinson’s disease dataset, the Heart dataset, the Ionosphere dataset, the Energy dataset, the Housing dataset, and the Concrete dataset. Here, we vary
\(\epsilon\) from 0 to 0.7. From this figure, we can see that the adopted models are vulnerable to adversarial perturbations, and the proposed AckMetric can easily fool the six models. For example, when the parameter
\(\epsilon\) is set as 0.6, the attacker is able to craft adversarial pairs against the adopted metric models with almost
\(100\%\) success on the Parkinson’s disease dataset. As for the two image datasets (i.e., 8v9 and AT&T), we vary
\(\epsilon\) from 0.05 to 0.15 and report the results for the models GMML, LMNN, and SCML in Table
2, from which we can see AckMetric still has good performance. By setting
\(\epsilon\) as 0.15, the attacker can successfully generate adversarial pairs on 63% of the MNIST 8V9 testing data when the model is SCML, and let GMML misclassify 54% of the AT&T testing data. All these results show that the learned models using given metric learning methods are vulnerable to adversarial perturbations and the proposed AckMetric can effectively generate adversarial pairs to fool well-learned metric models.
Inspection of adversarial dissimilar pairs. Here, we provide the visualization results for the crafted adversarial dissimilar pairs that are generated by the proposed AckMetric. Specifically, for each original (clean) similar instance pair, we aim at generating the adversarial perturbations that is added to the original similar instance pair cause a metric learning model to make a false similarity prediction (i.e., the adversarial dissimilar pair). In this experiment, we conduct experiments on the two adopted image datasets (i.e., the MNIST and AT&T datasets), and the parameter
\(\epsilon\) is set as 0.01. The reason is that by setting
\(\epsilon =0.01\) , we can ensure that the pairwise adversarial perturbations are imperceptible to humans. In this way, the added slight adversarial perturbations could not be recognized by human eyes, and the attacker can avoid being detected. Meanwhile, the added adversarial pairwise perturbations can construct adversarial instance pairs that largely change the pairwise prediction results given by the metric learning models. Then, we can evaluate whether the proposed AckMetric can easily generate adversarial instance pairs with imperceptible changes to fool these metric learning models. Figure
2 shows some examples of the adversarial dissimilar pairs generated by AckMetric when LMNN is adopted on the MNIST 8V9 dataset. For simplicity, we take Figure
2(a) as an example to give an intuitive understanding of the generated dissimilar pairs. In the bottom row of Figure
2(a), we show the original similar image pair of two digit images, which is also treated as a similar pair by LMNN when there is no adversarial perturbations. In the middle row of Figure
2(a), we show the adversarial perturbations that is added to the original similar pair (in the bottom row of Figure
2(a)) to craft the adversarial instance pair to fool the trained metric learning model. In the top row of Figure
2(a), we show the generated adversarial image pair that is crafted by adding the adversarial perturbations in the middle row to the original similar image pair in the bottom row. From this figure, we can observe that the adversarial dissimilar image pair in the top row is almost the same as the original similar image pair in the bottom row and the added adversarial perturbations are imperceptible to humans, but the crafted adversarial image pair can successfully mislead LMNN to make wrong similarity predictions. We also visualize the adversarial dissimilar pairs crafted by AckMetric when the SCML model is adopted on the AT&T dataset. Some examples are provided in Figure
3, from which we can also observe the similar observation. The reported visualization results in the Figures
2 and
3 also verify that current metric learning models are not robust enough to adversarial perturbations and the proposed AckMetric can easily generate adversarial pairs with imperceptible changes to fool these metric learning models.
Inspection of adversarial similar pairs. In addition to providing the visualization results of the crafted dissimilar pairs, we also visualize the adversarial similar image pairs crafted by AckMetric (with the optimization framework described in Section
5.1) when LMNN is adopted on the MNIST 8V9 dataset. Here,
\(\epsilon\) is still set as 0.01. Figure
4 shows the crafted adversarial similar instance pairs. For simplicity, we take Figure
4(a) as an example to give an intuitive understanding of the generated similar instance pairs. In Figure
4(a), the image pair in the bottom row is the original clean dissimilar pair, which are also correctly classified by LMNN as dissimilar. The image pair in the top row of Figure
4(a) is the generated adversarial similar pair, which is crafted by the proposed AckMetric and misclassified by LMNN as similar. The generated adversarial perturbations added to the original dissimilar pair (in the bottom row of Figure
4(a)) to generate the adversarial similar pair is shown in the middle row of Figure
4(a). From this figure, we can observe that the crafted adversarial similar pair (in the top row of Figure
4(a)) is visually indistinguishable from the original clean dissimilar pair (in the bottom row of Figure
4(a)). Importantly, the crafted adversarial similar pair (in the top row of Figure
4(a)) can successfully fool LMNN, which further verifies the effectiveness of the proposed AckMetric and that current metric learning models are not robust enough to adversarial perturbations.
Crafting adversarial triplets. In addition to generating adversarial pairs, we also conduct experiments to evaluate the robustness of current metric learning models by crafting adversarial triplets based on the proposed AckMetric. The details about how to generate adversarial triplets are described in Section
C of the supplementary Appendix. Figure
5 reports some examples of the adversarial triplets crafted by AckMetric when LMNN is adopted on the MNIST 8V9 dataset. In this experiment, we randomly select half of the instances in the dataset to train LMNN, and the parameter
\(\epsilon\) is set as 0.01. Take Figure
5(a) as an example. The image triplet in the bottom row denotes the original triplet. For this original triplet, based on LMNN, we can derive that the left image is more similar to the middle image than to the right image. The triplet in the top row is the crafted adversarial triplet, which has the reverse relationship based on LMNN (i.e., the left image is more similar to the right image than to the middle image). The perturbations added to the original triplet to generate the adversarial triplet are shown in the middle row. As we can see, the crafted adversarial image triplets are almost the same as the original image triplet and the changes are imperceptible to humans, but the crafted adversarial triplets can successfully fool LMNN.
6.2 Evaluating the Proposed Training Objective
In this section, we evaluate the effectiveness of the proposed defense mechanism. Here, we still take the widely adopted pairwise constrained metric learning loss (i.e., Equation (
20)) as an example and evaluate whether the proposed training objective (i.e., Equation (
21)) can improve its robustness to adversarial perturbations.
Baselines. Note that the proposed training objective (i.e., Equation (
21)) is derived by adding the upper bound to the pairwise constrained loss function (i.e., Equation (
20)). In experiments, we compare the proposed training objective with the following three different objectives:
–
Normal training (NT-ML). The pairwise constrained loss \(\mathcal {L}_{1}\) and no explicit regularization.
–
Spectral norm regularization (Spe-ML). The pairwise constrained loss \(\mathcal {L}_{1}\) and the regularizer \(\beta _{1} \Vert \mathbf {W}^{T}\mathbf {W} \Vert _{2}\) with \(\beta _{1}=0.2\) (i.e., \(\mathcal {L}_{1}+\beta _{1} \Vert \mathbf {W}^{T}\mathbf {W} \Vert _{2}\) ).
–
Frobenius norm regularization (Fro-ML). The pairwise constrained loss \(\mathcal {L}_{1}\) and the regularizer \(\beta _{2} \Vert \mathbf {W}^{T}\mathbf {W} \Vert _{F}\) with \(\beta _{2}=0.2\) (i.e., \(\mathcal {L}_{1}+\beta _{2} \Vert \mathbf {W}^{T}\mathbf {W} \Vert _{F}\) ).
The robustness of the proposed training objective. In this experiment, we evaluate the effectiveness of the proposed training objective in the metric learning based classification task. Specifically, for each training objective, we first use the training data to learn a distance metric. Then, we generate the adversarial instance for each of the instances in the test set based on the attack strategy described in Section
5.2. Finally, we calculate the classification accuracy of the adversarial instances. Here, the class labels of the adversarial instances are assigned based on the KNN classifier. The higher the classification accuracy, the more robust the metric learning model, which means the corresponding defense objective is more effective.
The parameter setting is detailed in Table
5 (in the supplementary Appendix). We tune the number of the eigenvalues used in the certificated loss (i.e., Equation (
21)), and Table
3 shows the experimental results on the Parkinson’s disease, Ionosphere, and Heart datasets. From this table, we can see that the metric learning model (i.e., NT-ML) without any defense mechanism achieves the worst performance under attack. The testing accuracy of NT-ML on the Ionosphere dataset is only 0.26 under attack, which further verifies the vulnerability of the learned metric models. The results also show that the performance of the learned metric models under attack can be improved after considering the derived upper bounds. Although the proposed training objective (i.e., Equation (
21)) performs slightly worse than Spe-ML when there is no attack, it can achieve much better performance under attack. When we incorporate the top three eigenvalues, the testing accuracy of the proposed training objective on the Heart dataset is 0.64, while those of Spe-ML and Fro-ML are only 0.57 and 0.52, respectively. These results demonstrate that the proposed training objective (i.e., Equation (
21)) can make metric learning more robust against adversarial perturbations.