Abstract
When deploying classifiers in the real world, users expect them to respond to inputs appropriately. However, traditional classifiers are not equipped to handle inputs which lie far from the distribution they were trained on. Malicious actors can exploit this defect by making adversarial perturbations designed to cause the classifier to give an incorrect output. Classification-with-rejection methods attempt to solve this problem by allowing networks to refuse to classify an input in which they have low confidence. This works well for strongly adversarial examples, but also leads to the rejection of weakly perturbed images, which intuitively could be correctly classified. To address these issues, we propose Reed-Muller Aggregation Networks (RMAggNet), a classifier inspired by Reed-Muller error-correction codes which can correct and reject inputs. This paper shows that RMAggNet can minimise incorrectness while maintaining good correctness over multiple adversarial attacks at different perturbation budgets by leveraging the ability to correct errors in the classification process. This provides an alternative classification-with-rejection method which can reduce the amount of additional processing in situations where a small number of incorrect classifications are permissible.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Chen, X., et al.: Symbolic discovery of optimization algorithms. arXiv:2302.06675 (2023)
Tragakis, A., Kaul, C., Murray-Smith, R., Husmeier, D.: The fully convolutional transformer for medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3660–3669 (2023)
Pierazzi, F., Pendlebury, F., Cortellazzi, J., Cavallaro, L.: Intriguing properties of adversarial ml attacks in the problem space. In: 2020 IEEE Symposium on Security and Privacy (SP), 2020, pp. 1332–1349 (2020)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv:1312.6199 (2013)
Smith, L., Gal, Y.: Understanding measures of uncertainty for adversarial example detection. In: 34th Conference on Uncertainty in Artificial Intelligence 2018, UAI 2018, vol. 2, mar 2018, pp. 560–569 (2018)
Zou, A., Wang, Z., Kolter, J.Z., Fredrikson, M.: Universal and transferable adversarial attacks on aligned language models (2023)
Morris, J.X., Lifland, E., Yoo, J.Y., Grigsby, J., Jin, D., Qi, Y.: TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in nlp. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 119–126 (2020)
Chen, S.-T., Cornelius, C., Martin, J., Chau, D.H.P.: ShapeShifter: robust physical adversarial attack on faster R-CNN object detector. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11051, pp. 52–68. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10925-7_4
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 2016, pp. 582–597 (2016)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv:1412.6572 (2014)
Verma, G., Swami, A.: Error correcting output codes improve probability estimation and adversarial robustness of deep neural networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Cortes, C., DeSalvo, G., Mohri, M.: Learning with rejection. In: Ortner, R., Simon, H.U., Zilles, S. (eds.) ALT 2016. LNCS (LNAI), vol. 9925, pp. 67–82. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46379-7_5
Charoenphakdee, N., Cui, Z., Zhang, Y., Sugiyama, M. In: International Conference on Machine Learning, PMLR, 2021, pp. 1507–1517 (2021)
Song, Y., Kang, Q., Tay, W.P.: Error-correcting output codes with ensemble diversity for robust learning in neural networks. Proc. AAAI Conf. Artif. Intell. 35(11), 9722–9729 (2021)
Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv:1605.07277 (2016)
Stutz, D., Hein, M., Schiele, B.: Confidence-calibrated adversarial training: generalizing to unseen attacks. In: Proceedings of the International Conference on Machine Learning ICML (2020)
Fentham, D., Parker, D., Ryan, M.: Using Reed-Muller codes for classification with rejection and recovery. arXiv:2309.06359 (2023)
Gamal, A., Hemachandra, L., Shperling, I., Wei, V.: Using simulated annealing to design good codes. IEEE Trans. Inf. Theory 33(1), 116–123 (1987)
Muller, D.E.: Application of boolean algebra to switching circuit design and to error detection. In: Transactions of the I.R.E. Professional Group on Electronic Computers, vol. EC-3, no. 3, pp. 6–12 (1954)
Reed, I.: A class of multiple-error-correcting codes and the decoding scheme. Trans. IRE Profess. Group Inform. Theory 4(4), 38–49 (1954)
Hamming, R.W.: Error detecting and error correcting codes. Bell System Tech. J. 29(2), 147–160 (1950)
Carlini, N., et al.: On evaluating adversarial robustness. arXiv:1902.06705 (2019)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv:1706.06083 (2017)
Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In: International Conference on Learning Representations (2018)
Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: Emnist: extending mnist to handwritten letters. In: 2017 International Joint Conference on Neural Networks (IJCNN), IEEE, 2017, pp. 2921–2926 (2017)
Rauber, J., Zimmermann, R., Bethge, M., Brendel, W.: Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in pytorch, tensorflow, and jax. Journal of Open Source Software, vol. 5, no. 53, p. 2607, 2020. https://doi.org/10.21105/joss.02607
Rauber, J., Brendel, W., Bethge, M.: Foolbox: a python toolbox to benchmark the robustness of machine learning models. In: Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning (2017)
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: International Conference on Machine Learning. PMLR, 2018, pp. 274–283 (2018)
Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 9185–9193 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778 (2016)
Jeevan, P., Viswanathan, K., Sethi, A.: Wavemix-lite: a resource-efficient neural network for image analysis. arXiv:2205.14375 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fentham, D., Parker, D., Ryan, M. (2024). Using Reed-Muller Codes for Classification with Rejection and Recovery. In: Mosbah, M., Sèdes, F., Tawbi, N., Ahmed, T., Boulahia-Cuppens, N., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2023. Lecture Notes in Computer Science, vol 14551. Springer, Cham. https://doi.org/10.1007/978-3-031-57537-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-57537-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57536-5
Online ISBN: 978-3-031-57537-2
eBook Packages: Computer ScienceComputer Science (R0)