Abstract
We introduce a deep learning ensemble (NNBits) as a tool for bit-profiling and evaluation of cryptographic (pseudo) random bit sequences. On the one hand, we show how to use NNBits ensemble to explain parts of the seminal work of Gohr [16]: Gohr’s depth-1 neural distinguisher reaches a test accuracy of 78.3% in round 6 for SPECK32/64 [3]. Using the bit-level information provided by NNBits we can partially explain the accuracy obtained by Gohr (78.1% vs. 78.3%). This is achieved by constructing a distinguisher which only uses the information about correct or incorrect predictions on the single bit level and which achieves 78.1% accuracy. We also generalize two heuristic aspects in the construction of Gohr’s network: i) the particular input structure, which reflects expert knowledge of SPECK32/64, as well as ii) the cyclic learning rate.
On the other hand, we extend Gohr’s work as a statistical test on avalanche datasets of SPECK32/64, SPECK64/128, SPECK96/144, SPECK128/128, and AES-128. In combination with NNBits ensemble we use the extended version of Gohr’s neural network to draw a comparison with the NIST Statistical Test Suite (NIST STS) on the previously mentioned avalanche datasets. We compare NNBits in conjunction with Gohr’s generalized network to the NIST STS and conclude that the NNBits ensemble performs either as good as the NIST STS or better. Furthermore, we demonstrate cryptanalytic insights that result from bit-level profiling with NNBits, for example, we show how to infer the strong input difference \((0x0040, 0x0000)\) for SPECK32/64 or infer a signature of the multiplication in the Galois field of AES-128.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that this statement has limited practical implications: even if enough data, representational power of the network, as well as sufficient computational resources to train the network are given, the training itself may be an NP-hard problem [26].
- 2.
This limit has recently been overpassed by human cryptanalysis in [8], giving machine learning a new threshold to overcome.
References
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2016), pp. 265–283 (2016)
Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media (2019). https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/
Bacuieti, N.N., Batina, L., Picek, S.: Deep neural networks aiding cryptanalysis : a case study of the Speck distinguisher. ePrint, pp. 1–24 (2022). https://eprint.iacr.org/2022/341
Baksi, A., Breier, J., Chen, Y., Dong, X.: Machine learning assisted differential distinguishers for lightweight ciphers. In: 2021 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 176–181 (2021). https://doi.org/10.23919/DATE51398.2021.9474092
Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., Wingers, L.: The SIMON and SPECK families of lightweight block ciphers. National Security Agency (NSA), 9800 Savage Road, Fort Meade, MD 20755, USA (2013)
Bellini, E., Rossi, M.: Performance comparison between deep learning-based and conventional cryptographic distinguishers. In: Arai, K. (ed.) Intelligent Computing. LNNS, vol. 285, pp. 681–701. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80129-8_48
Benamira, A., Gerault, D., Peyrin, T., Tan, Q.Q.: A deeper look at machine learning-based cryptanalysis. In: Canteaut, A., Standaert, F.-X. (eds.) EUROCRYPT 2021. LNCS, vol. 12696, pp. 805–835. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77870-5_28
Biryukov, A., dos Santos, L.C., Teh, J.S., Udovenko, A., Velichkov, V.: Meet-in-the-filter and dynamic counting with applications to speck. Cryptology ePrint Archive (2022)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1023/A:1018054314350
Brown, R.G.: DieHarder: a GNU public license random number tester. Duke University Physics Department, Durham, NC 27708-0305 (2006). http://www.phy.duke.edu/~rgb/General/dieharder.php
Castro, J.C.H., Sierra, J.M., Seznec, A., Izquierdo, A., Ribagorda, A.: The strict avalanche criterion randomness test. Math. Comput. Simul. 68(1), 1–7 (2005). https://doi.org/10.1016/j.matcom.2004.09.001
Daemen, J., Hoffert, S., Van Assche, G., Van Keer, R.: The design of xoodoo and xoofff. IACR Trans. Symmetric Cryptol. 2018(4), 1–38 (2018). https://tosc.iacr.org/index.php/ToSC/article/view/7359, https://doi.org/10.13154/tosc.v2018.i4.1-38
Daor, J., Daemen, J., Rijmen, V.: AES proposal: Rijndael (1999). https://www.cs.miami.edu/home/burt/learning/Csc688.012/rijndael/rijndael_doc_V2.pdf
Feistel, H.: Cryptography and computer privacy. Sci. Am. 228(5), 15–23 (1973)
Gohr, A.: Deep speck (2019). https://github.com/agohr/deep_speck
Gohr, A.: Improving attacks on round-reduced speck32/64 using deep learning. In: Boldyreva, A., Micciancio, D. (eds.) CRYPTO 2019. LNCS, vol. 11693, pp. 150–179. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26951-7_6
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 19. The MIT Press (2017). https://mitpress.mit.edu/books/deep-learning
Gunning, D., Vorm, E., Wang, J.Y., Turek, M.: Darpa’s explainable AI (XAI) program: a retrospective. Appl. AI Lett. 2, e61 (2021). https://doi.org/10.1002/AIL2.61
Gustafson, H., Dawson, E., Golić, J.D.: Automated statistical methods for measuring the strength of block ciphers. Stat. Comput. 7(2), 125–135 (1997)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem, pp. 770–778 (2016). http://image-net.org/challenges/LSVRC/2015/, https://doi.org/10.1109/CVPR.2016.90
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. (1991). https://doi.org/10.1016/0893-6080(91)90009-T
Hou, Z., Ren, J., Chen, S., Fu, A.: Improve neural distinguishers of Simon and speck. Secur. Commun. Netw. 2021 (2021). https://doi.org/10.1155/2021/9288229
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745
Langford, S.K., Hellman, M.E.: Differential-linear cryptanalysis. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 17–25. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48658-5_3
L’Ecuyer, P., Simard, R.: TestU01: a software library in ANSI C for empirical testing of random number generators, software user’s guide. Département d’Informatique et Recherche opérationnelle, Université de Montréal, Montréal, Québec, Canada (2001). http://www.iro.umontreal.ca/~simardr/TestU01.zip
Livni, R., Shalev-Shwartz, S., Shamir, O.: On the computational efficiency of symmetric neural networks. Adv. Neural Inf. Process. Syst. 27, 855–863 (2014). https://papers.nips.cc/paper/2014/hash/3a0772443a0739141292a5429b952fe6-Abstract.html
Makridakis, S., Spiliotis, E., Assimakopoulos, V.: The M4 competition: 100,000 time series and 61 forecasting methods. Int. J. Forecast. 36(1), 54–74 (2020). https://doi.org/10.1016/j.ijforecast.2019.04.014
Moritz, P., et al.: Ray: a distributed framework for emerging \(\{\)AI\(\}\) applications. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2018), pp. 561–577 (2018)
Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: N-BEATS: neural basis expansion analysis for interpretable time series forecasting. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=r1ecqn4YwB
Reddi, S.J., Kale, S., Kumar, S.: On the convergence of Adam and beyond. arXiv preprint arXiv:1904.09237 (2019)
Rukhin, A., et al.: A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. NIST (2010)
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., Silver, D.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015). http://www.robots.ox.ac.uk/
Soto, J.: Randomness testing of the advanced encryption standard candidate algorithms. NIST Interagency/Internal Report (NISTIR) (1999). http://www.nist.gov/customcf/get_pdf.cfm?pub_id=151193
Soto, J., Bassham, L.: Randomness testing of the advanced encryption standard finalist candidates. NIST Interagency/Internal Report (NISTIR) (2000). https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151216
Švenda, P., Ukrop, M., Matyáš, V.: Determining cryptographic distinguishers for eStream and SHA-3 candidate functions with evolutionary circuits. In: Obaidat, M.S., Filipe, J. (eds.) ICETE 2013. CCIS, vol. 456, pp. 290–305. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44788-8_17
Team, P.: PyTorch ResNet implementation (2022). https://pytorch.org/hub/pytorch_vision_resnet/
Team, R.: Ray (2022). https://github.com/ray-project/ray
(TII), T.I.I.: Crypto-TII nnbits (2022). https://github.com/Crypto-TII/nnbits
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Walker, J.: ENT: a pseudorandom number sequence test program. Web site (2008). http://www.fourmilab.ch/random/
Yadav, T., Kumar, M.: Differential-ML distinguisher: machine learning based generic extension for differential cryptanalysis. In: Longa, P., Ràfols, C. (eds.) LATINCRYPT 2021. LNCS, vol. 12912, pp. 191–212. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88238-9_10
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
ADetails for the Generalized network Experiment
We apply Generalized network as a distinguisher to the avalanche datasets of SPECK32/64, SPECK64/128, SPECK96/144, SPECK128/128 and AES-128 in the settings , which are advantageous for machine learning. Table 6 summarizes the experimental settings for each cipher. We generate X bit sequences of the length of avalanche units for the respective cipher. A randomly chosen half of the inputs X have the label \(Y=0\) and contains random data. The other half of the inputs has the label \(Y=1\) and contains avalanche units of a cipher, that is, not random data. A Generalized network, as presented in Sect. 4.2 is trained on a subset \(X_\textrm{train}\) to predict the labels \(Y_\textrm{train}\) for 10 epochs. Subsequently, previously unseen data \(X_\textrm{test}\) is used to evaluate the accuracy A of the distinguisher.
Table 6 summarizes the avalanche unit bit sizes, the number of avalanche units for training \(X_\textrm{train}\) and testing \(X_\textrm{test}\), as well as the distinguisher’s accuracy A for relevant rounds. The accuracy is given as the mean and standard deviation over four runs of the previously described experiment.
B Details of NIST Results
C Details of NNBits Results
D Bit Profiles of AES-128
1.1 D.1 AES Round 1/10 Bit Pattern
The previous analysis of SPECK32/64 has shown a particular region of weak bits. In AES-128, however, we find repeating patterns of weak and strong bits in rounds 1 and 2 in the 128 bit sub-blocks of the avalanche unit.
Figure 13 shows details of the patterns observed after one round of AES-128. The complete avalanche unit of AES-128 consists of \(128 \times 128=16,384\) bits. We analyze the complete avalanche unit in blocks of 128 bits (Fig. 13a)) and can identify four recurring patterns of weak and strong bits that occur throughout the avalanche unit. For example, pattern occurs in the avalanche blocks \(s=0\) to \(s=7\) (Fig. 13b)). Exemplary sections for the distributions of weak and strong bit patterns are shown in Fig. 13c).
After one round of AES-128 96 consecutive bits of the 128 bits in each subblock can be predicted with 100% accuracy. The remaining 32 bits (4 bytes) can be predicted with less than 100% accuracy, which can be understood as follows. The round function of the AES is such that changing one byte in the input results in differences in one column of the output after one round (with the other columns remaining undisturbed). This is a well-known fact about the AES, due in particular to the MDS property of the mixcolumns operation. For the avalanche dataset, this implies that for each subblock of 128 bits (corresponding to one input difference bit), 4 bytes (one column) are nonzero, while the rest of the bytes are all zeroes.
The distribution of patterns of Fig. 13a) and Fig. 13b) is still observable after two rounds of AES (we show the equivalent 2-round patterns in the appendix Fig. 14c)). When encrypting for two rounds, each of the nonzero bytes of round 1 is sent to a different column through the shiftrows operation, and then propagated to a whole column through mixcolumns, so that after two rounds, all the bytes of the dataset are non-zero. Furthermore, there are relations between the bytes of each column: mixcolumns applies a linear transformation to a 4-byte column, and by construction, only one byte is non-zero in each column. Therefore, the resulting values are multiples (in the Galois field of AES) of a single variable, with the coefficients (2, 3, 1, 1), in an order that depends on the position of the 128-bit block in the avalanche dataset. The bytes with coefficient 1 are consistently predicted, whereas only some bits of the bytes with coefficients 2 and 3 are reliably predicted. This explains the peculiar pattern observed in the prediction, where for each group of 4 bytes, there are peaks for 2 bytes, and for some of the bits among the remaining 2 bytes.
1.2 D.2 AES Round 2/10 Bit Pattern
Please see Appendix D for the context of the analysis shown in Fig. 14.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hambitzer, A., Gerault, D., Huang, Y.J., Aaraj, N., Bellini, E. (2023). NNBits: Bit Profiling with a Deep Learning Ensemble Based Distinguisher. In: Rosulek, M. (eds) Topics in Cryptology – CT-RSA 2023. CT-RSA 2023. Lecture Notes in Computer Science, vol 13871. Springer, Cham. https://doi.org/10.1007/978-3-031-30872-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-30872-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30871-0
Online ISBN: 978-3-031-30872-7
eBook Packages: Computer ScienceComputer Science (R0)