NNBits: Bit Profiling with a Deep Learning Ensemble Based Distinguisher

Hambitzer, Anna; Gerault, David; Huang, Yun Ju; Aaraj, Najwa; Bellini, Emanuele

doi:10.1007/978-3-031-30872-7_19

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13871))

Included in the following conference series:

Cryptographers’ Track at the RSA Conference

588 Accesses
1 Altmetric

Abstract

We introduce a deep learning ensemble (NNBits) as a tool for bit-profiling and evaluation of cryptographic (pseudo) random bit sequences. On the one hand, we show how to use NNBits ensemble to explain parts of the seminal work of Gohr [16]: Gohr’s depth-1 neural distinguisher reaches a test accuracy of 78.3% in round 6 for SPECK32/64 [3]. Using the bit-level information provided by NNBits we can partially explain the accuracy obtained by Gohr (78.1% vs. 78.3%). This is achieved by constructing a distinguisher which only uses the information about correct or incorrect predictions on the single bit level and which achieves 78.1% accuracy. We also generalize two heuristic aspects in the construction of Gohr’s network: i) the particular input structure, which reflects expert knowledge of SPECK32/64, as well as ii) the cyclic learning rate.

On the other hand, we extend Gohr’s work as a statistical test on avalanche datasets of SPECK32/64, SPECK64/128, SPECK96/144, SPECK128/128, and AES-128. In combination with NNBits ensemble we use the extended version of Gohr’s neural network to draw a comparison with the NIST Statistical Test Suite (NIST STS) on the previously mentioned avalanche datasets. We compare NNBits in conjunction with Gohr’s generalized network to the NIST STS and conclude that the NNBits ensemble performs either as good as the NIST STS or better. Furthermore, we demonstrate cryptanalytic insights that result from bit-level profiling with NNBits, for example, we show how to infer the strong input difference $(0x0040, 0x0000)$ for SPECK32/64 or infer a signature of the multiplication in the Galois field of AES-128.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Neural Networks Aiding Cryptanalysis: A Case Study of the Speck Distinguisher

A Deeper Look at Machine Learning-Based Cryptanalysis

Performance Comparison Between Deep Learning-Based and Conventional Cryptographic Distinguishers

Notes

1.
Note that this statement has limited practical implications: even if enough data, representational power of the network, as well as sufficient computational resources to train the network are given, the training itself may be an NP-hard problem [26].
2.
This limit has recently been overpassed by human cryptanalysis in [8], giving machine learning a new threshold to overcome.

References

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th $\{$USENIX$\}$ Symposium on Operating Systems Design and Implementation ($\{$OSDI$\}$ 2016), pp. 265–283 (2016)
Google Scholar
Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media (2019). https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/
Bacuieti, N.N., Batina, L., Picek, S.: Deep neural networks aiding cryptanalysis : a case study of the Speck distinguisher. ePrint, pp. 1–24 (2022). https://eprint.iacr.org/2022/341
Baksi, A., Breier, J., Chen, Y., Dong, X.: Machine learning assisted differential distinguishers for lightweight ciphers. In: 2021 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 176–181 (2021). https://doi.org/10.23919/DATE51398.2021.9474092
Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., Wingers, L.: The SIMON and SPECK families of lightweight block ciphers. National Security Agency (NSA), 9800 Savage Road, Fort Meade, MD 20755, USA (2013)
Google Scholar
Bellini, E., Rossi, M.: Performance comparison between deep learning-based and conventional cryptographic distinguishers. In: Arai, K. (ed.) Intelligent Computing. LNNS, vol. 285, pp. 681–701. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80129-8_48
Chapter Google Scholar
Benamira, A., Gerault, D., Peyrin, T., Tan, Q.Q.: A deeper look at machine learning-based cryptanalysis. In: Canteaut, A., Standaert, F.-X. (eds.) EUROCRYPT 2021. LNCS, vol. 12696, pp. 805–835. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77870-5_28
Chapter Google Scholar
Biryukov, A., dos Santos, L.C., Teh, J.S., Udovenko, A., Velichkov, V.: Meet-in-the-filter and dynamic counting with applications to speck. Cryptology ePrint Archive (2022)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1023/A:1018054314350
Article MATH Google Scholar
Brown, R.G.: DieHarder: a GNU public license random number tester. Duke University Physics Department, Durham, NC 27708-0305 (2006). http://www.phy.duke.edu/~rgb/General/dieharder.php
Castro, J.C.H., Sierra, J.M., Seznec, A., Izquierdo, A., Ribagorda, A.: The strict avalanche criterion randomness test. Math. Comput. Simul. 68(1), 1–7 (2005). https://doi.org/10.1016/j.matcom.2004.09.001
Article MathSciNet MATH Google Scholar
Daemen, J., Hoffert, S., Van Assche, G., Van Keer, R.: The design of xoodoo and xoofff. IACR Trans. Symmetric Cryptol. 2018(4), 1–38 (2018). https://tosc.iacr.org/index.php/ToSC/article/view/7359, https://doi.org/10.13154/tosc.v2018.i4.1-38
Daor, J., Daemen, J., Rijmen, V.: AES proposal: Rijndael (1999). https://www.cs.miami.edu/home/burt/learning/Csc688.012/rijndael/rijndael_doc_V2.pdf
Feistel, H.: Cryptography and computer privacy. Sci. Am. 228(5), 15–23 (1973)
Article Google Scholar
Gohr, A.: Deep speck (2019). https://github.com/agohr/deep_speck
Gohr, A.: Improving attacks on round-reduced speck32/64 using deep learning. In: Boldyreva, A., Micciancio, D. (eds.) CRYPTO 2019. LNCS, vol. 11693, pp. 150–179. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26951-7_6
Chapter Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 19. The MIT Press (2017). https://mitpress.mit.edu/books/deep-learning
Gunning, D., Vorm, E., Wang, J.Y., Turek, M.: Darpa’s explainable AI (XAI) program: a retrospective. Appl. AI Lett. 2, e61 (2021). https://doi.org/10.1002/AIL2.61
Gustafson, H., Dawson, E., Golić, J.D.: Automated statistical methods for measuring the strength of block ciphers. Stat. Comput. 7(2), 125–135 (1997)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem, pp. 770–778 (2016). http://image-net.org/challenges/LSVRC/2015/, https://doi.org/10.1109/CVPR.2016.90
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. (1991). https://doi.org/10.1016/0893-6080(91)90009-T
Article Google Scholar
Hou, Z., Ren, J., Chen, S., Fu, A.: Improve neural distinguishers of Simon and speck. Secur. Commun. Netw. 2021 (2021). https://doi.org/10.1155/2021/9288229
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745
Langford, S.K., Hellman, M.E.: Differential-linear cryptanalysis. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 17–25. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48658-5_3
Chapter Google Scholar
L’Ecuyer, P., Simard, R.: TestU01: a software library in ANSI C for empirical testing of random number generators, software user’s guide. Département d’Informatique et Recherche opérationnelle, Université de Montréal, Montréal, Québec, Canada (2001). http://www.iro.umontreal.ca/~simardr/TestU01.zip
Livni, R., Shalev-Shwartz, S., Shamir, O.: On the computational efficiency of symmetric neural networks. Adv. Neural Inf. Process. Syst. 27, 855–863 (2014). https://papers.nips.cc/paper/2014/hash/3a0772443a0739141292a5429b952fe6-Abstract.html
Makridakis, S., Spiliotis, E., Assimakopoulos, V.: The M4 competition: 100,000 time series and 61 forecasting methods. Int. J. Forecast. 36(1), 54–74 (2020). https://doi.org/10.1016/j.ijforecast.2019.04.014
Article Google Scholar
Moritz, P., et al.: Ray: a distributed framework for emerging $\{$AI$\}$ applications. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2018), pp. 561–577 (2018)
Google Scholar
Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: N-BEATS: neural basis expansion analysis for interpretable time series forecasting. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=r1ecqn4YwB
Reddi, S.J., Kale, S., Kumar, S.: On the convergence of Adam and beyond. arXiv preprint arXiv:1904.09237 (2019)
Rukhin, A., et al.: A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. NIST (2010)
Google Scholar
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., Silver, D.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015). http://www.robots.ox.ac.uk/
Soto, J.: Randomness testing of the advanced encryption standard candidate algorithms. NIST Interagency/Internal Report (NISTIR) (1999). http://www.nist.gov/customcf/get_pdf.cfm?pub_id=151193
Soto, J., Bassham, L.: Randomness testing of the advanced encryption standard finalist candidates. NIST Interagency/Internal Report (NISTIR) (2000). https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151216
Švenda, P., Ukrop, M., Matyáš, V.: Determining cryptographic distinguishers for eStream and SHA-3 candidate functions with evolutionary circuits. In: Obaidat, M.S., Filipe, J. (eds.) ICETE 2013. CCIS, vol. 456, pp. 290–305. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44788-8_17
Chapter Google Scholar
Team, P.: PyTorch ResNet implementation (2022). https://pytorch.org/hub/pytorch_vision_resnet/
Team, R.: Ray (2022). https://github.com/ray-project/ray
(TII), T.I.I.: Crypto-TII nnbits (2022). https://github.com/Crypto-TII/nnbits
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Article Google Scholar
Walker, J.: ENT: a pseudorandom number sequence test program. Web site (2008). http://www.fourmilab.ch/random/
Yadav, T., Kumar, M.: Differential-ML distinguisher: machine learning based generic extension for differential cryptanalysis. In: Longa, P., Ràfols, C. (eds.) LATINCRYPT 2021. LNCS, vol. 12912, pp. 191–212. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88238-9_10
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Technology Innovation Institute, Cryptography Research Center, 9639, Abu Dhabi, UAE
Anna Hambitzer, David Gerault, Yun Ju Huang, Najwa Aaraj & Emanuele Bellini

Authors

Anna Hambitzer
View author publications
You can also search for this author in PubMed Google Scholar
David Gerault
View author publications
You can also search for this author in PubMed Google Scholar
Yun Ju Huang
View author publications
You can also search for this author in PubMed Google Scholar
Najwa Aaraj
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Bellini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Hambitzer .

Editor information

Editors and Affiliations

Oregon State University, Corvallis, OR, USA
Mike Rosulek

Appendices

ADetails for the Generalized network Experiment

We apply Generalized network as a distinguisher to the avalanche datasets of SPECK32/64, SPECK64/128, SPECK96/144, SPECK128/128 and AES-128 in the settings , which are advantageous for machine learning. Table 6 summarizes the experimental settings for each cipher. We generate X bit sequences of the length of avalanche units for the respective cipher. A randomly chosen half of the inputs X have the label $Y=0$ and contains random data. The other half of the inputs has the label $Y=1$ and contains avalanche units of a cipher, that is, not random data. A Generalized network, as presented in Sect. 4.2 is trained on a subset $X_\textrm{train}$ to predict the labels $Y_\textrm{train}$ for 10 epochs. Subsequently, previously unseen data $X_\textrm{test}$ is used to evaluate the accuracy A of the distinguisher.

Table 6 summarizes the avalanche unit bit sizes, the number of avalanche units for training $X_\textrm{train}$ and testing $X_\textrm{test}$, as well as the distinguisher’s accuracy A for relevant rounds. The accuracy is given as the mean and standard deviation over four runs of the previously described experiment.

Table 6. Accuracies A for distinguishing avalanche units of the respective cipher from random data. Bold is the first round for which the distinguisher offers no advantage over a random-guess.

Full size table

B Details of NIST Results

C Details of NNBits Results

Table 7. Summary of the number of training and testing avalanche units presented to the neural network ensemble for each cipher. The detailed training outcomes for each round are shown in Table 8.

Full size table

Table 8. Detailed results for the NNBits analysis presented in Table 5. For each round r the training settings (number of epochs, number of training avalanche sequences, number of testing avalanche sequences, as well as the runtime in minutes), as well as the resulting test accuracy, p-value and randomness result is shown.

Full size table

D Bit Profiles of AES-128

1.1 D.1 AES Round 1/10 Bit Pattern

The previous analysis of SPECK32/64 has shown a particular region of weak bits. In AES-128, however, we find repeating patterns of weak and strong bits in rounds 1 and 2 in the 128 bit sub-blocks of the avalanche unit.

Figure 13 shows details of the patterns observed after one round of AES-128. The complete avalanche unit of AES-128 consists of $128 \times 128=16,384$ bits. We analyze the complete avalanche unit in blocks of 128 bits (Fig. 13a)) and can identify four recurring patterns of weak and strong bits that occur throughout the avalanche unit. For example, pattern occurs in the avalanche blocks $s=0$ to $s=7$ (Fig. 13b)). Exemplary sections for the distributions of weak and strong bit patterns are shown in Fig. 13c).

After one round of AES-128 96 consecutive bits of the 128 bits in each subblock can be predicted with 100% accuracy. The remaining 32 bits (4 bytes) can be predicted with less than 100% accuracy, which can be understood as follows. The round function of the AES is such that changing one byte in the input results in differences in one column of the output after one round (with the other columns remaining undisturbed). This is a well-known fact about the AES, due in particular to the MDS property of the mixcolumns operation. For the avalanche dataset, this implies that for each subblock of 128 bits (corresponding to one input difference bit), 4 bytes (one column) are nonzero, while the rest of the bytes are all zeroes.

The distribution of patterns of Fig. 13a) and Fig. 13b) is still observable after two rounds of AES (we show the equivalent 2-round patterns in the appendix Fig. 14c)). When encrypting for two rounds, each of the nonzero bytes of round 1 is sent to a different column through the shiftrows operation, and then propagated to a whole column through mixcolumns, so that after two rounds, all the bytes of the dataset are non-zero. Furthermore, there are relations between the bytes of each column: mixcolumns applies a linear transformation to a 4-byte column, and by construction, only one byte is non-zero in each column. Therefore, the resulting values are multiples (in the Galois field of AES) of a single variable, with the coefficients (2, 3, 1, 1), in an order that depends on the position of the 128-bit block in the avalanche dataset. The bytes with coefficient 1 are consistently predicted, whereas only some bits of the bytes with coefficients 2 and 3 are reliably predicted. This explains the peculiar pattern observed in the prediction, where for each group of 4 bytes, there are peaks for 2 bytes, and for some of the bits among the remaining 2 bytes.

1.2 D.2 AES Round 2/10 Bit Pattern

Please see Appendix D for the context of the analysis shown in Fig. 14.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hambitzer, A., Gerault, D., Huang, Y.J., Aaraj, N., Bellini, E. (2023). NNBits: Bit Profiling with a Deep Learning Ensemble Based Distinguisher. In: Rosulek, M. (eds) Topics in Cryptology – CT-RSA 2023. CT-RSA 2023. Lecture Notes in Computer Science, vol 13871. Springer, Cham. https://doi.org/10.1007/978-3-031-30872-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-30872-7_19
Published: 19 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30871-0
Online ISBN: 978-3-031-30872-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

NNBits: Bit Profiling with a Deep Learning Ensemble Based Distinguisher

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Neural Networks Aiding Cryptanalysis: A Case Study of the Speck Distinguisher

A Deeper Look at Machine Learning-Based Cryptanalysis

Performance Comparison Between Deep Learning-Based and Conventional Cryptographic Distinguishers

Notes

References