Quantum autoencoders for communication-efficient cloud computing

Zhu, Yan; Bai, Ge; Wang, Yuexuan; Li, Tongyang; Chiribella, Giulio

doi:10.1007/s42484-023-00112-5

Quantum autoencoders for communication-efficient cloud computing

Research
Published: 10 July 2023

Volume 5, article number 27, (2023)
Cite this article

Quantum Machine Intelligence Aims and scope Submit manuscript

Yan Zhu¹,
Ge Bai¹,
Yuexuan Wang^2,3,
Tongyang Li^4,5,6 &
…
Giulio Chiribella^1,7,8,9

299 Accesses
4 Citations
Explore all metrics

Abstract

In the model of quantum cloud computing, the server executes a computation on the quantum data provided by the client. In this scenario, it is important to reduce the amount of quantum communication between the client and the server. A possible approach is to transform the desired computation into a compressed version that acts on a smaller number of qubits, thereby reducing the amount of data exchanged between the client and the server. Here we propose quantum autoencoders for quantum gates (QAEGate) as a method for compressing quantum computations. We illustrate it in concrete scenarios of single-round and multi-round communication and validate it through numerical experiments. A bonus of our method is it does not reveal any information about the server’s computation other than the information present in the output.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Fig. 10

Quantum image compression with autoencoders based on parameterized quantum circuits

Article 30 January 2024

Denoising quantum mixed states using quantum autoencoders

Article 25 January 2024

Practical distributed quantum information processing with LOCCNet

Article Open access 04 November 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Preskill J (2018) Quantum computing in the NISQ era and beyond. Quantum 2:79
Article Google Scholar
Arute F et al (2019) Quantum supremacy using a programmable superconducting processor. Nature 574(7779):505–510
Article Google Scholar
Zhong H-S et al (2020) Quantum computational advantage using photons. Science 370(6523):1460–1463
Article Google Scholar
Broadbent A, Fitzsimons J, Kashefi E (2009) Universal blind quantum computation. In: 2009 50th Annual IEEE Symposium on Foundations of Computer Science. IEEE, pp 517–526
Arrighi P, Salvail L (2006) Blind quantum computation. Int J Quantum Inf 4(05):883–898
Article Google Scholar
Morimae T, Fujii K (2013) Blind quantum computation protocol in which alice only makes measurements. Phys Rev A 87(5):050301
Article Google Scholar
Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Netw 61:85–117
Article Google Scholar
Romero J, Olson JP, Aspuru-Guzik A (2017) Quantum autoencoders for efficient compression of quantum data. Quantum Sci Technol 2(4):045001
Article Google Scholar
Bottou L (2012) Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade. Springer, pp 421–436
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58
Article Google Scholar
Barz S, Kashefi E, Broadbent A, Fitzsimons JF, Zeilinger A, Walther P (2012) Demonstration of blind quantum computing. Science 335(6066):303–308
Article MathSciNet Google Scholar
Yang Y, Chiribella G, Hayashi M (2020) Communication cost of quantum processes. IEEE J Sel Areas Inf Theory 1(2):387–400
Article Google Scholar
Sheng Y-B, Zhou L (2017) Distributed secure quantum machine learning. Sci Bull 62(14):1025–1029
Article Google Scholar
Bondarenko D, Feldmann P (2020) Quantum autoencoders to denoise quantum data. Phys Rev Lett 124(13):130502
Article Google Scholar
Achache T, Horesh L, Smolin J (2020) Denoising quantum states with Quantum Autoencoders–Theory and Applications . arXiv preprint arXiv:2012.14714
Nielsen MA, Chuang IL (1997) Programmable quantum gate arrays. Phys Rev Lett 79(2):321
Article MathSciNet Google Scholar
Yang Y, Renner R, Chiribella G (2020) Optimal universal programming of unitary gates. Phys Rev Lett 125(21):210501
Article MathSciNet Google Scholar
Choi M-D (1975) Completely positive linear maps on complex matrices. Linear Algebra Appl 10(3):285–290
Article MathSciNet Google Scholar
Jamiołkowski A (1972) Linear transformations which preserve trace and positive semidefiniteness of operators. Rep Math Phys 3(4):275–278
Article MathSciNet Google Scholar
Nielsen MA, Chuang I (2002) Quantum computation and quantum information. American Association of Physics Teachers
Chiribella G, D’Ariano GM, Perinotti P (2008) Transforming quantum operations: Quantum supermaps. EPL Europhys Lett 83(3):30004
Article Google Scholar
Chiribella G, D’Ariano GM, Perinotti P (2009) Theoretical framework for quantum networks. Phys Rev A 80(2):022339
Cong I, Choi S, Lukin MD (2019) Quantum convolutional neural networks. Nat Phys 15(12):1273–1278
Farhi E, Goldstone J, Gutmann S (2014) A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028
Debnath S, Linke NM, Figgatt C, Landsman KA, Wright K, Monroe C (2016) Demonstration of a small programmable quantum computer with atomic qubits. Nature 536(7614):63–66
Article Google Scholar
Kiefer J, Wolfowitz J et al (1952) Stochastic estimation of the maximum of a regression function. Ann Math Stat 23(3):462–466
Article MathSciNet Google Scholar
McClean JR, Boixo S, Smelyanskiy VN, Babbush R, Neven H (2018) Barren plateaus in quantum neural network training landscapes. Nat Commun 9(1):4812
Article Google Scholar
Broughton M, Verdon G, McCourt T, Martinez AJ, Yoo JH, Isakov SV, Massey P, Halavati R, Niu MY, Zlokapa A et al (2020) Tensorflow quantum: A software framework for quantum machine learning. arXiv preprint arXiv:2003.02989
Baxter RJ (2007) Exactly Solved Models in Statistical Mechanics. Courier Corporation
Williamson DF, Parker RA, Kendrick JS (1989) The box plot: a simple visual method to interpret data. Ann Intern Med 110(11):916–921
Article Google Scholar
Bisio A, Chiribella G, D’Ariano GM, Facchini S, Perinotti P (2010) Optimal quantum learning of a unitary transformation. Phys Rev A 81(3):032324
Article Google Scholar
Mo Y, Chiribella G (2019) Quantum-enhanced learning of rotations about an unknown direction. New J Phys 21(11):113003
Article MathSciNet Google Scholar
Sedlák M, Ziman M (2020) Probabilistic storage and retrieval of qubit phase gates. Phys Rev A 102(3):032618
Article Google Scholar
Bishop LS, Bravyi S, Cross A, Gambetta JM, Smolin J (2017) Quantum volume. Quantum Volume, Technical Report
Google Scholar
Moll N, Barkoutsos P, Bishop LS, Chow JM, Cross A, Egger DJ, Filipp S, Fuhrer A, Gambetta JM, Ganzhorn M et al (2018) Quantum optimization using variational algorithms on near-term quantum devices. Quantum Sci Technol 3(3):030503
Article Google Scholar
Allen-Zhu Z (2017) Natasha 2: Faster non-convex optimization than sgd. arXiv preprint arXiv:1708.08694
Sweke R, Wilde F, Meyer JJ, Schuld M, Fährmann PK, Meynard-Piganeau B, Eisert J (2020) Stochastic gradient descent for hybrid quantum-classical optimization. Quantum 4:314
Article Google Scholar
Developers Cirq (2022). Cirq Zenodo. https://doi.org/10.5281/ZENODO.7465577
Mitarai K, Negoro M, Kitagawa M, Fujii K (2018) Quantum circuit learning. Physical Review A 98(3):032309
Google Scholar

Download references

Funding

This work was supported by funding from the Hong Kong Research Grant Council through grants no. 17300918 and no. 17307520, through the Senior Research Fellowship Scheme SRFS2021-7S02, the Croucher Foundation, and by the John Templeton Foundation through grant 62312, The Quantum Information Structure of Spacetime (qiss.fr). YXW acknowledges funding from the National Natural Science Foundation of China through grants no. 61872318. Research at the Perimeter Institute is supported by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.

Author information

Authors and Affiliations

QICI Quantum Information and Computation Initiative, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China
Yan Zhu, Ge Bai & Giulio Chiribella
AI Technology Lab Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China
Yuexuan Wang
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Yuexuan Wang
Center on Frontiers of Computing Studies, Peking University, Beijing, China
Tongyang Li
School of Computer Science, Peking University, Beijing, China
Tongyang Li
Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, USA
Tongyang Li
Department of Computer Science, University of Oxford, Parks Road, Oxford, OX1 3QD, UK
Giulio Chiribella
Perimeter Institute For Theoretical Physics, 31 Caroline Street North, Waterloo, N2L 2Y5, Ontario, Canada
Giulio Chiribella
The University of Hong Kong Shenzhen Institute of Research and Innovation, Yuexing 2nd Rd Nanshan, Shenzhen, China
Giulio Chiribella

Authors

Yan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ge Bai
View author publications
You can also search for this author in PubMed Google Scholar
Yuexuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tongyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Giulio Chiribella
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Zhu and Li provided the main ideas for the work, while Zhu, Bai, and Wang designed the numerical experiments. Zhu and Chiribella drafted the main manuscript text, and all authors reviewed the manuscript.

Corresponding authors

Correspondence to Tongyang Li or Giulio Chiribella.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 A. Proof of Theorem 1

According to Allen-Zhu (2017), we have the following lemma about convergence of SGD for nonconvex functions:

Lemma 1

If a function f(x) is $L_1$-smooth and we perform SGD update $x_{t+1}\leftarrow x_t - \eta \nabla f_i(x_t)$ each time for a random $i\in [n]$, then the convergence rate is $T = \mathcal {O}(\frac{L_1}{\epsilon ^4})$ for finding $\mathbb {E}[\Vert \nabla f(x) \Vert ^2]\le \epsilon ^2$.

Therefore, we first discuss the smoothness property of the loss function $\mathcal {L}(\varvec{\theta })$ in the training of QAEGate model. Note that the first- and second-order Lipschitz smoothness are equivalent to Lipschitz continuity of the first- and second-order derivatives. So we start with a lemma about the Lipschitz continuity of multivariate functions (Sweke et al. 2020).

Lemma 2

Given some function $f :\textbf{R}^M \rightarrow \textbf{R}$, if all partial derivatives of f are continuous, then for any $a, b \in \textbf{R}$ the function $f :[a, b]^M \rightarrow \textbf{R}$ is L-Lipschitz continuous with

$$\begin{aligned} L = \sqrt{M}\underset{j \in \{1,\dots ,M\}}{\textrm{max}} \sup _{\varvec{x}\in [a, b]^M} \left| \frac{\partial f(\varvec{x})}{\partial x_j}\right| . \end{aligned}$$

(3)

Next we prove a more general result about the smoothness of measurement probability of quantum circuits, and then show that $\mathcal {L}(\varvec{\theta })$ fits into this result.

Lemma 3

Consider a quantum circuit consisting of any number of fixed unitary gates and M variable unitary gates $U_1(\theta _1),\dots ,U_M(\theta _M)$, where $U_j(\theta _j):= \textrm{exp}(i\theta _j H_j)$ for some Hermitian operator $H_j$. Then the probability of any measurement outcome of the output of this circuit is $L_1$-smooth and $L_2$-second-order smooth with respect to $\varvec{\theta }=(\theta _1,\dots ,\theta _M)$, where

$$\begin{aligned} L_1= & {} 4MH_{\textrm{max}}^2\end{aligned}$$

(4)

$$\begin{aligned} L_2= & {} 8M^{3/2}H_{\textrm{max}}^3 \end{aligned}$$

(5)

where $H_{\textrm{max}}:=\textrm{max}_{j} \Vert H_j\Vert _2$ and $\Vert \cdot \Vert _2$ is the induced operator norm defined as $\Vert A\Vert _2:= \sup \frac{\Vert Ax\Vert _2}{\Vert x\Vert _2}$.

Proof

Let $\vert \psi _0\rangle $ be the initial state of the circuit. Without loss of generality, we assume the variable unitary gates are labeled as $U_1(\theta _1)$ to $U_M(\theta _M)$ according to the order they are applied. Then the output of this circuit can be written as $\vert \psi _{\varvec{\theta }}\rangle := V_M U_M(\theta _M) \dots V_1 U_1(\theta _1) V_0 \vert \psi _0\rangle $, where $V_1,\dots ,V_M$ denotes the fixed unitary gates. A measurement outcome can be described by a positive operator-valued measure (POVM) element P with the property that P and $I-P$ are both positive semidefinite, where I is the identity operator of the output. The probability of this outcome is

$$\begin{aligned} f(\varvec{\theta }):= \textrm{Tr}[P \vert \psi _{\varvec{\theta }}\rangle \langle \psi _{\varvec{\theta }}\vert ] = \langle \psi _{\varvec{\theta }}\vert P \vert \psi _{\varvec{\theta }}\rangle . \end{aligned}$$

(6)

To study the smoothness of $f(\varvec{\theta })$, we take the partial derivatives:

$$\begin{aligned} \frac{\partial f(\varvec{\theta })}{\partial \theta _j}= & {} \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \vert \psi _{\varvec{\theta }}\rangle + \langle \psi _{\varvec{\theta }}\vert P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _j}\\= & {} 2 \Re \left( \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \vert \psi _{\varvec{\theta }}\rangle \right) \\ \frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}= & {} 2 \Re \left( \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \vert \psi _{\varvec{\theta }}\rangle + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} \right) \\ \frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l}= & {} 2 \Re \left( \frac{\partial ^3\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k \partial \theta _l} P \vert \psi _{\varvec{\theta }}\rangle +\frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _l}\right. \\{} & {} \left. + \frac{\partial ^2 \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _l} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial ^2\vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k \partial \theta _l} \right) \end{aligned}$$

These partial derivatives can be bounded with vector and operator norms. For a unitary operator U, $\Vert U\Vert _2=1$. For the POVM element P, $\Vert P\Vert _2\le 1$. The second-order partial derivatives can be bounded as:

$$\begin{aligned} \left| \frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\right|\le & {} 2 \left| \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \vert \psi _{\varvec{\theta }}\rangle + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} \right| \nonumber \\\le & {} 2\left\| \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k}\right\| _2 \Vert P\Vert _2 \Vert \vert \psi _{\varvec{\theta }}\rangle \Vert _2 + 2\left\| \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j}\right\| _2\nonumber \\{} & {} \times \Vert P\Vert _2 \left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k}\right\| _2 \nonumber \\\le & {} 2\underset{a,b}{\textrm{max}}\left\| \frac{\partial ^2\vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2 + 2 \underset{a}{\textrm{max}} \left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a}\right\| _2^2 \end{aligned}$$

(7)

Similarly, for the third-order partial derivatives,

$$\begin{aligned} \left| \frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l} \right|\le & {} 2 \left| \frac{\partial ^3\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k \partial \theta _l} P \vert \psi _{\varvec{\theta }}\rangle + \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _l}\right. \nonumber \\{} & {} \left. + \frac{\partial ^2 \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _l} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial ^2\vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k \partial \theta _l} \right| \nonumber \\\le & {} \!2\underset{a,b,c}{\textrm{max}}\left\| \frac{\partial ^3 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b \partial \theta _c}\right\| _2 \!+\! 6\! \left( \underset{a,b}{\textrm{max}}\left\| \frac{\partial ^2 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2 \right) \nonumber \\{} & {} \! \times \!\left( \underset{a}{\textrm{max}}\left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a }\right\| _2 \right) \end{aligned}$$

(8)

Therefore, to show that the second- and third-order partial derivatives are bounded, it suffices to show that $\textrm{max}_{a}\left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a }\right\| _2, \textrm{max}_{a,b}\left\| \frac{\partial ^2 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2$ and $\textrm{max}_{a,b,c}\left\| \frac{\partial ^3 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b \partial \theta _c}\right\| _2$ are all bounded. We observe that $\frac{\partial U_j(\theta _j)}{\partial j} = \frac{\partial \textrm{exp}(i\theta _j H_j)}{\partial j} = iH_j U_j(\theta _j)$, and we have

$$\begin{aligned} \underset{a}{\textrm{max}}\left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a }\right\| _2= & {} \underset{a}{\textrm{max}}\left\| V_M U_M(\theta _M)\dots V_a i H_a U_a(\theta _a)\right. \nonumber \\{} & {} \left. \dots U_1(\theta _1)V_0 \vert \psi _0\rangle \right\| _2 \nonumber \\\le & {} \!\underset{a}{\textrm{max}} \left\| V_M U_M(\theta _M)\! \dots \!V_a \right\| _2 \!\Vert H_a\Vert _2 \left\| U_a(\theta _a)\!\right. \nonumber \\{} & {} \left. \dots U_1(\theta _1)V_0 \vert \psi _0\rangle \right\| _2 \nonumber \\= & {} \underset{a}{\textrm{max}} \Vert H_a\Vert _2 = H_{\textrm{max}} \end{aligned}$$

(9)

With similar derivations, we can obtain $\textrm{max}_{a,b}\left\| \frac{\partial ^2 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2^2 \le H_{\textrm{max}}^2$ and $\textrm{max}_{a,b,c}\left\| \frac{\partial ^3 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b \partial \theta _c}\right\| _2 \le H_{\textrm{max}}^3$. According to Eqs. (7) and (8), we obtain

$$\begin{aligned} \left| \frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\right|\le & {} 4 H_{\textrm{max}}^2 \end{aligned}$$

(10)

$$\begin{aligned} \left| \frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l} \right|\le & {} 8 H_{\textrm{max}}^3 \end{aligned}$$

(11)

Consider $\frac{\partial f(\varvec{\theta })}{\partial \theta _j}$ as a function of $\varvec{\theta }$. Since its partial derivatives $\frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}$ are bounded by $4 H_{\textrm{max}}^2$, according to Lemma 2, $\frac{\partial f(\varvec{\theta })}{\partial \theta _j}$ is $L_1'$-continuous with $L_1'=4 \sqrt{M} H_{\textrm{max}}^2$. Then

$$\begin{aligned} \Vert \nabla f(\varvec{\alpha }) - \nabla f(\varvec{\beta }) \Vert _2\le & {} \sqrt{M}\underset{j}{\textrm{max}} \left| \frac{\partial f(\varvec{\alpha })}{\partial \theta _j} - \frac{\partial f(\varvec{\beta })}{\partial \theta _j} \right| \nonumber \\\le & {} \sqrt{M} L_1' \Vert \alpha - \beta \Vert _2 = 4 M H_{\textrm{max}}^2 \Vert \alpha - \beta \Vert _2 \end{aligned}$$

(12)

and therefore f is $L_1$-smooth with $L_1 = 4 M H_{\textrm{max}}^2$. Similarly, consider $\frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}$ as a function of $\varvec{\theta }$. Since its partial derivatives $\frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l}$ are bounded by $8 H_{\textrm{max}}^3$, according to Lemma 2, $\frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}$ is $L_2'$-continuous with $L_2'=8\sqrt{M} H_{\textrm{max}}^3$. Then

$$\begin{aligned} \Vert \nabla ^2 f(\varvec{\alpha }) - \nabla ^2 f(\varvec{\beta }) \Vert _2\le & {} M \underset{j,k}{\textrm{max}} \left| \frac{\partial ^2 f(\varvec{\alpha })}{\partial \theta _j \partial \theta _k} - \frac{\partial ^2 f(\varvec{\beta })}{\partial \theta _j \partial \theta _k} \right| \nonumber \\\le & {} M L_2' \Vert \alpha \!-\! \beta \Vert _2 \!=\! 8 M^{3/2} H_{\textrm{max}}^3 \Vert \alpha \!-\! \beta \Vert _2 \end{aligned}$$

(13)

and therefore f is $L_2$-second-order smooth with $L_2 = 8 M^{3/2} H_{\textrm{max}}^3$. $\square $

Based on above lemma, we obtain the following theorem:

Theorem 3

(Smoothness) The loss function $\mathcal {L}(\varvec{\theta })$ in the training of QAEGate model is $L_1$-smooth and $L_2$-second-order smooth.

Proof

Let $p(\varvec{\theta })$ be the probability of outcome $\vert +\rangle $ in the SWAP test when the gates are parameterized with $\varvec{\theta =(\theta _{\textrm{le}},\theta _{\textrm{re}},\theta _{\textrm{ld}},\theta _{\textrm{rd}})}$. Because of $\mathcal {L}(\varvec{\theta }) = 1-f(\varvec{\theta })$ and $f(\varvec{\theta }) = 2p(\varvec{\theta })-1$, we just need to discuss the smoothness of $p(\varvec{\theta })$. Since $p(\varvec{\theta })$ is the probability of a measurement outcome of a unitary circuit, and in this circuit, every variable gate has the form of $U_\theta = \textrm{exp}(i\theta H)$, we can apply Lemma 3. Thus, $p(\varvec{\theta })$ is $L_1$-smooth and $L_2$-second-order smooth where $L_1 = 4MH_{\textrm{max}}^2, L_2 = 8M^{3/2}H_{\textrm{max}}^3$ and M is the number of variable gates in the circuit. To prove Theorem 3, we only need to show that $H_{\textrm{max}}$ is bounded.

The variable gates in our circuit only includes the following single- and two-qubit gates:

$$\begin{aligned} R_X(\theta ):= & {} \textrm{exp}(i\theta \sigma _x),\ R_Y(\theta ):= \textrm{exp}(i\theta \sigma _y),\ R_Z(\theta )\nonumber \\{} & {} := \textrm{exp}(i\theta \sigma _z) \end{aligned}$$

(14)

$$\begin{aligned} XX(\theta ):= & {} \textrm{exp}(i\theta \sigma _x\otimes \sigma _x),\ YY(\theta ):=\textrm{exp}(i\theta \sigma _y\otimes \sigma _y),\nonumber \\{} & {} ZZ(\theta ):=\textrm{exp}(i\theta \sigma _z\otimes \sigma _z) \end{aligned}$$

(15)

Therefore, $H_{\textrm{max}} = \textrm{max}\{\Vert \sigma _x\Vert _2,\Vert \sigma _y\Vert _2,\Vert \sigma _z\Vert _2, \Vert \sigma _x\otimes \sigma _x\Vert _2, \Vert \sigma _y\otimes \sigma _y\Vert _2,\Vert \sigma _z\otimes \sigma _z\Vert _2\} = 1$. $\square $

Proof

(Proof of Theorem 1) Based on Theorem 3, we have proved that $\mathcal {L}(\varvec{\theta })$ is $L_1$-smooth, where $L_1 = 4M$. In our implementation, there is only one variable for each variable unitary gate, so $M=\dim {\varvec{\theta }} = \mathcal {O}(n^2)$ here. Then we can apply Lemma 1 and obtain that the convergence rate is $T=\mathcal {O}(\frac{n^2}{\epsilon ^4})$ for finding $\mathbb {E}[\Vert \nabla _{\varvec{\theta }} \mathcal {L}(\varvec{\theta }) \Vert ^2]\le \epsilon ^2$. In order to estimate the gradients, we have to compute the loss function for $\mathcal {O}(\dim {\varvec{\theta }}) = \mathcal {O}(n^2)$ times. The total time complexity of the algorithm is $\mathcal {O}(\frac{n^4}{\epsilon ^4})$. $\square $

1.2 B. Details of numerical experiments

Data generation

For each class of gates U(t) we considered in different scenarios, we randomly generate 60 different $U(t_i), t_i \in [0,2]$ and split them into the training set and the test set with ratio 50 : 10.

Hyperparameters

We set the learning rate $\eta $ as 0.005, the maximum number of iterations K as 150 and the threshold $\delta $ as 0.999 in all experiments.

Impelementation details

We implement our proposed model QAEGate based on Tensorflow Quantum (Broughton et al. 2020). The parameterized quantum circuits are constructed by the tools provided in Cirq (2022) and the gradients are estimated by the differentiators based on parameter shift rule (Mitarai et al. 2018).

Training details

We employed stochastic gradient descent to train all the models in our experiments, using a single GPU. The training time for all the experiments presented in this paper is less than one hour.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhu, Y., Bai, G., Wang, Y. et al. Quantum autoencoders for communication-efficient cloud computing. Quantum Mach. Intell. 5, 27 (2023). https://doi.org/10.1007/s42484-023-00112-5

Download citation

Received: 14 February 2023
Accepted: 19 April 2023
Published: 10 July 2023
DOI: https://doi.org/10.1007/s42484-023-00112-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantum autoencoders for communication-efficient cloud computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Quantum image compression with autoencoders based on parameterized quantum circuits

Denoising quantum mixed states using quantum autoencoders

Practical distributed quantum information processing with LOCCNet

Explore related subjects

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's Note

Appendix

Appendix

1.1 A. Proof of Theorem 1

Lemma 1

Lemma 2

Lemma 3

Proof

Theorem 3

Proof

Proof

1.2 B. Details of numerical experiments

Data generation

Hyperparameters

Impelementation details

Training details

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation