Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Q NeuroEvolution Arxiv

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/348409470

Markovian Quantum Neuroevolution for Machine Learning

Preprint · December 2020

CITATIONS READS
0 89

3 authors, including:

Zhide Lu Pei-Xin Shen


Tsinghua University Tsinghua University
10 PUBLICATIONS 43 CITATIONS 13 PUBLICATIONS 45 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Zhide Lu on 12 January 2021.

The user has requested enhancement of the downloaded file.


Markovian Quantum Neuroevolution for Machine Learning

Zhide Lu,1, ∗ Pei-Xin Shen,1, ∗ and Dong-Ling Deng1, 2, †


1
Center for Quantum Information, IIIS, Tsinghua University, Beijing 100084, People’s Republic of China
2
Shanghai Qi Zhi Institute, 41th Floor, AI Tower, No. 701 Yunjin Road, Xuhui District, Shanghai 200232, China
(Dated: December 30, 2020)
Neuroevolution, a field that draws inspiration from the evolution of brains in nature, harnesses evolutionary
algorithms to construct artificial neural networks. It bears a number of intriguing capabilities that are typically
inaccessible to gradient-based approaches, including optimizing neural-network architectures, hyperparameters,
and even learning the training rules. In this paper, we introduce a quantum neuroevolution algorithm that
autonomously finds near-optimal quantum neural networks for different machine learning tasks. In particular,
we establish a one-to-one mapping between quantum circuits and directed graphs, and reduce the problem of
finding the appropriate gate sequences to a task of searching suitable paths in the corresponding graph as a
Markovian process. We benchmark the effectiveness of the introduced algorithm through concrete examples
including classifications of real-life images and symmetry-protected topological states. Our results showcase
the vast potential of neuroevolution algorithms in quantum machine learning, which would boost the exploration
towards quantum learning supremacy with noisy intermediate-scale quantum devices.

Quantum machine learning studies the interplay between


machine learning and quantum physics [1–4]. On the one
Unknown
hand, machine learning has achieved dramatic success over System

. . .

. . .
architecture
the past two decades [5, 6] and many problems that were noto-
riously challenging for artificial intelligence, such as playing
the game of Go [7, 8] or predicting protein structures [9], have
been cracked recently. This gives rise to new opportunities Depth-1
Gate-block

. . .

. . .
circuit
for using machine learning techniques to solve difficult prob-
lems in quantum science. Indeed, machine learning ideas and  
tools have been invoked in various applications in quantum 0 1 ··· 0
physics, including representing quantum many-body states 0 0 ··· 1
The relation Prior Next  
 .. .. .. .. 
. . .

. . .

. . .
[10, 11], quantum state tomography [12, 13], non-locality de- between blocks block block . .
. .
tection [14], topological quantum compiling [15], and learn- 1 0 ··· 0
ing phases of matter [16–26], etc. On the other hand, the idea
of quantum computing has revolutionized the theories and im-
Circuit sequence Block-1 Block-n
. . .

. . .

. . .
plementations of computation [27]. New quantum algorithms
may offer unprecedented prospects to enhance, speed up, or
innovate machine learning as well [28–35]. Without a doubt,
the studies of the interplay between machine learning and
quantum physics will benefits both fields and the emergent re- TABLE I. Illustration of the graph-encoding method, based on which
search frontier of quantum machine learning has become one the problem of searching the optimal quantum circuits reduces to
a task of finding paths (such as the one with the red color) in the
of today’s most rapidly growing interdisciplinary fields [1–4].
corresponding directed graphs.
An intriguing approach widely studied in quantum machine
learning is to exploit the hybrid quantum-classical scheme,
where parameterized quantum circuits are optimized with intermediate-scale quantum (NISQ) devices [59], where the
classical methods (such as stochastic gradient descent) to sat- depth of the quantum circuits would be limited due to unde-
isfy certain objective functions. Notable examples in this sirable noises carried by such a device. In the classical ma-
category include various quantum classifiers [35–49], varia- chine learning literature, several renowned algorithms have
tional quantum eigensolver [50–53], quantum Born machines been proposed to search for appropriate neural network archi-
[54, 55], and quantum approximation optimization algorithms tectures [60–72], including evolutionary or genetic algorithms
[56–58]. In this scenario, one typically choose a variational (such as NeuroEvolution of Augmenting Topologies, NEAT)
ansatz circuit with fixed structure and then optimize its tunable [62], greedy algorithms [64], reinforcement learning based al-
parameters to tackle the given problem. Yet, different families gorithms [65–68], and differentiable architecture search [69–
of parameterized quantum circuits may bear distinct entan- 72]. Inspired by these algorithms, analogous quantum archi-
gling capabilities and representation power, thus are suitable tecture search algorithms have also been introduced [73–82].
for different tasks. For a given learning task, how to obtain a Each of these algorithms carries its own pros and cons, and
well-performing ansatz circuit as short as possible is of cru- the choice depends on the specific problem.
cial importance, especially for quantum learning with noisy In this paper, we introduce a quantum neuroevolution al-
2

fitness > fc Circuit


Forbidden R
Sequence-1
Rx Rx Rx R Rx R
Gate Set

Rx R R R R build

…………
QML
determine Block Library find Paths evolve Fitness
Problem
map
R
Permitted

Rx Rx R Rx R Directed Graph
Circuit
Rx R R R generations > gc Sequence-n

FIG. 1. A schematic illustration of the MQNE algorithm. For a given


TABLE II. The connection rules for gate-blocks. The first row shows quantum machine learning (QML) problem and experimental setup,
the forbidden connection configurations, which should be replaced we first determine the allowed gate-blocks and the corresponding di-
by the corresponding permitted ones in the second row. Here, only rected graph, and iteratively generate quantum circuits in a Marko-
two-qubit controlled-Rx gate and single-qubit rotation gate R are vian fashion. The algorithm terminates when the highest fitness of
used in constructing various quantum circuits. the generated circuits becomes larger than certain threshold value fc
or the number of generations exceeds a given number gc .

gorithm, which we call the Markovian quantum neuroevolu-


tion (MQNE) algorithm, to search for optimal ansatz quan- ents, since crossover and mutation of unitaries may result in
tum circuits for different machine learning tasks. We pro- meaningless structures. Now, let us introduce our MQNE al-
pose a graph-encoding method (see Table I), where the nodes gorithm, which overcomes these shortcomings.
of the graph correspond to the elementary gate-blocks and First, we introduce a graph-encoding method, which maps
the directed edges represent the allowed connection between quantum circuits to directed paths in the corresponding graph.
gate-blocks, to injectively map quantum circuits to directed Suppose we need to design a k-qubit quantum circuit to solve
graphs. Consequently, we recast the problem of designing a given quantum machine learning problem, and for simplic-
a proper quantum circuit to a task of searching an appropri- ity we restrict our discussion to the case that the circuits
ate directed path of the graph in a Markovian fashion. To are composed with only single-qubit rotations and two-qubit
illustrate the effectiveness of the MQNE algorithm, we ap- controlled-Rx gates. We mention that this choice is already
ply it to a variety of quantum learning tasks, including clas- general enough since these gates forms a universal gate set for
sifications of real-life images (such as handwritten digit im- quantum computing [27]. We choose the controlled-Rx gate,
ages in the MNIST dataset [83], and the Wisconsin Diagnos- rather than the controlled-NOT gate typically used in design-
tic Breast Cancer dataset [84]) and symmetry-protected topo- ing quantum neural networks, to guarantee that the circuits
logical (SPT) states. We find that our algorithm yields ansatz from later generations cover these from earlier generations, so
quantum circuits with notably smaller depths, while maintain- as to ensure improved performance of the offsprings. This can
ing a comparable classification accuracy. be easily deduced from the fact that the controlled-Rx gate re-
The algorithm.—In designing classical neural networks, a duces to identity when setting the controlled rotation angle to
renowned neuroevolution algorithm is the NEAT algorithm 0. To avoid ambiguity and duplication of successive rotations,
[62], which exploits concepts (e.g., genome, crossover, spe- we invoke some connection rules for arranging gate-blocks (a
ciation, and mutation) from biology to evolve neural network gate-block is a depth-1 quantum circuit) in sequential order
topologies along with weights. Inspired by this, a natural to form the desired circuits (see Table II): (a) the latter gate-
idea is to extend NEAT to the quantum scenario. However, block should not include any gate which can be operated in
straightforward adoption of NEAT in the quantum domain parallel with the former gate-block; (b) the latter gate-block
does not work since quantum neural networks differs substan- should not include the same gates as the former gate-block on
tially from classical ones. For instance, in the quantum sce- the same qubits. We suppose that the qubits are arranged in
nario the quantum neurons (qubits) are connected by multi- a one-dimensional geometry and the controlled-Rx gates only
qubit unitaries rather than weight parameters. As a result, cer- act on adjacent qubits. We use a length-(k + 2bk/2c) vector
tain techniques, such as explicit fitness sharing and matching to represent a quantum gate-block. The first 2bk/2c numbers
up genomes [62], used in NEAT become invalid or ambiguous encode controlled-Rx gates in a gate-block. Here, two adja-
in the quantum scenario. Indeed, as shown in the Supplemen- cent nonzero numbers represent a controlled-Rx gate acting
tary Material [85], the simple genetic algorithm for designing on these two qubits labeled by them, and two adjacent 0 num-
quantum classifiers, which uses crossover and mutation di- bers means that there is no controlled-Rx gate acting on the
rectly, performs poorly in classifying images. The ineffective- remaining qubits. The next k numbers encodes the single-
ness of this algorithm is due to: i) the encoding of the quantum qubit rotation gates in a gate-block, where we use 0 to denote
circuits into bit strings is not a bijection, which increases the the absence of rotation for the corresponding qubit (see [85]
search space and slows down the searching process; ii) the for more details).
performance of the offspring generated from crossover and Without further restrictions, it is straightforward to ob-
mutation is not guaranteed to be better than that for their par- tain that the number of possible gate-blocks is f1 (k) =
3

1.0 (a) 1.0


(b) 0.70
Algorithm 1 Markovian quantum neuroevolution algorithm 0.9 0.9
0.8 training loss 0.65

Accuracy
Input: The directed graph, hyperparameters ni , ti , l, l0 , fc , and gc 0.8

Fitness
validation loss

Loss
0.7
Output: The optimal quantum circuit architecture the highest fitness 0.7 0.60
0.6
Initialization: randomly generate n1 length-l paths, and compute 0.5 0.6 training accuracy
all populations validation accuracy 0.55
the fitness of their corresponding circuits 0.4
1 2 3 4 5 6 7 8 9 1011 1213
0.5
0 50 100 150 200
for i = 1 to gc do 1.0 (c) 1.0
(d) 0.70
Choose the ti (ti < ni ) paths with highest fitness 0.9 0.9
Evolution: produce paths in the (i + 1)-th generation by con- 0.8 training accuracy 0.65

Accuracy
0.8

Fitness
catenating ni+1 segments of length l0 to each of the ti paths;

Loss
0.7 validation accuracy
the highest fitness 0.7 0.60
compute the fitness for the paths in the (i + 1)-th generation 0.6
0.6 training loss
if max [fitness (paths)] ≥ fc then 0.5 all populations validation loss 0.55
Terminate the iteration 0.4 0.5
1 2 3 4 5 6 7 8 9 10 11 0 50 100 150 200
end if Generations Epochs
end for
Output the optimal quantum circuit FIG. 2. The numerical results for classification of handwritten-digit
images in the MNIST dataset. In (a), the initial variational param-
eters are randomly chosen during the training process. In (b), we
plot the loss and accuracy as a function of training epochs for the
√ √ sixth-generation quantum circuit with the highest fitness. (c) shows
(1+ 3)k+1 −(1− 3)k+1

2 3
[85]. These gate-blocks form a gate- the fitness of generated circuits for each generation with fixed ini-
block library and we use a directed graph to represent this tial variational parameters and (d) plots the loss and accuracy for the
library. Each node of the graph corresponds to a gate block, eighth-generation circuit that has largest fitness [85].
and each directed edge represent a legitimate connections of
gate-blocks according to the connection rules: there is an edge
pointing from node x to y if and only if the gate-block y is al- size of the graph, we can impose some further restrictions on
lowed to be put next to gate-block x. For convenience, we use building possible gate-blocks. For instance, we may require
an adjacency matrix to denote the directed graph as in graph that for each gate-block there are at most c (a cut-off constant
theory [86]. Noting that a quantum circuit is just a sequence of number) controlled-Rx gates and the rest qubits all undergo
gate-blocks in the corresponding library, hence the task of de- single-qubit rotations. With these restrictions, the number of
signing a well-performing ansatz variational quantum circuit possible gate-blocks reduces to a polynomial function of k
reduces a problem of finding an optimal path in the directed [85]. Accordingly, the size of the directed graph is also re-
graph. The later problem can be solved with the following duced. However, it is worthwhile to mention that the reduc-
procedure: 1) Initialization. we randomly generate n1 paths tion of the graph may also bring up a problem: we may not
with length l based on the directed graph, and compute the fit- be able to find the optimal ansatz circuits since the searching
ness of the corresponding variational quantum circuits; 2) It- space is reduced too much by the restrictions. In the follow-
eration in a Markovian fashion. From the i-th generation, we ing, we give a couple of concrete examples to benchmark the
choose ti paths with the largest fitness and for each of them effectiveness of our MQNE algorithm.
randomly add to its end ni+1 segments of length l0 . Due to Classification of handwritten-digit images.—The first ex-
the connection rules, not all possible segments can be added ample we consider is the classification of handwritten-digit
at will to the existed paths. Whether a new segment is al- images in the MNIST dataset. This is a prototypical machine
lowed to be added depends on the last gate-block of the given learning task for benchmarking the effectiveness of various
paths, which is similar to a Markovian process. In this way, learning approaches. The MNIST dataset consists of gray-
we obtain the (i + 1)-th generation of paths. We then com- scale images for handwritten digits from 0 through 9. Each
pute the fitness of all (i + 1)-th generation quantum circuits. image is two dimensional, and contains 28×28 pixels. For our
If the fitness of a circuit is larger than certain given threshold purpose, we choose only a subset of MNIST consisting of im-
value fc (or the number of iteration exceed a given number ages for digits 1 and 9 and reduced the size of the images from
gc ), we terminate the iteration and output the corresponding 28×28 pixels to 16×16 pixels, so that we can run our MQNE
path and quantum circuit. If none of the circuits has a fitness algorithm and simulate the quantum classifiers generated with
larger than fc , we repeat this step to generate paths and cir- moderate classical computational resources. In addition, we
cuits for the next generation. A schematic illustration and the use amplitude encoding to map the input images into quantum
pseudocode for our MQNE algorithm are given in Fig. 1 and states and define the following loss function based on cross-
Algorithm S2, respectively. entropy for a single data sample encoded as |ψiin (see [87] for
more details):
We note that the number of nodes of the directed graph
scales exponentially with the number of qubits involved L (h (|ψiin ; Θ) , a) = −a1 log g1 − a2 log g2 , (1)
f1 (k) = Θ(2k ). For large k, the size of the graph might
exceeds the capacity of the any classical computer, render- where a = (a1 , a2 ) = (1, 0) or (0, 1) denotes the one-hot
ing our MQNE algorithm infeasible in practice. To reduce the encoding [88] of the label of |ψiin , h (|ψiin ; Θ) represents the
4

Generations quantum classifiers may also be used to directly classify quan-


4 3
tum states produced by quantum devices. To show the power
of our MQNE algorithm in this scenario, we consider a quan-
5 2 tum machine learning task of classifying SPT states. For sim-
plicity and concreteness, we consider the following cluster-
Ising model, whose Hamiltonian reads [89]
N
X N
X
6 1
H(λ) = − x
σj−1 σjz σj+1
x
+λ σjy σj+1
y
, (2)
0. 0.2 0.4 0.6 0.8 1. j=1 j=1
Fitness
where σiα , α = x, y, z, are Pauli matrices acting on the i-
th spin and λ is a parameter describing the strength of the
nearest-neighbour interaction, and N denotes the number of
7 10
spins. This model is exactly solvable and features a well-
understood quantum phase transition at λ = 1, between a
8 9 Z2 × Z2 SPT cluster phase characterized by a string order for
λ < 1 and an antiferromagnetic phase with long-range order
FIG. 3. The performance of the MQNE algorithm in the task of clas- for λ > 1. Here, we apply the MQNE algorithm to obtain an
sifying symmetry-protected topological states [85]. optimal ansatz variational circuit, which serves as a quantum
classifier for classifying these two distinct phases. To this end,
we set N = 8 and uniformly sample 2000 Hamiltonians with
output of the quantum classifier with its parameters denoted varying λ from 0 to 2 under the periodic boundary condition.
by Θ collectively, and g1,2 denotes the output probabilities of We compute their corresponding ground states, which are the
digits 1 and 9. For training the quantum classifier, we use a input data to the classifier, and randomly choose 1600 of them
classical optimizer to search the optimal parameters Θ∗ that for training and the remaining ones for testing. Our results
minimize the averaged loss over the training dataset. are plotted in Fig. 3, from which it is evident that the largest
For images with 16 × 16 pixels, we need eight qubits to fitness increases at the first several generations and then satu-
encode each input sample and for convenience we also use rates. We find a circuit at the ninth generation, which involves
an additional qubit to output the results of the binary classi- only 36 single-qubit and 31 two-qubit gates but has a fitness
fication. Thus, the ansatz circuit we aim to design is a nine- equals 100% [85].
qubit variation circuit. Applying the graph-encoding method We stress that, in comparison with the typical variational
for nine-qubit circuits and supposing that controlled-Rx gates circuits used in previous works [87], the ansatz circuits found
act only on adjacent qubits, we obtain 6688 gate-blocks and by the MQNE algorithm involves much less gates and varia-
the corresponding directed graph has 6688 nodes. Based tional parameters, while maintaining a comparable classifica-
on the connection rules, we compute the adjacency matrix tion accuracy. For instance, for the example of classification
and apply the MQNE algorithm with hyperparameters set as of handwritten-digit images, the classifier used in Ref.[87]
(ni , ti , l, l0 ) = (5, 1, 5, 2). Our results are summarized in Fig. uses more than 90 single-qubit and 80 two-qubit gates with
2. In Fig. 2(a), we randomly choose the initial variational circuit depth larger than 30 and the number of variational pa-
parameters when training the generated quantum classifiers at rameters larger than 270, whereas the circuit found by the
each generation. The MQNE algorithm outputs a quantum cir- MQNE algorithm at the fifth generation contains only 22
cuit with fitness (accuracy) 96% at the sixth generation. The single-qubit and 24 two-qubit gates with 90 variational pa-
corresponding path for this circuit on the directed graph reads rameters and circuit depth 16. This significant reduction of
[4257 → 6687 → 1345 → 5029 → 1859 → 914 → 3777 → the circuit depth and number of gates would be crucial for
6244 → 2433 → 4797 → 569 → 2054 → 4261 → 3681 → experimental demonstration of quantum learning with NISQ
4769 → 6200], where the numbers denotes the labels of the devices. It not only simplifies the implementation of quantum
nodes of the graph. In Fig. 2(b), we plot the average accu- classifiers substantially from the practical perspective, but also
racy and loss for both the training and validation datasets as a would mitigate the possible barren plateau problem (i.e., van-
function of the number of epochs during the training process. ishing gradient) [90–92] in training deep networks. We also
After training, the performance of this quantum classifier is mention that the performance of the MQNE may be improve
also tested on the testing dataset and a accuracy of 96% is ob- further by choosing the hyperparameters judiciously accord-
tained. Fig.2 (c) and (d) are analogous to Fig.2 (a) and (b) ing to different learning problems and experimental setups. In
respectively, but with fixed initial parameters during the train- the Supplementary Material, we also tested the MQNE algo-
ing process. We find that fixing the initial parameters would rithm in the task of classification of images from the Wiscon-
lead to a more stable improvement of the performance for next sin Diagnostic Breast Cancer dataset, which may have impor-
generation classifiers. tant application in medical machine learning [93].
Classification of SPT states.—Unlike classical classifiers, Discussion and conclusion.—Recent advances in quan-
5

tum machine learning have revealed that quantum classifiers T. Graepel, and D. Hassabis, Mastering the game of Go with
are highly vulnerable to adversarial attacks—adding a tiny deep neural networks and tree search, Nature 529, 484 (2016).
amount of carefully crafted perturbations into the original le- [8] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou,
gitimate data will cause the quantum classifiers to make incor- A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton,
Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. van den Driessche,
rect predictions [87, 94]. Thus, how to enhance the robustness T. Graepel, and D. Hassabis, Mastering the game of Go without
of quantum classifiers to adversarial perturbations is a prob- human knowledge, Nature 550, 354 (2017).
lem of vital importance for practical applications of quantum [9] A. W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre,
learning in the future. With the MQNE algorithm, a possi- T. Green, C. Qin, A. Žı́dek, A. W. R. Nelson, A. Bridgland,
ble solution to this problem is to design ansatz circuits that H. Penedones, S. Petersen, K. Simonyan, S. Crossan, P. Kohli,
are more robust to the given type of adversarial attack. This D. T. Jones, D. Silver, K. Kavukcuoglu, and D. Hassabis, Im-
could be achieved by replacing the original loss function [e.g. proved protein structure prediction using potentials from deep
learning, Nature 577, 706 (2020).
Eq. (1)] with a modified one that incorporates the adversarial [10] G. Carleo and M. Troyer, Solving the quantum many-body
perturbations [95]. In addition, the graph-encoding method problem with artificial neural networks, Science 355, 602
would also be combined with other evolution or genetic al- (2017).
gorithms to construct optimal circuit structures for different [11] X. Gao, Z. Zhang, and L. Duan, An efficient quantum algorithm
quantum learning problems. for generative machine learning, arXiv:1711.02038.
In summary, we have introduced a quantum neuroevolu- [12] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko,
and G. Carleo, Neural-network quantum state tomography, Nat.
tion algorithm, named the MQNE algorithm, to design opti- Phys. 14, 447 (2018).
mal variational ansatz quantum circuits for different quantum [13] J. Carrasquilla, G. Torlai, R. G. Melko, and L. Aolita, Recon-
learning tasks. Through concrete examples involving clas- structing quantum states with generative models, Nat. Mach.
sifications of real-life images and SPT quantum states, we Intell. 1, 155 (2019).
demonstrate that the MQNE algorithm performs excellently in [14] D.-L. Deng, Machine Learning Detection of Bell Nonlocality
searching appropriate quantum classifiers. It finds ansatz cir- in Quantum Many-Body Systems, Phys. Rev. Lett. 120, 240402
cuits with notably smaller depths and number of gates, while (2018).
[15] Y.-H. Zhang, P.-L. Zheng, Y. Zhang, and D.-L. Deng, Topolog-
maintaining a comparable classification accuracy. Our results ical Quantum Compiling with Reinforcement Learning, Phys.
provide a valuable guide for experimental implementations of Rev. Lett. 125, 170501 (2020).
quantum machine learning with NISQ devices. [16] Y. Zhang and E.-A. Kim, Quantum Loop Topography for Ma-
We acknowledge helpful discussions with Weikang Li, chine Learning, Phys. Rev. Lett. 118, 216401 (2017).
Wenjie Jiang, and Sirui Lu. This work is supported by [17] J. Carrasquilla and R. G. Melko, Machine learning phases of
the start-up fund from Tsinghua University (Grant. No. matter, Nat. Phys. 13, 431 (2017).
[18] E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Huber, Learning
53330300320), the National Natural Science Foundation of
phase transitions by confusion, Nat. Phys. 13, 435 (2017).
China (Grant. No. 12075128), and the Shanghai Qi Zhi Insti- [19] L. Wang, Discovering phase transitions with unsupervised
tute. learning, Phys. Rev. B 94, 195105 (2016).
[20] P. Broecker, J. Carrasquilla, R. G. Melko, and S. Trebst, Ma-
chine learning quantum phases of matter beyond the fermion
sign problem, Sci. Rep. 7, 8823 (2017).
[21] K. Ch’ng, J. Carrasquilla, R. G. Melko, and E. Khatami, Ma-

These authors contributed equally to this work. chine Learning Phases of Strongly Correlated Fermions, Phys.

dldeng@tsinghua.edu.cn Rev. X 7, 031038 (2017).
[1] S. Das Sarma, D.-L. Deng, and L.-M. Duan, Machine learning [22] Y. Zhang, R. G. Melko, and E.-A. Kim, Machine learning Z2
meets quantum physics, Phys. Today 72, 48 (2019). quantum spin liquids with quasiparticle statistics, Phys. Rev. B
[2] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, 96, 245119 (2017).
and S. Lloyd, Quantum machine learning, Nature 549, 195 [23] S. J. Wetzel, Unsupervised learning of phase transitions: From
(2017). principal component analysis to variational autoencoders, Phys.
[3] V. Dunjko and H. J. Briegel, Machine learning & artificial in- Rev. E 96, 022140 (2017).
telligence in the quantum domain: A review of recent progress, [24] W. Hu, R. R. P. Singh, and R. T. Scalettar, Discovering phases,
Rep. Prog. Phys. 81, 074001 (2018). phase transitions, and crossovers through unsupervised ma-
[4] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, chine learning: A critical examination, Phys. Rev. E 95, 062122
N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Machine learn- (2017).
ing and the physical sciences, Rev. Mod. Phys. 91, 045002 [25] Y. Zhang, A. Mesaros, K. Fujita, S. D. Edkins, M. H. Hamidian,
(2019). K. Ch’ng, H. Eisaki, S. Uchida, J. C. S. Davis, E. Khatami,
[5] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature and E.-A. Kim, Machine learning in electronic-quantum-matter
521, 436 (2015). imaging experiments, Nature 570, 484 (2019).
[6] M. I. Jordan and T. M. Mitchell, Machine learning: Trends, [26] W. Lian, S.-T. Wang, S. Lu, Y. Huang, F. Wang, X. Yuan,
perspectives, and prospects, Science 349, 255 (2015). W. Zhang, X. Ouyang, X. Wang, X. Huang, L. He, X. Chang,
[7] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van D.-L. Deng, and L. Duan, Machine Learning Topological
den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershel- Phases with a Solid-State Quantum Simulator, Phys. Rev. Lett.
vam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalch- 122, 210503 (2019).
brenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, [27] M. A. Nielsen and I. L. Chuang, Quantum Computation
6

and Quantum Information (Cambridge University Press, Cam- mun. 5, 4213 (2014).
bridge, 2010). [51] C. Kokail, C. Maier, R. van Bijnen, T. Brydges, M. K. Joshi,
[28] A. W. Harrow, A. Hassidim, and S. Lloyd, Quantum Algorithm P. Jurcevic, C. A. Muschik, P. Silvi, R. Blatt, C. F. Roos, and
for Linear Systems of Equations, Phys. Rev. Lett. 103, 150502 P. Zoller, Self-verifying variational quantum simulation of lat-
(2009). tice models, Nature 569, 355 (2019).
[29] S. Lloyd, M. Mohseni, and P. Rebentrost, Quantum principal [52] J.-G. Liu, Y.-H. Zhang, Y. Wan, and L. Wang, Variational
component analysis, Nat. Phys. 10, 631 (2014). quantum eigensolver with fewer qubits, Phys. Rev. Research 1,
[30] V. Dunjko, J. M. Taylor, and H. J. Briegel, Quantum-Enhanced 023025 (2019).
Machine Learning, Phys. Rev. Lett. 117, 130501 (2016). [53] D. Wang, O. Higgott, and S. Brierley, Accelerated Variational
[31] M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and Quantum Eigensolver, Phys. Rev. Lett. 122, 140504 (2019).
R. Melko, Quantum Boltzmann Machine, Phys. Rev. X 8, [54] J.-G. Liu and L. Wang, Differentiable learning of quantum cir-
021050 (2018). cuit Born machines, Phys. Rev. A 98, 062324 (2018).
[32] X. Gao, Z.-Y. Zhang, and L.-M. Duan, A quantum machine [55] B. Coyle, D. Mills, V. Danos, and E. Kashefi, The Born
learning algorithm based on generative models, Sci. Adv. 4, supremacy: Quantum advantage and training of an Ising Born
eaat9004 (2018). machine, Npj Quantum Inf. 6, 1 (2020).
[33] S. Lloyd and C. Weedbrook, Quantum Generative Adversarial [56] E. Farhi, J. Goldstone, and S. Gutmann, A Quantum Approxi-
Learning, Phys. Rev. Lett. 121, 040502 (2018). mate Optimization Algorithm, arXiv:1411.4028.
[34] L. Hu, S.-H. Wu, W. Cai, Y. Ma, X. Mu, Y. Xu, H. Wang, [57] L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin,
Y. Song, D.-L. Deng, C.-L. Zou, and L. Sun, Quantum genera- Quantum Approximate Optimization Algorithm: Performance,
tive adversarial learning in a superconducting quantum circuit, Mechanism, and Implementation on Near-Term Devices, Phys.
Sci. Adv. 5, eaav2761 (2019). Rev. X 10, 021067 (2020).
[35] M. Schuld and N. Killoran, Quantum Machine Learning in Fea- [58] N. Moll, P. Barkoutsos, L. S. Bishop, J. M. Chow, A. Cross,
ture Hilbert Spaces, Phys. Rev. Lett. 122, 040504 (2019). D. J. Egger, S. Filipp, A. Fuhrer, J. M. Gambetta, M. Ganzhorn,
[36] M. Schuld, A. Bocharov, K. M. Svore, and N. Wiebe, Circuit- A. Kandala, A. Mezzacapo, P. Müller, W. Riess, G. Salis,
centric quantum classifiers, Phys. Rev. A 101, 032308 (2020). J. Smolin, I. Tavernelli, and K. Temme, Quantum optimiza-
[37] E. Farhi and H. Neven, Classification with Quantum Neural tion using variational algorithms on near-term quantum devices,
Networks on Near Term Processors, arXiv:1802.06002. Quantum Sci. Technol. 3, 030503 (2018).
[38] M. Schuld, M. Fingerhuth, and F. Petruccione, Implementing [59] J. Preskill, Quantum Computing in the NISQ era and beyond,
a distance-based classifier with a quantum interference circuit, Quantum 2, 79 (2018).
EPL Europhys. Lett. 119, 60002 (2017). [60] E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan,
[39] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Quantum Q. Le, and A. Kurakin, Large-Scale Evolution of Image Classi-
circuit learning, Phys. Rev. A 98, 032309 (2018). fiers, arXiv:1703.01041.
[40] V. Havlı́ček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kan- [61] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, Regularized
dala, J. M. Chow, and J. M. Gambetta, Supervised learning with evolution for image classifier architecture search, in Proceed-
quantum-enhanced feature spaces, Nature 567, 209 (2019). ings of the Aaai Conference on Artificial Intelligence, Vol. 33
[41] D. Zhu, N. M. Linke, M. Benedetti, K. A. Landsman, N. H. (2019) pp. 4780–4789.
Nguyen, C. H. Alderete, A. Perdomo-Ortiz, N. Korda, A. Gar- [62] K. O. Stanley and R. Miikkulainen, Evolving neural networks
foot, C. Brecque, L. Egan, O. Perdomo, and C. Monroe, Train- through augmenting topologies, Evol. Comput. 10, 99 (2002).
ing of quantum circuits on a hybrid quantum computer, Sci. [63] K. O. Stanley, J. Clune, J. Lehman, and R. Miikkulainen, De-
Adv. 5, eaaw9918 (2019). signing neural networks through neuroevolution, Nat. Mach. In-
[42] I. Cong, S. Choi, and M. D. Lukin, Quantum convolutional neu- tell. 1, 24 (2019).
ral networks, Nat. Phys. 15, 1273 (2019). [64] S. Huang, X. Li, Z.-Q. Cheng, Z. Zhang, and A. Haupt-
[43] K. H. Wan, O. Dahlsten, H. Kristjánsson, R. Gardner, and M. S. mann, GNAS: A Greedy Neural Architecture Search Method
Kim, Quantum generalisation of feedforward neural networks, for Multi-Attribute Learning, in Proceedings of the 26th ACM
Npj Quantum Inf. 3, 1 (2017). International Conference on Multimedia, MM ’18 (Associa-
[44] E. Grant, M. Benedetti, S. Cao, A. Hallam, J. Lockhart, V. Sto- tion for Computing Machinery, New York, NY, USA, 2018) pp.
jevic, A. G. Green, and S. Severini, Hierarchical quantum clas- 2049–2057.
sifiers, Npj Quantum Inf. 4, 1 (2018). [65] B. Zoph and Q. V. Le, Neural Architecture Search with Rein-
[45] Y. Du, M.-H. Hsieh, T. Liu, and D. Tao, Implementable Quan- forcement Learning, arXiv:1611.01578.
tum Classifier for Nonlinear Data, arXiv:1809.06056. [66] B. Baker, O. Gupta, N. Naik, and R. Raskar, Designing
[46] A. Uvarov, A. Kardashin, and J. Biamonte, Machine Neural Network Architectures using Reinforcement Learning,
Learning Phase Transitions with a Quantum Processor, arXiv:1611.02167.
arXiv:1906.10155. [67] H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang, Efficient Archi-
[47] C. Blank, D. K. Park, J.-K. K. Rhee, and F. Petruccione, Quan- tecture Search by Network Transformation, arXiv:1707.04873.
tum classifier with tailored quantum kernel, arXiv:1909.02611. [68] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, Learning
[48] P. Rebentrost, M. Mohseni, and S. Lloyd, Quantum Support Transferable Architectures for Scalable Image Recognition,
Vector Machine for Big Data Classification, Phys. Rev. Lett. arXiv:1707.07012.
113, 130503 (2014). [69] H. Liu, K. Simonyan, and Y. Yang, DARTS: Differentiable Ar-
[49] F. Tacchino, C. Macchiavello, D. Gerace, and D. Bajoni, An chitecture Search, arXiv:1806.09055.
artificial neuron implemented on an actual quantum processor, [70] S. Xie, H. Zheng, C. Liu, and L. Lin, SNAS: Stochastic Neural
Npj Quantum Inf. 5, 1 (2019). Architecture Search, arXiv:1812.09926.
[50] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, [71] A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, and F. Hut-
P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, A variational ter, Understanding and Robustifying Differentiable Architec-
eigenvalue solver on a photonic quantum processor, Nat. Com- ture Search, arXiv:1909.09656.
7

[72] H. Liang, S. Zhang, J. Sun, X. He, W. Huang, K. Zhuang, and D. Mukhopadhyay, Adversarial Attacks and Defences: A Sur-
Z. Li, DARTS+: Improved Differentiable Architecture Search vey, arXiv:1810.00069.
with Early Stopping, arXiv:1909.06035. [96] J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah, Julia:
[73] R. Li, U. Alvarez-Rodriguez, L. Lamata, and E. Solano, Ap- A Fresh Approach to Numerical Computing, SIAM Rev. 59, 65
proximate Quantum Adders with Genetic Algorithms: An IBM (2017).
Quantum Experience, Quantum Meas. Quantum Metrol. 4, 1 [97] X.-Z. Luo, J.-G. Liu, P. Zhang, and L. Wang, Yao.jl: Extensible,
(2017). Efficient Framework for Quantum Algorithm Design, Quantum
[74] L. Cincio, Y. Subaşı, A. T. Sornborger, and P. J. Coles, Learn- 4, 341 (2020).
ing the quantum algorithm for state overlap, New J. Phys. 20,
113022 (2018).
[75] T. Fösel, P. Tighineanu, T. Weiss, and F. Marquardt, Reinforce-
ment Learning with Neural Networks for Quantum Feedback,
Phys. Rev. X 8, 031084 (2018).
[76] A. G. Rattew, S. Hu, M. Pistoia, R. Chen, and S. Wood, A
Domain-agnostic, Noise-resistant, Hardware-efficient Evolu-
tionary Variational Quantum Eigensolver, arXiv:1910.09694.
[77] D. Chivilikhin, A. Samarin, V. Ulyantsev, I. Iorsh, A. R.
Oganov, and O. Kyriienko, MoG-VQE: Multiobjective genetic
variational quantum eigensolver, arXiv:2007.04424.
[78] L. Cincio, K. Rudinger, M. Sarovar, and P. J. Coles, Machine
learning of noise-resilient quantum circuits, arXiv:2007.01210.
[79] M. Ostaszewski, E. Grant, and M. Benedetti, Quantum circuit
structure learning, arXiv:1905.09692.
[80] L. Li, M. Fan, M. Coram, P. Riley, and S. Leichenauer, Quan-
tum optimization with a novel Gibbs objective function and
ansatz architecture search, Phys. Rev. Research 2, 023074
(2020).
[81] S.-X. Zhang, C.-Y. Hsieh, S. Zhang, and H. Yao, Differentiable
Quantum Architecture Search, arXiv:2010.08561.
[82] M. Pirhooshyaran and T. Terlaky, Quantum Circuit Design
Search, arXiv:2012.04046.
[83] Y. LeCun, C. Cortes, and C. Burges, MNIST handwritten digit
database (1998).
[84] W. H. Wolberg, N. Street, and O. L. Mangasarian, UCI Ma-
chine Learning Repository: Breast Cancer Wisconsin (Diag-
nostic) Data Set (1992).
[85] See Supplemental Material at [URL will be inserted by pub-
lisher] for details on the graph-encoding method and the MQNE
algorithm, and more numerical results to demonstrate the per-
formance of the proposed scheme.
[86] N. Deo, Graph Theory with Applications to Engineering and
Computer Science, first edition, first ed. (Dover Publications,
Mineola, New York, 2016).
[87] S. Lu, L.-M. Duan, and D.-L. Deng, Quantum adversarial ma-
chine learning, Phys. Rev. Research 2, 033212 (2020).
[88] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning
(MIT Press, 2016).
[89] P. Smacchia, L. Amico, P. Facchi, R. Fazio, G. Florio, S. Pas-
cazio, and V. Vedral, Statistical mechanics of the cluster Ising
model, Phys. Rev. A 84, 022304 (2011).
[90] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and
H. Neven, Barren plateaus in quantum neural network training
landscapes, Nat. Commun. 9, 4812 (2018).
[91] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Cost-
Function-Dependent Barren Plateaus in Shallow Quantum Neu-
ral Networks, arXiv:2001.00550.
[92] E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti,
An initialization strategy for addressing barren plateaus in
parametrized quantum circuits, Quantum 3, 214 (2019).
[93] B. J. Erickson, P. Korfiatis, Z. Akkus, and T. L. Kline, Machine
Learning for Medical Imaging, RadioGraphics 37, 505 (2017).
[94] N. Liu and P. Wittek, Vulnerability of quantum classification to
adversarial perturbations, Phys. Rev. A 101, 062331 (2020).
[95] A. Chakraborty, M. Alam, V. Dey, A. Chattopadhyay, and
8

Supplementary Material: Markovian Quantum qubits can be acted on by rotation gate R or identity gate
Neuroevolution for Machine Learning pertaining to 2k−2 possibilities. Through some combinatorial
calculations, we can obtain the total number of 1 controlled-
In this Supplementary Material, we specifically show how Rx blocks:
to narrow (enlarge) the gate-block library by imposing (can-
N1 (1) = 2k−2 × k−1 × 21 .

celling) some restrictions on building possible gate-blocks. 1
We also mention how to expand the Markovian process in the
Similarly, for quantum gate-blocks containing i controlled-Rx
MQNE algorithm to the high-order Markovian process. Be-
gate, where i ≤ bk/2c, each of the remaining k − 2i qubits
sides, we present more details on the graph-encoding method
can be acted on by gate R or the identity gate, we obtain
and the MQNE algorithm and more numerical results to
N1 (i) = 2k−2i × k−i × 2i .

demonstrate the performance of the proposed scheme. i

Then the total number of quantum gate-blocks is:


I. ENCODING VECTORS bk/2c √ √
X (1 + 3)k+1 − (1 − 3)k+1
f1 (k) = N1 (i) = √ .
In the main text, we restrict our discussion to the case i=0
2 3
that the circuits are composed with only single-qubit rotation
In
√ this way, we have calculated there are f1 (k) = Θ[(1 +
gates R (composed of Rz − Rx − Rz , where Rx and Rz de-
3)k ] quantum gate-blocks in the gate-block library used in
note rotations along x and z axes, respectively) and two-qubit
the main text.
controlled-Rx gates, and the controlled-Rx gates only act on
However, due to limited classical computational power, we
adjacent qubits. We use a length-(k + 2bk/2c) vector to rep-
may need to impose further restrictions to reduce the size of
resent a quantum gate-block. As an example, we list two en-
the library, from the exponential scaling f1 (k) to a polynomial
coding vectors and their corresponding quantum gate-blocks
scaling f2 (k) as follows:
in Table S3.
1. For quantum gate-blocks containing i controlled-Rx
gate, the remaining k − 2i qubits are acted on simul-
II. THE GATE-BLOCK LIBRARY
taneously by either gate R or the identity gates.

For k-qubit quantum gate-block library, we can sort all 2. For each controlled-Rx gate acting on two neighboring
quantum gate-blocks into the ones containing 0 controlled-Rx qubits i and i + 1, we assume that the i-th qubit is the
gate, 1 controlled-Rx gate, until bk/2c controlled-Rx gates. controlling qubit.
For quantum gate-blocks containing 0 controlled-Rx gate, 3. There are at most c (a cut-off constant number indepen-
each of k qubits can be acted on by single-qubit rotation gate dent of k) controlled-Rx gates in each quantum gate-
R or identity gate, corresponding to 2k quantum gate-blocks block.
of this type totally. Likewise, for quantum gate-blocks con-
With these further restrictions, there are N2 (i) = k−i

taining 1 controlled-Rx gate, each of the remaining k − 2 i
gate-blocks containing i controlled-Rx gates, and conse-
quently the total number of gate-blocks f2 (k) in the library
[1, 2, 5, 4, 0, 0; 0, 0, 3, 0, 0, 6, 0] [0, 0, 0, 0, 0, 0; 1, 0, 3, 4, 0, 6, 7] is given by
R c
X c
X
k−i
= O(k c ).

f2 (k) = N2 (i) = i
Rx i=0 i=0

Similarly, adding more restrictions can even reduce the size


R
R of the library to a constant f3 , independent of k. For instance,
we may use only three gate-blocks to construct a library for
R k-qubit quantum circuits, with encoding vectors as follows:
Rx
• gate-block 1:
[0, 0, 0, . . . , 0; 1, 2, 3, . . . , k],
R
R • gate-block 2:
[1, 2, 3, 4, . . . , 2bk/2c − 1, 2bk/2c; 0, 0, 0, . . . , 0].
R
• gate-block 3:
[2, 3, . . . , 2bk/2c, 2bk/2c + 1; 0, 0, 0, . . . , 0] when k
TABLE S3. Examples for encoding vectors and their corresponding is odd ,
gate-blocks. Here, we assume that the quantum classifier we want to [2, 3, . . . , 2bk/2c−2, 2bk/2c−1, 0, 0; 0, 0, 0, . . . , 0]
construct involves seven qubits. when k is even.
9
 D  E
 1.0 0.70
 training loss
 
0.9 validation loss

$FFXUDF\
 WUDLQLQJORVV
0.69
)LWQHVV

YDOLGDWLRQORVV

/RVV
 
0.8

Accuracy

 WKHEHVWILWQHVV  0.68

Loss
DOOSRSXODWLRQV WUDLQLQJDFFXUDF\
YDOLGDWLRQDFFXUDF\

         

    
 0.7
 F  G  0.67
0.6
  WUDLQLQJORVV  0.5 training accuracy 0.66

$FFXUDF\
YDOLGDWLRQORVV
)LWQHVV

validation accuracy

/RVV
 
 0.4 0.65
WKHEHVWILWQHVV 0 20 40 60 80 100
 DOOSRSXODWLRQV  WUDLQLQJDFFXUDF\
         
YDOLGDWLRQDFFXUDF\
  
 Epochs
*HQHUDWLRQV (SRFKV

FIG. S5. The numerical results for classifying symmetry-protected


FIG. S4. The numerical results for applying of the MQNE algorithm topological states of the cluster-Ising model. The initial variational
to the task of classifying samples in the cancer dataset. In (a), we parameters are randomly chosen during the training process, we plot
randomly choose the initial variational parameters at the beginning the loss and accuracy as a function of training epochs for the tenth-
of the training process. In (b), the loss and accuracy are plotted as generation quantum circuit with the highest fitness.
functions of training epochs for the tenth-generation quantum circuit
with the highest fitness. (c) and (d) are similar to (a) and (b), but with
fixed initial variation parameters. Generations
3

We mention that, the gate-block 1 and gate-block 2 and 3 cor- 4 2


respond to the rotation and entangler layers respectively, in
the Fig.2 of Ref. [87].
On the contrary, in order to improve the possibility of find-
ing the optimal quantum circuits, we can enlarge the searching
space by releasing some restrictions mentioned in the main
5 1
text and above. For example, we may remove the restriction 0. 0.2 0.4 0.6 0.8 1.
that the controlled-Rx gates only act on adjacent qubits. We Fitness
obtain the total number of quantum gate-blocks containing 1
controlled-Rx gate:
N0 (1) = 2k−2 × A2k . 6 8
Likewise, for quantum gate-blocks containing i controlled-Rx
gates, where i ≤ bk/2c, with each of the remaining k − 2i 7
qubits being acted on by gate R or the identity gate, we obtain
N0 (i) = 2k−2i × A2i i
k /Ai . FIG. S6. The performance of the MQNE algorithm in the task of
classifying symmetry-protected topological states, with the initial
Then the total number of gate-blocks is variational parameters fixed at the beginning of the training process.
bk/2c
X
f0 (k) = N0 (i) .
i=0 III. THE MQNE ALGORITHM WITH HIGH-ORDER
In this way, the size of the quantum gate-block library is en- MARKOVIAN PROCESS
larged to Ω(bk/2c!).
In practical applications, we can directly construct the gate- In the main text, we consider the rules between adjacent
block library for (k + 1)-qubit circuits based on existed li- gate-blocks with the help of the single-qubit rotation gate
braries for (k −1)- and k-qubit circuits. Firstly, the operations R. We could regard finding paths in the directed graph as a
on the top k qubits can be adopted entirely from the gate-block Markovian process, in which the future state depends only
library of k-qubit circuits when the last qubit is acted on by a on the current state of the system, but not on the previous
gate R or a identity gate. Secondly, the operations on the top ones. All paths of the graph correspond to the discrete state
(k − 1) qubits can be adopted entirely from the gate-block li- space, and the adjacency matrix of the graph corresponds to
brary of (k −1)-qubit circuits when the last two qubits is acted the transition matrix describing the probabilities of transitions
on by a controlled-Rx gate. In this way, we can readily scale between states. In the main text, we consider a simple case
up the gate-block library of (k − 1)-qubit and k-qubit circuits where the transition probability is uniformly distributed on all
to the gate-block library of (k + 1)-qubit circuits. states that the present state can transit to.
10

1.0 0.74 example in the main text, the number of gates involved in the
training accuracy corresponding circuit is in fact on the same order (46 versus
0.9 validation accuracy 0.72 44 gates). In Fig. S4(b), we plot the average accuracy and loss
0.8 training loss for both the training and validation datasets as a function of
Accuracy

0.70 the number of epochs during the training process. After train-

Loss
validation loss
0.7 ing, the performance of this quantum classifier is also tested
0.68 on the testing dataset and a accuracy of 94.6% is obtained. We
0.6
0.66 mention that the numerical simulations of training the quan-
0.5 tum classifiers are based on the Julia language [96] and Yao.jl
0.4 0.64 [97] framework, throughout the paper. Fig. S4 (c) and (d) are
0 20 40 60 80 analogous to Fig. S4 (a) and (b) respectively, but with fixed
Epochs initial parameters during the training process.

FIG. S7. The loss and accuracy as a function of training epochs for
the eighth-generation circuit with the highest fitness (see Fig. S6), V. MORE NUMERICAL RESULTS FOR CLASSIFYING
for the task of classifying symmetry-protected topological states of SPT STATES
the cluster-Ising model.
In the main text, we also consider a quantum machine learn-
ing task of classifying SPT states. We use two strategies
For the high-order Markovian process, we could directly to train the generated quantum classifiers at each generation.
use Rz and Rx gates, instead of the composed R gate, to con- Firstly, we randomly choose the initial variational parameters
struct quantum gate-blocks. Consequently, we need to invoke when training the generated quantum classifiers at each gen-
new rules between adjacent four gate-blocks to avoid ambi- eration, our results are plotted in the main text. The MQNE
guity and duplication of successive rotations. For example, algorithm outputs a quantum circuit (with 67 variational pa-
when three consecutive quantum gate-blocks contain Rz , Rx , rameters) with fitness 100% at the tenth generation. The cor-
Rz on the same qubit respectively, then the fourth gate-block responding path for this circuit on the directed graph reads
should not have Rz or Rx on this qubit. So the future state [4257 → 3329 → 1474... → 6180 → 4257 → 1441]. In
depends on the previous three states of the system, which is a Fig. S5, we plot the average accuracy and loss for both the
third-order Markovian process. training and validation datasets as a function of the number of
epochs during the training process.
Secondly, we fix initial parameters during the training pro-
IV. NUMERICAL RESULTS FOR THE CANCER DATASET cess, the results are shown in Fig. S6. The MQNE algo-
rithm outputs a quantum circuit (bearing 77 variational pa-
Another example we consider is the classification of sam- rameters) with fitness 99% at the eighth generation. The cor-
ples in the cancer dataset [84], we need six qubits to encode responding path for this circuit on the directed graph reads
each input sample and use an additional qubit to output the [4257 → ... → 1372 → 1608 → 3945 → 1034], In Fig. S7,
results of the binary classification (whether the label is can- we plot the average accuracy and loss for both the training
cer or not). Thus, the ansatz circuit we aim to design is a and validation datasets as a function of the number of epochs
seven-qubit variational circuit. Applying the graph-encoding during the training process.
method for seven-qubit circuits and supposing that controlled-
Rx gates act only on adjacent qubits, we obtain 896 gate-
blocks and the corresponding directed graph has 896 nodes. VI. PARAMETER SETTINGS IN TRAINING QUANTUM
Based on the connection rules, we compute the adjacency ma- CLASSIFIERS
trix and apply the MQNE algorithm with hyperparameters set
as (ni , ti , l, l0 ) = (5, 1, 3, 2). In the task for classification of handwritten-digit images in
Our results are summarized in Fig. S4. In Fig. S4(a), we the MNIST dataset (see Fig. 2 in the main text), we use the
randomly choose the initial variational parameters when train- Adam optimizer with a batch size of 30 and a learning rate
ing the generated quantum classifiers at each generation. The of 0.0015 to minimize the loss function. The training step
MQNE algorithm outputs a quantum circuit (with 88 varia- is 200. The fitness are averaged on 2000 training samples and
tional parameters) with fitness 94.6% at the tenth generation. 500 validation samples which are not contained in the training
The corresponding path for this circuit on the directed graph dataset.
reads [737 → 895 → 105 → 417 → 737 → 675 → 429 → In the task for classification of samples in the cancer dataset
705 → 417 → 673 → 449 → 713 → 800 → 705 → 745 → (see Fig.S4), we use the Adam optimizer with a batch size of
644 → 625 → 314 → 377 → 519 → 377 → 650], where the 20 and a learning rate of 0.0015 to minimize the loss func-
numbers denotes the labels of the nodes of the graph. We note tion. The training step is 100. The fitness are averaged on
that, although the path seems longer than that for the MINST 400 training samples and 169 validation samples which are
11

1.0 the NEAT algorithm [62], in searching quantum circuits.


In conventional evolutionary algorithms, we evolve the
0.9 population to create the next generation by applying genetic
0.8 operators on individuals to generate offspring. Two important
Fitness

0.7 genetic operators are mutation and crossover [62, 63].


Applying the genetic algorithm in searching quantum cir-
0.6 cuits, the above two operators is defined as:
0.5
1. Crossover: produce new offspring by recombining two
0.4 selected individuals (parents). For two parent quantum
0.3 circuit sequences, we divide each of them into two parts
1 2 3 4 5 6 7 8 9 10 respectively, and exchange their divided parts to create
Generations two offspring.

FIG. S8. The performance of the genetic algorithm in the task of 2. Mutation: generate new offspring by randomly mutat-
classifying handwritten-digit images in the MNIST dataset. ing a selected individual. Randomly generate some po-
sitions in a quantum circuit with given probability, then
replace quantum gates in these positions with other dif-
1.0 0.70
training loss ferent gates.
0.9 validation loss 0.68 We design the genetic algorithm as shown in Algorithm S2
Accuracy

to search for optimal quantum classifier architectures (9-qubit


0.8 0.66
Loss
circuits) for the MNIST handwritten digit dataset. The major
0.7 0.64 hyperparameters (ni , ti ) are set as (9, 3).
Fig. S8 displays the fitness of 86 quantum circuits totally
training accuracy
0.6 0.62 evaluated in running the genetic algorithm. We find a circuit
validation accuracy structure with fitness 92% at the third generation, whose num-
0.5 0.60 ber of quantum gates is more than 100. In Fig. S9, we plot the
0 20 40 60 80 100 average accuracy and loss for both the training and valida-
Epochs tion datasets as a function of the number of epochs during the
training process. We see from Fig. S8 that the local conver-
FIG. S9. The loss and accuracy as a function of training epochs for gence appears at the third generation and the fitness does not
the tenth-generation quantum classifier with the highest fitness, in increase for later generations. The performance of the genetic
classifying handwritten-digit images in the MNIST dataset. Here, algorithm is ineffective when directly applying these two op-
we use the genetic algorithm to construct quantum circuits.
erators in designing quantum classifiers.

not contained in the training dataset.


In the task for classifying symmetry-protected topological Algorithm S2 The genetic algorithm
states (see Fig. 3 in the main text, and Figs. S6, S7 and Input: Hyperparameters ni , ti , gc , fc , etc
S5 in this Supplementary Material), we use the Adam opti- Output: The optimal quantum circuit architecture
mizer with a batch size of 20 and a learning rate of 0.0015 to Initialization:: randomly generate n1 quantum circuits, and com-
minimize the loss function. The training step is 100. The fit- pute their fitness
for i = 1 to gc do
ness are averaged on 1600 training samples and 400 validation
Choose the best ti (ti < ni ) circuits with highest fitness
samples which are not contained in the training dataset. Apply the crossover operator
Apply the mutation operator
Compute the fitness of individuals in the (i + 1)-th generation
VII. THE DIRECT GENERALIZATION OF THE NEAT if max[fitness(circuits)] ≥ fc then
ALGORITHM IN SEARCHING QUANTUM CIRCUITS Terminate the iteration
end if
In this part, we show the performance of the genetic algo- end for
rithm, which could be regarded as a naive generalization of

View publication stats

You might also like