Quantum AI simulator using a hybrid CPUâFPGA approach

Suzuki, Teppei; Miyazaki, Tsubasa; Inaritai, Toshiki; Otsuka, Takahiro

doi:10.1038/s41598-023-34600-2

Download PDF

Article
Open access
Published: 12 May 2023

Quantum AI simulator using a hybrid CPUâFPGA approach

Teppei Suzuki¹,
Tsubasa Miyazaki¹,
Toshiki Inaritai¹ &
â¦
Takahiro Otsuka¹Â

Scientific Reports volumeÂ 13, ArticleÂ number:Â 7735 (2023) Cite this article

6079 Accesses
4 Citations
5 Altmetric
Metrics details

Subjects

Abstract

The quantum kernel method has attracted considerable attention in the field of quantum machine learning. However, exploring the applicability of quantum kernels in more realistic settings has been hindered by the number of physical qubits current noisy quantum computers have, thereby limiting the number of features encoded for quantum kernels. Hence, there is a need for an efficient, application-specific simulator for quantum computing by using classical technology. Here we focus on quantum kernels empirically designed for image classification and demonstrate a field programmable gate arrays (FPGA) implementation. We show that the quantum kernel estimation by our heterogeneous CPUâFPGA computing is 470 times faster than that by a conventional CPU implementation. The co-design of our application-specific quantum kernel and its efficient FPGA implementation enabled us to perform one of the largest numerical simulations of a gate-based quantum kernel in terms of features, up to 780-dimensional features. We apply our quantum kernel to classification tasks using the Fashion-MNIST dataset and show that our quantum kernel is comparable to Gaussian kernels with the optimized hyperparameter.

Machine learning of high dimensional data on a noisy quantum processor

Article Open access 11 November 2021

Benchmarking an 11-qubit quantum computer

Article Open access 29 November 2019

Towards quantum enhanced adversarial robustness in machine learning

Article 25 May 2023

Introduction

Quantum computing¹ is an emerging technology that could transform many areas of industries and scientific research, including finance², chemistry³, and machine learning (ML)^4,5. In particular, quantum machine learning (QML)^{4,5,6,7,8,9,10,11,12,13,14,15,16,17} has received considerable attention at a rapid rate, indicating that QML is a plausible candidate for the practical application of near-term quantum devices. While early fault-tolerant quantum computing has been demonstrated recently¹⁸, noisy intermediate-scale quantum (NISQ) processors are currently available through various hardware platforms with $\sim$ 10â100 physical qubits. However, the number of physical qubits todayâs NISQ computers have is generally insufficient to explore practical applications of QML. Therefore, there is a need for application-specific quantum computing simulators to explore and validate the practicality of QML in real-world settings.

The quantum kernel method is a NISQ algorithm in the framework of the hybrid quantumâclassical approach^19,20 and can also be feasible on current NISQ computers with shallow quantum circuits^{9,12,13,16,17}. In the quantum kernel method, a quantum feature map can be described explicitly by a quantum circuit and the quantum kernel entry can be estimated by measuring the inner product of the quantum feature map^8,9. The calculation of quantum kernels when using real devices or general-purpose simulators based on quantum assembly language (QASM) requires a number of measurements to obtain the quantum kernel entries (note that measurements are a key part of the QASM simulator, which handles measurements by collapsing the state of the qubit according to the probabilities predicted by quantum mechanics). Commonly used quantum kernels inspired by instantaneous quantum polynomials (IQP)⁹ can be computationally prohibitive on classical computers as the number of qubits increases; for instance, the number of entangled qubits in the simulation of quantum kernels using state-of-the-art classical platforms is 30¹¹. On the other hand, it becomes challenging to reliably estimate such quantum kernels using near-term quantum devices with increasing size in circuits, owing to expensive gate costs, low gate fidelities, and different qubit connectivities. The above points can be a drawback in exploring practical applications of quantum kernels since machine learning models typically improve performance by increasing training data samples or expanding the number of input features. There is still a gap between theoretical developments and practical applications in the quantum kernel method.

To bridge the gap between theory and practice in the quantum kernel method, in this paper, we focus on an application-specific quantum kernel that can be applied to image data with a large number of features. To this end, we demonstrate an implementation of an efficient quantum AI simulator by using a heterogeneous classical computing platform. Our approach is highly customized for our specific tasks at the hardware level and the main objective of our simulator differs from a general-purpose quantum simulator, which is designed to be versatile and to perform a range of quantum algorithms. Until now, there have been considerable efforts to develop quantum computing simulators^{21,22,23,24,25,26,27}. Among hardware implementations, field programmable gate arrays (FPGA) are one of the desirable platforms, because FPGA has the properties of efficient parallelism, low latency, and customization. FPGAs comprise programmable logic blocks that can be interconnected to perform parallel processing, allowing each logic block to perform a specific task simultaneously. FPGAs can also be customized to perform specific tasks using hardware description languages such as Verilog. Herein, we co-design application-specific quantum kernels and our FPGA architecture, which allows efficient numerical simulations. FPGA has been successfully applied to fault-tolerant quantum algorithms such as Groverâs algorithm^28,29,30, quantum Fourier transform^28,29,30,31, and Deutschâs algorithm³². However, an FPGA implementation of quantum kernels has been unexplored and the present study is the first demonstration of a gate-based quantum kernel simulator using an FPGA platform.

The rest of the paper is organized as follows. We provide a brief introduction to support vector machine (SVM) and describe our quantum feature map that is useful for image classification. Then we explain the overview of our quantum AI simulator using a heterogeneous CPUâFPGA computing. From an algorithmic point of view, the quantum kernel method can be divided into the quantum kernel estimation and the rest of the tasks. The simulation of the quantum kernel can be computationally demanding; hence, the workload can be accelerated by FPGA hardware. On the other hand, the rest of the tasks, such as dimensionality reduction and the optimization of machine learning parameters, can be efficiently performed on the CPU using existing classical libraries. The FPGA implementation of the quantum kernel is checked in terms of both numerical precision and hardware acceleration. We apply our quantum kernel simulator to binary and multiclass classification for a range of input features using the Fashion-MNIST dataset. Then we summarize our conclusions.

Results

Quantum support vector machine

The quantum kernel method is one of the most important algorithms in QML techniques and many studies have been reported^{4,5,8,9,11,12,13,14,15,16,17}. In the classical kernel method^33,34 the inner product of the feature map is represented by kernel functions, which implicitly use the Hilbert space; on the other hand, the quantum kernel explicitly defines a quantum feature map by means of a quantum state $\left|\phi ({\varvec{x}})\rangle \right.$ for $d$-dimensional input vectors ${\varvec{x}}\in {\mathbb{R}}^{d}$. The quantum kernel matrix $K\left({\varvec{x}},{\varvec{x}}^{^{\prime}}\right)$ can be estimated by calculating the inner product of the quantum feature map^8,9:

$$K\left({\varvec{x}},{{\varvec{x}}}^{\boldsymbol{^{\prime}}}\right)={\left|\left\langle \phi ({\varvec{x}})|\phi ({\varvec{x}}^{^{\prime}})\right\rangle \right|}^{2}.$$

(1)

For binary classification in the framework of SVM, one can obtain a support vector classifier that estimates the label for a new datum ${\varvec{x}}$:

$$y = {\text{sgn}}\left( {\sum\limits_{I} {y_{i} \alpha _{i}^{*} K\left( {\varvec{x}^{{\left( i \right)}} ,\varvec{x}} \right) + b^{*} } } \right),$$

(2)

where ${y}_{i}\in \left\{+1, -1\right\}$ and parameters ${\{\alpha }_{i}^{*}\}$ and ${b}^{*}$ are the optimal parameters obtained in the training phase³⁴. In the hybrid quantumâclassical algorithm, the training phase can be performed on classical computers, whereas the quantum kernel entries can be computed by NISQ computers or quantum computing simulators; such methodology is called the quantum SVM (QSVM). The NISQ computation of the quantum kernel requires many quantum measurements to obtain a quantum kernel entry with statistically reliable accuracy. For example, a value for each computational-basis measurement is zero or one. For each quantum kernel entry, $\mathcal{O}\left({N}^{2}\right)$ shots are required with respect to the number of data samples $N$, resulting in the computational complexity of $\mathcal{O}\left({N}^{4}/{\varepsilon }^{2}\right)$ operations with the maximum error $\varepsilon$, in order to obtain all the quantum kernel entries⁹. Such computational complexity prohibits us from developing and validating quantum kernels as the number of data samples grows. Also, the number of entangling qubits with different connectivities in the previously proposed quantum kernels is increased with qubit count⁹, which requires additional computational resources.

To address these issues, here we introduce a shallow, fixed-depth quantum circuit that can be applied to a quantum kernel for a larger number of input features. In the previously proposed quantum kernels based on IQP circuits⁹, the number of dimensional features is typically set to the number of entangled qubits^9,11,14,15. IQP circuits are a subclass of quantum circuits that cannot be classically efficiently simulated unless the polynomial-time hierarchy collapses to the third level³⁵. Here, an IQP circuit is a circuit where a Hadamard gate is applied to each qubit at the beginning and end of the computation, but the rest of the gates are diagonal. In the context of the quantum kernel method, researchers have typically used a more specific type of IQP, called the ZZ feature map⁹. In the ZZ feature map, the connectivity of qubits is achieved in a pair-wise manner, resulting in $n(n-1)/2$ interactions, where $n$ is the number of qubits. This leads to a rapid expansion of expressibility and results in a deterioration of generalization performance as qubit count increases^11,14,15. Our approach aims to simplify the quantum feature map, limit the extent to which qubits are entangled, and control the capacity of our QML model, while increasing the number of input features. This framework can handle several hundreds of input features in QSVM. For the $mn$-dimensional input vector ${\varvec{x}}={\left[{{\varvec{s}}}_{1},{{\varvec{s}}}_{2},\cdots ,{{\varvec{s}}}_{m}\right]}^{\mathrm{T}}\in {\mathbb{R}}^{mn}$, where ${{\varvec{s}}}_{b}$ is the $n$-dimensional vector ${{\varvec{s}}}_{b}={\left[{s}_{b, 1},{s}_{b,2},\cdots ,{s}_{b,n}\right]}^{\mathrm{T}}$, we consider a block product state (BPS) wavefunction³⁶:

$$\left| {{\Psi }^{{{\text{BPS}}}} \left( {\varvec{x}} \right)\rangle} \right. = \left| {\psi_{1} \left( {{\varvec{s}}_{1} } \right)\rangle} \right. \otimes \left| {\psi_{2} \left( {{\varvec{s}}_{2} } \right)\rangle} \right. \otimes \cdots \otimes \left| {\psi_{m} \left( {{\varvec{s}}_{m} } \right)\rangle} \right.,$$

(3)

where

$$\left|{\psi }_{b}\left({{\varvec{s}}}_{b}\right)\rangle \right.=\left({\otimes }_{q=1}^{n}{R}_{z}\left({s}_{b,q}\right)\right){U}_{{2}^{n}}^{\mathrm{ent}}\left({\otimes }_{q=1}^{n}\left({R}_{y}\left({s}_{b,q}\right){R}_{z}({s}_{b,q})H\right)\right)\left|{0}^{\otimes n}\rangle \right.,$$

(4)

and

$${U}_{{2}^{n}}^{\mathrm{ent}}:=\prod_{q=1}^{n-1}{\mathbf{C}\mathbf{N}\mathbf{O}\mathbf{T}}_{q,q+1}.$$

(5)

In the BPS wavefunction, a modest number of qubits can be entangled within each block (in our numerical simulations, $n$ was varied from 2, 3, and 6); and for the wavefunction $\left|{\psi }_{b}\left({{\varvec{s}}}_{b}\right)\rangle \right.$, each component ${s}_{b,q}$ is encoded three times as the input angle for the ration operator gates (i.e., ${s}_{b,q}$ is encoded in the ${R}_{z}$ gate, in the ${R}_{y}$ gate, and again in the ${R}_{z}$ gate in Eq.Â (4)). Such kind of redundant encoding leads to the better performance of QML models based on angle encoding³⁷. The state $\left|{\psi }_{b}\left({{\varvec{s}}}_{b}\right)\rangle \right.$ is related to matrix product states, a class of tensor networks that have been used for the study of ground states of quantum systems and recently for machine learning. The connectivity of qubits in Eq.Â (4) is limited to their nearest neighbors, resulting in $(n-1)$ interactions. The idea of BPS has been originally used for ML models based on tensor networks³⁶; yet, to our knowledge, this kind of BPS-based quantum feature map has not been applied to quantum kernels. In this work, we will show that such a feature map can be used for QSVM. The kernel associated with the quantum feature map defined by Eq.Â (3) can be given by

$$K\left({{\varvec{x}}}^{(i)},{{\varvec{x}}}^{(j)}\right)={\left|\left\langle {\Psi }^{\mathrm{BPS}}\left({{\varvec{x}}}^{(i)}\right)|{\Psi }^{\mathrm{BPS}}\left({{\varvec{x}}}^{(j)}\right)\right\rangle \right|}^{2}=\prod_{b=1}^{m}{\left|\left\langle {\psi }_{b}\left({{\varvec{s}}}_{b}^{(i)}\right)|{\psi }_{b}\left({{\varvec{s}}}_{b}^{(j)}\right)\right\rangle \right|}^{2}.$$

(6)

The number of blocks $m$ can be varied in order to allow a larger number of input features depending on different datasets. Another interesting aspect is that the quantum kernel is not translation invariant, which means that the quantum kernel does not depend solely on the distance of input vectors, in contrast with Gaussian kernels. A computational benefit of our approach is that the calculation of the quantum kernel can be divided into $m$ computational tasks, allowing an efficient computation on classical computers. In particular, ${\left|\langle {\psi }_{b}^{i}|{\psi }_{b}^{j}\rangle \right|}^{2}$ in Eq.Â (6) can be computed separately; hence, each task can now be efficiently simulated through FPGA acceleration and the multiplication can then be performed on CPU.

Quantum AI simulator using a hybrid CPUâFPGA approach

By co-designing FPGA architecture and a quantum kernel given by a shallow quantum circuit, we implemented a fast and efficient quantum AI simulator using a heterogeneous computing approach (Fig.Â 1a). Details of computational resources in the cloud system (FPGA and CPU) are given in Methods. To begin with, using the principal component analysis (PCA) method³⁸ we conducted the dimensionality reduction of the $28\times 28$ image data from the Fashion-MNIST dataset³⁹; then the number of input features can be varied from $d=4$ to $d=780$. After obtaining PCA-reduced input vectors ${{\varvec{x}}}^{(i)}\in {\mathbb{R}}^{d}$, the input data are sent from CPU to the internal memory of an FPGA hardware via PCI express. Then, for each block wavefunction $\left|{\psi }_{b}\left({{\varvec{s}}}_{b}\right)\right.\rangle (b=1,\cdots ,m)$ of the quantum feature map, we calculate the square of the norm of the inner products ${\left|\langle {\psi }_{b}^{i}|{\psi }_{b}^{j}\rangle \right|}^{2}$ (which is depicted in Fig.Â 1b) on our FPGA architecture in the following procedure: First, the sine and cosine of the input angles for quantum gates are computed using the COordinate Rotational DIgital Computer (CORDIC) algorithm⁴⁰. Second, the square of the norm of the inner product can be calculated using the unitary matrices in Eq.Â (4), together with an efficient implementation of $n$-qubit entanglement. (The procedure is described in great detail in Methods and Supplementary Notes 1 and 2.) This process can be repeated for all the pairs of data samples, namely, for ${N}^{2}/2$ cycles. The processed, real-valued data are sent back to the CPU. The kernel matrix element will thus be calculated by the multiplication of $m$ blocks. After all the kernel entries are obtained, the training phase of the SVM can be performed on the CPU platform. In the test process (Fig.Â 1c), prediction can be done using the same FPGA acceleration with $\mathcal{O}(ND)$ operations, where $D$ is the number of test data.

FPGA implementation: numerical precision and acceleration

Herein we validate our FPGA implementation in terms of numerical precision and acceleration. We begin by comparing the quantum kernel values obtained by the FPGA platform and those obtained by the CPU platform (Fig.Â 2aâc). The norms of inner products ${\left|\langle {\psi }_{b}^{i}|{\psi }_{b}^{j}\rangle \right|}^{2}$ have values between 0 and 1. Such property along with a shallow circuit depth is amenable to the use of 16-bit fixed-point arithmetic in our FPGA architecture, which in turn makes the calculation faster with efficient hardware utilization. We also employed 64-bit floating-point arithmetic in the CPU platform to validate our FPGA implementation. The parity plot suggests the success of our FPGA implementation of the quantum kernel (Fig.Â 2c). The numerical deviation between the two hardware platforms was Â±$\sim$ 0.095%, indicating that there was a negligible loss of numerical accuracy.

Next, we compare the execution time for computing a kernel matrix (in the case of 6 entangled qubits) using the FPGA platform with the one obtained by our CPU implementation, as well as the one obtained by Qiskit Aer²¹, a QASM quantum computing simulator (Fig.Â 2d). Measurement is a vital aspect of the simulation process in the QASM simulator, which handles measurements by collapsing the state of the qubit according to the probabilities determined by the state of the qubit. Therefore, in the QASM simulator, a number of shots are required to obtain the expectation value. In our CPU implementation, the kernel matrix entry is obtained directly by calculating the inner product of the state vectors. In particular, we used NumPy⁴¹, which is a popular library for scientific computing and data analysis (note that the core of NumPy is implemented in C Language). For our particular tasks (in the case of 6 entangled qubits), the execution time by our CPU implementation is likely to be somewhat faster than that by the state-vector simulator; this is because the state-vector simulator tracks the quantum state of the system as it evolves through the circuit, resulting in a slowing down of the execution time. Thus, the plot for the state-vector simulation would be the upper side of the plot denoted by orange in Fig.Â 2d.

In our FPGA architecture, once the data are sent to the FPGA architecture, we used only the internal memory of the FPGA hardware without accessing the external (off-chip) memory, which circumvents the associated communication overhead (for more details of our FPGA architecture, see Supplementary Note 2). In addition, two more factors are responsible for the FPGA acceleration. First, an FPGA allows each programable logic block to perform a specific task simultaneously in an efficient manner. Second, an FPGA can be customized to perform specific tasks using the hardware description language, resulting in faster performance in comparison with CPUs.

In our FPGA implementation, all the kernel entries were computed in 4.1Â ms at $N=1000$; and the execution time including CPUâFPGA communication overhead was 15.4Â ms at $N=1000$. In other words, our FPGA implementation achieved a 1784âÃâimprovement in comparison with the CPU counterpart. Also, the execution time including the communication overhead was 472 times faster (Fig.Â 2d); moreover, in comparison with the execution by a QASM simulator (assuming that the computation cost grows as $\mathcal{O}\left({N}^{4}/{\varepsilon }^{2}\right)$ operations), a 10 million times speedup was accomplished at $N=400$ (Fig.Â 2d). The results show that our FPGA implementation is highly efficient in terms of the number of data samples, with a modest number of entangling qubits (up to 6 qubits) being used in our quantum feature map. Owing to the limitation of the internal memory and digital single processors within an FPGA, however, our implementation technique will be prohibitive for $n$ large than 8. Nonetheless, for our machine-learning tasks, this can be overcome by dividing input features into a number of blocks, and each blockâs quantum kernel can be efficiently computed in FPGA. Thus, the FPGA-based simulator accelerates the numerical simulations of QSVM using our quantum kernel and allows us to validate its applicability to much larger features in quantum kernel methods.

Binary classification on Fashion-MNIST dataset

Having shown the accuracy and efficiency of our FPGA-based quantum kernel estimation, we now turn to the performance of our quantum kernel. To begin with, we trained classical and quantum SVMs on Fashion-MNIST and obtained 45 binary classifiers. Among 45 pairs of binary classification tasks from 10 categories of Fashion-MNIST³⁹ (0, t-shirt/top; 1, trouser; 2, pullover; 3, dress; 4, coat; 5, sandals; 6, shirt; 7, sneaker; 8, bag; 9, ankle boots), about half the pairs of classification tasks were relatively easy to distinguish. On the other hand, more challenging tasks such as pullover versus shirt (2 vs. 6), pullover versus coat (2 vs. 4), and coat versus shirt (4 vs. 6) classification tasks were somewhat difficult to distinguish (e.g., the images of pullovers are more similar to those of coats than to those of trousers). Hence, we focused on the three binary classification tasks and investigate the performance in detail (Fig.Â 3).

The performance of our quantum kernel without introducing any hyperparameter was comparable to that of the Gaussian kernel $\mathrm{exp}\left(-\gamma {\Vert {{\varvec{x}}}^{(i)}-{{\varvec{x}}}^{(j)}\Vert }^{2}\right)$ with the optimized bandwidth $\gamma$, for dimensions smaller than $\sim$ 300 (Fig.Â 3a). Here, a key hyperparameter in the Gaussian kernel is the kernel bandwidth $\gamma$, which is known to affect the performance of kernel-based methods such as SVMs and is routinely optimized when SVMs are used in practice. The hyperparameter $\gamma$ controls the smoothness of the decision boundary in the SVM. Analogously, we introduced a scaling hyperparameter $\uplambda$ (i.e., ${{\varvec{x}}}^{(i)}\leftarrow\uplambda {{\varvec{x}}}^{(i)}$ in the quantum circuit) to improve the performance of QSVM. The role of $\uplambda$ appears to be similar to the classical counterpart. The hyperparameter $\uplambda$ can calibrate the angles of the rotation gates and directly affect the quantum feature map in the Hilbert space. From a physical point of view, changing the hyperparameter $\uplambda$ in the quantum kernel is related to changing the total evolution time in the Hamiltonian evolution¹⁵. The best test accuracy for the quantum kernel was 0.87 at $d=180, 190, 200$; whereas that for the classical kernel with the optimal bandwidth was 0.88 at $d=190$. We found that introducing the scaling parameter $\uplambda$ improved the performance of our quantum kernel for larger dimensions ($d>\sim 300$), maintaining its comparable performance to the classical kernel, which is indicated by the blue dotted line in Fig.Â 3a (for the grid search over the hyperparameters of the classical and quantum kernels, see Supplementary Note 3).

The test accuracy obtained by our quantum kernel was improved by increasing the number of data samples $N$ (Fig.Â 3b). In particular, as the number $N$ was increased, the test accuracies for higher dimensional vectors tended to improve gradually (Fig.Â 3b). But the relatively sharp drop for dimensions higher than $\sim$ 300 was difficult to overcome just by increasing $N$; nonetheless, the dimension $d$ that gave the best test accuracy was typically in the range between 100 and 200 for this particular application. We note that the drop in the test accuracy for higher dimensions can be overcome by optimizing the aforementioned scaling parameter (which will be discussed in multiclass classification).

The performance of our hyperparameter-free quantum kernel was competitive with the Gaussian kernel with the optimized bandwidth at $N>1500$ (Fig.Â 3c), which might be beneficial for practical applications. The best test accuracies at $N=2000$ and $N=3000$ were 0.89 and 0.90, respectively, for both of the two kernels. For smaller numbers of data samples ($N<1000$), the performance of our quantum kernel was slightly lower than the best classical counterpart. To understand the role of quantum entanglement, we investigated the effects of enlarging the number of entangled qubits. Increasing the number of entangled qubits (from 2 to 6 qubits per block) did not significantly change the performance for PCA-reduced input vectors (Fig.Â 3d); this kind of insensitiveness to quantum entanglement has been previously reported in an ML model based on tensor networks using BPS³⁶. Our results probably indicate that the capacity of our quantum feature map is already sufficiently high even in the case of $n=2$. However, this may not necessarily mean that quantum entanglement is unimportant; the CNOT entangling gate can make the quantum feature map more complex in comparison with no quantum entanglement. Overall, the behavior of our quantum kernel is quite different from the previously used quantum kernels^{9,11,12,13,14,15}. The results suggest that our quantum kernel is comparable to the best classical kernel with good generalization performance for a range of features.

Multiclass classification on Fashion-MNIST dataset

We also show the numerical results for 10-class classification on Fashion-MNIST. We trained our multiclass QSVMs using a one-versus-rest strategy. As was found in binary classification tasks, our quantum kernel was comparable to the best classical kernel (Fig.Â 4a). For multiclass classification using the quantum kernel, we found that it was important to introduce the scaling parameter. Hence, we performed a grid search for the scaling parameter $\uplambda$ for a range of features ($4<d<340$) (Fig.Â 4b). The optimal value for $\uplambda$ was 0.6 at $d=330$. On the other hand, the optimal value for $\upgamma$ of the Gaussian kernel was 2.5 at $d=330$. The confusion matrices for QSVM and SVM were similar to each other (Fig.Â 4c,d). The performance metrics for the quantum (classical) kernel were the following: accuracy, 0.855 (0.855); precision, 0.850 (0.853); recall, 0.855 (0.855); F-measure, 0.848 (0.851). We note that, among 45 pairs generated by 10 categories of Fashion-MNIST, about half the pairs of classification tasks were relatively easy to distinguish; hence, the difference in the test accuracy between the classical and the quantum kernels tended to be decreased. The results suggest that our quantum kernel performed competitively with the best classical kernel in the multiclass classification task.

Discussion

In this study, we have implemented an application-specific quantum AI simulator using a heterogeneous CPUâFPGA computing, which was achieved by co-designing the FPGA architecture and our quantum kernel. To this end, we have introduced a BPS structure as a quantum feature map for QSVM, where a small number of qubits are entangled in each block. This is the first demonstration of the FPGA implementation of a gated-based quantum kernel. The co-design of the quantum kernel and its efficient FPGA implementation have enabled us to perform one of the largest numerical simulations of QSVM in terms of input features, up to 780-dimensional data. In the literature, one of the largest simulations of quantum kernels in terms of qubit count was performed by Huang et al.¹¹. The number of qubits in their study is 30. For our particular study, increasing the number of entangled qubits is not a practical direction. Instead, our strategy is to divide input features into a number of blocks, and each blockâs quantum kernel can be efficiently computed in FPGA. By doing this, hundreds of features can be handled. Our approach is highly customized for our specific tasks at the hardware level; the focus of our simulator differs from that of a general-purpose quantum simulator, which is designed to be flexible and to perform various quantum algorithms. An application of our quantum kernel to dimensional features larger than $\sim$ 1000 would be more challenging because off-diagonal kernel values could become much smaller. This limitation is related to our formalism of the quantum kernel, owing to the multiplication of many values that are less than one in Eq.Â (6). Nevertheless, the FPGA-based quantum kernel simulator has significantly accelerated our numerical simulations and allowed us to validate the applicability to QSVM with hundreds of input features. The quantum circuit presented in this work might have implications for co-designing quantum software and hardware and for developing application-specific quantum computers^42,43.

We have demonstrated that the FPGA-based quantum kernel simulator was 470 times faster than that obtained by the CPU implementation, without loss of accuracy. The numerical simulations show that our FPGA implementation is highly efficient in terms of the number of data samples (up to 4000), with a modest number of entangling qubits being used in the quantum feature map. We have applied our quantum kernel to image classification using Fashion-MNIST for a wide range of PCA-reduced features. The results suggest that our quantum kernel is comparable to the best classical kernel, with similar generalization performance for binary and multiclass classification tasks. In binary classification, our hyperparameter-free quantum kernel was comparable to the Gaussian kernels; whereas, in multiclass classification, the scaling parameter played a significant role in improving the performance of our quantum kernel, in line with recent studies^15,44.

Whether quantum kernels could perform better than classical kernels or have a practical advantage in real-world settings is still an open question. Our quantum kernel may be helpful for understanding the applicability of quantum kernels as well as their limitations. While our quantum kernel was applied to classification, the quantum kernel could be used for other kernel-based ML tasks, such as regression, spectral clustering, Gaussian process¹⁷, and causal discovery⁴⁵. With hundreds of input features being handled in our quantum kernel, other possible applications might include financial data, cheminformatics, and medical data. There is room for improvement in our quantum feature map. For instance, a recent approach based on the automatic design of quantum feature maps⁴⁶ may possibly improve our quantum feature map or reduce the number of quantum gates required. Nonetheless, our results might have implications for developing quantum-inspired algorithms and designing practical quantum kernels in the NISQ era.

Methods

FPGA implementation of the quantum kernel

We describe an approach for efficient simulation of our quantum kernel, which is particularly designed for our FPGA architecture. The quantum kernel is given by the inner product of the quantum feature map, which in principle requires $\mathcal{O}\left({2}^{3n}\right)$ operations, owing to the multiplication of ${2}^{n}\times {2}^{n}$ matrices to generate the quantum feature map. Such computational complexity becomes prohibitive for efficient FPGA implementation of quantum kernels, because FPGA architecture is memory-bound and the number of complex multipliers is limited. For that reason, efficient resource utilization of FPGA was crucial for calculating our quantum kernel. In this work, we employed a shallow quantum circuit so that we were able to calculate the quantum kernel with $\mathcal{O}({2}^{n})$ operations, as we will see below. This enabled efficient parallelization and the use of internal memory in FPGA.

We consider the following quantum state:

$$\left| \psi \rangle \right. = \left( {V_{1} \otimes V_{2} \otimes \cdots \otimes V_{n} } \right)U_{{2^{n} }}^{\rm ent} \left( {U_{1} \otimes U_{2} \otimes \cdots \otimes U_{n} } \right)\left| {0^{ \otimes n} \rangle.} \right.$$

(7)

where ${U}_{1}, {U}_{2},\cdots ,{U}_{n}$ and ${V}_{1},{V}_{2},\cdots ,{V}_{n}$ are single-qubit gates and ${U}_{{2}^{n}}^{\mathrm{ent}}:={\prod }_{q=1}^{n-1}{\mathbf{C}\mathbf{N}\mathbf{O}\mathbf{T}}_{q,q+1}$ represents $n$-qubit entanglement operation. For the sake of our discussion, it is convenient to rewrite $\left|\psi \rangle \right.$ as ${\varvec{f}}=V{U}_{{2}^{n}}^{\mathrm{ent}}U{{\varvec{f}}}_{0}$ with ${{\varvec{f}}}_{0}$ being a vector ${\left[\mathrm{1,0},\cdots ,0\right]}^{\mathrm{T}}$, where $U: = U_{1} \otimes \cdots \otimes U_{n}$ and $V: = V_{1} \otimes \cdots \otimes V_{n}$. First, we note that, in the calculation of $U{{\varvec{f}}}_{0}$, only the first column of $U$ is needed; hence, $U{{\varvec{f}}}_{0}$ can be obtained without the need for fully conducting tensor operations. By denoting the first column vector of each $2\times 2$ unitary matrix ${U}_{q}$ as ${\left[{\chi }_{1}^{(q)},{\chi }_{2}^{(q)}\right]}^{\mathrm{T}}$ and the first column vector of $U$ as ${\varvec{u}}={\left[{u}_{1},{u}_{2},\cdots ,{u}_{{2}^{n}}\right]}^{\mathrm{T}}\in {\mathbb{C}}^{{2}^{n}}$, then we have

$$U{{\varvec{f}}}_{0}={\varvec{u}}=\left[\begin{array}{c}\begin{array}{c}\begin{array}{c}\begin{array}{c}\begin{array}{c}\begin{array}{c}\begin{array}{c}{\chi }_{1}^{(1)}\cdot \cdots \cdot {\chi }_{1}^{(n-2)}\cdot {\chi }_{1}^{(n-1)}\cdot {\chi }_{1}^{(n)}\\ {\chi }_{1}^{(1)}\cdot \cdots \cdot {\chi }_{1}^{(n-2)}\cdot {\chi }_{1}^{(n-1)}\cdot {\chi }_{2}^{(n)}\\ {\chi }_{1}^{(1)}\cdot \cdots \cdot {\chi }_{1}^{(n-2)}\cdot {\chi }_{2}^{(n-1)}\cdot {\chi }_{1}^{(n)}\end{array}\\ {\chi }_{1}^{(1)}\cdot \cdots \cdot {\chi }_{1}^{(n-2)}\cdot {\chi }_{2}^{(n-1)}\cdot {\chi }_{2}^{(n)}\end{array}\\ \vdots \end{array}\\ {\chi }_{2}^{(1)}\cdot \cdots \cdot {\chi }_{2}^{(n-2)}\cdot {\chi }_{1}^{(n-1)}\cdot {\chi }_{1}^{(n)}\end{array}\\ {\chi }_{2}^{(1)}\cdot \cdots \cdot {\chi }_{2}^{(n-2)}\cdot {\chi }_{1}^{(n-1)}\cdot {\chi }_{2}^{(n)}\end{array}\\ {\chi }_{2}^{(1)}\cdot \cdots \cdot {\chi }_{2}^{(n-2)}\cdot {\chi }_{2}^{(n-1)}\cdot {\chi }_{1}^{(n)}\end{array}\\ {\chi }_{2}^{(1)}\cdot \cdots \cdot {\chi }_{2}^{(n-2)}\cdot {\chi }_{2}^{(n-1)}\cdot {\chi }_{2}^{(n)}\end{array}\right].$$

(8)

This calculation can be performed by $4\cdot ({2}^{n-1}-1)$ operations using complex multipliers in FPGA (more details are given in Supplementary Fig.Â 5). The feature map can thus be rewritten as ${\varvec{f}}=V{U}_{{2}^{n}}^{\mathrm{ent}}{\varvec{u}}$. Next, we note that $V$ is a diagonal matrix in our quantum circuit and that ${U}_{{2}^{n}}^{\mathrm{ent}}$ is a sparse matrix, in which each row vector contains only one non-zero entry. By denoting the diagonal elements $\left\{{V}_{kk}\right\}$ as ${\varvec{v}}={\left[{v}_{1},{v}_{2},\cdots ,{v}_{{2}^{n}}\right]}^{\mathrm{T}}\in {\mathbb{C}}^{{2}^{n}}$, we can calculate ${\varvec{f}}$ as

$${f}_{k}={v}_{k}{u}_{{\upxi }_{k}}.$$

(9)

Here ${\upxi }_{k}$ is the index of the non-zero element in the $i$ th row of ${U}_{{2}^{n}}^{\mathrm{ent}}$ (e.g., for $n=2$, then ${\upxi }_{1}=1$, ${\upxi }_{2}=2$, ${\upxi }_{3}=4$, and ${\upxi }_{4}=3$). In general, ${U}_{{2}^{n}}^{\mathrm{ent}}$ can be calculated recursively by

$${U}_{{2}^{n+1}}^{\mathrm{ent}}=\left[\begin{array}{cc}{U}_{{2}^{n}}^{\mathrm{ent}}& {O}_{{2}^{n}}\\ {O}_{{2}^{n}}& {Y}_{{2}^{n}}\end{array}\right];\begin{array}{cc}{Y}_{{2}^{n+1}}=\left[\begin{array}{cc}{O}_{{2}^{n}}& {U}_{{2}^{n}}^{\mathrm{ent}}\\ {Y}_{{2}^{n}}& {O}_{{2}^{n}}\end{array}\right]& (n\ge 1)\end{array}.$$

(10)

where ${U}_{2}^{\mathrm{ent}}$ and ${Y}_{2}$ denote the $2\times 2$ identity matrix and the Pauli X matrix, respectively, and ${O}_{{2}^{n}}$ denotes the ${2}^{n}\times {2}^{n}$ zero matrix. The proof of the recurrence relation is given in Supplementary Note 1. The indices $\left\{{\upxi }_{k}\right\}$ in Eq.Â (9) can be determined once ${U}_{{2}^{n}}^{\mathrm{ent}}$ is obtained. Finally, the inner product $\langle {\psi }^{i}|{\psi }^{j}\rangle$ can be calculated by ${\sum }_{k}{f}_{k}^{*}\left({{\varvec{s}}}^{(i)}\right){f}_{k}\left({{\varvec{s}}}^{(j)}\right)$.

Details of computational resources

Our quantum AI simulator based on a hybrid CPUâFPGA system is implemented on the Amazon Web Services (AWS) Elastic Computing Cloud (EC2) platform, in which AWS EC2 F1 instances of AMD Xilinx FPGA hardware are accessible. In particular, we used the f1.2xlarge instance size, which has 1 FPGA, 8 vCPUs, and 122Â GB of memory. More specifically, we used AMD Xilinx Virtexâ¢ UltraScaleâ+ââ¢ VU19P FPGA and Intel Xeonâ¢ E5-2686 v4 with a base clock speed of 2.3Â GHz as vCPU. The details of our FPGA architecture and block diagrams are provided in Supplementary Note 2.

Machine learning

Here we provide the details of our ML models. Preprocessing was applied to the original data to make them suitable for quantum angle encoding: PCA was used to reduce the dimension of the $28\times 28$ original image data to $d$-dimensional input vectors ${{\varvec{x}}}^{(i)}\in {\mathbb{R}}^{d}$ (where $d$ was varied from 4 to 780), which were then transformed such that ${{\varvec{x}}}^{(i)}\in [-\mathrm{1,1}]$. In the training of support vector classifiers, hinge loss was used for the loss function. Throughout the paper, the regularization parameter $C$ for soft margin SVM⁴⁷ was set to 1.0 for both classical and quantum ML models. For the multiclass classification task shown in Fig.Â 4, a one-versus-rest strategy was employed.

To compare the performance of our quantum kernel with the classical counterpart, we used the Gaussian kernel, which is given by $\mathrm{exp}\left(-\gamma {\Vert {{\varvec{x}}}^{(i)}-{{\varvec{x}}}^{(j)}\Vert }^{2}\right)$, with $\gamma$ being a hyperparameter. To obtain the optimal test accuracy, we performed a grid search over the bandwidth.

$$\gamma \in \left\{0.001, 0.1, 0.25, 0.5, 0.75, 1, 1.25, 2.5, 3.75, 5, 6.25, 7.5, 8.75, 10, 50, 100, 1000\right\}.$$

(11)

It is also possible to introduce a hyperparameter in our quantum feature map $\left|{\Psi }^{\mathrm{BPS}}\left({\varvec{x}}\right)\rangle \right.$. In this work, we consider that the input vector ${\varvec{x}}$ can be scaled by $\uplambda$ (i.e., ${{\varvec{x}}}^{(i)}\leftarrow\uplambda {{\varvec{x}}}^{(i)}$), which is similar to an approach by recent work¹⁵. Thus, we performed a grid search over the scaling parameter.

$$\uplambda \in \left\{0.001, 0.1, 0.25, 0.5, 0.75, 1, 1.25, 2.5, 3.75, 5, 6.25, 7.5, 8.75, 10, 50, 100, 1000\right\}.$$

(12)

The effect of the scaling parameter $\uplambda$ was somewhat different from that of $\gamma$. In particular, we found that, for binary classification, the case of $\uplambda =1$ typically gave the near-optimal performance (see also Supplementary Note 3), implying that our quantum kernel gave a reasonable performance without introducing any hyperparameter. Nonetheless, to further optimize the value for $\uplambda$, we narrowed the range for $\uplambda$ and performed another grid search over the scaling parameter

$$\uplambda \in \{0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1, 1.05,$$

$$1.1, 1.15, 1.2, 1.25, 1.3, 1.35, 1.4, 1.45, 1.5, 1.55, 1.6\}.$$

(13)

We found that the test accuracy was slightly improved from 0.870 to 0.875 in binary classification (see also Supplementary Note 4) and that the use of the scaling parameter $\uplambda$ played an important role in multiclass classification.

Data availability

All the datasets used in the present study are publicly available at https://github.com/zalandoresearch/fashion-mnist. We cited the reference of the data in the manuscript.

References

Nielsen, M. A. & Chuang, I. L. Quantum Computing and Quantum Information, 10th Anniversary (Cambridge University Press, 2010).
MATHÂ Google ScholarÂ
Woerner, S. & Egger, D. J. Quantum risk analysis. NPJ Quantum Inf. 5, 15. https://doi.org/10.1038/s41534-019-0130-6 (2019).
ArticleÂ ADSÂ Google ScholarÂ
Cao, Y. et al. Quantum chemistry in the age of quantum computing. Chem. Rev. 119, 10856â10915 (2019).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 130503 (2014).
ArticleÂ ADSÂ PubMedÂ Google ScholarÂ
Liu, Y., Arunachalam, S. & Temme, K. A rigorous and robust quantum speed-up in supervised machine learning. Nat. Phys. 17, 1013â1017 (2021).
ArticleÂ CASÂ Google ScholarÂ
Biamonte, J. et al. Quantum machine learning. Nature 549, 195â202 (2017).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Mitarai, K., Negoro, M., Kitagawa, M. & Fujii, K. Quantum circuit learning. Phys. Rev. A 98, 032309 (2018).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Schuld, M. & Killoran, N. Quantum machine learning in feature Hilbert spaces. Phys. Rev. Lett. 122, 040504 (2019).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
HavlÃÄek, V. et al. Supervised learning with quantum-enhanced feature spaces. Nature 567, 209â212 (2019).
ArticleÂ ADSÂ PubMedÂ Google ScholarÂ
Benedetti, M., Lloyd, E., Sack, S. & Fiorentini, M. Parameterized quantum circuits as machine learning models. Quantum Sci. Technol. 4, 043001 (2019).
ArticleÂ ADSÂ Google ScholarÂ
Huang, H.-Y. et al. Power of data in quantum machine learning. Nat. Commun. 12, 2631. https://doi.org/10.1038/s41467-021-22539-9 (2021).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Peters, E. et al. Machine learning of high dimensional data on a noisy quantum processor. NPJ Quantum Inf. 7, 161. https://doi.org/10.1038/s41534-021-00498-9 (2021).
ArticleÂ ADSÂ Google ScholarÂ
Hubregtsen, T. et al. Training quantum embedding kernels on near-term quantum computers.Â Preprint at https://arxiv.org/abs/2105.02276 (2021).
Jerbi, S. et al. Quantum machine learning beyond kernel methods. Preprint at https://arxiv.org/abs/2110.13162 (2021).
Shaydulin, R. & Wild, S. M. Importance of kernel bandwidth in quantum machine learning. Preprint at https://arxiv.org/abs/2111.05451 (2021).
Kusumoto, T., Mitarai, K., Fujii, K., Kitagawa, M. & Negoro, M. Experimental quantum kernel trick with nuclear spins in a solid. NPJ Quantum Inf. 7, 94. https://doi.org/10.1038/s41534-021-00423-0 (2021).
ArticleÂ ADSÂ Google ScholarÂ
Moradi, S. et al. Error mitigation for quantum kernel based machine learning methods on IonQ and IBM quantum computers. Preprint at https://arxiv.org/abs/2206.01573 (2022).
Postler, L. et al. Demonstration of fault-tolerant universal quantum gate operations. Nature 605, 675â680 (2022).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Preskill, J. Quantum computing in the NISQ era and beyond. Quantum 2, 79. https://doi.org/10.22331/q-2018-08-06-79 (2018).
ArticleÂ Google ScholarÂ
Bharti, K. et al. Noisy intermediate-scale quantum algorithms. Rev. Mod. Phys. 94, 015004 (2022).
ArticleÂ ADSÂ MathSciNetÂ CASÂ Google ScholarÂ
Aleksandrowicz, G. et al. Qiskit: An open-source framework for quantum computing. Qiskit. https://doi.org/10.5281/ZENODO.2562111 (2019).
Guo, C. et al. General-purpose quantum circuit simulator with projected entangled-pair states and the quantum supremacy frontier. Phys. Rev. Lett. 123, 190501 (2019).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Wang, Z. et al. A quantum circuit simulator and its applications on Sunway TaihuLight supercomputer. Sci. Rep. 11, 355. https://doi.org/10.1038/s41598-020-79777-y (2021).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Suzuki, Y. et al. Qulacs: A fast and versatile quantum circuit simulator for research purpose. Quantum 5, 559. https://doi.org/10.22331/q-2021-10-06-559 (2021).
ArticleÂ Google ScholarÂ
Efthymiou, S. et al. Qibo: A framework for quantum simulation with hardware acceleration. Quantum Sci. Technol. 7, 015018 (2022).
ArticleÂ ADSÂ Google ScholarÂ
Vincent, T. et al. Jet: Fast quantum circuit simulations with parallel task-based tensor-network contraction. Quantum 6, 709. https://doi.org/10.22331/q-2022-05-09-709 (2022).
ArticleÂ Google ScholarÂ
Nguyen, T. et al. Tensor network quantum virtual machine for simulating quantum circuits at exascale. Preprint at https://arxiv.org/abs/2104.10523 (2021).
Khalid, A. U., Zilic, Z. & Radecka, K. FPGA emulation of quantum circuits. In IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD) (2004).
Lee, Y. H., Khalil-Hani, M. & Marsono, M. N. An FPGA-based quantum computing emulation framework based on serial-parallel architecture. Int. J. Reconfigurable Comput. 2016, 5718124 (2016).
ArticleÂ Google ScholarÂ
Mahmud, N. & El-Araby, E. A scalable high-precision and high-throughput architecture for emulation of quantum algorithms. In 2018 31st IEEE International System-on-Chip Conference (SOCC) (2018).
Mahmud, N., El-Araby, E. & Caliga, D. Scaling reconfigurable emulation of quantum algorithms at high precision and high throughput. Quantum Eng. 1, e19. https://doi.org/10.1002/que2.19 (2019).
ArticleÂ Google ScholarÂ
Pilch, J. & DÅugopolski, J. An FPGA-based real quantum computer emulator. J. Comput. Electron. 18, 329â342 (2019).
ArticleÂ Google ScholarÂ
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273â297 (1995).
ArticleÂ MATHÂ Google ScholarÂ
SchÃ¶lkopf, B. & Smola, A. J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (MIT Press, 2002).
Google ScholarÂ
Bremner, M. J., Jozsa, R. & Shepherd, D. J. Classical simulation of commuting quantum computations implies collapse of the polynomial hierarchy. Proc. R. Soc. A 467, 459 (2011).
ArticleÂ ADSÂ MathSciNetÂ MATHÂ Google ScholarÂ
Martyn, J., Vidal, G., Roberts, C. & Leichenauer, S. Entanglement and tensor networks for supervised image classification. Preprint at https://arxiv.org/abs/2007.06082 (2020).
Suzuki, T. & Katouda, M. Predicting toxicity by quantum machine learning. J. Phys. Commun. 4, 125012. https://doi.org/10.1088/2399-6528/abd3d8 (2020).
ArticleÂ Google ScholarÂ
Subasi, A. & Gursoy, M. I. EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl. 37, 8659â8666 (2010).
ArticleÂ Google ScholarÂ
Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at https://arxiv.org/abs/1708.07747 (2017).
Volder, J. E. The CORDIC trigonometric computing technique. IRE Trans. Electron. Comput. 3, 330â334 (1959).
ArticleÂ Google ScholarÂ
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357â362 (2020).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Li, G. et al. On the co-design of quantum software and hardware. In Proceedings of the Eight Annual ACM International Conference on Nanoscale Computing and Communication (2021).
Tomesh, T. & Martonosi, M. Quantum codesign. IEEE Micro 41, 33â40 (2021).
ArticleÂ Google ScholarÂ
Canatar, A., Peters, E., Pehlevan, C., Wild, S. M. & Shaydulin, R. Bandwidth enables generalization in quantum kernel models. Preprint at https://arxiv.org/abs/2206.06686 (2022).
Kawaguchi, H. Application of quantum computing to a linear non-Gaussian acyclic model for novel medical knowledge discovery. Preprint at https://arxiv.org/abs/2110.04485 (2021).
Altares-LÃ³pez, S., Ribeiro, A. & GarcÃa-Ripoll, J. J. Automatic design of quantum feature maps. Quantum Sci. Technol. 6, 045015 (2021).
ArticleÂ ADSÂ Google ScholarÂ
Chang, C. C. & Lin, C. J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1â27 (2011).
ArticleÂ Google ScholarÂ

Download references

Acknowledgements

We thank Hideki Asoh (National Institute of Advanced Industrial Science and Technology) for useful discussions.

Author information

Authors and Affiliations

Research and Development Center, SCSK Corporation, Toyosu Front, 3-2-20 Toyosu, Koto-ku, Tokyo, 135-8110, Japan
Teppei Suzuki,Â Tsubasa Miyazaki,Â Toshiki InaritaiÂ &Â Takahiro Otsuka

Authors

Teppei Suzuki
View author publications
You can also search for this author in PubMedÂ Google Scholar
Tsubasa Miyazaki
View author publications
You can also search for this author in PubMedÂ Google Scholar
Toshiki Inaritai
View author publications
You can also search for this author in PubMedÂ Google Scholar
Takahiro Otsuka
View author publications
You can also search for this author in PubMedÂ Google Scholar

Contributions

T.S. and T.O. conceived the concept of co-designing the quantum kernel and the FPGA implementation. T.S. conceived the idea of the quantum kernel for image classification in this work. T.M. developed the computer code for the quantum kernel simulator. T.I. executed the FPGA implementation and performed the numerical simulations. T.S., T.I., and T.M. analyzed the results of the quantum support vector machines. T.S. wrote the manuscript. All the authors commented on the manuscript.

Corresponding author

Correspondence to Teppei Suzuki.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Suzuki, T., Miyazaki, T., Inaritai, T. et al. Quantum AI simulator using a hybrid CPUâFPGA approach. Sci Rep 13, 7735 (2023). https://doi.org/10.1038/s41598-023-34600-2

Download citation

Received: 21 September 2022
Accepted: 04 May 2023
Published: 12 May 2023
DOI: https://doi.org/10.1038/s41598-023-34600-2

This article is cited by

Quantum support vector machines for classification and regression on a trapped-ion quantum computer
- Teppei Suzuki
- Takashi Hasebe
- Tsubasa Miyazaki
Quantum Machine Intelligence (2024)