Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Short quantum circuits in reinforcement learning policies for the vehicle routing problem

Fabio Sanches, Sean Weinberg, Takanori Ide, and Kazumitsu Kamiya
Phys. Rev. A 105, 062403 – Published 3 June 2022

Abstract

Quantum computing and machine learning have potential for symbiosis. However, in addition to the hardware limitations from current devices, there are still basic issues that must be addressed before quantum circuits can usefully incorporate with current machine learning tasks. We report a strategy for such an integration in the context of attention models used for reinforcement learning. Agents that implement attention mechanisms have successfully been applied to certain cases of combinatorial routing problems by first encoding nodes on a graph and then sequentially decoding nodes until a route is selected. We demonstrate that simple quantum circuits can be used in place of classical attention head layers while maintaining performance. Our method modifies attention mechanisms by replacing key and query vectors for every node with quantum states that are entangled before being measured. The resulting hybrid classical-quantum agent is tested in the context of vehicle routing problems where its performance is competitive with the original classical approach. We regard our model as a prototype that can be scaled up and as an avenue for further study on the role of quantum computing in reinforcement learning.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
2 More
  • Received 18 December 2021
  • Accepted 5 May 2022

DOI:https://doi.org/10.1103/PhysRevA.105.062403

©2022 American Physical Society

Physics Subject Headings (PhySH)

Quantum Information, Science & TechnologyInterdisciplinary Physics

Authors & Affiliations

Fabio Sanches1,*, Sean Weinberg1,*, Takanori Ide2, and Kazumitsu Kamiya3

  • 1QC Ware Corporation, Palo Alto, California 94306, USA
  • 2Aisin Corporation, Tokyo Research Center, Chiyoda-ku 101-0021, Tokyo, Japan
  • 3Aisin Technical Center of America, San Jose, California 95110, USA

  • *These authors contributed equally to this work.

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 105, Iss. 6 — June 2022

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×

Images

  • Figure 1
    Figure 1

    Structure of the encoder in our model. This structure differs from [6] only in the heads H1–H6 which are quantum rather than classical attention heads.

    Reuse & Permissions
  • Figure 2
    Figure 2

    The short quantum circuit used to compute a quantum key-query compatibility.

    Reuse & Permissions
  • Figure 3
    Figure 3

    Comparison of training results for quantum and classical attention heads with synthetic data. Quantum circuits are simulated during the training process. The lower plot averages together the four quantum and four classical training runs, respectively. Note that every data point in both plots is an epoch average over episodes.

    Reuse & Permissions
  • Figure 4
    Figure 4

    Locations of the 288 suppliers in the Aisin supply chain data. These suppliers are scattered across Japan. Note that the largest cluster is in the vicinity of Anjō where the depot is located.

    Reuse & Permissions
  • Figure 5
    Figure 5

    Performance of our quantum attention head model on Aisin Group powertrain supply chain data. The performance is less consistent than that of synthetic data due to the awkward distribution of suppliers.

    Reuse & Permissions
  • Figure 6
    Figure 6

    The connectivity of the Rigetti Aspen-8 hardware at the time of our experimentation.

    Reuse & Permissions
  • Figure 7
    Figure 7

    Our qubit mapping methodology for implementing the quantum attention head short circuit (left) on physical qubits on a line. By taking advantage of the larger number of qubits, we are able to remove error associated with swap gates by doubling the number of query qubits. In this figure, blue qubits are keys and yellow qubits are (redundantly encoded) queries.

    Reuse & Permissions
  • Figure 8
    Figure 8

    Selection of qubits on the Rigetti Aspen-8 chip for five attention heads in parallel. The groups of qubits that are indicated with black curves correspond directly to the construction shown in Fig. 7.

    Reuse & Permissions
  • Figure 9
    Figure 9

    Single VRP episode performed on Rigetti Aspen-8. The green diamond is the depot and red hexagons are depots. Nodes lie in the unit square (the thick black square around the figure) and distances are Euclidean. The numbers near nodes are initial demands. The truck has capacity 1 and follows the route down with arrows. We emphasize that this single episode is only meant as a proof of concept for the feasibility of running our model on hardware. The fact that the agent selected the optimal route is not statistically meaningful since cost constraints limited our ability to repeat the experiment.

    Reuse & Permissions
×

Sign up to receive regular email alerts from Physical Review A

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×