  • Published:

A graph placement methodology for fast chip design

20 September 2023 Editor’s Note: Readers are alerted that the performance claims in this article have been called into question. The Editors are investigating these concerns, and, if appropriate, editorial action will be taken once this investigation is complete.

An Author Correction to this article was published on 31 March 2022

This article has been updated


Chip floorplanning is the engineering task of designing the physical layout of a computer chip. Despite five decades of research1, chip floorplanning has defied automation, requiring months of intense effort by physical design engineers to produce manufacturable layouts. Here we present a deep reinforcement learning approach to chip floorplanning. In under six hours, our method automatically generates chip floorplans that are superior or comparable to those produced by humans in all key metrics, including power consumption, performance and chip area. To achieve this, we pose chip floorplanning as a reinforcement learning problem, and develop an edge-based graph convolutional neural network architecture capable of learning rich and transferable representations of the chip. As a result, our method utilizes past experience to become better and faster at solving new instances of the problem, allowing chip design to be performed by artificial agents with more experience than any human designer. Our method was used to design the next generation of Google’s artificial intelligence (AI) accelerators, and has the potential to save thousands of hours of human effort for each new generation. Finally, we believe that more powerful AI-designed hardware will fuel advances in AI, creating a symbiotic relationship between the two fields.

Fig. 1: Overview of our method and training regimen.
Fig. 2: Policy and value network architecture.
Fig. 3: Training from scratch versus fine-tuning for varying amounts of time.
Fig. 4: Convergence plots on Ariane RISC-V CPU.
Fig. 5: Effect of pre-training dataset size.

Data availability

The data supporting the findings of this study are available within the paper and the Extended Data.

Code availability

The code used to generate these data is available in the following GitHub repository: https://github.com/google-research/circuit_training.

Change history

  • 26 October 2021

    The editors of Nature have been informed that the code behind this paper is currently unavailable. The authors are currently working to migrate the code to an open-source platform, and we will update the paper with access instructions once this process is completed.

  • 01 April 2022

    The earlier issue of code availability has now been resolved and a correction notice was published on 31 March 2022 with the link to the GitHub repository.

  • 22 April 2022

    The Peer Review File for this article was included as a Supplementary Information file.

  • 20 September 2023

    Editor’s Note: Readers are alerted that the performance claims in this article have been called into question. The Editors are investigating these concerns, and, if appropriate, editorial action will be taken once this investigation is complete.

  • 31 March 2022

    A Correction to this paper has been published: https://doi.org/10.1038/s41586-022-04657-6


This project was a collaboration between Google Brain and the Google Chip Implementation and Infrastructure (CI2) Team. We thank M. Bellemare, C. Young, E. Chi, C. Stratakos, S. Roy, A. Yazdanbakhsh, N. Myung-Chul Kim, S. Agarwal, B. Li, S. Bae, A. Babu, M. Abadi, A. Salek, S. Bengio and D. Patterson for their help and support.

A.G. and A.M. are co-first authors and the order of the names was determined by coin flip. M.Y., J.W.J., E.S., S.W. and Y.-J.L. were major contributors to this work. The following authors contributed to the overall evaluation and provided insights on physical design: E.J., O.P., A.N., J.P., A.T., K.S., W.H. and E.T. The following authors managed and advised on the project: Q.V.L., J.L., R.H., R.C. and J.D.

Correspondence to Azalia Mirhoseini or Anna Goldie.

Competing interests

The following US patents are related to this work: ‘Generating integrated circuit floorplans using neural networks’ (granted as US10699043) and ‘Domain adaptive reinforcement learning approach to macro placement’ (filed).

Peer review informationNature thanks Jakob Foerster and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Evaluation workflow for producing the results in Table 1.

We allow each method access to the same clustered netlist hypergraph. We use the same hyperparameters (to the extent possible) in all the methods. Once the placement is completed by each method (this includes the legalization step for RePlAce), we snap the macros to the power grids, freeze the macro locations and use a commercial EDA tool to place the standard cells and report the final results.

Extended Data Fig. 2 Zero-shot performance of Edge-GNN versus GCN (graph convolutional neural network)77.

The agent with an Edge-GNN architecture is more robust to over-fitting and yields higher-quality results, as measured by average zero-shot performance on the test blocks shown in Extended Data Fig. 1.

Extended Data Fig. 3 Generalization performance as a function of pre-training dataset size.

We pre-train the policy network on three different training datasets (the small dataset with 2 blocks is a subset of the medium one with 5 blocks, and the medium dataset is a subset of the large one with 20 blocks). For each policy, at various snapshots during pre-training we report its inference performance on an unseen test block. As the dataset size increases, both the quality of generated placements on the test block and the generalization performance of the policy improve. The policy trained on the largest dataset is most robust to over-fitting.

Extended Data Fig. 4 Visualization of Ariane placements.

Left, zero-shot placements from the pre-trained policy; right, placements from the fine-tuned policy. The zero-shot placements are generated at inference time on a previously unseen chip. The pre-trained policy network (with no fine-tuning) reserves a convex hull in the centre of the canvas in which standard cells can be placed, a behaviour that reduces wirelength and that emerges only after many hours of fine-tuning in the policy trained from scratch.

Extended Data Fig. 5 Visualization of a real TPU chip.

Human expert placements are shown on the left and results from our approach are shown on the right. The white area represents macros and the green area represents standard cells. The figures are intentionally blurred because the designs are proprietary. The wirelength for the human expert design is 57.07 m, whereas ours is 55.42 m. Furthermore, our method achieves these results in 6 h, whereas the manual baseline took several weeks.

Extended Data Table 1 Hyperparameters used for fine-tuning the RL agent
Extended Data Table 2 Hyperparameters used for the FD algorithm that places standard cell clusters
Extended Data Table 3 Hyperparameters used to generate standard cell clusters with hMETIS32
Extended Data Table 4 Effect of different cost trade-offs on the post-PlaceOpt performance of Block 1 in Table 1
Extended Data Table 5 Sensitivity of results to the choice of random seed, as measured on a Ariane RISC-V block
Extended Data Table 6 Performance of our method compared to SA

Mirhoseini, A., Goldie, A., Yazgan, M. et al. A graph placement methodology for fast chip design. Nature 594, 207–212 (2021). https://doi.org/10.1038/s41586-021-03544-w

