Search | arXiv e-print repository

Kubric: A scalable dataset generator

Authors: Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi , et al. (10 additional authors not shown)

Abstract: Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential… ▽ More Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential to address these shortcomings: 1) it is cheap 2) supports rich ground-truth annotations 3) offers full control over data and 4) can circumvent or mitigate problems regarding bias, privacy and licensing. Unfortunately, software tools for effective data generation are less mature than those for architecture design and training, which leads to fragmented generation efforts. To address these problems we introduce Kubric, an open-source Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines, and generating TBs of data. We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation. We release Kubric, the used assets, all of the generation code, as well as the rendered datasets for reuse and modification. △ Less

Submitted 7 March, 2022; originally announced March 2022.

Comments: 21 pages, CVPR2022

arXiv:2002.02481 [pdf, other]

Sensitivity Analysis in the Dupire Local Volatility Model with Tensorflow

Authors: Francois Belletti, Davis King, James Lottes, Yi-Fan Chen, John Anderson

Abstract: In a recent paper, we have demonstrated how the affinity between TPUs and multi-dimensional financial simulation resulted in fast Monte Carlo simulations that could be setup in a few lines of python Tensorflow code. We also presented a major benefit from writing high performance simulations in an automated differentiation language such as Tensorflow: a single line of code enabled us to estimate se… ▽ More In a recent paper, we have demonstrated how the affinity between TPUs and multi-dimensional financial simulation resulted in fast Monte Carlo simulations that could be setup in a few lines of python Tensorflow code. We also presented a major benefit from writing high performance simulations in an automated differentiation language such as Tensorflow: a single line of code enabled us to estimate sensitivities, i.e. the rate of change in price of financial instrument with respect to another input such as the interest rate, the current price of the underlying, or volatility. Such sensitivities (otherwise known as the famous financial "Greeks") are fundamental for risk assessment and risk mitigation. In the present follow-up short paper, we extend the developments exposed in our previous work about the use of Tensor Processing Units and Tensorflow for TPUs. △ Less

Submitted 6 February, 2020; originally announced February 2020.

arXiv:1906.02818 [pdf, other]

Tensor Processing Units for Financial Monte Carlo

Authors: Francois Belletti, Davis King, Kun Yang, Roland Nelet, Yusef Shafi, Yi-Fan Chen, John Anderson

Abstract: Monte Carlo methods are critical to many routines in quantitative finance such as derivatives pricing, hedging and risk metrics. Unfortunately, Monte Carlo methods are very computationally expensive when it comes to running simulations in high-dimensional state spaces where they are still a method of choice in the financial industry. Recently, Tensor Processing Units (TPUs) have provided considera… ▽ More Monte Carlo methods are critical to many routines in quantitative finance such as derivatives pricing, hedging and risk metrics. Unfortunately, Monte Carlo methods are very computationally expensive when it comes to running simulations in high-dimensional state spaces where they are still a method of choice in the financial industry. Recently, Tensor Processing Units (TPUs) have provided considerable speedups and decreased the cost of running Stochastic Gradient Descent (SGD) in Deep Learning. After highlighting computational similarities between training neural networks with SGD and simulating stochastic processes, we ask in the present paper whether TPUs are accurate, fast and simple enough to use for financial Monte Carlo. Through a theoretical reminder of the key properties of such methods and thorough empirical experiments we examine the fitness of TPUs for option pricing, hedging and risk metrics computation. In particular we demonstrate that, in spite of the use of mixed precision, TPUs still provide accurate estimators which are fast to compute when compared to GPUs. We also show that the Tensorflow programming model for TPUs is elegant, expressive and simplifies automated differentiation. △ Less

Submitted 27 January, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

arXiv:1905.09874 [pdf, other]

Scaling Up Collaborative Filtering Data Sets through Randomized Fractal Expansions

Authors: Francois Belletti, Karthik Lakshmanan, Walid Krichene, Nicolas Mayoraz, Yi-Fan Chen, John Anderson, Taylor Robie, Tayo Oguntebi, Dan Shirron, Amit Bleiwess

Abstract: Recommender system research suffers from a disconnect between the size of academic data sets and the scale of industrial production systems. In order to bridge that gap, we propose to generate large-scale user/item interaction data sets by expanding pre-existing public data sets. Our key contribution is a technique that expands user/item incidence matrices matrices to large numbers of rows (users)… ▽ More Recommender system research suffers from a disconnect between the size of academic data sets and the scale of industrial production systems. In order to bridge that gap, we propose to generate large-scale user/item interaction data sets by expanding pre-existing public data sets. Our key contribution is a technique that expands user/item incidence matrices matrices to large numbers of rows (users), columns (items), and non-zero values (interactions). The proposed method adapts Kronecker Graph Theory to preserve key higher order statistical properties such as the fat-tailed distribution of user engagements, item popularity, and singular value spectra of user/item interaction matrices. Preserving such properties is key to building large realistic synthetic data sets which in turn can be employed reliably to benchmark recommender systems and the systems employed to train them. We further apply our stochastic expansion algorithm to the binarized MovieLens 20M data set, which comprises 20M interactions between 27K movies and 138K users. The resulting expanded data set has 1.2B ratings, 2.2M users, and 855K items, which can be scaled up or down. △ Less

Submitted 8 April, 2019; originally announced May 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1901.08910

ACM Class: I.2.6; H.3.3

arXiv:1905.09414 [pdf, other]

Quantifying Long Range Dependence in Language and User Behavior to improve RNNs

Authors: Francois Belletti, Minmin Chen, Ed H. Chi

Abstract: Characterizing temporal dependence patterns is a critical step in understanding the statistical properties of sequential data. Long Range Dependence (LRD) --- referring to long-range correlations decaying as a power law rather than exponentially w.r.t. distance --- demands a different set of tools for modeling the underlying dynamics of the sequential data. While it has been widely conjectured tha… ▽ More Characterizing temporal dependence patterns is a critical step in understanding the statistical properties of sequential data. Long Range Dependence (LRD) --- referring to long-range correlations decaying as a power law rather than exponentially w.r.t. distance --- demands a different set of tools for modeling the underlying dynamics of the sequential data. While it has been widely conjectured that LRD is present in language modeling and sequential recommendation, the amount of LRD in the corresponding sequential datasets has not yet been quantified in a scalable and model-independent manner. We propose a principled estimation procedure of LRD in sequential datasets based on established LRD theory for real-valued time series and apply it to sequences of symbols with million-item-scale dictionaries. In our measurements, the procedure estimates reliably the LRD in the behavior of users as they write Wikipedia articles and as they interact with YouTube. We further show that measuring LRD better informs modeling decisions in particular for RNNs whose ability to capture LRD is still an active area of research. The quantitative measure informs new Evolutive Recurrent Neural Networks (EvolutiveRNNs) designs, leading to state-of-the-art results on language understanding and sequential recommendation tasks at a fraction of the computational cost. △ Less

Submitted 22 May, 2019; originally announced May 2019.

arXiv:1902.08588 [pdf, other]

doi 10.1145/3308558.3313650

Towards Neural Mixture Recommender for Long Range Dependent User Sequences

Authors: Jiaxi Tang, Francois Belletti, Sagar Jain, Minmin Chen, Alex Beutel, Can Xu, Ed H. Chi

Abstract: Understanding temporal dynamics has proved to be highly valuable for accurate recommendation. Sequential recommenders have been successful in modeling the dynamics of users and items over time. However, while different model architectures excel at capturing various temporal ranges or dynamics, distinct application contexts require adapting to diverse behaviors. In this paper we examine how to buil… ▽ More Understanding temporal dynamics has proved to be highly valuable for accurate recommendation. Sequential recommenders have been successful in modeling the dynamics of users and items over time. However, while different model architectures excel at capturing various temporal ranges or dynamics, distinct application contexts require adapting to diverse behaviors. In this paper we examine how to build a model that can make use of different temporal ranges and dynamics depending on the request context. We begin with the analysis of an anonymized Youtube dataset comprising millions of user sequences. We quantify the degree of long-range dependence in these sequences and demonstrate that both short-term and long-term dependent behavioral patterns co-exist. We then propose a neural Multi-temporal-range Mixture Model (M3) as a tailored solution to deal with both short-term and long-term dependencies. Our approach employs a mixture of models, each with a different temporal range. These models are combined by a learned gating mechanism capable of exerting different model combinations given different contextual information. In empirical evaluations on a public dataset and our own anonymized YouTube dataset, M3 consistently outperforms state-of-the-art sequential recommendation methods. △ Less

Submitted 22 February, 2019; originally announced February 2019.

Comments: Accepted at WWW 2019

arXiv:1901.08910 [pdf, other]

Scalable Realistic Recommendation Datasets through Fractal Expansions

Authors: Francois Belletti, Karthik Lakshmanan, Walid Krichene, Yi-Fan Chen, John Anderson

Abstract: Recommender System research suffers currently from a disconnect between the size of academic data sets and the scale of industrial production systems. In order to bridge that gap we propose to generate more massive user/item interaction data sets by expanding pre-existing public data sets. User/item incidence matrices record interactions between users and items on a given platform as a large spars… ▽ More Recommender System research suffers currently from a disconnect between the size of academic data sets and the scale of industrial production systems. In order to bridge that gap we propose to generate more massive user/item interaction data sets by expanding pre-existing public data sets. User/item incidence matrices record interactions between users and items on a given platform as a large sparse matrix whose rows correspond to users and whose columns correspond to items. Our technique expands such matrices to larger numbers of rows (users), columns (items) and non zero values (interactions) while preserving key higher order statistical properties. We adapt the Kronecker Graph Theory to user/item incidence matrices and show that the corresponding fractal expansions preserve the fat-tailed distributions of user engagements, item popularity and singular value spectra of user/item interaction matrices. Preserving such properties is key to building large realistic synthetic data sets which in turn can be employed reliably to benchmark Recommender Systems and the systems employed to train them. We provide algorithms to produce such expansions and apply them to the MovieLens 20 million data set comprising 20 million ratings of 27K movies by 138K users. The resulting expanded data set has 10 billion ratings, 864K items and 2 million users in its smaller version and can be scaled up or down. A larger version features 655 billion ratings, 7 million items and 17 million users. △ Less

Submitted 20 February, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

ACM Class: I.2.6; H.3.3

arXiv:1812.02353 [pdf, other]

Top-K Off-Policy Correction for a REINFORCE Recommender System

Authors: Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, Ed Chi

Abstract: Industrial recommender systems deal with extremely large action spaces -- many millions of items to recommend. Moreover, they need to serve billions of users, who are unique at any point in time, making a complex user state space. Luckily, huge quantities of logged implicit feedback (e.g., user clicks, dwell time) are available for learning. Learning from the logged feedback is however subject to… ▽ More Industrial recommender systems deal with extremely large action spaces -- many millions of items to recommend. Moreover, they need to serve billions of users, who are unique at any point in time, making a complex user state space. Luckily, huge quantities of logged implicit feedback (e.g., user clicks, dwell time) are available for learning. Learning from the logged feedback is however subject to biases caused by only observing feedback on recommendations selected by the previous versions of the recommender. In this work, we present a general recipe of addressing such biases in a production top-K recommender system at Youtube, built with a policy-gradient-based algorithm, i.e. REINFORCE. The contributions of the paper are: (1) scaling REINFORCE to a production recommender system with an action space on the orders of millions; (2) applying off-policy correction to address data biases in learning from logged feedback collected from multiple behavior policies; (3) proposing a novel top-K off-policy correction to account for our policy recommending multiple items at a time; (4) showcasing the value of exploration. We demonstrate the efficacy of our approaches through a series of simulations and multiple live experiments on Youtube. △ Less

Submitted 14 December, 2021; v1 submitted 6 December, 2018; originally announced December 2018.

arXiv:1701.08832 [pdf, other]

Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Authors: Francois Belletti, Daniel Haziza, Gabriel Gomes, Alexandre M. Bayen

Abstract: This article shows how the recent breakthroughs in Reinforcement Learning (RL) that have enabled robots to learn to play arcade video games, walk or assemble colored bricks, can be used to perform other tasks that are currently at the core of engineering cyberphysical systems. We present the first use of RL for the control of systems modeled by discretized non-linear Partial Differential Equations… ▽ More This article shows how the recent breakthroughs in Reinforcement Learning (RL) that have enabled robots to learn to play arcade video games, walk or assemble colored bricks, can be used to perform other tasks that are currently at the core of engineering cyberphysical systems. We present the first use of RL for the control of systems modeled by discretized non-linear Partial Differential Equations (PDEs) and devise a novel algorithm to use non-parametric control techniques for large multi-agent systems. We show how neural network based RL enables the control of discretized PDEs whose parameters are unknown, random, and time-varying. We introduce an algorithm of Mutual Weight Regularization (MWR) which alleviates the curse of dimensionality of multi-agent control schemes by sharing experience between agents while giving each agent the opportunity to specialize its action policy so as to tailor it to the local parameters of the part of the system it is located in. △ Less

Submitted 30 January, 2017; originally announced January 2017.

arXiv:1603.03336 [pdf, other]

Scalable Linear Causal Inference for Irregularly Sampled Time Series with Long Range Dependencies

Authors: Francois W. Belletti, Evan R. Sparks, Michael J. Franklin, Alexandre M. Bayen, Joseph E. Gonzalez

Abstract: Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long ran… ▽ More Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long range dependencies, and scale. Moreover, real-world data is often collected at irregular time intervals across vast arrays of decentralized sensors and with long range dependencies which make naive time domain correlation estimators spurious. In this paper we present a frequency domain based estimation framework which naturally handles irregularly sampled data and long range dependencies while enabled memory and communication efficient distributed processing of time series data. By operating in the frequency domain we eliminate the need to interpolate and help mitigate the effects of long range dependencies. We implement and evaluate our new work-flow in the distributed setting using Apache Spark and demonstrate on both Monte Carlo simulations and high-frequency financial trading that we can accurately recover causal structure at scale. △ Less

Submitted 10 March, 2016; originally announced March 2016.

arXiv:1511.06493 [pdf, other]

Embarrassingly Parallel Time Series Analysis for Large Scale Weak Memory Systems

Authors: Francois Belletti, Evan Sparks, Michael Franklin, Alexandre M. Bayen

Abstract: Second order stationary models in time series analysis are based on the analysis of essential statistics whose computations follow a common pattern. In particular, with a map-reduce nomenclature, most of these operations can be modeled as mapping a kernel that only depends on short windows of consecutive data and reducing the results produced by each computation. This computational pattern stems f… ▽ More Second order stationary models in time series analysis are based on the analysis of essential statistics whose computations follow a common pattern. In particular, with a map-reduce nomenclature, most of these operations can be modeled as mapping a kernel that only depends on short windows of consecutive data and reducing the results produced by each computation. This computational pattern stems from the ergodicity of the model under consideration and is often referred to as weak or short memory when it comes to data indexed with respect to time. In the following we will show how studying weak memory systems can be done in a scalable manner thanks to a framework relying on specifically designed overlapping distributed data structures that enable fragmentation and replication of the data across many machines as well as parallelism in computations. This scheme has been implemented for Apache Spark but is certainly not system specific. Indeed we prove it is also adapted to leveraging high bandwidth fragmented memory blocks on GPUs. △ Less

Submitted 20 November, 2015; originally announced November 2015.

MSC Class: 68M14; 37M10; 62M10

arXiv:0811.2864 [pdf, ps, other]

doi 10.1007/s10955-009-9727-z

An in-depth view of the microscopic dynamics of Ising spin glasses at fixed temperature

Authors: Janus Collaboration, F. Belletti, A. Cruz, L. A. Fernandez, A. Gordillo-Guerrero, M. Guidetti, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, J. Monforte, A. Muñoz Sudupe, D. Navarro, G. Parisi, S. Perez-Gaviro, J. J. Ruiz-Lorenzo, S. F. Schifano, D. Sciretti, A. Tarancon, R. Tripiccione, D. Yllanes

Abstract: Using the dedicated computer Janus, we follow the nonequilibrium dynamics of the Ising spin glass in three dimensions for eleven orders of magnitude. The use of integral estimators for the coherence and correlation lengths allows us to study dynamic heterogeneities and the presence of a replicon mode and to obtain safe bounds on the Edwards-Anderson order parameter below the critical temperature… ▽ More Using the dedicated computer Janus, we follow the nonequilibrium dynamics of the Ising spin glass in three dimensions for eleven orders of magnitude. The use of integral estimators for the coherence and correlation lengths allows us to study dynamic heterogeneities and the presence of a replicon mode and to obtain safe bounds on the Edwards-Anderson order parameter below the critical temperature. We obtain good agreement with experimental determinations of the temperature-dependent decay exponents for the thermoremanent magnetization. This magnitude is observed to scale with the much harder to measure coherence length, a potentially useful result for experimentalists. The exponents for energy relaxation display a linear dependence on temperature and reasonable extrapolations to the critical point. We conclude examining the time growth of the coherence length, with a comparison of critical and activated dynamics. △ Less

Submitted 18 November, 2008; originally announced November 2008.

Comments: 38 pages, 26 figures

Journal ref: J. Stat. Phys. 135, 1121-1158 (2009)

arXiv:0804.1471 [pdf, ps, other]

doi 10.1103/PhysRevLett.101.157201

Nonequilibrium spin glass dynamics from picoseconds to 0.1 seconds

Authors: F. Belletti, M. Cotallo, A. Cruz, L. A. Fernandez, A. Gordillo-Guerrero, M. Guidetti, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, A. Munoz Sudupe, D. Navarro, G. Parisi, S. Perez-Gaviro, J. J. Ruiz-Lorenzo, S. F. Schifano, D. Sciretti, A. Tarancon, R. Tripiccione, J. L. Velasco, D. Yllanes

Abstract: We study numerically the nonequilibrium dynamics of the Ising Spin Glass, for a time that spans eleven orders of magnitude, thus approaching the experimentally relevant scale (i.e. {\em seconds}). We introduce novel analysis techniques that allow to compute the coherence length in a model-independent way. Besides, we present strong evidence for a replicon correlator and for overlap equivalence.… ▽ More We study numerically the nonequilibrium dynamics of the Ising Spin Glass, for a time that spans eleven orders of magnitude, thus approaching the experimentally relevant scale (i.e. {\em seconds}). We introduce novel analysis techniques that allow to compute the coherence length in a model-independent way. Besides, we present strong evidence for a replicon correlator and for overlap equivalence. The emerging picture is compatible with non-coarsening behavior. △ Less

Submitted 5 September, 2008; v1 submitted 9 April, 2008; originally announced April 2008.

Comments: 4 pages, 4 eps color figures. Version accepted for publication in Phys. Rev. Lett

Journal ref: Phys. Rev. Lett. 101 (2008) 157201

arXiv:0710.3535 [pdf, other]

doi 10.1109/MCSE.2009.11

JANUS: an FPGA-based System for High Performance Scientific Computing

Authors: F. Belletti, M. Cotallo, A. Cruz, L. A. Fernández, A. Gordillo, M. Guidetti, A. Maiorano, F. Mantovani, E. Marinari, V. Martín-Mayor, A. Muñoz-Sudupe, D. Navarro, G. Parisi, S. Pérez-Gaviro, M. Rossi, J. J. Ruiz-Lorenzo, S. F. Schifano, D. Sciretti, A. Tarancón, R. Tripiccione, J. L. Velasco

Abstract: This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neighbor data links. Processors are also directly connected to an I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for, but not… ▽ More This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neighbor data links. Processors are also directly connected to an I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for, but not limited to, the requirements of a class of hard scientific applications characterized by regular code structure, unconventional data manipulation instructions and not too large data-base size. We discuss the architecture of this configurable machine, and focus on its use on Monte Carlo simulations of statistical mechanics. On this class of application JANUS achieves impressive performances: in some cases one JANUS processing element outperfoms high-end PCs by a factor ~ 1000. We also discuss the role of JANUS on other classes of scientific applications. △ Less

Submitted 8 April, 2008; v1 submitted 18 October, 2007; originally announced October 2007.

Comments: 11 pages, 6 figures. Improved version, largely rewritten, submitted to Computing in Science & Engineering

Journal ref: Computing in Science & Engineering 11 (2009 ) 48-58

arXiv:0710.2442 [pdf, ps, other]

QCD on the Cell Broadband Engine

Authors: F. Belletti, G. Bilardi, M. Drochner, N. Eicker, Z. Fodor, D. Hierl, H. Kaldass, T. Lippert, T. Maurer, N. Meyer, A. Nobile, D. Pleiter, A. Schaefer, F. Schifano, H. Simma, S. Solbrig, T. Streuer, R. Tripiccione, T. Wettig

Abstract: We evaluate IBM's Enhanced Cell Broadband Engine (BE) as a possible building block of a new generation of lattice QCD machines. The Enhanced Cell BE will provide full support of double-precision floating-point arithmetics, including IEEE-compliant rounding. We have developed a performance model and applied it to relevant lattice QCD kernels. The performance estimates are supported by micro- and… ▽ More We evaluate IBM's Enhanced Cell Broadband Engine (BE) as a possible building block of a new generation of lattice QCD machines. The Enhanced Cell BE will provide full support of double-precision floating-point arithmetics, including IEEE-compliant rounding. We have developed a performance model and applied it to relevant lattice QCD kernels. The performance estimates are supported by micro- and application-benchmarks that have been obtained on currently available Cell BE-based computers, such as IBM QS20 blades and PlayStation 3. The results are encouraging and show that this processor is an interesting option for lattice QCD applications. For a massively parallel machine on the basis of the Cell BE, an application-optimized network needs to be developed. △ Less

Submitted 12 October, 2007; originally announced October 2007.

Comments: 7 pages, 3 figures, contribution to Lattice 2007 (Regensburg, Germany)

Journal ref: PoSLAT2007:039,2007

arXiv:0704.3573 [pdf, ps, other]

doi 10.1016/j.cpc.2007.09.006

Simulating spin systems on IANUS, an FPGA-based computer

Authors: F. Belletti, M. Cotallo, A. Cruz, L. A. Fernández, A. Gordillo, A. Maiorano, F. Mantovani, E. Marinari, V. Martín-Mayor, A. Muñoz-Sudupe, D. Navarro, S. Pérez-Gaviro, J. J. Ruiz-Lorenzo, S. F. Schifano, D. Sciretti, A. Tarancón, R. Tripiccione, J. L. Velasco

Abstract: We describe the hardwired implementation of algorithms for Monte Carlo simulations of a large class of spin models. We have implemented these algorithms as VHDL codes and we have mapped them onto a dedicated processor based on a large FPGA device. The measured performance on one such processor is comparable to O(100) carefully programmed high-end PCs: it turns out to be even better for some sele… ▽ More We describe the hardwired implementation of algorithms for Monte Carlo simulations of a large class of spin models. We have implemented these algorithms as VHDL codes and we have mapped them onto a dedicated processor based on a large FPGA device. The measured performance on one such processor is comparable to O(100) carefully programmed high-end PCs: it turns out to be even better for some selected spin models. We describe here codes that we are currently executing on the IANUS massively parallel FPGA-based system. △ Less

Submitted 26 April, 2007; originally announced April 2007.

Comments: 19 pages, 8 figures; submitted to Computer Physics Communications

Journal ref: Computer Physics Communications, 178 (3), p.208-216, (2008)

arXiv:cond-mat/0507270 [pdf, ps, other]

Ianus: an Adpative FPGA Computer

Authors: Ianus Collaboration, F. Belletti, I. Campos, A. Cruz, L. A. Fernandez, S. Jimenez, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, D. Navarro, A. Munoz-Sudupe, S. Perez Gaviro, G. Poli, J. J. Ruiz-Lorenzo, F. Schifano, D. Sciretti, A. Tarancon, P. Tellez, R. Tripiccione, J. L. Velasco

Abstract: Dedicated machines designed for specific computational algorithms can outperform conventional computers by several orders of magnitude. In this note we describe {\it Ianus}, a new generation FPGA based machine and its basic features: hardware integration and wide reprogrammability. Our goal is to build a machine that can fully exploit the performance potential of new generation FPGA devices. We… ▽ More Dedicated machines designed for specific computational algorithms can outperform conventional computers by several orders of magnitude. In this note we describe {\it Ianus}, a new generation FPGA based machine and its basic features: hardware integration and wide reprogrammability. Our goal is to build a machine that can fully exploit the performance potential of new generation FPGA devices. We also plan a software platform which simplifies its programming, in order to extend its intended range of application to a wide class of interesting and computationally demanding problems. The decision to develop a dedicated processor is a complex one, involving careful assessment of its performance lead, during its expected lifetime, over traditional computers, taking into account their performance increase, as predicted by Moore's law. We discuss this point in detail. △ Less

Submitted 12 July, 2005; originally announced July 2005.

Journal ref: Computing in Science & Engineering 8, 41-49 (2006)

Showing 1–17 of 17 results for author: Belletti, F