Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Parallel Monte Carlo

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 39

Parallel Monte Carlo

CSC 5551 Parallel and Distributed Systems


William Roche
April 10, 2006

http://ouray.cudenver.edu/~wrroche/Parellel_Processing/ClassProject/Presentation.html
What is Monte Carlo
• The Monte Carlo method is a very
general term. It is used to classify
programs which use random numbers to
approximate solutions to problems.
The Idea’s Spark
• First proposed as a computer algorithm by
Nicolas Metropolis and Stanislaw Ulam,
1949, Las Alamos, while working on the
Manhatten Project.
– While ill he was wondering the probability of
winning at solitaire. It seemed computers of
the time might be able to simulate 100-1000
games and give an accurate estimate.
– He then applied this idea to neutron diffusion
and other complex problems in mathematics
and physics.
Buffon's needle
• Even earlier, in 1777, Mathematician Comte de
Buffon wondered, ‘If a needle were dropped
randomly on a table with parallel lines, what is the
probability that the needle will cross one of the
lines?’

• That outcome will occur only if A < l sin 


• Probability = 2l
d
The Experiment
π
n m l d surface estimate
500 236 3 4 stationary 3.1780
530 253 3 4 rotated 3.1423
590 939 5 2 rotated 3.1416

• By rotating the surface while dropping the needles biases


in position were eliminated. This improved the
randomness of the experiment.
• By increasing needle length and decreasing the distance
between lines takes advantage of variance reduction,
further improving the estimate for a given experiment.
Monte Carlo Applications
• Random Processes
– Nuclear Physics, Astrophysics…
– Radiation treatments and chemotherapy
– Weather Prediction / Environmental Research
– Many companies use MC simulation to estimate both
the average return and the risk of new products
– Eli Lilly uses simulation to determine the optimal plant
capacity which should be built for each drug
– Financial analysis.
– Semiconductor Device research
– Computer Graphics
– …
Monte Carlo for Non-Random
calculations

1
1

r 2 
( r  2
x ) dx 2 2 outside circle = 2 
Ratio of Random points inside circle vs.
2 4
Same method can2be used to integrate very complex functions
≈ 0.7854
Monte Carlo Applications
• Non Random Processes.
– Evaluation of complex integrals.
– Solutions to inverse/non-inverse simultaneous
equations.
– Differential Equations
–…
Components of Monte Carlo
Routines
• Probability Distribution Functions
• Random Number Generator
• Sampling Rule
• Evaluating
• Error Estimation
• Variance Reduction Techniques
• Parallelization and vectorization
Random Number Generator
Properties of good RNG’s
• Reproducible
– To debug programs
– To debug simulation models
– For documentation purposes.
Random Number Generator
Properties of good RNG’s
• Uncorrelated / Unpredictable
0.54
0.504

0.52 0.502

0.50 0.500

0.48 0.498

0.46 0.496

0.46 0.48 0.50 0.52 0.54 0.496 0.498 0.500 0.502 0.5041

215 points 225 points


Random Number Generator
Properties of good RNG’s
• Long period Length
– Reduce the affects of correlations in RNG’s
– Short period lengths create problems when massively
parallel processes generate many random numbers.
– Short period lengths can lead to unexpected overlap
when starting with different seed values.
– Any overlap can result in tighter than expected
distributions.  Incorrect Results
• The RNG period length should be several orders
of magnitude larger than what will be used.
Random Number Generator
Properties of good RNG’s
• The RNG period length should be several orders
of magnitude larger than what will be used

Name period
Ranrot 432099 - 184256986
Prime Modulus Linear Congruential Generator 2^64
Multiplicative Lagged Fibonacci 2^81
Marsaglia Generator 2^110
R250 2^250
Combined Multiple Recursive Generator 2^219
Mersenne Twister 2^19937
Random Number Generator
Properties of good RNG’s
• Parallel streams produced on different
processors should be uncorrelated
– Use Different RNG’s on different processors.
– Assign different sub streams of one large
RNG to different processors.
– Use the leapfrog approach.
Random Number Generator
Properties of good RNG’s
• Reproducible
• Uncorrelated / Unpredictable
• Long Period Length.
• Computationally Efficient
• Portable
• Require Limited Memory
• Parallel streams produced on different
processors should be uncorrelated
Examples
Managing Portfolio’s

• Input investments, assets, retirement accounts &


duration.
• Assign nominal return rates, minimum,
maximum and standard deviations.
• Estimated inflation rates.
• Run 100 or 1000 examples and see what
happens.
Derivative Security Pricing

• A reduction in the standard deviation of Ĉ by 10 requires a 100 fold


increase in N.
• When dealing with large funds with options depending on several
underlying assets the problems can become extremely large.
• Vectorizing this problem is easy
– often times problems in finance will require the solution of integrals with
many degrees of freedom.
Semiconductor Device Research
• Semiconductor devices are built at the
atomic level. Simulating the transistors
and processes that make them is complex.
• Will review two topics
– Ion Implantation.
– Device Simulation: Charge Transport through
a transistor
Ion Implant
2 Dimensional Simulation of particles entering at one point.

20keV B+ into 900A SiO2 on Si, simulated on SRIM2003 N = 99,999

• To form a transistor different atoms are


accelerated through electric fields and allowed to
collide with certain regions in silicon.
• Atoms with different energies are thus
‘implanted’ at different depths.
Ion Implant
• Parallel Monte Carlo Simulations are allowing three
dimensional simulations.
– Simulations have been run by distributing individual trajectories
to each processor. (Not Ideal – high communication needed)
– Break the surface into equally spaced grids, and run simulations
in each grid. (Communication needed only if ion crosses
boundaries)
Domain Choice and Utilization

• Average idle times of the slaves for a


simulation with five identical CPUs, using an
arbitrary (dark blocks) and an optimized
(light blocks) subdomain distribution.
Suggested Improvement…
Speed Up

• Difference between ideal and measured is due


to communication between processors.
Sample Results

Cross Section
Charge Transport in Transistors
• Accurate knowledge of current flow
through transistors is required.
• With complex structure of transistors this
can only be done by simulations.
High Level Algorithm
Device Structure
Input

Spatial Computation

Device Data Broadcast

Independent Carrier
flight history simulations

Statistics Collection
Monte Carlo Statistics
High Level Algorithm
• Breaks device into
grids and determine Device Structure
Input

electric fields and


materials scattering Spatial Computation

properties Device Data Broadcast

• No communication Independent Carrier


flight history simulations

required. Statistics Collection

– Only constraint on
Monte Carlo Statistics

partitioning is load
balance.
High Level Algorithm
• The results of the
spatial Device Structure
Input

computations are
globally broadcast. Spatial Computation

Device Data Broadcast

Independent Carrier
flight history simulations

Statistics Collection
Monte Carlo Statistics
High Level Algorithm
Random
Number Inject Carrier

Device Structure
Random Input
Number Generate Flight
Time

Spatial Computation

Move Carrier
Device Data Broadcast

Independent Carrier
Update flight history simulations
Statistics

Random Statistics Collection


Perform Monte Carlo Statistics
Number
Scattering

Carrier Exit
Device
Load Balancing - Static
• Originally attempted static load balancing.
– Testing m flight histories on each processor
– OK CPU time per flight history is comparable.
• This is not the case for many devices.
• Poorly suited for workstation clusters where
processor loads are unpredictable.
Load Balancing - Dynamic
• One process (the particle manager) maintains
an event queue.
• Each processor sends requests to the PM when
it is ready for another simulation.
• For workstations the PM is spawned as a
separate process.
• On the CM-5 the PM simulates it’s own histories
in addition to assigning tasks.
– Latencies were reduced by using interrupt driven
active messages.
Load Balancing - Comparison
Device Architecture Time(sec.)
Workstations 406.2
n+nn+
CM-5 (dynamic) 594.7
(2x128x2)
CM-5 (Static) 728.9
Workstations 469.4
2D MOSFET
CM-5 (dynamic) 735.2
(64x64x2)
CM-5 (Static) 807.7
Workstations 502.6
3D MOSFET
CM-5 (dynamic) 759.7
(64x26x4)
CM-5 (Static) 936.3

• Workstation – 60 MIPS based DECstations


• Each simulation consisted of 32,000 flight histories.
• Workstations outperformed CM-5 by average 34%
• On CM-5: dynamic outperformed static processing by
average of 15%
Scalability
• Scaling is virtually
linear on CM-5 for
static and dynamic
loading.
• The high cost of
communications
causes some
problems in
workstations with
constant problem size
per processor.
Computer Graphics

• Handles randomness of texture and


transmission very well.
• Can also assign processes individual random
rays to trace.
Conclusions
• Monte Carlo Algorithms are very easy to
convert to parallel algorithms.
• Care Must be taken in choosing random
number algorithms.
• While vectorization is easy, improvements
in load balancing can be had by
considering different ways to partition the
problem.
Bibliography
Don’t Trust Parallel Monte Carlo! P Hallekalek
Proceedings of the 1998 Workshop on Parallel and Distributed Simulation
Volume 1, Page(s): 82-89

Intelligent Scheduling for Flexible Manufacturing Systems


Luis Rabelo, Yuewern Yih, Albert Jones, Jay-Shinn Tsai
Proceedings of the IEEE International Conference on Robotics and Automation, Vol. III
May 1993 Page(s): 810-815

Parallel Monte Carlo Simulation of MBE Growth


Isabel Beichl Y. Ansel Teng James L. Blue
Proceedings of the 9th International Symposium on Parallel Processing 1995,
Page(s) 46-52

Parallel Monte Carlo Simulation of Ion Implantation


Andreas Hossinger Erasmus Langer
Proceedings 13th Int.Conf. on Ion Implantation Technology; (2000), Page(s): 203 – 208
Bibliography
National Institute of Water and Atmospheric Research New Zealand
http://www.niwascience.co.nz/rc/hpcf/

Parallel Monte Carlo Methods for Derivative Security Pricing,


Giorgio Pauletto
Computing in Economics, Finance 2000, University of Pompeu Fabra, Spain,
July 5--July 8, 2000

Partitioning 3D space for parallel many-particle simulations


M. A. Stijnman, R. H. Bisseling, and G. T. Barkema
Physics Communications 149, No. 3 (2003) pp. 121-134
Previous version (Preprint August 2002).

Monte Carlo Simulation for Schedule Risks


Brenda McCabe
Proceedings of the 2003 Winter Simulation Conference
New Orleans, LA December 2003
Bibliography
Massively Parallel Computation for Three-Dimensional Monte Carlo Simulation
H. Sheng, R. Guerrieri, and A. L. Sangiovanni-Vincentelli
In Proc. Simulation of Semiconductor Devices and Processes, pages 285-290, Zurich,
September 1991.

The Monte Carlo Method


Nicholas and Stanislaw Ulam (1949)
Journal of the American Statistical Association, 44 (247), 335-341

You might also like