Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Probab 10

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Probabilistic/Randomized Algorithms If M is chosen to be a large, 31-bit prime, the period should be

significantly large for most applications. M = 2 31 - 1 = 2,147,483,647


and A = 48,271.
Ming- Hwa Wang, Ph.D.
COEN 279/AMTH 377 Design and Analysis of Algorithm s Same sequence occurs all the time for easy debugging, and input seed
Departm ent of Com puter Engineering (e.g., use system clock) for real runs.
Santa Clara Univ ersity Usually a random real number in the open interval (0,1), which can be
done by dividing by M.
Probabilistic or Random ized algorithm Multiplication overflow prevention: let Q = M / A = 44,488 and R = M %
At least once during the algorithm, a random number is used to make a A = 3,399,
decision instead of spending time to work out which alternative is best. xi+1 = Axi % M = A(xi % Q) - R(xi / Q) + M (xi), where (xi) = xi / Q -
The worst-case running time of a randomized algorithm is almost always Axi / M = 1 iff the remaining terms evaluate to less than zero, 0
the same as the worst-case running time of the non-randomized otherwise.
algorithm. xi+1 = Axi % M = Axi - M(Axi / M) = Axi - M(xi / Q) + M(xi / Q) - M(Axi
A good randomized algorithm has no bad input, but only bad random / M) = Axi - M(xi / Q) + M(xi / Q - Axi / M) = A(Q(xi / Q) + xi % Q) -
numbers. M(xi / Q) + M(xi / Q - Axi / M) = (AQ - M)(xi / Q) + A(xi % Q) + M(xi
The random numbers are important, and we can get an expected running / Q - Axi / M) = -R(xi / Q) + A(xi % Q) + M(xi / Q - Axi / M) = A(xi %
time, where we now average over all possible random numbers instead Q) - R(xi / Q) + M (xi)
of over all possible inputs, or the mean time that it would take to solve
the same instance over and over again. Num erical Probabilistic Algorithm s
A randomized algorithm runs quickly but occasionally makes an error. For certain real-life problems, computation of an exact solution is not
The probability of error can, however, be make negligibly small. Any possible even in principle, e.g., uncertainties in the experimental data, digital
purported solution can be verified efficiently for correctness. computers handle only binary values, etc. For other problems, a pre cise
A randomized algorithm may give probabilistic answers which are not answer exists but it would take too long to figure it out exactly. Numerical
necessarily exact. algorithms yield a confidence interval, and the expected precision improves
The same algorithm may behave differently when it is applied twice to as the time available to the algorithm increase. The error is usually inversely
the same instance. Its execution time, and even the result obtained, may proportional to the square root of the amount of work performed.
vary considerable from one use to the next. If the algorithm gets stuck Buffon’s Needle: throw a needle at random on a floor made of planks of
(e.g., core dump), simply restart it on the same instance for a fresh constant width, if the needle is exactly half as long as the planks in the
chance of success. If there is more than one correct answer, several floor and if the width of the cracks between the planks are zero, the
different ones may be obtained by running the probabilistic algorithm probability that the needle will fall across a crack is 1/ . The probability
more than once. that a randomly thrown needle will fall across a crack is 2 / , where is
An expected running time bound is somewhat stronger than an average - needle length and is plank width. The result estimate will be between
case bound, but is weaker than the corresponding worst-case bound. and with probability at least (desired reliability).
Numerical integration - Monte Carlo integration: deterministic integration
Random Num ber Generators algorithms are easy to be fooled, and very expensive when evaluating a
True randomness is virtually impossible to do on a computer. multiple integral. The hybrid techniques that partly systematic and partly
Pseudorandom numbers. What really needed is a sequence of random probabilistic is called quasi Monte Carlo integration.
numbers appear independently. Probabilistic Counting:
The linear congruential generator: xi+1 = Axi % M, where x0 is the seed Counting twice as far to up to 2 n+1 - 2 by initialize to 0, each time
and 1 ≤ x0 < M. If M is prime, xi is never 0. After M-1 numbers, the tick is called, flip a fair coin. If it comes up head, add 1 to the
sequence repeat (period of M-1). Some choices of A gets shorter period register, otherwise, do nothing. When count is called, return twice
than M-1. the value stored in the register.
Counting exponentially farther from 0 to 22 n-1 - 1. Keep in the Skip Lists
register an estimate of the logarithm of the actual number of ticks Every 2 ith node has a pointer to the node 2 i ahead of it. The total
and count(c) returns 2 c-1. Keep the relative error in control instead number of pointers has only doubled, but now at most lgN nodes
of absolute. are examined during a search. The search consists of either
advancing to a new node or dropping to a lower pointer in the same
Monte Carlo Algorithm s node.
Monte Carlo algorithms give exact answer with high probability whatever the A level k node is a node that has k pointers, the ith pointer in any
instance considered, although sometimes they provide a wrong answer. level k node (k i) points to the next node with at least i levels.
Generally you cannot tell if the answer is correct, but you can reduce the Roughly half the nodes are level 1 nodes, roughly a quarter are level
error probability arbitrarily by allowing the algorithm more time (amplifying 2, and, in general, approximately 1/2 i nodes are level i. We choose
the stochastic). A Monte Carlo algorithm is p-correct if it returns a correct the level randomly.
answer with probability at least p (0 < p < 1), whatever the instance Find: start at the highest pointer at the header, traverse along this
considered. p depends on the instance size but not on the instance itself. level until find that the next node is larger than the one we are
Verifying Matrix Multiplication: looking for (or nil). When this occurs, go to the next lower level and
straightforward matrix multiplication algorithm (n 3), Strassen’s continue the strategy. When progress is stopped at level 1, either we
algorithm (n 2.37) are in front of the node we are looking for, or it is not in the list.
Let D = AB - C, S {1,2, .., n}, and S(D) denote the vector of Insert: proceed as in a Find, and keep track of each point where we
length n obtained by adding pointwise the rows of D indexed by the switch to a lower level. The new node, whose level is determined
elements of S. S(D) is always 0 if AB equal C, otherwise, assume i randomly, is then spliced into the list.
be an integer such that the ith row of D contains at least one nonzero O(lgN) expected cost
element. The probability that S(D) 0 is at least one-half. Let X be a Skip lists need an estimate of the number of elements that will be in
binary vector of length of n such that Xj = 1 if j S and Xj = 0 the list to determine the number of levels. Different level of nodes
otherwise. Then S(D) = XD, and we want to verify if XAB = XC, need different type declarations.
where (XA)B need (n 2). Getting the answer false just once allows
you conclude that AB C. The probability that k successive calls each Las Vegas Algorithm s
-k -k
return the wrong answer is at most 2 , so it is (1 - 2 ) correct. Las Vegas algorithms make probabilistic choices to help guide them more
Alternatively, Monte Carlo algorithms can be given an explicit upper quickly to a correct solution, they never return a wrong answer. Two main
bound on the tolerable error probability in (n 2lg -1). categories of Las Vegas algorithms: it take longer time to solve a problem
Primality Testing when unfortunate choice are made (e.g., Quicksort), and alternatively, they
O(2 d/2) to test whether a d-digit number is a prime allow themselves go to a dead end and admit that they cannot find a solution
Randomized polynomial-time algorithm: if the algorithm declares in this run of the algorithm. A Las Vegas algorithm has the Robin Hood
that the number is not prime, then it is certainly not a prime. If the effect, with high probability, instances that took a long time deterministically
algorithm declares that the number is a prime, then with high are now solved much faster, but instances on which the deterministic
probability but not 100% sure, the number is prime. algorithm was particularly good are slowed down to average. Let p(x) be the
Fermat's Lesser Theorem: If P is prime, and 0 < A < P, then AP-1 1 probability of success of the algorithm, then the expected time t(x) is 1/p(x).
%P However, a correct analysis must consider separately the expected time
Pick 1 < A < N-1 at random. If AN-1 1 % N, declare that N is taken by LV(x) in case of success s(x) and in case of failure f(x). t(x) = s(x)
probably prime, otherwise declare that N is definitely not prime. + ((1-p(x))/p(x))f(x).
False witness of primality: Carmichael numbers are not prime but The Eight Queens Problem
satisfy AN-1 1 % N for all 0 < A < N that are relatively prime to N. Combine backtracking with probabilistic algorithm, first places a
If P is prime and 0 < X < P, the only solutions to X2 = 1 % P are X = number of queens on the board in a random way, and then uses
1, P-1. backtracking to try and add the remaining queens without
reconsidering the positions of the queens that were placed randomly.
The more queens we place randomly, the smaller the average time
needed by the subsequent backtracking stage, whether it fails or
succeeds, but the greater the probability of failure. This is the fine -
tuning knob.
Probabilistic Quickselect and Quicksort
Universal Hashing
Las Vegas hashing allows us to retain the efficiency of hashing on the
average, without arbitrarily favoring some programs at the expense
of others. Choose the hash function randomly at the beginning of
each compilation and again whenever rehashing becomes necessary,
ensure that collision lists remain reasonably well-balanced with high
probability.
Universal hashing: Let U = {1,2, .., a-1} be the universe of potential
indexes for the associative table, and let B = {1,2, ..,N-1} be the set
of indexes in the hash table. Let two distinct x and y in U, a set H of
functions from U to B, and h:U B is a function chosen randomly
from H, H is a universal2 class of hash functions if the probability that
h(x) = h(y) is at most 1/N. Let p be a prime number at least as large
as a, and i, j be two integers (1 i < p and 0 j < p), then h ij(x) =
((ix + j)%p)%N, and H is universal2.
Factorizing Large Integers
The factorization problem consists of finding the unique
decomposition of n into a product of prime facto rs. The splitting
consists of finding one nontrivial divisor of n, provided n is
composite. Factorizing reduces to splitting and primality testing. An
integer is k-smooth if all its prime divisors are among the k smallest
prime numbers. k-smooth integers can be factorized efficiently by
trial division if k is small. A hard composite number is the product of
two primes of roughly equal size.
Let n be a composite integer, Let a and b be distinct integers
between 1 and n-1 such that a + b n. If a2 % n b2 % n, then
gcd(a+b, n) is a nontrivial divisor of n.

You might also like