Module8 ADGE
Module8 ADGE
Logic:
Deduction
And
8
Inductive
Reasoning
JOE DEVIN P CALIBAR
POLICE CORPORAL
INSTRUCTOR
Module Content:
INDUCTIVE PROBABILITY
Inductive probability attempts to give the probability of future events based on past events. It is the basis
for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a
source of knowledge about the world.
There are three sources of knowledge: inference, communication, and deduction. Communication relays
information found using other methods. Deduction establishes new facts based on existing facts. Inference
establishes new facts from data. Its basis is Bayes' theorem.
Information describing the world is written in a language. For example, a simple mathematical language of
propositions may be chosen. Sentences may be written down in this language as strings of characters. But in
the computer, it is possible to encode these sentences as strings of bits (1s and 0s). Then the language may
be encoded so that the most commonly used sentences are the shortest. This internal language implicitly
represents probabilities of statements.
Occam's razor says the "simplest theory, consistent with the data is most likely to be correct". The "simplest
theory" is interpreted as the representation of the theory written in this internal language. The theory with the
shortest encoding in this internal language is most likely to be correct.
History
Probability and statistics was focused on probability distributions and tests of significance. Probability was
formal, well defined, but limited in scope. In particular its application was limited to situations that could be
defined as an experiment or trial, with a well defined population.
Bayes's theorem is named after Rev. Thomas Bayes 1701–1761. Bayesian inference broadened the application
of probability to many situations where a population was not well defined. But Bayes' theorem always
depended on prior probabilities, to generate new probabilities. It was unclear where these prior probabilities
should come from.
Ray Solomonoff developed algorithmic probability which gave an explanation for what randomness is and how
patterns in the data may be represented by computer programs, that give shorter representations of the data
circa 1964.
PCpl Joe Devin P Calibar Page 2
Module No 8 ADGE: Logic: Deduction and Inductive Reasoning
Chris Wallace and D. M. Boulton developed minimum message length circa 1968. Later Jorma
Rissanen developed the minimum description length circa 1978. These methods allow information theory to be
related to probability, in a way that can be compared to the application of Bayes' theorem, but which give a
source and explanation for the role of prior probabilities.
Marcus Hutter combined decision theory with the work of Ray Solomonoff and Andrey Kolmogorov to give a
theory for the Pareto optimal behavior for an Intelligent agent, circa 1998.
Probability
Probability is the representation of uncertain or partial knowledge about the truth of statements. Probabilities
are subjective and personal estimates of likely outcomes based on past experience and inferences made from
the data.
This description of probability may seem strange at first. In natural language we refer to "the probability" that
the sun will rise tomorrow. We do not refer to "your probability" that the sun will rise. But in order for
inference to be correctly modeled probability must be personal, and the act of inference generates new
posterior probabilities from prior probabilities.
Probabilities are personal because they are conditional on the knowledge of the individual. Probabilities are
subjective because they always depend, to some extent, on prior probabilities assigned by the individual.
Subjective should not be taken here to mean vague or undefined.
The term intelligent agent is used to refer to the holder of the probabilities. The intelligent agent may be a
human or a machine. If the intelligent agent does not interact with the environment then the probability will
converge over time to the frequency of the event.
If, however the agent uses the probability to interact with the environment there may be a feedback, so that
two agents in the identical environment starting with only slightly different priors, end up with completely
different probabilities. In this case optimal decision theory as in Marcus Hutter's Universal Artificial Intelligence
will give Pareto optimal performance for the agent. This means that no other intelligent agent could do better
in one environment without doing worse in another environment.
Comparison to deductive probability
In deductive probability theories, probabilities are absolutes, independent of the individual making the
assessment. But deductive probabilities are based on,
Shared knowledge.
Assumed facts, that should be inferred from the data.
For example, in a trial the participants are aware the outcome of all the previous history of trials. They also
assume that each outcome is equally probable. Together this allows a single unconditional value of probability
to be defined.
But in reality, each individual does not have the same information. And in general, the probability of each
outcome is not equal. The dice may be loaded, and this loading needs to be inferred from the data.
Probability as estimation
The principle of indifference has played a key role in probability theory. It says that if N statements are
symmetric so that one condition cannot be preferred over another then all statements are equally probable. [10]
Taken seriously, in evaluating probability this principle leads to contradictions. Suppose there are 3 bags of
gold in the distance and one is asked to select one. Then because of the distance one cannot see the bag
sizes. You estimate using the principle of indifference that each bag has equal amounts of gold, and each bag
has one third of the gold.
PCpl Joe Devin P Calibar Page 3
Module No 8 ADGE: Logic: Deduction and Inductive Reasoning
Now, while one of us is not looking, the other takes one of the bags and divide it into 3 bags. Now there are 5
bags of gold. The principle of indifference now says each bag has one fifth of the gold. A bag that was
estimated to have one third of the gold is now estimated to have one fifth of the gold.
Taken as a value associated with the bag the values are different therefore contradictory. But taken as an
estimate given under a particular scenario, both values are separate estimates given under different
circumstances and there is no reason to believe they are equal.
Estimates of prior probabilities are particularly suspect. Estimates will be constructed that do not follow any
consistent frequency distribution. For this reason, prior probabilities are considered as estimates of
probabilities rather than probabilities.
A full theoretical treatment would associate with each probability,
The statement
Prior knowledge
Prior probabilities
The estimation procedure used to give the probability.
We can't prove that an inductive conclusion is 100% right or wrong. But we can tell if it is sound or unsound.
Here are some classic mis-steps one can make when putting together inductive arguments that can lead to an
unsound conclusion.
A reasonable inference covers the available information, and doesn't require us to invent a lot of new
hypothetical information to make it stick. An unreasonable conclusion may fit the existing evidence, but
it also requires us to accept a lot of other ideas!
All of the following generalizations are drawn from one or two examples. You would be surprised to
find out how many of your opinions are based on one observation. Indeed, many of our opinions are
not even based on observation at all--we merely say what everyone else around us is saying, or repeat
what we heard on television last night. Sad, but true.
I spent five minutes looking for a restaurant in San Mateo that was open at 2 p.m. but didn't find one.
Therefore, there are NO restaurants open past 2 p.m. in San Mateo, ever.
So: good induction relies on wide observation, including several examples, and preferably examples
from other people's experience as well as your own.
I have twenty-five Irish friends from my local Alcoholics Anonymous group, and all are recovering
drunks. Therefore, all Irishmen are recovering drunks.
100% of students in the ENGL 165 online class use their computers to study.
Therefore, all CSM students use computers to study.
When I lived in San Francisco, no one I met through work was in trouble with the law.
Since I moved to Eureka, everyone I've met through work has been in trouble with the law.
Therefore, there is a lot more lawbreaking in Eureka than in San Francisco.
References:
https://en.wikipedia.org/wiki/Inductive_probability
https://collegeofsanmateo.edu/writing/tutorials/Logical%20Method_Induction%20Deduction.pdf
1. __________________________________________
2. __________________________________________
3. __________________________________________
Course No. ADGE Descriptive Title: Logic: Deduction and Inductive Reasoning
Checked by:
Recommending Approval:
Approved: