ES
ES
ES
Semantic
networks became popular in artificial intelligence and natural language processing only because it represents knowledge
or supports reasoning. These act as another alternative for predicate logic in a form of knowledge representation.The
structural idea is that knowledge can be stored in the form of graphs, with nodes representing objects in the world, and
arcs representing relationships between those objects.
● Semantic nets consist of nodes, links and link labels. In these networks diagram, nodes appear in form of
circles or ellipses or even rectangles which represents objects such as physical objects, concepts or situations.
● Links appear as arrows to express the relationships between objects, and link labels specify relations.
● Relationships provide the basic needed structure for organizing the knowledge, so therefore objects and
relations involved are also not needed to be concrete.
● Semantic nets are also referred to as associative nets as the nodes are associated with other nodes
● Assertion Networks – Designed to assert propositions is intended to state recommendations. Mostly data in
an assertion network is genuine unless it is marked with a modal administrator. Some assertion systems are
even considered as the model of the reasonable structures underlying the characteristic semantic natural
languages.
● Implicational Networks – Uses Implication as the primary connection for connecting nodes. These networks
are also used to explain patterns of convictions, causality and even deductions.
● Executable Network- Contains mechanisms that can cause some changes to the
network itself by incorporating some techniques, for example, such as attached
procedures or marker passing which can perform path messages, or associations and
searches for patternsprocessing and machine visions.
1. Nt.
2.
3. First-Order Logic in Artificial intelligence
4. In the topic of Propositional logic, we have seen that how to represent statements using propositional logic. But unfortunately, in
propositional logic, we can only represent the facts, which are either true or false. PL is not sufficient to represent the complex
sentences or natural language statements. The propositional logic has very limited expressive power. Consider the following
sentence, which we cannot represent using PL logic.
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such as first-order logic.
First-Order logic:
○ First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to propositional logic.
○ FOL is sufficiently expressive to represent the natural language statements in a concise way.
○ First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a powerful language that develops
information about the objects in a more easy way and can also express the relationship between those objects.
○ First-order logic (like natural language) does not only assume that the world contains facts like propositional logic but also assumes
the following things in the world:
a. Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
b. Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the sister of, brother of, has
color, comes between
c. Function: Father of, best friend, third inning of, end of, ......
a. Syntax
b. Semantics
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic syntactic elements of first-order
logic are symbols. We write statements in short-hand notation in FOL.
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
○ Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a predicate symbol followed by
a parenthesis with a sequence of terms.
○ We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Complex Sentences:
○ Predicate: A predicate can be defined as a relation, which binds two atoms together in a statement.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of the statement and second part "is an integer," is
known as a predicate.
○ A quantifier is a language element which generates quantification, and quantification specifies the quantity of specimen in the
universe of discourse.
○ These are the symbols that permit to determine or identify the range and scope of the variable in the logical expression. There are two
types of quantifier:
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement within its range is true for everything or every
instance of a particular thing.
Note: In universal quantifier we use implication "→"If x is a variable, then ∀x is read as:
○ For all x
○ For each x
○ For every x.
Example:
Let a variable x which refers to a cat so all x can be represented in UOD as below:
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within its scope is true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a predicate variable then it is called as an existential
quantifier
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
Properties of Quantifiers:
The quantifiers interact with variables which appear in a suitable way. There are two types of variables in First-order logic which are given below:
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope of the quantifier.
Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the scope of the quantifier.
These section examines the constraint optimization methodology, another form or real concern method. By its name,
constraints fulfilment implies that such an issue must be solved while adhering to a set of restrictions or guidelines.
Whenever a problem is actually variables comply with stringent conditions of principles, it is said to have been addressed using
the solving multi - objective method. Wow what a method results in a study sought to achieve of the intricacy and organization
of both the issue.
○ D: The variables are contained within a collection several domain. Every variables has a distinct scope.
In constraint satisfaction, domains are the areas wherein parameters were located after the restrictions that are particular to
the task. Those three components make up a constraint satisfaction technique in its entirety. The pair "scope, rel" makes up the
number of something like the requirement. The scope is a tuple of variables that contribute to the restriction, as well as rel is
indeed a relationship that contains a list of possible solutions for the parameters should assume in order to meet the
restrictions of something like the issue.
For a constraint satisfaction problem (CSP), the following conditions must be met:
○ States area
The definition of a state in phase space involves giving values to any or all of the parameters, like as
1. Consistent or Legal Assignment: A task is referred to as consistent or legal if it complies with all laws and
regulations.
2. Complete Assignment: An assignment in which each variable has a number associated to it and that the CSP solution
is continuous. One such task is referred to as a completed task.
3. A partial assignment is one that just gives some of the variables values. Projects of this nature are referred to as
incomplete assignment.
The parameters utilize one of the two types of domains listed below:
○ Discrete Domain: This limitless area allows for the existence of a single state with numerous variables. For instance,
every parameter may receive a endless number of beginning states.
○ It is a finite domain with continous phases that really can describe just one area for just one particular variable.
Another name for it is constant area.
Basically, there are three different categories of limitations in regard towards the parameters:
○ Unary restrictions are the easiest kind of restrictions because they only limit the value of one variable.
○ Binary resource limits: These restrictions connect two parameters. A value between x1 and x3 can be found in a
variable named x2.
○ Global Resource limits: This kind of restriction includes a unrestricted amount of variables.
The main kinds of restrictions are resolved using certain kinds of resolution methodologies:
○ In linear programming, when every parameter carrying an integer value only occurs in linear equation, linear
constraints are frequently utilised.
○ Non-linear Constraints: With non-linear programming, when each variable (an integer value) exists in a non-linear
form, several types of restrictions were utilised
Think of a Sudoku puzzle where some of the squares have initial fills of certain integers.
You must complete the empty squares with numbers between 1 and 9, making sure that no rows, columns, or blocks contains a
recurring integer of any kind. This solving multi - objective issue is pretty elementary. A problem must be solved while taking
certain limitations into consideration.
The integer range (1-9) that really can occupy the other spaces is referred to as a domain, while the empty spaces themselves
were referred as variables. The values of the variables are drawn first from realm. Constraints are the rules that determine how
a variable will select the scope.
Mycin was an expert system developed at Stanford in the 1970s. Its job was to diagnose and recommend
treatment for certain blood infections. To do the diagnosis ``properly'' involves growing cultures of the infecting
organism. Unfortunately this takes around 48 hours, and if doctors waited until this was complete their patient
might be dead! So, doctors have to come up with quick guesses about likely problems from the available data,
and use these guesses to provide a ``covering'' treatment where drugs are given which should deal with any
possible problem.
Mycin was developed partly in order to explore how human experts make these rough (but important) guesses
based on partial information. However, the problem is also a potentially important one in practical terms - there
are lots of junior or non-specialised doctors who sometimes have to make such a rough diagnosis, and if there is
an expert tool available to help them then this might allow more effective treatment to be given. In fact, Mycin
was never actually used in practice. This wasn't because of any weakness in its performance - in tests it
outperformed members of the Stanford medical school. It was as much because of ethical and legal issues
related to the use of computers in medicine - if it gives the wrong diagnosis, who do you sue?
Anyway Mycin represented its knowledge as a set of IF-THEN rules with certainty factors. The following is an
English version of one of Mycin's rules:
The 0.7 is roughly the certainty that the conclusion will be true given the evidence. If the evidence is uncertain the
certainties of the bits of evidence will be combined with the certainty of the rule to give the certainty of the
conclusion.
Mycin was written in Lisp, and its rules are formally represented as Lisp expressions. The action part of the rule
could just be a conclusion about the problem being solved, or it could be an arbitary lisp expression. This allowed
great flexibility, but removed some of the modularity and clarity of rule-based systems, so using the facility had to
be used with care.
Anyway, Mycin is a (primarily) goal-directed system, using the basic backward chaining reasoning strategy that
we described above. However, Mycin used various heuristics to control the search for a solution (or proof of
some hypothesis). These were needed both to make the reasoning efficient and to prevent the user being asked
too many unnecessary questions.
One strategy is to first ask the user a number of more or less preset questions that are always required and which
allow the system to rule out totally unlikely diagnoses. Once these questions have been asked the system can
then focus on particular, more specific possible blood disorders, and go into full backward chaining mode to try
and prove each one. This rules out alot of unecessary search, and also follows the pattern of human
patient-doctor interviews.
The other strategies relate to the way in which rules are invoked. The first one is simple: given a possible rule to
use, Mycin first checks all the premises of the rule to see if any are known to be false. If so there's not much point
using the rule. The other strategies relate more to the certainty factors. Mycin will first look at rules that have
more certain conclusions, and will abandon a search once the certainties involved get below 0.2.
A dialogue with Mycin is somewhat like the mini dialogue we gave in section 5.3, but of course longer and
somewhat more complex. There are three main stages to the dialogue. In the first stage, initial data about the
case is gathered so the system can come up with a very broad diagnosis. In the second more directed questions
are asked to test specific hypotheses. At the end of this section a diagnosis is proposed. In the third section
questions are asked to determine an appropriate treatment, given the diagnosis and facts about the patient. This
obviously concludes with a treatment recommendation. At any stage the user can ask why a question was asked
or how a conclusion was reached, and when treatment is recommended the user can ask for alternative
treatments if the first is not viewed as satisfactory.
Mycin, though pioneering much expert system research, also had a number of problems which were remedied in
later, more sophisticated architectures. One of these was that the rules often mixed domain knowledge, problem
solving knowledge and ``screening conditions'' (conditions to avoid asking the user silly or awkward questions -
e.g., checking patient is not child before asking about alcoholism). A later version called NEOMYCIN attemped to
deal with these by having an explicit disease taxonomy (represented as a frame system) to represent facts about
different kinds of diseases. The basic problem solving strategy was to go down the disease tree, from general
classes of diseases to very specific ones, gathering information to differentiate between two disease subclasses
(ie, if disease1 has subtypes disease2 and disease3, and you know that the patient has the disease1, and subtype
disease2 has symptom1 but not disease3, then ask about symptom1.)
There were many other developments from the MYCIN project. For example, EMYCIN was really the first expert
shell developed from Mycin. A new expert system called PUFF was developed using EMYCIN in the new domain
of heart disorders. And systom called NEOMYCIN was developed for training doctors, which would take them
through various example cases, checking their conclusions and explaining where they went wrong.
We should make it clear at this point that not all expert systems are Mycin-like. Many use different approaches to
both problem solving and knowledge representation. A full course on expert systems would consider the
different approaches used, and when each is appropriate. Come to AI4 for more details!
Mycin, developed at Stanford University by E.H.Shortliffe (Buchanan and Shortliffe, 1984) in the mid 1970s, was
the first expert system to demonstrate impressive levels of performance in a medical domain. Mycin's task was
to recommend therapy for blood and meningitis infections. Because the cause of an infection was often
unknown, Mycin would first diagnose the cause and then prescribe therapy. Typically, Mycin would ask a bunch of
questions about a particular case and then suggest one or more therapies to cover all the likely causes of the
infection. Sometimes, cases involved more than one infection.
Before we review how Mycin was actually evaluated, let's consider how we might do it. Stanford University is
home to some of the world's experts on blood and meningitis infections, so perhaps we should show them a
reasonable number of problems that Mycin had solved and ask them what they think. Good idea, bad execution.
Perhaps half the experts would say, ``I think Mycin is great!'' and the other half would say, ``I think Mycin is the
thin end of a very dangerous wedge; computer diagnosticians, over my dead body!'' A more likely scenario--we
have seen it often--is the experts would be very enthusiastic and give glowing reviews, much the way that parents
gush over their children. What we need is n ot opinions or impressions, but relatively objective measures of performance.
We might ask the experts to assess whether Mycin offered the correct therapy recommendation in each of, say, 10 cases. The
experts would be told how to grade Mycin when it offered one of several possible recommendations, and how to assign points
to adequate but suboptimal therapy recommendations. This approach is more objective but still flawed. Why? It doesn't control
for the prejudices of the experts. Enthusiastic experts might give Mycin ``partial credit'' on problems that anti-Mycin experts
would say it failed.
A standard mechanism for controlling for judges' biases is blinding. In a single-blind study, single-blind study the judges don't
know whether they are judging a computer program or a human. Mycin was evaluated with a single-blind study. Shortliffe asked
each of eight humans to solve ten therapy recommendation problems. These were real, representative problems from a case
library at Stanford Medical School. Shortliffe collected ten recommendations from each of the eight humans, ten from Mycin,
and the ten recommendations made originally by the attending physicians in each case, for a total of 100 recommendations.
These were then shuffled and given to a panel of eight expert judges. Each judge was asked to score each recommendation as
a) equivalent to their own best judgment, b) not equivalent but acceptable, or c) unacceptable. This design, which controls for
judges' bias by blinding them to the origin of the recommendations, is shown in Figure 3.3
.
You might think this design is an ironclad evaluation of Mycin's performance. It isn't. The design as stated fails to control for
two possible explanations of Mycin's performance.
Imagine an expert system for portfolio management, the business of buying and selling stocks, bonds and other securities for
investment. I built a system for a related problem as part of my Ph.D. research. Naturally I wondered who the experts were. One
place to look is a ranking, published annually, of pension-fund managers-- the folks who invest our money for our old age. I
learned something surprising: very few pension-fund managers remain in the top 10% of the ranking from one year to the next.
The handful that do could be considered expert; the rest are lucky one year, unlucky the next. Picking stocks is notoriously
difficult (see Rudd and Clasing, 1982), which is why I avoided the problem in my dissertation research. But suppose I had built a
stock picking system. How could I have evaluated it? An impractical approach is to invest a lot of money and measure the profit
five years later. A better alternative might be to convene a panel of experts, as Shortliffe did for Mycin, and ask whether my
stock-picking program picked the same stocks as the panel. As with the Mycin judges, we face the problem that the experts
won't agree. But the disagreements signify different things: Mhen portfolio managers don't agree, it is because they don't know
what they are doing. They aren't experts. Few outperform a random stock-picking strategy. Now you see the crucial control
condition: One must first establish that the ``experts'' truly are expert, which requires comparing ``experts'' to nonexperts.
Nonexpert performance is the essential control condition.
Surely, though, Professors of Medicine at Stanford University must be real experts. Obviously, nothing could be learned from the
proposed condition; the Professors would perform splendidly and the novices would not. Shortliffe didn't doubt the Professors
were real experts, but he still included novices on the Mycin evaluation panel. Why?
Imagine you have built a state-of-the-art parser for English sentences and you decide to evaluate it by comparing its parse trees
with those of expert linguists. If your parser produces the same parse trees as the experts, then it will be judged expert. You
construct a set of test sentences, just as Shortliffe assembled a set of test cases, and present them to the experts. Here are the
test sentences:
Because your program produces parse trees identical to the experts', you assert your program performs as well as experts.
Then someone suggests the obvious control condition: Ask a ten-year-old child to parse the sentences. Not surprisingly, the
child parses these trivial sentences just as well as the experts (and your parser).
Shortliffe put together a panel of eight human therapy recommenders and compared Mycin's performance to theirs. Five of the
panel were faculty at Stanford Medical School, one was a senior resident, one a senior postdoctoral fellow, and one a senior
medical student. This panel and Mycin each solved ten problems, then Shortliffe shipped the solutions, withou t attribution, to eight judges around the country. For each
solution that a judge said was equivalent or acceptable to the judge's own, Shortliffe awarded one point. Thus, each human therapy recommender, and Mycin, could score a maximum of 80 points--for eight ``equivalent or acceptable" judgments on each of the ten problems. The results are shown in Figure 3.4.
The expert judges actually agreed slightly more with Mycin's recommendations than with those of the Stanford Medical School faculty.
By including novices on the Mycin evaluation panel, Shortliffe achieved three aims. Two relate to control--ruling out particular explanations of Mycin's high level of performance. One explanation is that
neither Mycin nor the experts are any good. Shortliffe controlled against this explanation by showing that neither Mycin nor the experts often agreed with the novice panel members. This doesn't prove that
Mycin and the Professors are better than the novices, but they are different. (If five Professors recommend, say, ampicillin as therapy, and five novices give five different answers, none of which is ampicillin,
then ampicillin isn't necessarily a better therapy, but which answer would you bet your life on?) Another explanation of Mycin's performance is that Shortliffe gave Mycin easy cases to solve. If ``easy'' means
``anyone can do it,'' then the novices should have made the same therapy recommendations as the experts, and they didn't.
The third advantage of including novices on the evaluation panel is that it allowed Shortliffe to test a causal
hypothesis about problem-solving performance in general, and Mycin's performance in particular. Before
discussing the hypothesis, consider again the results in Figure 3.4. What is the x axis of the graph? It is unlabeled
because the factors that determine performance have not been explicitly identified. What could these factors be?
Mycin certainly does mental arithmetic more accurately and more quickly than Stanford faculty; perhaps this is
why it performed so well. Mycin remembers everything it is told; perhaps this explains its performance. Mycin
reasons correctly with conditional probabilities, and many doctors do not (Eddy 1982); perhaps this is why it did
so well. Even if you know nothing about Mycin or medicine, you can tell from Figure 3.4 that some of these
explanations are wrong. The mental arithmetic skills of Stanford faculty are probably no better than those of
postdoctoral fellows, residents, or even medical students; nor are the faculty any more capable of reasoning
about conditional probabilities than the other human therapy recommenders; yet the faculty outperformed the
others. To explain these results, we are looking for something that faculty have in abundance, something that
distinguishes fellows from residents from medical students. If we didn't already know the answer, we would
certainly see it in Figure 3.4: knowledge is power! This is the hypothesis that the Mycin evaluation tests, and that
Figure 3.4 so dramatically confirms:
To be completely accurate, Figure 3.4 supports this hypothesis only if we define high performance as a high
degree of agreement with the eight expert judges, but if Shortliffe didn't believe this, he wouldn't have used these
judges as a gold standard.
● To control for judges' bias, blind them to origin of the items they are judging. For example, do not tell
them whether recommendations were produced by a program or a person.
● To control for the possibility that high performance is due to easy problems, include a control group of
problem solvers who can solve easy problems but not difficult ones. For example, if a student performs
as well as faculty, then the problems are probably easy.
● To control for the possibility that the ``gold standard'' against which we measure performance is not a
high standard, include a control group that sets a lower standard. For example, if a chimpanzee throwing
darts at the Big Board picks stocks as well as professional portfolio managers, then the latter do not set
a high standard.
● To test the hypothesis that a factor affects performance, select at least two (and ideally more) levels of
the factor and compare performance at each level. For example, to test the hypothesis that knowledge
affects performance, measure the performance of problem solvers with four different levels of
knowledge--faculty, post-doc, resident, student. Note that this is an observation experiment because
problem solvers are classified according to their level of knowledge. It generally isn't practical to
manipulate this variable because it takes so long to train people to expert levels. The
knowledge-is-power hypothesis might also be tested in a manipulation experiment with Mycin by
directly manipulating the amount Mycin knows--adding and subtracting rules from its knowledge
base--and observing the effects on performance (chapter 6). In all these designs, it is best to have more
than two levels of the independent variable. With only two, the functional relationship between x and y
must be approximated by a straight line.
MYCIN was an early expert system that used artificial intelligence to identify bacteria
causing severe infections, such as bacteremia and meningitis, and to recommend
antibiotics,with the dosage adjusted for patient's body weight the name derived from the
antibiotics themselves, as many antibiotics have the suffix "-mycin".
The Mycin system was also used for the diagnosis of blood clotting diseases.
MYCIN was developed over five or six years in the early 1970s at Stanford
University.
It was written in Lisp as the doctoral dissertation of Edward Shortliffe under the
direction of Bruce G. Buchanan, Stanley N. Cohen and others. It arose in the laboratory that
had created the earlier Dendral expert system.
MYCIN was never actually used in practice but research indicated that it proposed an
acceptable therapy in about 69% of cases, which was better than the performance of
infectious disease experts who were judged using the same criteria.
MYCIN was the first large expert system to perform at the level of a human expert
and to provide users with an explanation of its reasoning. Most expert systems developed
sinceMYCIN have used MYCIN as a benchmark to define an expert system. Moreover,
The techniques developed for MYCIN have become widely available in the various
Small expert system building tools.
ARCHITECTURE
MYCIN consists of three programs and two logical databases. The three
programs are the consultation, explanation, and knowledge acquisition systems. The two
logical databases are the dynamic and static databases. The dynamic database contains
patient data, laboratory data, a context tree, and a rule trace
Consultation System
Consultation system Performs Diagnosis and Therapy Selection. It Control Structure reads
Static DB (rules) and read/writes to Dynamic DB (patient, context).They Linked to
Explanations and Terminal interface to Physician.
Consultation system User-Friendly Features:
#Users can request rephrasing of questions and Synonym dictionary allows latitude of
user responses.
#Questions are asked when more data is needed
If data cannot be provided, system ignores relevant rules
Production Rules:
The production rule hierarchy contains the domain knowledge. This domain
knowledge implicitly contains the disease taxonomy, and explicitly contains organism
specific data. It uses over 450 rules in MYCIN and uses Premise-Action (If-Then)
Form. Each rule is completely modular, all relevant context is contained in the rule
with explicitly stated premises.
Meta-Rules: The meta-rules contain domain-dependent search strategies (such as ordering of
sub-goals, to focus on more likely connections first) in an attempt to prune the search space. It is
alternative to exhaustive invocation of all rules. Strategy rules used in it is to suggest an approach for
a given sub-goal. Ordering rules to try first, effectively pruning the search tree. It creates a
search-space with embedded Information on which branch is best to take. High-order Meta-Rules
(i.e. Meta-Rules for Meta-Rules) is powerful, but used limitedly in practice. Impact to the Explanation
System are encode Knowledge formerly in the Control Structure and Sometimes create “murky”
explanations.
Templates: The Production Rules are all based on Template structures. This aids Knowledge-base
expansion, because the system can “understand” its own representations.Templates are updated
by the system when a new rule is entered
Preview Mechanism: Interpreter reads rules before invoking them. It avoids unnecessary
deductive work if the sub-goal has already been tested/determined and ensures self-referencing sub-
goals do not enter recursive infinite loops
Dynamic Database :
The dynamic database contains patient data, laboratory data, a context tree, and a rule
trace.
It is build by consultation system and used by explanation system.
Context Tree
The context tree contains domain knowledge as it relates to conclusions about a specific
consultation. This context tree is a hierarchy of hypothetical conclusions.
Therapy Selection:
It is a plan-Generate-and-Test Process
Therapy List Creation contains
Set of specific rules recommend treatments based on the probability (notCF) of organism sensitivity
Probabilities based on laboratory data
One therapy rule for every organism
Assigning Item Numbers
Only hypothesis with organisms deemed “significantly likely” (CF)
are considered
Then the most likely (CF) identity of the organisms themselves are
determined and assigned an Item Number
Each item is assigned a probability of likelihood and probability of
sensitivity to drug
Final Selection based on:
Sensitivity
Contraindication Screening
Using the minimal number of drugs and maximizing the coverage
of organisms
Experts can ask for alternate treatments
Therapy selection is repeated with previously recommended drugs
removed from the list
Explanation System:
The explanation system uses the rules, translation templates,
the context tree, and program trace to construct natural language explanations
for
.
users
It provides reasoning why a conclusion has been made, or why a
question is being asked. It ignores Definitional Rules (CF == 1).
Two modules are used in explanation system
Q-A Module
Reasoning Status Checker
The overall goal of an explanatory interface is to bridge the gap between the complex reasoning
processes of an expert system and the user's need for understandable and trustable decisions.