0% found this document useful (0 votes)

5 views

Module_1_part2_NLP

The document discusses two primary approaches to language modeling: grammar-based models and statistical language models. Grammar-based models utilize hand-coded rules to define syntactic structures, while statistical models capture language patterns through large corpora. Key concepts include generative grammars, government and binding theories, and the principles of X-bar theory, which explain the hierarchical structure of phrases and the relationships between sentence components.

Uploaded by

shreekd2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Module_1_part2_NLP

Uploaded by

shreekd2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 79

Language Modelling

Two Approaches for Language Modelling

• One is to define a grammar that can handle the language
• Other is capture the patterns in a grammar language
statistically.
By the above Two primarily
Grammar Based Mode

Statistical language model

•These includes lexical functional grammar,
government and binding ,Paninian and n-
gram based Model
Introduction
• Model is a description of complex
process. some Language entity or a
language. model is thus description of
• Natural language is a complex entity in order to process
it we need to represent or build a model, this is known as
language modelling.
• Language model can be viewed as a problem of
grammar inference or Problem of probability estimation
• Grammar based language model attempts to
distinguish a grammatical sentence from a non
grammatical one , Where probability model estimates
maximum likelihood estimate.
Grammar Based Language Models uses grammar to
create the model , attempts to represent syntactic
structure.
***Grammar Consists of hand coded rules defining the
structure and ordering the constituents and utilizes
structures and relations.

**Grammar based models are:

1. Generative Grammars (TG, Chomsky 1957)

2.Hierarchial Grammars (Chomsky 1956)
3.Government and Binding (GB) (Chomsky 1981)
4. Lexical Functional Grammar (LFG)(Kalpan
1982)
5. Paninian Framework (Joshi 1985)
Statistical Language Models(SLM)
• This approach creates a model by training it
from a corpus (it should be large for regularities ).
• SLM is the attempt to capture the regularities of
a natural language for the purpose of improving
the performance of various natural language
applications.– Rosenfield(1994)

• SLM s are fundamental task in many NLP

applications like speech recognition, Spell
correction, machine translation, QA,IR and Text
summarization.
-N-gram Models
Grammar Based Language Models
I. Generative Grammars
According to Syntactic Structure
• We can generate sentences if we know a collection of words
and rules in a language this point dominated
computational linguistics and is appropriately termed
• generative grammar.
If we have a complete set of rules that can generate all
possible sentences in a language those rules provide a model
• of that language.
Language is a relation between the sound(or written text)
and its meaning. Thus model of a Lang means it should also
• need to deal with syntax and meaning also.
Most of these grammars deals with Perfectly
grammatical but meaningless sentence.
II. Hierarchical Grammar
• Chomsky 1956 described classes of a
grammar, where the top layer contained by its
subclasses.
• Type 0 (unrestricted)
• Type 1 (context sensitive)
• Type 2 (context free)
• Type 3 (regular)
R for given classes of formal Grammars, it can be
extended to describe grammars at various levels
such as in a class-sub class relationship.
III. Government and Binding
• Explains how sentences are structured in human
languages using a set of principles and rules.

• GB Theory as a set of tools that help us understand

how words and phrases are arranged in a sentence.

-Government refers to how one word

influences another word.

-Binding refers to how pronouns (he, she,

himself, etc) and noun phrases relate to each
other in a sentence.
III. Government and Binding
• In computational linguistics, structure of a language
can be understood at the level of its meaning to
resolve the structural ambiguity.
• Transformational Grammars assume two levels of
existence of sentences one at the surface level other
is at the deep root level.
• Government and Binding theories have renamed
them as s-level and d-level and identified two more
levels of representation called Phonetic form and
Logical form.
• GB theories language can be considered for analysis
at the levels shown,
d-structure
|
s-structure

phonet
ic Form (PF)

Logical Form (LF)

• If we say language as the representation of some
Fi
sound and meaning GB considers LF and PF but GB
g 1: different
concerned with LF rather than PF.
levels of
representation
in GB
PF: How a sentence is pronounced (sound representation)
-He will go to the park
“He’ll go to the park”

LF: The meaning of a sentence (semantic interpretation)

- Everyone loves someone
Everyone loves at least one person, but not necessarily the
same person
There is one specific person whom everyone loves
• Transformational Grammar have hundreds of rewriting
rules generally language specific and construct-specific
rules for assertive and interrogative sentences in
English or active or passive voice.
• GB envisages that we define rules at structural levels
units at the deep level, it will generate any language
with few rules. Deep level structures are the
abstractions of Noun Phrase verb phrase and common
to all languages.( eg child Lang : abstract structure enters the
mind & its gives rise to actual phonetic structures)
• The existence of deep level, language independent,
abstract structures, and expressions of these rules in
surface level, language specific with simple rules of GB
theories.
Surface Structure: represents the final form of the sentence
after transformations
Passive Movement: The object "Mukesh" moves to the subject position.
Insertion of Auxiliary Verb: "Be" is inflected to "was."
Deletion of Object Placeholder (e): The empty category (e) represents the removed
subject.
Deep Structure (D-structure) represents the basic, underlying form of the
sentence before any transformations occur.
She married him

Surface Structure Deep Structure

(He was married)
S
/ | \
NP INFL VP
| | / \
He past V VP
| |
Be V
|
Married
In Phrase Structure Grammar (PSG) each constituent consists of two
components:
• the head (the core meaning) and
• the complement (the rest of the constituent that completes the core
meaning).
For example, in verb phrase “[ate icecream ravenously]”, the complement
‘icecream’ is necessary for the verb ‘ate’ while the complement
‘ravenously’ can be omitted. We have to disentangle the compulsory from
the omissible in order to examine the smallest complete meaning of a
constituent. This partition is suggested in Xʹ Theory (Chomsky, 1970).

Xʹ Theory (pronounced ‘X-bar Theory’) states that each constituent

consists of four basic components:
• head (the core meaning),
• complement (compulsory element),
• adjunct (omissible element),
• and specifier (an omissible phrase marker).
•
Components of GB
• Government and binding comprises a set of theories
that map structure from d-structure to s-structure.
gAeneral transformational rule called ‘Move α ’ is
applied to d structure and s structure.
• This can move constituents at any place if it does not
violate the constraints put by several theories and
principles.
Stage Description Example

D-Structure Basic structure before movement "John likes Mary."

Move α Elements are moved to form grammatical sentences "Mary is liked by John."

S-Structure The sentence after transformations "What do you want?"

Move α Again More transformations before final meaning processing

"Everyone saw a movie." (Did they see

Logical Form (LF) Final meaning interpretation
the same movie?)
• GB consists of ‘a series of modules that contain
constraints and principles’ applied at various levels
of its representations and transformation rules,
Move α.
• These modules includes X-bar theory, projection
principle, ø-theory , ø-criterion, command
and government, case theory, empty category
principle (ECP), and binding theory.
• GB considers three levels of representations (d-,s-, and
LF) as syntactic and LF is also related to meaning
or sematic representations .
Eg : Two countries are visited by most travelers
• Important concepts in GB is that of constraints, these
can prohibits certain combinations and movements. GB
creates constraints, cross lingual constraints ‘ a
constituent cannot be moved from position X’ (rules are
language independent).
X Theory
• is one of the central concepts in GB. Instead of
defining several phrase structures & Sentence
Structures with separate set of rules , ¯X
theory defines them both as maximal projections of
some head.
• Entities defined become language independent , Thus , noun
phrase (np), verb phrase (vp), adjective phase(AP),(PP) are
maximal projections of noun (N), verb(V), adjective(A),and
preposition(P) head where X={N,V,A,P}.
• GB envisages semi phrasal level denoted by X bar and the
second maximal projection at the phrasal level denoted by
X
• Move α (Move Alpha) is applied to "Most travellers", moving it to the
front.
• The subscript "i" shows that "most travellers" has moved from its original
position.
• The empty category (e) is left behind, representing the movement.
• This changes the interpretation to:
"For most travellers, there exist two specific countries that they visit.“
• In LF1, the focus is on the countries.
• In LF2, the focus is on the travellers.
Understanding Figure 2.7(a) – X-Bar Theory
General Structure of a Phrase
The X-Bar Theory is a model of phrase structure that explains how words form larger
syntactic units.
• X̄ (X-bar) Theory suggests that phrases have a hierarchical structure with four
key components:
• Head (X) – The core element that determines the type of phrase (e.g., Noun
for NP, Verb for VP).
• Specifier – A word that modifies or specifies the head (e.g., articles like "the"
in "the food").
• Modifier – Additional elements (adjectives, adverbs, or prepositional
phrases) that modify the phrase.
• Argument – A required element that completes the meaning of the head.
• The Maximal Projection (X̄̄ or XP) is the highest level of the phrase (e.g., NP for a
noun phrase, VP for a verb phrase).
Understanding Figure 2.7(b) – NP Structure
Example: "The food in a dhaba"
This part of the image explains the syntactic structure of the noun phrase
(NP):
Determinant (Det): "the"
Noun (N): "food"
Prepositional Phrase (PP): "in a dhaba" (modifies the noun "food")
Tree Breakdown:
The entire phrase is an NP (Noun Phrase).
"The" (Det) is the Specifier of the noun.
"Food" (N) is the Head of the NP.
VP (Verb Phrase) Structure

•VP (Verb Phrase) is the main phrase.

•Head (V): "ate"
•NP (Noun Phrase) as object: "the food"
•PP (Prepositional Phrase) as modifier:
"in a dhaba"

•VP (Verb Phrase) is the main phrase.

•Head (V): "ate"
•NP (Noun Phrase) as object: "the food"
•PP (Prepositional Phrase) as modifier: "in a dhaba“

This structure shows that "ate" is the main verb, "the food" is the object, and
"in a dhaba" is an optional prepositional phrase modifying the VP.
AP (Adjective Phrase) Structure

•AP (Adjective Phrase) is the main phrase.

•Head (A): "proud"
•Degree Modifier (Deg): "very"
•PP (Prepositional Phrase) as modifier: "of
his country"

Adjective is a word that modifies

(describes or gives more information
about) a noun or pronoun. It tells us
what kind, how many, or which one. Ex:
She has a beautiful dress.
Tree Explanation:
• The adjective "proud" is the Head.
• The degree modifier "very" strengthens the adjective.
• The PP "of his country" gives additional information about "proud".

This structure shows that "proud" is the main adjective, "very"

intensifies it, and "of his country" modifies it.
PP (Prepositional Phrase) Structure

•PP (Prepositional Phrase) is the main

phrase.
•Head (P): "in"
•NP (Noun Phrase) as complement: "a
dhaba"

Tree Explanation:
The preposition "in" is the Head.
The NP "a dhaba" is the Complement.
"a" is the determiner (Det).
"dhaba" is the noun (N).
👉 This structure shows that "in" is the
preposition, which takes "a dhaba" as
its complement.
Maximal Projection of Sentence Structure

S̅ (S-Bar or CP -
Complementizer
Phrase)
COMP
(Complementizer)

• The word "that" is a complementizer, a word that introduces an embedded

clause (subordinate clause).
• Complementizers like "that", "whether", and "if" are used to connect a
dependent clause to a main clause.
Ex: I know that she ate the food in a dhaba. (Here, "that she ate the food in a
dhaba" is the complement clause of "I know.")
Subcategorization
• GB doesn’t consider traditional phrase
structures it considers maximal projection and
sub categorization.
• Maximal projection can be the argument head
but sub categorization is used to filter to permit
various heads to select a certain subset of
the range of maximal projections.
Eg :
• The verb eat can subcategorize for NP,
• whereas word 'sleep' cannot, so ate food is well-
formed but slept the bed is not.
Subcategorization tells us which grammatical elements (NP, PP, S',

etc.) must or can follow a verb.

It explains why "She ate food" is correct but "She slept the bed" is

wrong.

•Sleep" is intransitive → No NP allowed.

•"Eat" is transitive → NP required.

•"Slept the bed" is wrong because "sleep" doesn’t take an NP.

•"Ate food" is correct because "eat" requires an NP.

•"Slept on the bed" is correct because "on the bed" is a PP, not an NP.
Projection Principle

• This is also an basic notion in GB, places a constraint on

the three syntactic representations and their mapping
from one to the other. All syntactic levels are form lexicon.

❖ Theta Theory or The Theory of Thematic relations

Sub Categorizations puts restrictions only on syntactic

categories which a head can accept, GB puts other
restrictions on lexical heads, roles to arguments, the role
assignments are related to ‘semantic relation’
• Theta role and Theta criterion

Thematic roles from which a head can select, theta roles

are mentioned in the lexicon word eat can take (Agent,Theme)
Eg : Mukesh ate food (agent role to mukesh, theme role to food )

❖ Roles are assigned based on the syntactic positions of the

arguments, it is important there should be a match between
the number of roles and number of arguments depicted by
theta criterion
❖ Theta criterion states that ‘each argument bears one and only
one theta role, and each Theta role is assigned to one and
only one argument’
C (Constituent Command)-command and
Government
• C-Constituent is a syntactic relationship that helps define structural
dependencies in a sentence, such as binding, scope, and interpretation
of pronouns and anaphors
• C- command defines scope of maximal projection:
If any word or phrase falls within the scope of and is
determined by a maximal projection, we say that it is
dominated by the maximal projection , there are two
structures α and ß related in such a way that

“ every maximal projection dominating α

dominates ß” iff we say that α C commands ß .
• The def of C command doesn’t include al maximal
projections dominating ß only those dominating α.
Definition of C-Command
A node α C-commands node β if and only if:
1.Every maximal projection that dominates α also
dominates β.

2. α does not dominate β, and β does not dominate α.

This means that α and β must be "sibling-like"

structures, with a common dominating node, but
neither should dominate the other.
S
/ \
NP VP
/ / \
N V NP
John loves Mary
•"John" C-commands "loves" and "Mary", because the NP ("John") and
VP ("loves Mary") are both dominated by S, their closest maximal projection.

•However, "John" does NOT C-command inside the NP ("Mary"),

because "Mary" is inside another maximal projection.
Government, Movement, Empty Category and
Co indexing
• “α governs ß” iff : α C-commands ß
α is an X (head e.g, noun verb preposition adjective
and inflection) and every maximal projection
dominating ß dominates α.
• Movement
In GB move α is described as ‘move anything
anywhere’ though provides restrictions for valid
movement.
•In GB, active to passive transformation.
wh-movement (Wh-question) and NP- movement
What did Mukesh eat ? [Mukesh INFL eat
• Lexical[what ]]
categoryies (N, V, A) must exisit in all three
levels.
• Existence of an abstract entity called empty category
(invisible elements).
In GB four types of empty categories, two being empty
NP positions called wh-trace and NP-Trace and remaining
two pronouns called small pro and big PRO
With two properties –anaphoric(+a or –a)
pronominal(+p or –p)
Co-Indexing is the indexing of the subject NP and AGR at
d-structure which are preserved by Move α.
1. Wh-trace -a -p
2. NP-trace +a -p
3. Small pro -a +p
4. Big PRO +a +p

Properties of Empty Categories

Empty categories are classified based on two properties:
1.Anaphoric (+a or -a):
1. If +a, the category depends on an antecedent for meaning
(e.g., traces).
2. If -a, it does not require an antecedent (e.g., pro).
2.Pronominal (+p or -p):
1. If +p, the category behaves like a pronoun (e.g., pro, PRO).
2. If -p, it does not behave like a pronoun (e.g., traces).
Classification of Empty Categories in GB Theory
Empty categories (ECs) are syntactic elements that exist in sentence structure but
are not pronounced. They are classified based on two properties:
1.Anaphoricity (+a or -a) → Does it need an antecedent?
•If (+a) → Needs an antecedent (its meaning comes from another element
in the sentence).
•Example: Traces (wh-trace, NP-trace)
•What did John eat tᵥ?
•The trace tᵥ refers back to "what" (the wh-word), so it needs an
antecedent → +a
•If (-a) → Does NOT need an antecedent (its meaning is understood
without a reference).
Example: pro (in pro-drop languages like Kannada

“speaks Kannada fluently”

•pro is understood as "he/she/they," but it doesn't depend on any explicit
antecedent → -a
2. Pronominality (+p or -p) → Does it behave like a pronoun?
This property determines whether an empty category behaves
like a pronoun (i.e., whether it can refer to a person or thing).

•If (+p) → Acts like a pronoun (can take the role of "he," "she," "it," etc.).
•Example: pro, PRO
ಓದುತ್ತಾನೆ
refers to a person (like "he/she"), it behaves like a pronoun → +p

•If (-p) → Does NOT act like a pronoun (doesn’t refer to a person/thing).
•Example: Traces (wh-trace, NP-trace)
The cake was eaten tₙₚ.
The trace tₙₚ is just a placeholder, NOT a pronoun → -p
Wh-trace (t𝑤ℎ) – Wh-Movement

Definition: A wh-trace (t𝑤ℎ) is created when a wh-word (such as

what, who, where) moves to the front of a sentence in Wh-
movement. The original position of the wh-word is left empty,
creating a trace.
Example: John ate what?

Whatidid John eat t𝑤ℎ?

NP-trace (tNP) – NP-Movement

•An NP-trace (tNP ) occurs when an NP moves due to

passivization or raising verbs.

•The original position of the NP is left empty, and a trace

is left behind.
Example: Someone ate the cake.
The cake𝑖 was eaten tNP i
Small pro (pronoun)
small pro refers to an empty subject pronoun found in
languages that allow pro-drop, like Kannada, Spanish,
Italian, and Chinese
Example:
ನಾನು ಶಾಲೆಗೆ ಹೋಗುತ್ತೇನೆ → "I go to school."
(Small pro)ಶಾಲೆಗೆ ಹೋಗುತ್ತೇನೆ → "Go to school." (subject is
omitted)

The subject (ನಾನು / "I") is not spoken but understood. This missing subject is represented as
small pro in GB theory:
Big PRO (Pronoun)
•The term PRO (uppercase) refers to an empty subject in
control constructions.
•It is called "Big" because it appears in non-finite clauses
(clauses without tense), unlike small pro, which appears in
finite clauses.
•PRO is "big" because it has more syntactic restrictions
than small pro.
Example
The teacher told Vishnu [PRO to study].
The meaning is: The teacher told Vishnu that Vishnu should study
Incorrect: The teacher told Vishnu [he to study]
Binding Theory
• Binding defined as
α binds ß iff :
α C-commands ß
α and ß are Co-indexed.
Eg : Mukesh was killed
[e1 INFL kill Mukesh ]
[Mukesh was killed (by
ei)]
Mukesh was killed.
Empty clause (ei) and
mukesh (Npi) are bound.
Binding theory can be
given as follows:
(a) An anaphor (+a) is
bound in its
1. Principle A (Anaphors: +a, -p)
An anaphor (e.g., "himself", "herself") must be bound in its government
category.
• Example:
• Correct: John saw himself. → "Himself" is bound by "John" in the same clause.
• Incorrect: John said that Mary saw himself. → "Himself" has no local binder
(wrong).
2. Principle B (Pronominals: -a, +p)
A pronominal (e.g., "he", "she") must be free in its government category.
• Example:
• Correct: John said that he left. → "He" is free in the embedded clause.
• Incorrect: John saw him. → "Him" is bound in the same clause (wrong).
3. Principle C (R-expressions: -a, -p)
An R-expression (Referential) (e.g., "Mukesh", "John", "Mary") must be free (not
bound) everywhere.
• Example:
• Correct: John said that Mary left. → "Mary" is not bound.
• Incorrect: He said that John left. → If "He" refers to "John", it's wrong because
"John" must be free.
Empty category Principle (ECP):
α properly governs ß iff :
α governs ß and α is lexical ( i.e, N V A or P) or
Α locally A – binds ß ECP says ‘ A trace must be properly governed’
Example: What did John eat __?“
(Wrong) What do you think that John ate __?

Bounding Theory Case Theory and Case Filter

• In GB case theory deals with the distributions of NPs and mentions that each NP
must assigned a case.
(Case refers to the grammatical role that a noun (or pronoun) plays in a sentence.
Ex: She ate an apple)

• In English we have nominative, objective, genitive etc., cases which are assigned to
NPs at particular positions.

• Indian languages are rich in case markers, which are carried even
during movements.
Case Filter :
An NP is un grammatical if it has phonetic content or if it is an argument and
is not case marked.

Phonetic content here, refers to some physical realization, as opposed to empty

categories. Case filters restricts the NP movement.
LFG Model : Lexical Functional Grammar (LFG) Model:

Two syntactic levels :

constituent structure (c-struct)-Phrase structure representation (tree format)

functional structure (f-struct)- Grammatical function representation (subject, object, etc.)

ATN (Argument Transition Networks )- Computational model linking syntax and meaning

which used phrase structure trees to represent the surface of sentences and underlying
predicate –argument structure.

LFG aimed to C-structure and f-structure

She saw stars in the sky

•↑ (Up Arrow): Refers to the f-structure of the larger node (the phrase
containing the element).
•↓ (Down Arrow): Refers to the f-structure of the current node (the specific
element itself).

Rule 1 (S → NP VP):
•(↑ subj = ↓) → This means that the f-structure of NP (subject) goes into the
f-structure of the entire sentence (S).
•(↑ = ↓) → This means that the f-structure of VP (verb phrase) directly
assigned to the f-structure of S (since the verb defines the sentence's
action).
Functional Notation in Lexical Functional Grammar (LFG)
f-structure of given sentence

The order follows a grammatical hierarchy where features affecting

agreement appear first
Three key properties of f-structure in linguistic theory:
1.Consistency – Each attribute in an f-structure can have only one
value. If conflicting values are found (e.g., singular and plural for
a noun), the structure is rejected.
2.Completeness – An f-structure must include all the functions
required by its predicate. If a predicate requires an object but
none is provided (e.g., "He saw" without specifying what was
seen), the structure is incomplete.
3.Coherence – All governable functions in an f-structure must be
governed by a predicate. If an object appears where the verb does
not allow it, the structure is rejected.
Lexical Rules in Lexical-Functional Grammar (LFG)
Active Sentences (The subject performs the action, and the object receives it)
Example:
Tara ate the food. → ("Tara" is the subject, "ate" is the verb, and "the
food" is the object.)

Passive Sentences (The object becomes the subject, and the original subject moves to
an optional phrase.)
Example:
• Passive: The food was eaten by Tara. → ("The food" is now the
subject, "was eaten" is the verb, and "by Tara" is an optional agent
phrase.)
Active Structure: Pred = ‘eat < (↑ Subj) (↑ Obj) >’
Passive Structure: Pred = ‘eat < (↑ Obl_ag) (↑ Subj) >’
[oblique agent phrase (Obl_ag): special grammatical element used in passive
sentences to indicate who performed the action. Example: Active Voice: Tara ate
the food. Passive Voice (Sentence Rewritten): The food was eaten by Tara. “by
Tara" is the oblique agent phrase (Obl_ag) because it represents the original subject
(Tara) and it is no longer the main subject of the sentence ]
Causativization (Making Someone Do Something)

Causativization is when an action is caused by someone rather than being done

directly.
Example:
Active: तारा हंसी (Taaraa hansii) → Tara laughed
Pred = ‘Laugh < (↑ Subj) >’
Causative (when someone causes an action to happen):
मोनिका ने तारा को हँसाया (Monika ne Tara ko hansaayaa) →
Monika made Tara laugh
Pred = ‘cause < (↑ Subj) (↑ Obj) (Comp) >’
Subject (Subj): Monika (the causer)
Object (Obj): Tara (the one affected)
Complement (Comp): Tara laugh (this is the action that was caused). (complement
that tells what action is being caused)
Long-Distance Dependencies and Coordination in LFG
Long-distance dependency happens when a word or phrase is moved from its usual position
in a sentence to another position. In English, this often happens with questions (wh-
movement).
Example: Tara likes which picture most?
(Wh-Movement) Which picture does Tara like most?
In GB Theory (Government and Binding Theory), when a word or phrase moves to a
different position, it leaves an empty category (a placeholder for the missing part).
Which picture does Tara like __ most? (invisible placeholder or trace)

LFG does not create empty categories like GB Theory. Instead, it uses functional structures
to maintain connections and Coordination.
The moved phrase (which picture) is still linked to its original position functionally.
Coordination refers to how different sentence elements are linked logically.
Example: Tara likes ‘tea and coffee’
"Which picture does Tara like__ most?“

1. Focus
• Represents the wh-word phrase, which is in focus.
• In this case, "Which picture" is the focus.
2. Pred (‘picture ⟨(Oblth)⟩’)
• The predicate (Pred) represents the main word of the phrase.
• Here, "picture" is the noun, and Oblth (oblique thematic) indicates that the picture is
related to something else (like an owner or subject).
3. Oblth(Oblique Thematic Role)
• It refers to an oblique phrase that provides additional information, such as who the
picture is related to.
• Contains:
• pred ‘PRO’ → PRO stands for a pronoun-like element.
• Refl + → Suggests reflexive or reference to something already mentioned.
4. Subj (Subject)
• pred ‘Tara’ → Identifies "Tara" as the subject of the sentence.
5. Obj (Object)
• The object is left empty, as the wh-word ("which picture") has moved.
6. Pred (‘like ⟨(↑Subj) (↑Obj)⟩’)
• Represents the verb "like", which takes a subject and an object.
Paninian Framework
Its a linguistic model based on Paninian Grammar (PG), which was written by Panini in 500 BC in Sanskrit (originally titled
Astadhyayi). Though originally designed for Sanskrit, this framework can be applied to other Indian languages and some Asian
languages.
Key Features
1. SOV Structure: Unlike English (which follows Subject-Verb-Object [SVO] order), most Asian languages follow Subject-
Object-Verb [SOV] order.
1. Example:
1. English (SVO): Tara likes apples.
2. Hindi (SOV): तारा को सेब पसंद है। (Tara ko seb pasand hai)
2. Inflectional Richness:
1. Many Indian languages rely on inflectional changes to convey grammatical relationships (e.g., tense, case, gender),
instead of relying solely on word order.
2. Example (in Sanskrit):
1. रामः ग्रामं गच्छति (Rāmaḥ grāmaṁ gacchati) → Rama goes to the village.
2. रामेण ग्रामः गम्यते (Rāmeṇa grāmaḥ gamyate) → The village is gone to by Rama.
3. Here, "Rama" (रामः / रामेण) changes its form based on its grammatical role.
3. Syntactic and Semantic Cues:
1. The Paninian framework focuses on meaning-based analysis rather than just word order, making it useful for analyzing
complex Indian languages.
4. Ongoing Research:
1. The framework is still being explored for its application to Indian languages, as many complexities remain to be
explained.
Some Important features of Indian languages
Layered representation of PG
• General GB considers deep structure , Surface and LF, where LF nearer to Semantics
• Paninion grammar frame work is said to be syntactico- semantic
surface layer to deep semantics by passing to intermediate layers.

•Language as a multi-layered process. You start with spoken words (surface level),
add grammatical roles (vibhakti level), determine who is doing what (karaka level),
and finally understand the real meaning (semantic level).
•Paninian Grammar follows this approach to ensure sentences preserve meaning
even if word order changes.
•This is useful in languages like Hindi and Sanskrit, where word order is flexible,
but meaning remains clear.
• Vibhakti means inflection, but here it refers to
word (noun, verb,or other)groups based
either on case endings, post positions or
compound verbs, or main and auxiliary verbs
etc,.
• Instead of talking NP,VP,AP,PP or … word
groups are formed based on various kinds of
markers. These markers are language specific
but all indian languages can be represented at
Vibhakti Leve.l
• Karaka Level means Case in GB these are theta
criterion etc.,.
• PG has its own way of defining karaka relations, these
relations based on word groups participate in the
activity denoted by the verb group(syntactic &
semantic as well).
KARAKA THEORY
• Central theme of PG framework, relations are assigned
based on the roles played by various participates in the
main activity.
• Roles are reflected in the case markers and post
position markers.
• Case relations we can find in english langauge, richness
of the case endings found in indian languages .
• Karakas such as Karta (subject), karma(object),
Karna(instrument),sampradhana(beneficary),
Apandan(seperation) and Adikhran (locus).
Issues in Panininan Grammar(PG)

•Computational implementation of PG- Translating PG

into a computer-friendly format is complex.

• Adaptation of PG to Indian , other similar

languages.
• Mapping Vibakthi to several semantics
n-gram Model
• n-gram model: is used in statistical language modeling to estimate the
likelihood (probability) of a sentence appearing in a given language.
• The n-gram model helps us calculate this probability using past words in a
sentence.
• Instead of treating the whole sentence as a single unit, the model breaks it
down into smaller parts and calculates probabilities step by step.
• This follows the chain rule of probability, which means the probability of a
sentence P(s) is the product of the probabilities of each word appearing,
given the previous words.

we have a sentence with words:

𝑠=(𝑤1,𝑤2,𝑤3,...,𝑤𝑛) Where: 𝑤1is the first word, 𝑤2 is the second word, and

so on 𝑤𝑛 is the last word

Special Words (Pseudo-Words) is introduced to mark the
start or beginning of the Sentence.
•<s> → Marks the beginning of a sentence.
•</s> → Marks the end of a sentence.
•In trigram models, we use <s1> and <s2> to mark
the start.
How Do We Estimate These Probabilities?

• To train an n-gram model, we use real-world text data

(corpus). We count how often a particular n-gram
appears and divide it by the total occurrences of its
history.
• The formula for estimating probabilities using Maximum
Likelihood Estimation (MLE) is:
In General sum of all n-grams that share first n-1 words is equal to the count
of the common prefix

Therefore P(T/M) where T is training set and M is model

Training Set
1. The Arabian Knights
2. These are the fairy tales of the east
3. The stories of the Arabian knights are translated in many languages
using bigram model.

Formula for bigram model

P(The / <s>) = 0.67
•C(The) = 2 (because "The" appears twice as the first word
in different sentences: "The Arabian Knights" and "The
stories of the Arabian knights...")
•Total number of sentences = 3
P(Arabian / The) = 0.4
•C(The) = 5 (appears 5 times in total)
•C(The, Arabian) = 2 (word pair appears twice: "The
Arabian Knights" and "The stories of the Arabian
knights...")
P(Knights / Arabian) = 1.0
•C(Arabian) = 2 (appears twice)
•C(Arabian, Knights) = 2 (word pair "Arabian Knights"
appears twice)
Add-One Smoothing (Laplace Smoothing)

In n-gram models, we estimate the probability of a word appearing after a

sequence of previous words based on how often that sequence occurs in the
training data.
However, a major problem arises when we encounter word sequences that
never appeared in the training data. If a certain n-gram was not seen in
training, its probability is calculated as zero, which means the model cannot
generate or recognize new sequences.
Ex: The Arabian knights are strong
P(strong | are) = 0.

Solution is smoothing
Smoothing is a technique used to fix the zero-probability problem by
adjusting how probabilities are assigned. It does this by giving a small amount
of probability to unseen n-grams so that nothing has zero probability.
One of the simplest smoothing techniques is Add-One Smoothing, also called
Laplace Smoothing.
What is Add-One Smoothing?
Add-One Smoothing (Laplace Smoothing) is a simple method where
we add 1 to all n-gram counts before normalizing probabilities.

Formula (For n-gram Model):

Good-Turing Smoothing
• Good-Turing Smoothing is a statistical technique proposed by Alan Turing and I.
J. Good (1953) to handle the problem of data sparsity in n-gram language
models. The main idea is to adjust the estimated probabilities of observed and
unseen n-grams by redistributing some probability mass from frequent n-
grams to infrequent and unseen ones.
• Good-Turing modifies the count (frequency) f of an n-gram and replaces it with
an adjusted count f∗:
Example:
Caching Technique
• Caching is an optimization technique that
improves the basic n-gram model by storing
recently seen n-grams and giving them higher
probabilities.

Why is Caching Needed?

•Language is Context-Dependent
• Certain words or phrases occur more often in specific sections of text but
not uniformly across the dataset.
•Standard N-gram Models Ignore Recent History
• A basic n-gram model treats every sentence independently and does not
consider recent occurrences of words.

aiml manual 6th sem
No ratings yet
aiml manual 6th sem
15 pages
Syntax
No ratings yet
Syntax
6 pages
Generative Grammar
100% (2)
Generative Grammar
4 pages
Ionela Neagu English Syntax
No ratings yet
Ionela Neagu English Syntax
139 pages
Ai Agents
No ratings yet
Ai Agents
3 pages
NLP UNIT 1 Part 2
No ratings yet
NLP UNIT 1 Part 2
15 pages
GB Theory
No ratings yet
GB Theory
67 pages
A Step-By-step Introduction To GB Syntax
No ratings yet
A Step-By-step Introduction To GB Syntax
76 pages
Black Cheryl A. - A Step-By-Step Introduction To The Government and Binding Theory of Syntax
No ratings yet
Black Cheryl A. - A Step-By-Step Introduction To The Government and Binding Theory of Syntax
76 pages
Challenges (NLP) and F C Structure
No ratings yet
Challenges (NLP) and F C Structure
8 pages
Government and Binding Theory
100% (1)
Government and Binding Theory
14 pages
What Is Syntax
No ratings yet
What Is Syntax
7 pages
Lingg 206: Grammatical Analysis I Angelina A. Aquino
No ratings yet
Lingg 206: Grammatical Analysis I Angelina A. Aquino
19 pages
Chomsky's Grammar - 57, Aspects, EST
No ratings yet
Chomsky's Grammar - 57, Aspects, EST
7 pages
Lexical Functional Grammar Peter K Austin
No ratings yet
Lexical Functional Grammar Peter K Austin
32 pages
Theoretical Framework of Generative Grammar
No ratings yet
Theoretical Framework of Generative Grammar
7 pages
Intro To Transformational Grammar
100% (1)
Intro To Transformational Grammar
290 pages
syntax 2 by dororo hyakimaru Luffy (T)
No ratings yet
syntax 2 by dororo hyakimaru Luffy (T)
57 pages
tdt4310 2024 Lect11 Full
No ratings yet
tdt4310 2024 Lect11 Full
78 pages
Chomsky's Theories of Grammar: Angelina A. Aquino
No ratings yet
Chomsky's Theories of Grammar: Angelina A. Aquino
2 pages
Natural Language Processing Artificial Intelligence
No ratings yet
Natural Language Processing Artificial Intelligence
81 pages
Introduction To Linguistics Syntax: Class 7
No ratings yet
Introduction To Linguistics Syntax: Class 7
35 pages
CH19 Jean Atchison-Atchison's Linguistics - (2010)
No ratings yet
CH19 Jean Atchison-Atchison's Linguistics - (2010)
14 pages
Syntax 1
No ratings yet
Syntax 1
59 pages
Linguistics - Unit 2
100% (1)
Linguistics - Unit 2
11 pages
History of Gfs
No ratings yet
History of Gfs
58 pages
Syntax FP
No ratings yet
Syntax FP
35 pages
An Introduction To Generative Syntax According To Generative Theories
No ratings yet
An Introduction To Generative Syntax According To Generative Theories
50 pages
File Syntax - Photo - Phrases
No ratings yet
File Syntax - Photo - Phrases
41 pages
Chapter 9
No ratings yet
Chapter 9
34 pages
Lecture 12
No ratings yet
Lecture 12
34 pages
Course Description. TG Grammar Introduction
No ratings yet
Course Description. TG Grammar Introduction
18 pages
Lyra Chapter 5-7
No ratings yet
Lyra Chapter 5-7
81 pages
Lecture 6
No ratings yet
Lecture 6
43 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
50 pages
Publication 1 7970 420
No ratings yet
Publication 1 7970 420
6 pages
UNIT 1. The Lexicon and Sentence Structure - Docx 2017
No ratings yet
UNIT 1. The Lexicon and Sentence Structure - Docx 2017
8 pages
Ionela Neagu English Syntax
No ratings yet
Ionela Neagu English Syntax
139 pages
Syntax
No ratings yet
Syntax
6 pages
Natural Language Processing PDF
100% (1)
Natural Language Processing PDF
47 pages
Compositional Semantics_ an Introduction
No ratings yet
Compositional Semantics_ an Introduction
449 pages
Croft Cruse 2004CH9
No ratings yet
Croft Cruse 2004CH9
32 pages
Class 12 The Minimalist Project
No ratings yet
Class 12 The Minimalist Project
4 pages
The Linguistic Structure of Modern English-205-236
No ratings yet
The Linguistic Structure of Modern English-205-236
32 pages
NLP Iat QB
No ratings yet
NLP Iat QB
10 pages
Univerzalna Gramatika
No ratings yet
Univerzalna Gramatika
18 pages
Lec-6 Grammar in NLP
No ratings yet
Lec-6 Grammar in NLP
55 pages
Instant Download Lexical Functional Syntax 1st Edition Joan Bresnan PDF All Chapters
100% (9)
Instant Download Lexical Functional Syntax 1st Edition Joan Bresnan PDF All Chapters
77 pages
Ling342 6
No ratings yet
Ling342 6
44 pages
Engl 412 Advanced descriptions of Modern English.
No ratings yet
Engl 412 Advanced descriptions of Modern English.
45 pages
Generative PDF
No ratings yet
Generative PDF
17 pages
Applied Linguistics-Week 2
No ratings yet
Applied Linguistics-Week 2
27 pages
Context Free Grammars
No ratings yet
Context Free Grammars
38 pages
Generative Grammar (GG) : Karim Nazari Bagha
No ratings yet
Generative Grammar (GG) : Karim Nazari Bagha
14 pages
601 Lectures
No ratings yet
601 Lectures
361 pages
Ling Syntax Reviewer
No ratings yet
Ling Syntax Reviewer
8 pages
Evolution of Chomsky's Transformational Grammar
From Everand
Evolution of Chomsky's Transformational Grammar
El Mouatamid Ben Rochd
No ratings yet
Selected Readings on Transformational Theory
From Everand
Selected Readings on Transformational Theory
Noam Chomsky
5/5 (1)
Understanding Words and Morphology
From Everand
Understanding Words and Morphology
Gauraang Asan
No ratings yet
Communication in Drama: a Pragmatic Approach
From Everand
Communication in Drama: a Pragmatic Approach
Dr. Umesh S. Jagadale
No ratings yet
Explorations of Language Transfer
From Everand
Explorations of Language Transfer
Terence Odlin
No ratings yet
1IKS notes
No ratings yet
1IKS notes
16 pages
Module 1_HCAI
No ratings yet
Module 1_HCAI
61 pages
BRMK557-model-set-1-paper
No ratings yet
BRMK557-model-set-1-paper
2 pages
LAB Manual 4th Sem
No ratings yet
LAB Manual 4th Sem
11 pages
Module 3 Python (Chap 2)
No ratings yet
Module 3 Python (Chap 2)
13 pages
Course SYNTAX 18-19
No ratings yet
Course SYNTAX 18-19
4 pages
A Corpus-Based Approacg To Infinitival Complements in Early Latin
No ratings yet
A Corpus-Based Approacg To Infinitival Complements in Early Latin
476 pages
A Comparative Study On English and Turkish Syntactic Structures Within The Terms of The Minimalist Program
No ratings yet
A Comparative Study On English and Turkish Syntactic Structures Within The Terms of The Minimalist Program
17 pages
NLP Merged
100% (1)
NLP Merged
975 pages
Accessibility of Subjacency Principle and Empty Category Principle To SLA
No ratings yet
Accessibility of Subjacency Principle and Empty Category Principle To SLA
8 pages
Relativised Minimality
No ratings yet
Relativised Minimality
28 pages
Dokumen - Pub English Grammara Generative Perspective 0631188398
No ratings yet
Dokumen - Pub English Grammara Generative Perspective 0631188398
691 pages
Chinese Passives in Comparative Perspective
No ratings yet
Chinese Passives in Comparative Perspective
75 pages
Head-Driven Phrase Structure Grammar: July 2002
No ratings yet
Head-Driven Phrase Structure Grammar: July 2002
9 pages
The Historical Sketch of Null Constituents
No ratings yet
The Historical Sketch of Null Constituents
1 page
Valian 1991
No ratings yet
Valian 1991
61 pages
Language and Mind
No ratings yet
Language and Mind
19 pages
Immediate download Syntactic Theory 2nd Edition Geoffrey Poole ebooks 2024
100% (1)
Immediate download Syntactic Theory 2nd Edition Geoffrey Poole ebooks 2024
81 pages
What Are Pseudo Relatives
No ratings yet
What Are Pseudo Relatives
17 pages
Anders Holmberg 2005 Is there little pro - Evidence from Finnish
No ratings yet
Anders Holmberg 2005 Is there little pro - Evidence from Finnish
35 pages
1588547009 Principles and Parameters of Universal Grammar 1225479431408730 8
No ratings yet
1588547009 Principles and Parameters of Universal Grammar 1225479431408730 8
32 pages
Linguistics An Introduction 2nd Edition Andrew Radford - The ebook is ready for download to explore the complete content
100% (1)
Linguistics An Introduction 2nd Edition Andrew Radford - The ebook is ready for download to explore the complete content
52 pages
ReviewofCarnie Syntax A Generative Introduction
No ratings yet
ReviewofCarnie Syntax A Generative Introduction
6 pages
A'-Dependencies_in_Turkish (özsoy)
No ratings yet
A'-Dependencies_in_Turkish (özsoy)
24 pages
Syntactic theory 2. ed Edition Poole - The full ebook with all chapters is available for download
100% (2)
Syntactic theory 2. ed Edition Poole - The full ebook with all chapters is available for download
47 pages
(Ebook) Linguistics: An Introduction by Andrew Radford, Martin Atkinson, David Britain, Harald Clahsen, Andrew Spencer ISBN 9780521614788, 9780521849487, 0521614783, 0521849489 - The full ebook with all chapters is available for download
100% (2)
(Ebook) Linguistics: An Introduction by Andrew Radford, Martin Atkinson, David Britain, Harald Clahsen, Andrew Spencer ISBN 9780521614788, 9780521849487, 0521614783, 0521849489 - The full ebook with all chapters is available for download
57 pages
PronominalBindinginEnglishandArabic
No ratings yet
PronominalBindinginEnglishandArabic
29 pages
Hornstein 1999
No ratings yet
Hornstein 1999
29 pages
Naze in That'-Clauses: Chiba University of Commerce
No ratings yet
Naze in That'-Clauses: Chiba University of Commerce
14 pages
The Use of Empty Categories in Syntax
100% (1)
The Use of Empty Categories in Syntax
17 pages
Syntaxe Générative L3
No ratings yet
Syntaxe Générative L3
18 pages
Unit 2 Types of Clauses and Sentences: Objectives
No ratings yet
Unit 2 Types of Clauses and Sentences: Objectives
14 pages
Gustar-Type Verbs
100% (1)
Gustar-Type Verbs
10 pages
Syntactic Theory - Poole, Geoffrey - Modern Linguistics Series, 2 - Ed, 2011 - Palgrave Macmillan - 0230344062 - Anna's Archive
100% (2)
Syntactic Theory - Poole, Geoffrey - Modern Linguistics Series, 2 - Ed, 2011 - Palgrave Macmillan - 0230344062 - Anna's Archive
359 pages
Pro-Drop 1989
No ratings yet
Pro-Drop 1989
23 pages

Module_1_part2_NLP

Uploaded by

Module_1_part2_NLP

Uploaded by

Language Modelling

Two Approaches for Language Modelling

Statistical language model

**Grammar based models are:

1. Generative Grammars (TG, Chomsky 1957)

• SLM s are fundamental task in many NLP

• GB Theory as a set of tools that help us understand

-Government refers to how one word

-Binding refers to how pronouns (he, she,

Logical Form (LF)

LF: The meaning of a sentence (semantic interpretation)

Surface Structure Deep Structure

Xʹ Theory (pronounced ‘X-bar Theory’) states that each constituent

D-Structure Basic structure before movement "John likes Mary."

S-Structure The sentence after transformations "What do you want?"

Move α Again More transformations before final meaning processing

"Everyone saw a movie." (Did they see

•VP (Verb Phrase) is the main phrase.

•VP (Verb Phrase) is the main phrase.

•AP (Adjective Phrase) is the main phrase.

Adjective is a word that modifies

This structure shows that "proud" is the main adjective, "very"

•PP (Prepositional Phrase) is the main

• The word "that" is a complementizer, a word that introduces an embedded

etc.) must or can follow a verb.

•Sleep" is intransitive → No NP allowed.

•"Eat" is transitive → NP required.

•"Slept the bed" is wrong because "sleep" doesn’t take an NP.

•"Ate food" is correct because "eat" requires an NP.

• This is also an basic notion in GB, places a constraint on

❖ Theta Theory or The Theory of Thematic relations

Sub Categorizations puts restrictions only on syntactic

Thematic roles from which a head can select, theta roles

❖ Roles are assigned based on the syntactic positions of the

“ every maximal projection dominating α

2. α does not dominate β, and β does not dominate α.

This means that α and β must be "sibling-like"

•However, "John" does NOT C-command inside the NP ("Mary"),

Properties of Empty Categories

“speaks Kannada fluently”

Definition: A wh-trace (t𝑤ℎ) is created when a wh-word (such as

Whati​did John eat t𝑤ℎ?

•An NP-trace (tNP ​) occurs when an NP moves due to

•The original position of the NP is left empty, and a trace

Bounding Theory Case Theory and Case Filter

Phonetic content here, refers to some physical realization, as opposed to empty

Two syntactic levels :

constituent structure (c-struct)-Phrase structure representation (tree format)

functional structure (f-struct)- Grammatical function representation (subject, object, etc.)

LFG aimed to C-structure and f-structure

The order follows a grammatical hierarchy where features affecting

Causativization is when an action is caused by someone rather than being done

•Computational implementation of PG- Translating PG

• Adaptation of PG to Indian , other similar

we have a sentence with words:

so on 𝑤𝑛​ is the last word

• To train an n-gram model, we use real-world text data

Therefore P(T/M) where T is training set and M is model

Formula for bigram model

In n-gram models, we estimate the probability of a word appearing after a

Formula (For n-gram Model):

Why is Caching Needed?

You might also like

Whatidid John eat t𝑤ℎ?

•An NP-trace (tNP ) occurs when an NP moves due to

so on 𝑤𝑛 is the last word