Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
40 views

Shallow Parsing

The document discusses shallow parsing and chunk parsing. Shallow parsing aims to divide text into non-overlapping chunks like NPs and VPs without fully analyzing syntax. Chunk parsing can be done with rules or machine learning and is useful for applications like information extraction.

Uploaded by

saisuraj1510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Shallow Parsing

The document discusses shallow parsing and chunk parsing. Shallow parsing aims to divide text into non-overlapping chunks like NPs and VPs without fully analyzing syntax. Chunk parsing can be done with rules or machine learning and is useful for applications like information extraction.

Uploaded by

saisuraj1510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Shallow Parsing

Parsing is an intermediate stage


✓Builds structures - used by later stages of processing

Parsing Full parsing is


✓Provides more information than we need
✓Full parse has too much structure
Too nested
✓Not necessary stage for many practical NLP tasks
Shallow More locality
• Less long-range context-dependence
• Less ambiguity
Parsing
Goal: Divide a sentence into a sequence of chunks.
NPs, VPs, AdjPs, PPs
Chunks are non-overlapping regions of a text
Shallow/Chunk [I] saw [a tall man] in [the park].

Parsing Chunks are non-recursive


A chunk can not contain other chunks
Chunks are non-exhaustive
Not all words are included in chunks
Chunks vs Constituents

A constituent is part of some higher unit in the hierarchical syntactic parse

Constituents: [[a tall man] [ in [the park]]].


Chunks: [a tall man] in [the park].

✓ Chunks are not constituents, Constituents are recursive


✓ Chunks do not cross major constituent boundaries
Noun-phrase chunking:
[I] saw [a tall man] in [the park].
Verb-phrase chunking:
Chunk Parsing The man who [was in the park] [saw me].

Examples Prosodic chunking (in Speech synthesis):


Prosody - study of the tune and rhythm of speech.
[I saw] [a tall man] [in the park].
Applications
Locating information
Text retrieval
✓Index a document collection on its noun phrases
Question answering
✓Which [Spanish explorer] discovered [the Mississippi River]?

Constructing annotated text for other applications

Extraction of specialized terminology or multi-words


Medical texts or manuals
Chunk Rule NP: {<DT>?<JJ>*<NN>}

Chunk all matching subsequences:


Chunking Input
the/DT little/JJ cat/NN sat/VBD on/IN the/DT mat/NN

Apply chunk rule


[the/DT little/JJ cat/NN] sat/VBD on/IN [the/DT mat/NN]
NP Chunking

[NP I] saw [NP the man] on [NP the hill] with [NP a telescope]
Approaches

Finite-State Parsers

Stochastic Processes which “learn” a grammar

Machine Learning based


Approaches: FST Chunking

Use regular expression to identify constituents, e.g. NP → (DT)? NN+


• Find longest matching chunk
• Hand-built rules
• No recursion
Approaches: ML based Chunking

Require annotated corpus

Train classifier to classify each element of input in sequence

✓ B (beginning of sequence)
✓ I (internal to sequence)
✓ O (outside of any sequence) BIO Tagging

Book/B_VP that/B_NP flight/I_NP quickly/O


Representation
Classifier
Chunk Evaluation

Compare system output with Gold Standard using


✓Precision: # correct chunks /total chunks retrieved by system
✓Recall: # correct chunks /total actual chunks in test corpus
✓F-measure: (weighted) Harmonic Mean of Precision and Recall
2 PR
F1 =
P+R
Limitations

✓Depends on POS tagging accuracy


✓Ambiguities and errors in labels of training corpus
Distribution of Chunks in CONLL Shared Task
Conference on Natural Language Learning
Limitations
Shallow or Prepositional phrases contain important semantic
information for interpreting events, are left
Partial Parsing unattached.

You might also like