Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

NLP Final

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Q1) a) What do you mean by part-of-speech Tagging? What is the need of this task in NLP.

Part-of-speech tagging is the process of assigning grammatical categories (such as noun, verb, adjective) to
words in a text.

Need for Part-of-Speech Tagging in NLP

1) Syntactic Analysis: Helps identify the role of words in a sentence for parsing and grammar
analysis.
2) Semantic Analysis: Aids in understanding the meaning of words and their relationships in a
sentence.
3) Machine Translation: Facilitates accurate translation by preserving grammatical structure.
4) Information Retrieval: Enhances search accuracy by considering word usage and context.
5) Named Entity Recognition (NER): Identifies and categorizes named entities like people,
organizations, and locations.
6) Text-to-Speech Systems: Assists in generating natural-sounding speech by providing proper
pronunciation cues.
7) Grammar Checking: Enables automated proofreading and correction by identifying word
usage errors.
8) Information Extraction: Supports extracting relevant information from text by understanding
word roles.
9) Improving NLP Models: Contributes to training better NLP models by providing labeled data
for supervised learning.

b) Differentiate between natural languages and programming languages.

NO Natural language processing Programming Language


1 Connected in processing the human natural Way of writing instructions to the computer
language
2 Generates human language syntax Strict syntax for every language
3 enable computers to interact with human Solve the task and computational problems
language
4 Works with unstructured and speech data Works with structured data, variables, and
program logic
5 Focuses on processing and understanding Used for specifying algorithms and manipulation
human language text data
6 Chatbots, language translation, speech develop software, applications and algorithms
recognition, etc
7 Ex. machine translation, sentiment analysis Ex. C, C++,java, python etc.

8 Tools. NLTK, TensorFlow Tools. IDEs (Integrated Development


Environments), compilers

c) Explain Tokenization with its different types.

- process of breaking down a text into smaller units called tokens.


- tokens are usually words, phrases, or symbols.

Types of Tokenization:

1) Word Tokenization: breaking a text into individual words


Ex. "The quick brown fox jumps over the lazy dog" is tokenized into ["The", "quick", "brown", "fox",
"jumps", "over", "the", "lazy", "dog"].
2) Sentence Tokenization: splitting a text into individual sentences.
Ex. "This is the first sentence. This is the second sentence." is tokenized into ["This is the first
sentence.", "This is the second sentence."].

3) Whitespace Tokenization: using spaces as separators to break the text into tokens.
Ex. "The sun is shining" is tokenized into ["The", "sun", "is", "shining"].

4) Punctuation Tokenization: using punctuation marks as separators to split the text into
tokens.
Ex. "He said, 'Hello! How are you?'" is tokenized into ["He", "said", ",", "'Hello", "!", "How", "are", "you",
"?", "'"].

5) Morphological Tokenization: breaking down words into their root forms


Ex. For "running," the morphological tokenization might include ["run", "-ing"].

Q2) a) What is Natural Language Processing (NLP)? Discuss various stages involved in NLP process with suitable
example.

- (NLP) is a subfield of AI
- It focuses on the interaction between computers and human language.
- It enable machines to understand, interpret, and generate human language.

Stages:
1) Text Acquisition: Gathering relevant textual data.
Ex. Extracting text from news articles for sentiment analysis or information retrieval.

2) Preprocessing: Cleaning and preparing the data for analysis.


Ex. Converting all text to lowercase and removing common words like "the" and "and" to
reduce noise in the data

3) Tokenization: Breaking text into individual words or tokens.


Ex. Tokenizing the sentence "The cat is sleeping" into ["The", "cat", "is", "sleeping"].

4) Part of Speech Tagging: Assigning grammatical categories to tokens.


Ex. Tagging "The cat is sleeping" as [Determiner, Noun, Verb, Verb] where "The" is a
determiner, "cat" is a noun, and so on.

5) Parsing: Analyzing the syntactic structure of sentences.


Ex. Parsing the sentence "The cat chased the mouse" to understand the subject-verb-object
relationship.
6) Semantic Analysis: Understanding the meaning of the text.
Ex. Determining the semantic similarity between two sentences, such as "The cat is on the
mat" and "A feline is resting on the rug."

7) Discourse Integration: Coherent interpretation of a sequence of sentences.


Ex. Understanding the narrative flow and connection between sentences in a paragraph or
document.
b) Discuss the challenges of Natural Language Processing.

1) Ambiguity: Words and phrases often have multiple meanings.


Ex. In the sentence "I saw her duck," the word "duck" could be a bird or an action (to lower oneself).

2) Context Dependency: Interpretation depends on context.


Ex. The word "bank" can refer to a financial institution or the side of a river, depending on context.

3) Lack of Standardization: Languages evolve, introducing variations.


Ex. Differences in spelling between British and American English (e.g., "colour" vs. "color") can pose
challenges.

4) Cultural Nuances: Understanding cultural context in language.


Ex. Idiomatic expressions and culturally specific references may be challenging for models trained on a
different cultural context

5) Handling Rare Cases: Dealing with uncommon or specialized terms.


Ex. Technical jargon in niche fields may be challenging for models without specific domain knowledge

6) Data Privacy and Security: Data privacy and security require careful management in
language processing tasks.
Ex. Disambiguating between the different senses of "bat" (e.g., a flying mammal or a sports
equipment) in a given context

Q3) a) Derive a top-down, depth-first, left-to-right parse tree for the given sentence: “The angry bear chased
the frightened little squirrel” Use the following grammar rules to create the parse tree:

S -> NP VP Det -> the


NP -> Det Nom Adj -> little | angry | frightened
VP -> V NP N -> squirrel | bear
Nom -> Adj Nom | N V -> chased

b) Explain Derivational and Inflectional morphology in detail with suitable example.


Derivational Morphology: Process of creating new words by adding prefixes, suffixes, or morphemes
to change meaning or grammatical category.

Ex. 1) Create → Creation (Verb to Noun)


2) Friend → Friendly (Noun to Adjective)
3) Nation → National (Noun to Adjective)
4) Deep → Deepen (Adjective to Verb)

Inflectional Morphology: Adding morphemes to convey grammatical information without altering


meaning.

Ex. 1) Walk → Walks (3rd person singular)


2) Child → Children (Plural)
3) Run → Ran (Past Tense)
4) Sing → Singing (Present Participle)

Q4) a) What is Probabilistic context-free grammars? State the benefits of probabilistic parsing

Probabilistic Context-Free Grammars (PCFG):

Extends context-free grammars with probabilities for production rules, commonly used in syntactic
parsing.

Benefits of Probabilistic Parsing:


1) Capturing Ambiguity
2) Statistical Learning
3) Flexible Language Modeling
4) Handling Out-of-Vocabulary Words
5) Syntactic Disambiguation
6) Adaptability to Different Domains
7) Scalability

b) Explain with suitable examples following relationship between word meanings, 1. Homonymy 2. Polysemy 3.
Synonymy 4. Hyponymy

1) Homonymy: Words with different meanings but the same form.


Ex. "bat" (flying mammal) vs. "bat" (sports equipment)

2) Polysemy: Words with multiple related meanings


Ex. "bank" (financial institution) vs. "bank" (river's edge)

3) Synonymy: Words with similar meanings.


Ex. "big" and "large"

4) Hyponymy: Relationship between a general term and specific instances.


Ex. "rose" is a hyponym of "flower"

You might also like