NLP Qa
NLP Qa
NLP Qa
1. What is a Chabot?
A chatbot is a computer program that's designed to simulate human conversation through voice
commands or text chats or both. Eg: Mitsuku Bot, Jabberwacky etc.
OR
A chatbot is a computer program that can learn over time how to best interact with humans. It can
answer questions and troubleshoot customer problems, evaluate and qualify prospects, generate sales
leads and increase sales on an ecommerce site.
OR
A chatbot is a computer program designed to simulate conversation with human users. A chatbot is
also known as an artificial conversational entity (ACE), chat robot, talk bot, chatterbot or chatterbox.
OR
A chatbot is a software application used to conduct an on-line chat conversation via text or text-to-
speech, in lieu of providing direct contact with a live human agent.
1. What are the types of data used for Natural Language Processing applications?
Natural Language Processing takes in the data of Natural Languages in the form of written words and
spoken words which humans use in their daily lives and operates on this.
6. Which words in a corpus have the highest values and which ones have the least?
Stop words like - and, this, is, the, etc. have highest values in a corpus. But these words do not talk
about the corpus at all. Hence, these are termed as stopwords and are mostly removed at the pre-
processing stage only.
Rare or valuable words occur the least but add the most importance to the corpus. Hence, when we
look at the text, we take frequent and rare words into consideration.
7. Does the vocabulary of a corpus remain the same before and after text normalization? Why?
No, the vocabulary of a corpus does not remain the same before and after text normalization. Reasons
are –
● In normalization the text is normalized through various steps and is lowered to minimum
vocabulary since the machine does not require grammatically correct statements but the essence of it.
● In normalization Stop words, Special Characters and Numbers are removed.
● In stemming the affixes of words are removed and the words are converted to their base form.
So, after normalization, we get the reduced vocabulary.
13. What are stop words? Explain with the help of examples.
“Stop words” are the most common words in a language like “the”, “a”, “on”, “is”, “all”. These
words do not carry important meaning and are usually removed from texts. It is possible to remove
stop words using Natural Language Toolkit (NLTK), a suite of libraries and programs for symbolic
and statistical natural language processing.
1 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 0 0 0
1 1 1 1 0 0 1 1 1 0 0 0
0 0 0 0 1 0 0 1 0 1 1 1
2. Classify each of the images according to how well the model’s output matches the data
samples:
Here, the red dashed line is model’s output while the blue crosses are actual data samples.
● The model’s output does not match the true function at all. Hence the model is said to be under
fitting and its accuracy is lower.
● In the second case, model performance is trying to cover all the data samples even if they are out of
alignment to the true function. This model is said to be over fitting and this too has a lower accuracy
● In the third one, the model’s performance matches well with the true function which states that the
model has optimum accuracy and the model is called a perfect fit.
Tokenization:
Raj and Vijay are best friends .
Raj and Vijay
are best
friends.
Likes -s Like
Prefers -s Prefer
Wants -s want
7. What are the different applications of NLP which are used in real-life scenario?
Answer – Some of the applications which is used in the real-life scenario are –
• Automatic Summarization – Automatic summarization is useful for gathering data from social
media and other online sources, as well as for summarizing the meaning of documents and other
written materials. When utilized to give a summary of a news story or blog post while eliminating
redundancy from different sources and enhancing the diversity of content acquired, automatic
summarizing is particularly pertinent.
• Sentiment Analysis – In posts when emotion is not always directly expressed, or even in the same
post, the aim of sentiment analysis is to detect sentiment. To better comprehend what internet users
are saying about a company’s goods and services, businesses employ natural language processing
tools like sentiment analysis.
• Text Classification – Text classification enables you to classify a document and organize it to
make it easier to find the information you need or to carry out certain tasks. Spam screening in email
is one example of how text categorization is used.
• Virtual Assistants – These days, digital assistants like Google Assistant, Cortana, Siri, and Alexa
play a significant role in our lives. Not only can we communicate with them, but they can also
facilitate our life. They can assist us in making notes about our responsibilities, making calls for us,
sending messages, and much more by having access to our data.
8. Explain the types of Chatbot?
Answer – There are two types of Chatbot –
• Script Bot – An Internet bot, sometimes known as a web robot, robot, or simply bot, is a software
programme that does automated operations (scripts) over the Internet, typically with the aim of
simulating extensive human online activity like communicating.
• Smart Bot – An artificial intelligence (AI) system that can learn from its surroundings and past
experiences and develop new skills based on that knowledge is referred to as a smart bot. Smart bot
that are intelligent enough can operate alongside people and learn from their actions.
17. How does the relationship between a word’s value and frequency in a corpus look like in the
given graph?
Answer – The graph demonstrates the inverse relationship between word frequency and word value.
The most frequent terms, such as stop words, are of little significance. The value of words increases
as their frequency decreases. These words are referred to as precious or uncommon words. The least
frequently occurring but most valuable terms in the corpus are those.
19. Explain the differences between lemmatization and stemming. Give an example to assist you
explain.
Answer – Stemming is the process of stripping words of their affixes and returning them to their
original form.
After the affix is removed during lemmatization, we are left with a meaningful word known as a
lemma. Lemmatization takes more time to complete than stemming because it ensures that the lemma
is a word with meaning.
The following example illustrates the distinction between stemming and lemmatization:
Caring >> Lemmatization >> Care
Caring >> Stemming >> Car