Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Expert Systems With Applications: Raja Muhammad Suleman, Ioannis Korkontzelos

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Expert Systems With Applications 165 (2021) 114130

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Extending latent semantic analysis to manage its syntactic blindness


Raja Muhammad Suleman *, Ioannis Korkontzelos
Department of Computer Science, Edge Hill University, Ormskirk, Lancashire L39 4QP, United Kingdom

A R T I C L E I N F O A B S T R A C T

Keywords: Natural Language Processing (NLP) is the sub-field of Artificial Intelligence that represents and analyses human
Natural Language Processing language automatically. NLP has been employed in many applications, such as information retrieval, information
Natural Language Understanding processing and automated answer ranking. Semantic analysis focuses on understanding the meaning of text.
Latent Semantic Analysis
Among other proposed approaches, Latent Semantic Analysis (LSA) is a widely used corpus-based approach that
Semantic Similarity
evaluates similarity of text based on the semantic relations among words. LSA has been applied successfully in
diverse language systems for calculating the semantic similarity of texts. LSA ignores the structure of sentences, i.
e., it suffers from a syntactic blindness problem. LSA fails to distinguish between sentences that contain
semantically similar words but have opposite meanings. Disregarding sentence structure, LSA cannot differen­
tiate between a sentence and a list of keywords. If the list and the sentence contain similar words, comparing
them using LSA would lead to a high similarity score. In this paper, we propose xLSA, an extension of LSA that
focuses on the syntactic structure of sentences to overcome the syntactic blindness problem of the original LSA
approach. xLSA was tested on sentence pairs that contain similar words but have significantly different meaning.
Our results showed that xLSA alleviates the syntactic blindness problem, providing more realistic semantic
similarity scores.

1. Introduction analysis. Simple string-based metrics only apply in cases of exact word
matching. They do not consider inflection, synonyms and sentence
Natural Language Processing (NLP) is the sub-field of Artificial In­ structure. To capture these text variations, more sophisticated text
telligence that focusses on understanding and generating natural lan­ processing techniques, able to calculate text similarity on the basis of
guage by machines (Khurana et al., 2017). Formally, NLP is defined as “a semantics, are needed. Latent Semantic Analysis (LSA) is one such
theoretically motivated range of computational techniques for studying technique, allowing to compute the “semantic” overlap between text
and representing naturally occurring texts (of any mode or type) at one snippets. Introduced as an information retrieval technique for query
or more levels of linguistic analysis for the purpose of attaining language matching, LSA performed as well as humans on simple tasks (Deerwester
that is like a human-like language processing for a range of tasks or et al., 1990). LSA’s abilities to handle complex tasks, such as modelling
applications” (Liddy, 2001). NLP is an interdisciplinary field lying at the human conceptual knowledge, cognitive phenomena and morphology
intersection of computing science, computational linguistics, artificial induction have been assessed on a variety of tasks consistently achieving
intelligence and cognitive science. NLP is concerned with research and promising results (Landauer et al., 1998, 2007; Landauer & Dumais,
development of novel applications for Human Computer Interaction 2008; Schone & Jurafsky, 2000). As its underlying principle, LSA con­
(HCI), with human languages as a medium of communication. NLP ap­ siders the meaning of text in direct relationship with the occurrence of
plications include human language understanding, lexical analysis, distinct words. Intuitively, LSA considers that words with similar
machine translation, text summarization, speech recognition, sentiment meaning will occur in similar contexts. It has been used successfully in a
analysis, expert systems, question answering and reasoning, intelligent diverse range of NLP applications (Landauer, 2002; Vrana et al., 2018;
tutoring systems and conversational interfaces. Wegba et al., 2018; Jirasatjanukul et al., 2019). For example, it has been
Calculating the similarity between text snippets is an important task extensively used as an approximation to human semantic knowledge
for many NLP applications. Similarity scoring schemes range from basic and verbal intelligence in the context of Intelligent Tutoring Systems
string-based metrics to more complex techniques that employ semantic (ITS). LSA-based ITSs, such as AutoTutor and Write To Learn (Lenhard,

* Corresponding author.
E-mail addresses: sulemanr@edgehill.ac.uk (R.M. Suleman), yannis.korkontzelos@edgehill.ac.uk (I. Korkontzelos).

https://doi.org/10.1016/j.eswa.2020.114130
Received 30 January 2020; Received in revised form 26 July 2020; Accepted 14 October 2020
Available online 24 October 2020
0957-4174/© 2020 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

2008), allow learners to interact with the system using a natural lan­ artefacts such as intents, semantics, sentiments etc. In the last years, NLU
guage interface. Even though LSA provides promising results in a is an active area of research, due to its applications to HCI, especially
multitude of applications, its major shortcomings come from the fact with the recent popularity of semantic search and conversational in­
that it completely ignores syntactic information during similarity com­ terfaces, also known as chatbots (Pereira & Díaz, 2019).
putations. LSA suffers the following inherent problems:
2.2. Text similarity approaches
1) LSA is based on the semantic relations between words and ignores
the syntactic composition of sentences. Consequently, it may String similarity can be measured based on lexical or semantic
consider semantically similar sentences with very different or even analysis. Strings are lexically similar if they consist of the same sequence
opposite meaning (Cutrone & Chang, 2011). of characters. They are semantically similar if they have same meaning
2) LSA does not consider the positions of subject and object of a verb as or are used in similar contexts. String-based similarity is evaluated on
distinct, while comparing sentences. For example, LSA considers the character composition and word sequence, whereas corpus-based simi­
sentences “The boy stepped on a spider” and “The spider stepped on a larity is evaluated on the basis of a large corpus. Knowledge-based
boy” as semantically identical, although they are semantically similarity is determined on the basis of information in a semantic
opposite to each other. network (Gomaa & Fahmy, 2013).
3) LSA considers list of words as complete sentences, despite the lack of
proper structure (Islam & Hoque, 2010; Braun et al., 2017). For 2.3. String-based similarity
example, “boy spider stepped” is considered equivalent to the sen­
tences in (2), and LSA considers them as semantically identical. String-based similarity measures can be split in two major categories:
4) LSA does not consider negation. Consequently, it cannot differentiate character-based similarity measures and term-based similarity mea­
between two semantically similar sentences, but one contains some sures. Longest Common SubString (LCS) and Damerau-Levenshtein are
negation. For example, “Christopher Columbus discovered America” among the most popular string-based similarity techniques. Character-
and “Christopher Columbus did not discover America”. Negation based techniques are of limited applicability, because they can only
inverts the sentence’s meaning. However, LSA assigns a similarity capture exact matches. Cosine and Jaccard similarity are two commonly
score of more than 90% to this pair of sentences. used term-based similarity techniques.

In this paper, we explore ways to enrich LSA with syntactic infor­ 2.4. Corpus-based similarity
mation to enhance its accuracy when comparing short text snippets. We
employ Parts-Of-Speech (PoS) tags and Sentence Dependency Structure Corpus-based similarity approaches compute semantic similarity
(SDS) to enrich the input text with syntactic information. Current trends between two strings or words based on information gathered from a
in NLP research focus on Deep Learning. Neural network-based archi­ corpus, i.e., a large collection of text. Hyperspace Analogue to Language
tectures are employed to model complex human behaviors in natural (HAL), Pointwise Mutual Information – Information Retrieval (PMI-IR)
language. These methods have achieved top performance levels for and Latent Semantic Analysis are some of the most popular corpus-based
many semantic understanding tasks, arguably due to their ability to similarity approaches.
capture syntactic representations of text (Gulordava et al., 2018; Kun­
coro et al., 2018; Linzen, Emmanuel, & Yoav, 2016; Hewitt and Mann­ 2.5. Knowledge-based similarity
ing, 2019). Lately, a variety of models that produce embeddings that
capture the linguistic context of words have been proposed, ranging Knowledge-based similarity approaches evaluate word similarity
from Word2vec (Mikolov, 2013) to state-of-the-art transformer-based based on information retrieved from semantic networks. WordNet is the
architectures, such as BERT (Devlin et al., 2018) and XLNet (Yang et al., most popular sematic network for computing knowledge-based simi­
2019). We evaluate our method against some of the current neural larity. Knowledge-based similarity measures can be divided in two cat­
network-based methods, such as Universal Sentence Encoder (USE) egories: semantic similarity measures and semantic relatedness
(Cer, 2018), Bidirectional Encoder Representations from Transformers measures. Two words are semantically similar if they have the same
(BERT) (Devlin et al., 2018) and XLNet (Yang et al., 2019). The results meaning or are synonymous. Two words are semantically related if they
show that xLSA performs consistently better than these techniques on are used in proximity. Mohler and Mihalcea (2009) provided a com­
short text snippets. parison between corpus-based and knowledge-based similarity mea­
The rest of the paper is organized as follows: Section 2 provides an sures. According to their findings, the corpus-based approaches can be
overview of text similarity approaches and describes research work on improved by accounting for corpus size and domain.
enriching the LSA model with syntactic information. Section 3 in­
troduces xLSA and provides details about the proposed extension to LSA. 2.6. Latent semantic analysis (LSA)
Section 4 describes the experimental settings and summarizes the results
of the comparative analysis. Section 5 concludes the findings and pro­ LSA considers the meaning of a document or a passage as directly
poses directions for future work. associated to the occurrence of particular words in it. Kuechler (2007)
provided a detailed overview of a number of information systems and
2. Background and related work business applications of textual data analysis that use LSA. It has been
extensively used in reviewing the literature quantitatively, in computer-
NLP methods aim at allowing computers to understand and manip­ mediated textual data analysis, in customer feedback and interview
ulate language like humans do. NLP applications have been successful in analysis, and in knowledge repositories management (Evangelopoulos
opening new dimensions of Human-Computer Interactions (HCI). et al., 2012). LSA assumes that words that have similar meaning are
likely to occur in related pieces of text. LSA starts by populating a matrix
2.1. Natural language understanding of word count per sentence or paragraph. Each column represents a
sentence or paragraph and each row represents a unique word. Singular
Natural Language Understanding (NLU) deals with tasks that extract Value Decomposition (SVD), a well-known dimensionality reduction
structured semantic information from unstructured text or speech method, is used to reduce the number of columns, preserving the simi­
(Braun et al., 2017). NLU breaks down natural language into a struc­ larity structure among rows. Words are matched by calculating the
tured ontology, allowing computers to understand it and identify cosine similarity between two vectors, which ranges between zero and

2
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

one (Landauer & Dumais, 1997). LSA-based automated text grading et al., 2012) introduced the notion of polarity, allowing the system to
systems have been shown to outperform or at least perform comparably handle two opposite relationships between words, i.e., synonyms and
with human graders in multiple experiments (Toutanova et al., 2003). antonyms. As part of a modified LSA that was applied for automatic
A range of approaches have been applied to enhance LSA with Arabic essay scoring, TF-POS was proposed. TF-POS is a transformed
morphological and syntactic knowledge. The Tagged LSA (TLSA) version of Term Frequency-Inverse Document Frequency (TF-IDF) that
(Wiemer-Hastings et al., 2001) added synthetic information to LSA. It combines PoS tagging with TF to add syntactic information into the
considered a word together with its PoS tag as a single term, whereas the vectors of words (Mezher & Omar, 2016). The model was trained on 488
original LSA does not differentiate between different part of speech of student answers and tested on 183 answers. The results showed en­
the same word. The Syntactically Enhanced LSA (SELSA) (Kanejiya hancements in the Modified LSA score, when compared to original LSA
et al., 2003) is similar to the TLSA. This method populates a matrix scores.
where each row consists of a focus word and the PoS of the previous The methods for enhancing LSA, that were described above, add
word and each column corresponds to a document or sentence. The syntactic information to the data used to train LSA models. Some ap­
Parts-Of-Speech Enhanced LSA (POSELSA) (Kakkonen et al., 2006) proaches have only used PoS tags, whereas others have combined PoS
focused on enhancing LSA by adding PoS information. The technique tags with SDS information to enrich their training data. Our research
used three Word by Context Matrices (WCM) for each word. The first focuses on the task of calculating semantic similarity of short sentences.
entry was for the PoS tag of a focus word, the second entry was for the To address it, we introduce a wrapper around LSA. We do not train our
PoS tags of the focus word and its preceding word, whereas the third own model, but use an existing LSA model trained on the UMBC Web­
entry was for the PoS tag of the focus word and its succeeding word. Base corpus (Han et al., 2013). We use syntactic information of the input
Results showed that using parts-of-speech improved accuracy by 5% to tokens to generate corresponding candidates for LSA comparison. The
10% in comparison to the original LSA. However, the computational results of token-pair comparisons are then combined to generate an
complexity of POSELSA was very high. The Polarity Inducing LSA (Yih overall semantic similarity score for the input sentences.

Fig. 1. xLSA Execution Flow.

3
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

3. Extended Latent semantic analysis (xLSA) root verb, i.e., the left descendants, are considered as subjects. At this
stage, we compute Subject-Verb Agreement (SVA) for each sentence.
3.1. xLSA overview SVA is an NLP task useful for Grammatical Error Detection (GED)
(Leacock et al., 2010; Enguehard et al., 2017; Wang and Zhao, 2015).
Given LSA’s syntactic blindness problem, we propose an algorithmic SVA entails that the subjects and verbs in a sentence must agree on their
extension to address it by combining Sentence Dependency Structure multiplicity, i.e., a singular verb takes a singular subject and a plural
(SDS) and Parts-of-Speech (PoS) tags. The proposed algorithm has been verb takes a plural subject. During sentence decomposition, we create a
developed for the English language and validated over a test set of list of subjects and verbs in the input sentences along with their rela­
sentences, collected from various corpora (Bowman et al., 2015; Young tional dependencies. This information is used to assign an SVA flag to
et al., 2014). Fig. 1 shows the flow of the proposed system. each sentence, specifying whether there is number agreement between
A sentence is a complete thought in written, consisting of a subject its subject and verb.
and a predicate. The subject is “a person, thing or place that is per­ In succession, we check input sentences for negation. To denote this
forming some action”, whereas the predicate describes this action. The the spaCy Dependency Parser assigns a negation relationship (“neg”)
simplest form of the predicate is just a verb, e.g., in the sentence “I between the auxiliary (AUX) token and the particle (PART) token. We
breathe.”. A more complex predicate can include an object, i.e., “a noun use this information to check whether both sentences are negated or if
or pronoun that can be affected by the action of a subject”. Simple only one of them is. In the latter case, we update the isNegated flag to
sentences in English follow the Subject-Verb-Object (SVO) rule, where highlight the negation disagreement.
the verb shows the relationship between the subject and the object. xLSA
uses SDS and PoS tags to identify the Subject, Verb and Object in a 3.2. Evaluation
sentence along with their asymmetric relationships. This information is
used to calculate the similarity of two sentences, by matching the SVO SVO Comparison: After decomposition, sentences are compared on
structure. xLSA works in two phases: (i) the pre-processing phase and (ii) the basis of subject, verb and object (Ab Aziz, et al., 2009; Adhya &
the evaluation phase. The pre-processing phase tokenises the input Setua, 2016; Wiemer-Hastings et al., 2001). Before the comparison, the
sentences and assigns a PoS tag to each token. For each input sentence, it list of subjects, verbs and objects are stemmed. Stemming maps words to
also computes its SDS, that is then used in the evaluation phase to their base form and allows easy comparison between different in­
determine the structural similarity among the input sentences. flections of a word (Cutrone & Chang, 2011). For example, the common
Our method uses PoS tagging, SDS and Sentence Decomposition to base form of “processing” and “processed” is “process”. Stemming is
enrich the sentences for comparison. For tokenisation and PoS tagging, applied to simplify the process of matching terms. After stemming, xLSA
the system uses the spaCy Tokenizer and the spaCy PoS tagger, respec­ performs a cross-comparison, i.e., compares subject(s) of the first sen­
tively (Honnibal & Johnson, 2015). tence with the subject(s) of second sentence, the verb(s) of the first
PoS ambiguity refers to cases where the same form of a word may sentence with the verb(s) of the second sentence and the object(s) of first
occur in text with different PoS. For example, in the sentence “the boy sentence with the object(s) of the second sentence. If the first sentence
steps on the spider”, “steps” is a verb describing the boy’s action, has an object and the second sentence does not, the similarity score of
whereas in the phrase “the steps are broken” the same word is a noun. objects is set to zero. To compute similarity of subjects, verbs and ob­
PoS ambiguous words, such as “steps”, can have different PoS and jects, it is necessary that they exist in both sentences. If they only exist in
meaning dependent on their context. The spaCy PoS tagger can handle one sentence, then the similarity score is set to zero.
PoS ambiguity, and Fig. 2 shows an example. To compute similarity, we used the UMBC STS (Semantic Textual
Dependency grammar is a class of modern syntactic theories, based Similarity) Service API.1 UMBC STS uses a hybrid approach, combining
on dependency relation (Nivre, 2005). Dependency is a formalism that distributional similarity and LSA to compute word similarity. The UMBC
represents relations between words as directed links. Verbs are consid­ service was evaluated on different token-pairs to determine the upper
ered structural centres of clauses and other structural units link with and lower bounds of acceptance thresholds ranging from 0.1 to 1.0, with
verbs by directed links. Dependency structure provides information a 0.1 increment between consecutive tests. Similarity scores of less than
about grammatical functions of words in relation to other words in the 0.4 (40%) were observed for tokens that were completely unrelated with
sentence. The English language has four types of sentences: no semantic relevance, whereas similarity scores of greater than 0.7
(70%) were consistently observed for tokens that were semantically or
1) Simple sentences contextually similar. If subject-to-subject and object-to-object similarity
2) Compound sentences scores for the two sentences are less than the minimum threshold, then
3) Complex sentences xLSA cross-compares the subjects and objects. If the cross-similarity
4) Compound-complex sentences scores for the subjects, objects and verbs for both sentences is greater
than or equal to the upper threshold value then xLSA sets the inverse flag
Complex, compound and compound-complex sentences are broken to ‘1′ for the pair of sentences.
down into simple sentences for SVO comparison (Adhya & Setua, 2016). After computing the inverse flag, xLSA similarity for complete sen­
The spaCy library is used to generate the Dependency Structure of input tences is calculated as the average method (Wiemer-Hastings et al.,
sentences. Dependency Structure provides the system with information 2001). xLSA similarity provides a measure of semantic and syntactic
about sentence structure, to ensure that the provided input has a proper similarity of the sentences. The score is averaged with respect to the
sentence format. The results are used to check whether the input is a number of subjects, objects and verbs of the sentences. For sentences
proper sentence or an arbitrary list of words. that are found to be semantically similar based on SVO comparison, the
During Decomposition, the sentences are split into subjects, verbs isNegated flag specifies whether one of the sentences in the pair negates
and objects. In case of active-voice sentences, the spaCy library uses the the other.
“Nominal Subject” and “Direct Object” tags to specify the subject and
object, respectively. To deal with the passive-voice sentences, spaCy
denotes subjects and objects as “Nominal Subject (Passive)” and “Object
of Preposition”, respectively. As mentioned earlier, the spaCy library is
capable of resolving PoS ambiguity, ensuring that the root verb of a
sentence is identified correctly. Nouns that are the right descendants of
the root verb are considered as objects and nouns that appear before the 1
UMBC STS Service is available at: swoogle.umbc.edu/SimService.

4
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

Fig. 2. The spaCy POS tagger produces different SDSs for the PoS ambiguous word “steps”.

4. Experiments and results


Table 2
Experiment I: SVO structure of sentences.
4.1. Dataset
Sentences Subject Verb Object
For our experiments, we used sentences from two publicly available Sentence-1 Boy Stepping Spider
datasets (Bowman et al., 2015; Young et al., 2014). The datasets contain Sentence-2 Boy Stepped Spider
pairs of English sentences that are semantically similar to each other.
The SNLI corpus (Bowman et al., 2015), is a collection of 570 k human
written English sentences. The Flickr corpus (Young et al., 2014), con­ Table 3
tains 30 k English sentences. The selection criterion for sentence pairs Experiment I: LSA vs xLSA.
was semantic relatedness, for both corpora. SubSim VerbSim ObjSim xLSA LSA Inverse Negation
We selected three categories of sentences. The first category contains Score Score Score Score Score
semantically similar pairs of sentences. The second category contains 1 1 1 1 1 0 0
pairs of sentences with similar words, but complete opposite (inverse)
meaning. The third category contains pairs of semantically related
sentences, where one of the sentences had a negation in it. All three 5.2. Experiment II
categories include sentences in active voice or passive voice, and can
contain multiple subjects, verbs and objects. Some sentences have no In second experiment, we compared semantically related sentences
verb, instead of which only helping verb was used in the sentence. A few with inverse meaning. For example, let us consider the sentences: “The
sentences also included ‘gerunds’ which can be used as a noun or a verb cat climbs on the tree” and “The tree climbs on the cat”. Table 4 shows
depending upon the context of the sentence. the PoS tags of their words.
After PoS tagging, the dependency structure of the sentences were
5. Experiments generated. Then, the sentences were decomposed into subjects, verbs
and objects on the basis of their dependency structure, as shown in
5.1. Experiment I Table 5. The Subject, Verb and Object similarity scores were computed
by matching “subject to subject”, “verb to verb” and “object to object”. If
The first experiment compared pairs of semantically similar senten­ the similarity score of subjects and objects is less than 40% and the verb
ces, where one sentence was in active voice and the second sentence was similarity score is greater than 70%, then a cross-comparison of the
in passive voice. We considered pair of sentences such as “the boy is subject of the first sentence with object of the second sentence and the
stepping on a spider” and “the spider is being stepped on by a boy”. object of the second sentence with the subject of the first sentence is
Table 1 shows the PoS tags for each sentence. performed. If the cross-similarity score is greater than 70%, then the
After PoS tagging, xLSA decomposed the sentences into subjects, inverse flag is set to one. The default value for the inverse flag is zero.
verbs and objects on the basis of their dependency structures. Table 2 Table 6 shows that LSA computed a similarity score of 100%, whereas
shows the generated SVO structure. The subject, verb and object of the xLSA assigned a similarity score of 44% to this sentence pair. In addition,
first sentence were compared with the subject, verb and object of the xLSA detected that the two sentences were inverse of each other and set
second sentence. This comparison was used to calculate the similarity the Inverse flag to one.
scores between the two sentences on the basis of SVO values. Table 3
shows the Subject Similarity Score (SubSim Score), Objects Similarity 5.3. Experiment III
Score (ObjSim Score) and Verbs Similarity Score (VerbSim Score). The
xLSA Similarity Score (xLSA Score) was calculated by using an averaging The third experiment dealt with semantically similar sentences,
formula. In this scenario, xLSA and LSA had equal sentence similarity where each sentence negates the other. For example, we considered
scores, since the sentences had the same meaning. Inverse and Negation sentences such as: “Alessandro Volta invented the battery” and “Battery
flags were set to zero, since the sentences have the same subjects and was not invented by Alessandro Volta”. Table 7 shows the words and PoS
objects and none of them is negated. tags of the sentences. Then, the sentence dependency structures were

Table 4
Table 1
Experiment II: Words with corresponding PoS tags.
Experiment I: Words with corresponding PoS tags.
First Sentence
First Sentence
The cat climbs on the tree
The boy is stepping on a Spider
DT NN VBZ IN DT NN
DT NN VBZ VBG IN DT NN
Second Sentence
Second Sentence
The tree climbs on the cat
The spider is being stepped on by a boy
DT NN VBZ VBG VBN IN IN DT NN DT NN VBZ IN DT NN

5
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

Table 5 threshold of 70% for our scheme.


Experiment II: SVO structure of sentences.
Sentences Subject Verb Object 5.5. Experiment V
Sentence-1 cat climbs tree
Sentence-2 tree climbs cat As mentioned in the beginning of this paper, current NLP research
has mainly focussed on Deep Learning methods, which exploit neural
networks to learn representations of text in order to solve NLP tasks.
Table 6 Many state-of-the-art methods have shown promising results on a vari­
Experiment II: LSA vs xLSA Scores. ety of NLP tasks such as Text Classification, Name Entity Recognition,
SubSim VerbSim ObjSim xLSA LSA Inverse Negation Semantic Role Labelling, Grammatical Error Detection, Information
Score Score Score Score Score extraction, Intent Detection and Slot Filling, Language modelling etc. A
0.16 1 0.16 0.44 1 1 0
survey on the applications of Deep Learning for NLP provides an insight
into the depth and breadth of current NLP research (Otter et al., 2020).
Since the focus of our study is LSA, i.e., a statistical approach, it makes
used to determine voice as well as the subjects, verbs and objects, as sense to see how it would compare to the more recent techniques. We
shown in Table 8. evaluated our approach against some of the current well-known publicly
The subjects, verbs and objects were stemmed to their base forms available NLP models: Google’s USE, BERT and XLNet on the task of
that were then compared to compute semantic relatedness. The xLSA computing semantic similarity for short/simple English sentences. USE
similarity score was evaluated based on the similarity between subjects, adopts a transformer-based architecture, able to handle context in text
verbs and objects. If the xLSA similarity score is greater than 0.7, then spans. This allows USE to generate sentence-level embeddings. BERT is
xLSA checks negation in both sentences. If one of the sentences is also based on a transformer architecture that uses an attention mecha­
negated, then the negation flag is set to one otherwise it remains zero. nism to learn contextual relations amongst tokens in text. BERT uses
Table 9 shows that LSA computed a similarity score of 83% for this pair encoders and decoders to read text and generate predictions,
of sentences. xLSA computed a similarity score of 100% since it was able
to handle the change in the voice of the sentences. The LSA score was Table 8
less than the xLSA score because it took the adverb “not” into account Experiment III: SVO structure of sentences.
during the computation. LSA was unable to identify that the sentences
Sentences Subject Verb Object
were semantically related and that the second sentence was negated,
Sentence-1 alessandro volta invented battery
which inverted its meaning. xLSA produced a similarity score of 100%
Sentence-2 alessandro volta invented battery
because it only considered the subject, verb and object in each sentence
and set the negation flag to one. This means that xLSA identified that the
sentences are semantically similar but mutually contradicting, as one of
Table 9
them is negated.
Experiment III: LSA vs xLSA Scores.
SubSim VerbSim ObjSim xLSA LSA Inverse Negation
Score Score Score Score Score
5.4. Experiment IV
1 1 1 1 0.83 0 1
In the fourth experiment we compared sentences with similar words,
where one of the words appears with a different PoS in each sentence.
For example, consider the sentences “john writes a report” and “john Table 10
reports a murder”. In the first sentence, “report” is a noun, whereas in Experiment IV: Words with corresponding PoS tags.
the second sentence “reports” is a verb. Table 10 shows the words and First Sentence
PoS tags of the sentences. In the first sentence, “writes” was tagged as the
John Writes a report
verb and “report” was tagged as a noun, which qualifies it as an object. NNP VBZ DT NN
In the second sentence, “reports” was tagged as a noun and no verbs
Second Sentence
were found. For sentences without verbs, xLSA counts the number of
nouns and if it is greater than one then it matches the list of nouns John Reports a murder
NNP NNS RB NN
(subjects) with a pre-defined array of verbs whose forms are also used as
nouns. If a match is found, xLSA considers that noun as a verb and marks
its position. In the second sentence, the word “reports” was marked as
the verb. In succession, the subjects, verbs and objects in the sentences Table 11
were identified as shown in Table 11. The stemmed subjects, verbs and Experiment IV: SVO structure of sentences.
objects were then compared to compute similarity scores, as shown in Sentences Subject Verb Object
Table 12. The two sentences were quite different from each other on the Sentence-1 john writes report
semantic level, however LSA assigned a similarity score of 72%. xLSA Sentence-2 john reports murder
gave a similarity score of 52%, which is below the acceptable similarity

Table 7
Experiment III: Words with corresponding PoS tags.
First Sentence
Alessandro volta invented the battery
NNP NNP VBD DT NN

Second Sentence
Battery was not invented by alessandro volta

NN VBD RB VBN IN NNP NNP

6
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

Table 12 of words can yield a similarity score as high as 100%. This might be
Experiment IV: LSA vs xLSA Scores. counter-intuitive for applications that require proper sentences to be
SubSim VerbSim ObjSim xLSA LSA Inverse Negation matched, e.g., automated answer grading systems. To overcome this,
Score Score Score Score Score xLSA not only tests the sentences on the semantic level, but proper
1 0.47 0.11 0.52 0.72 0 0 syntactic structure is also validated to ensure that the input is not a list of
keywords. xLSA has identified all such instances successfully.

respectively. XLNet is a generalized autoregressive pre-training method 6.2. xLSA vs Deep learning-based techniques
that exploits directional dependencies of context words to predict the
following words in text. It also makes use of a transformer architecture, We evaluated xLSA against 3 Deep Learning-based (DL) models: USE,
in particular Transformer XL (Dai et al., 2019), to learn long-term BERT and XLNet on the same set of sentence-pairs that was used to
dependencies. evaluate xLSA against simple LSA. The dataset contained short simple
For this evaluation, we used the same sentence-pairs that were used sentences in English. xLSA along with all the DL models provided high
in the evaluation of xLSA against simple LSA. Google provides pre- similarity scores for sentences that were semantically similar. DL models
trained models for USE along with an interactive Jupyter Notebook on use contextual word embeddings to analyse the meaning of text and
their cloud-based environment, Google Colab.2 The notebook code was compute similarity. Following simple LSA, these approaches overlook
not modified, apart from adding a command to display the similarity changes in the sentence structure. For example, the sentence-pair “the
results as real numbers rather than plotting them on a heatmap. Simi­ cat climbed a tree” and “a tree climbed the cat” have complete opposite
larly, spaCy also provides a Jupyter notebook for the Colab environment meanings, however all of the DL models gave a greater than 85% simi­
that allows to use spaCy’s implementation of BERT and XLNet models.3 larity score to this pair, and also all other similar sentence pairs in the
Again, the models were used out-of-the-box, i.e., no code changes or test set. In addition, these models do not capture negation, hence sen­
parameter tuning was performed for any of the models. Table 13 shows tences such as “the cat climbed the tree” and “the cat did not climb the
the results of a single sentence-pair comparison, while the complete tree” receive a greater than 85% similarity score. On the other hand,
evaluation is presented in the next section. xLSA assigns the lowest and since DL models are trained on huge amounts of textual data, they able
most accurate similarity score to this sentence pair. to generalize better and perform well for sentences with ambiguous
structures.
6. Results
7. Conclusion
The experiments discussed in section 4.2 highlight xLSA’s capability
to handle frequently occurring scenarios in text matching. Due to subtle Natural language carries huge complexity and uncertainty due to its
ambiguities in natural language, the results of semantic similarity ambiguous nature. This makes automated analysis and extraction of
measures can be unpredictable. Sentences may be assessed as highly useful information from a given text a very difficult task. Natural Lan­
similar due to the occurrence of common terms, but still have guage Understanding (NLU) has garnered a lot of research interest due
completely different meaning. For evaluation, we tested xLSA against to its use in achieving seamless virtual conversational interfaces. Un­
simple LSA, Google’s USE, BERT and XLNet on a set of 100 sentence- derstanding text forms the basis of many advanced NLP tasks, and re­
pairs. Table 14 provides the average similarity scores produced by quires systems to gracefully manage the ambiguities of natural
each technique. language. Latent Semantic Analysis (LSA) is a corpus-based approach
that computes similarity of text within a corpus using algebraic tech­
6.1. xLSA vs LSA niques. LSA is used in document classification, semantic search engines,
automated short answers grading and many more tasks. LSA-based
Simple LSA gives a semantic similarity score of 100% to all the evaluation has been shown to correlate strongly with human grading
sentences that have similar words, irrespectively of the effect they have results (Gutierrez et al., 2013). LSA considers the semantic relationship
on the meaning of a sentence. xLSA has been designed to calculate se­ among words, but it overlooks the structure of a sentence, which may
mantic similarity not only based on similar words, but also on the syn­ cause a logically wrong answer to be treated as correct. Syntax plays a
tactic structure of the sentences and the positioning of words in them. key role in understanding the meaning of a sentence and traditional LSA
This allows xLSA to distinguish between sentences that are semantically is blind to it.
related on the surface level, i.e., based on the words that they contain, To mitigate LSA’s syntactic blindness problem, this paper aimed to
but convey completely different meaning. provide an extension to LSA (xLSA), focussing on syntactic composition
LSA does not consider the impact of negation on the meaning of as well as the semantic relations in sentences. xLSA analyses sentences to
sentences, therefore it fails to identify similarity correctly, when one of identify their proper sentence structure using Sentence Dependency
the sentences is negated. Using xLSA, all sentence pairs in the test set Structures (SDS) and the positioning of Parts-of-Speech (PoS) tags. If the
that contained at least one negation sentence were identified success­ sentences have a proper structure, then xLSA focuses on dependency
fully. This means that two sentences might have a high semantic relat­ structures and decomposes each sentence into Subject, Verb and Object
edness score, since they have common words, however, if one of the (SVO). The sentences are compared based on the similarity between the
sentences negates the other, then the semantic similarity between them SVOs. xLSA can identify inverse sentences by cross comparing subjects
is adjusted to address this negation. Table 15 shows some examples of and objects of the two sentences. xLSA also identifies negation in a pair
inverse sentences in the test set, which were successfully flagged as in­ of semantically related sentences, where one of the sentence negates the
verse by our algorithm. other.
LSA also does not consider the syntactic structure of sentences during In English, many words are PoS ambiguous, i.e., can be used both as
comparison. This means that comparing a complete sentence with a list verbs and nouns. Most PoS taggers cannot differentiate among these
words in a sentence. xLSA addresses this problem during the dependency
structure phase, by using a list of words that can be used both as nouns
2
colab.research.google.com/github/tensorflow/hub/blob/master/exam­ and verbs. Our solution is limited to this list of PoS ambiguous words.
ples/colab/semantic_similarity_with_tf_hub_universal_encoder.ipynb. We have tested xLSA with semantically similar sentences from two
3
colab.research.google.com/github/explosion/spacy-pytorch-transformers/ corpora against simple LSA and 3 Deep Learning models. xLSA’s results
blob/master/examples/Spacy_Transformers_Demo.ipynb. are very promising, but are limited by the number and categories of

7
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

Table 13
Experiment V: xLSA vs USE, BERT, XLNet.
USEScore BERTScore XLNetScore xLSAScore

Sentence-1Sentence-2 John writes a reportJohn reports a murder 0.63 0.78 0.78 0.52

Table 14
Evaluation Results.
Sentence-Pair Type USEScore (Avg.) BERTScore (Avg.) XLNetScore (Avg.) LSAScore (Avg.) xLSAScore (Avg.)

Similar 0.90 0.92 0.98 0.99 0.96


Inverse 0.88 0.87 0.97 1.0 0.38
*Negated ** ** ** ** 1.0þ
*Checked only when sentences have a high similarity score**Negation not handled+xLSA captured all negative sentences correctly

Cer, D., Yang, Y., Kong, S. Y., Hua, N., Limtiaco, N., John, R. S. \& Sung, Y. H. (2018).
Table 15 Universal sentence encoder. arXiv preprint arXiv:1803.11175.
Inverse sentences similarity scores. Cutrone, L. \& Chang, M. (2011, July). Auto-assessor: computerized assessment system
for marking student’s short-answers automatically. In 2011 IEEE international
Sentence Pair LSA xLSA Inverse conference on technology for education (pp. 81–88). IEEE.
Score Score Deerwester, Scott, Dumais, Susan T., Furnas, George W., Landauer, Thomas K., &
Harshman, Richard (1990). Indexing by latent semantic analysis. Journal of the
the earth must revolve around the sun.the sun 1 0.55 1
American society for information science, 41(6), 391–407.
must revolve around the earth.
Devlin, J., Chang, M. W., Lee, K. \& Toutanova, K. (2018). Bert: Pre-training of deep
koko was asked to choose a house or a tree.a 1 0.34 1 bidirectional transformers for language understanding. arXiv preprint arXiv:
house or a tree were asked to choose koko. 1810.04805.
money cannot buy happiness.happiness cannot 1 0.36 1 Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V. \& Salakhutdinov, R. (2019).
buy money. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv
the hard disk stores data.the data stores hard 1 0.42 1 preprint arXiv:1901.02860.
disk. Enguehard, E., Goldberg, Y. \& Linzen, T. (2017). Exploring the syntactic abilities of
the cat climbs on a tree.the tree climbs on a cat 1 0.44 1 RNNs with multi-task learning. arXiv preprint arXiv:1706.03542.
the dog bit a child.the child bit a dog. 1 0.47 1 Evangelopoulos, N., Zhang, X., & Prybutok, V. R. (2012). Latent semantic analysis: five
tom is writing a letter and a book.letter and book 1 0.33 1 methodological recommendations. European Journal of Information Systems, 21(1),
are writing tom. 70–86.
Gomaa, W. H., & Fahmy, A. A. (2013). A survey of text similarity approaches.
International Journal of Computer Applications, 68(13), 13–18.
Gutierrez, F., Dou, D., Martini, A., Fickas, S. \& Zong, H. (2013, December). Hybrid
sentences in the test set that was used for evaluation. We aim to address ontology-based information extraction for automated text grading. In 2013 12th
these limitations in the future, by increasing the types of sentences that International conference on machine learning and applications (Vol. 1, pp.
can be handled by xLSA and running a more thorough evaluation. 359–364). IEEE.
Han, L., Kashyap, A. L., Finin, T., Mayfield, J. \& Weese, J. (2013, June). UMBC_
EBIQUITY-CORE: Semantic textual similarity systems. In Second joint conference on
CRediT authorship contribution statement lexical and computational semantics (*SEM), Volume 1: Proceedings of the main
conference and the shared task: Semantic textual similarity (pp. 44–52).
Gulordava, Kristina, et al. (2018). Colorless green recurrent networks dream
Raja Muhammad Suleman: Conceptualization, Methodology, hierarchically. Proceedings of the 2018 Conference of the North American Chapter of the
Investigation, Software. Ioannis Korkontzelos: Supervision, Method­ Association for Computational Linguistics: Human Language Technologies, 1195–1205.
ology, Validation. In press.
Hewitt, J. \& Manning, C. D. (2019, June). A structural probe for finding syntax in word
representations. In Proceedings of the 2019 conference of the North American
Declaration of Competing Interest Chapter of the association for computational linguistics: Human language
technologies, Volume 1 (Long and Short Papers) (pp. 4129–4138).
Honnibal, M. \& Johnson, M. (2015, September). An improved non-monotonic transition
The authors declare that they have no known competing financial system for dependency parsing. In Proceedings of the 2015 conference on empirical
interests or personal relationships that could have appeared to influence methods in natural language processing (pp. 1373–1378).
Islam, M. M. \& Hoque, A. L. (2010, December). Automated essay scoring using
the work reported in this paper. generalized latent semantic analysis. In 2010 13th International conference on
computer and information technology (ICCIT) (pp. 358–363). IEEE.
Acknowledgement Jirasatjanukul, K., Nilsook, P., & Wannapiroon, P. (2019). Intelligent human resource
management using latent semantic analysis with the internet of things. International
Journal of Computer Theory and Engineering, 11(2).
This research work is part of the TYPHON Project, which has Kakkonen, T., Myller, N. \& Sutinen, E. (2006). Applying part-of-speech enhanced LSA to
received funding from the European Union’s Horizon 2020 Research automatic essay grading. arXiv preprint cs/0610118.
Kanejiya, D., Kumar, A. \& Prasad, S. (2003). Automatic evaluation of students’ answers
and Innovation Programme under grant agreement No. 780251. using syntactically enhanced LSA. In Proceedings of the HLT-NAACL 03 workshop on
building educational applications using natural language processing (pp. 53–60).
References Khurana, D., Koli, A., Khatter, K. \& Singh, S. (2017). Natural language processing: State
of the art, current trends and challenges. arXiv preprint arXiv:1708.05148.
Kuechler, W. L. (2007). Business applications of unstructured text. Communications of the
Ab Aziz, M. J., Dato’Ahmad, F., Ghani, A. A. A. \& Mahmod, R. (2009, October). ACM, 50(10), 86–93.
Automated marking system for short answer examination (AMS-SAE). In 2009 IEEE Kuncoro, A., et al. (2018). LSTMs can learn syntax-sensitive dependencies well, but
symposium on industrial electronics & applications (Vol. 1, pp. 47–51). IEEE. modeling structure makes them better. In Proceedings of the 56th Annual Meeting of
Adhya, S. \& Setua, S. K. (2016). Automated short answer grader using friendship the Association for Computational Linguistics, 1426–1436. In press.
graphs. In Computer science and information technology-proceedings of the sixth Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent
international conference on advances in computing and information technology semantic analysis theory of acquisition, induction, and representation of knowledge.
(ACITY 2016) (Vol. 6, No. 9, pp. 13–22). Psychological Review, 104(2), 211.
Bowman, S. R., Angeli, G., Potts, C. \& Manning, C. D. (2015). A large annotated corpus Landauer, T. K., McNamara, D. S., Dennis, S. \& Kintsch, W. (Eds.). (2007). Handbook of
for learning natural language inference. arXiv preprint arXiv:1508.05326. latent semantic analysis. Lawrence Erlbaum Associates Publishers.
Braun, D., Mendez, A. H., Matthes, F. \& Langen, M. (2017, August). Evaluating natural Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to latent semantic
language understanding services for conversational question answering systems. In analysis. Discourse Processes, 25, 259–284.
Proceedings of the 18th annual SIGdial meeting on discourse and dialogue (pp.
174–185).

8
R.M. Suleman and I. Korkontzelos Expert Systems With Applications 165 (2021) 114130

Landauer, T. K., & Dumais, S. (2008). Latent semantic analysis. Scholarpedia, 3(11), and the 4th conference on Computational natural language learning (Vol. 7, pp.
4356. 67–72). Association for Computational Linguistics.
Landauer, T. K. (2002). Applications of latent semantic analysis. Proceedings of the Toutanova, K., Klein, D., Manning, C. D. \& Singer, Y. (2003, May). Feature-rich part-of-
annual meeting of the cognitive science society, 24. speech tagging with a cyclic dependency network. In Proceedings of the 2003
Lenhard, W. (2008). Bridging the gap to natural language: A review on intelligent conference of the North American chapter of the association for computational
tutoring systems based on latent semantic analysis. linguistics on human language technology (Vol. 1, pp. 173–180). Association for
Liddy, E. D. (2001). Natural language processing. Computational Linguistics.
Leacock, C., et al. (2010). Automated grammatical error detection for language learners. Vrana, S. R., Vrana, D. T., Penner, L. A., Eggly, S., Slatcher, R. B., & Hagiwara, N. (2018).
Synthesis lectures on human language technologies, 1–134. In press. Latent semantic analysis: A new measure of patient-physician communication. Social
Linzen, Tal, Emmanuel, Dupoux, & Yoav, Goldberg (2016). Assessing the ability of Science & Medicine, 198, 22–26.
LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Wang, Y. \& Zhao, H. (2015, October). A light rule-based approach to English subject-
Computational Linguistics, 521–535. In press. verb agreement errors on the third person singular forms. In Proceedings of the 29th
Mezher, R., & Omar, N. (2016). A hybrid method of syntactic feature and latent semantic Pacific Asia conference on language, information and computation: Posters (pp.
analysis for automatic arabic essay scoring. Journal of Applied Sciences, 16(5), 209. 345–353).
Mikolov, T., Chen, K., Corrado, G. \& Dean, J. (2013). Efficient estimation of word Wegba, K., Lu, A., Li, Y. and Wang, W. (2018, March). Interactive Storytelling for Movie
representations in vector space. arXiv preprint arXiv:1301.3781. Recommendation through Latent Semantic Analysis. In 23rd International
Mohler, M. \& Mihalcea, R. (2009, March). Text-to-text semantic similarity for automatic conference on intelligent user interfaces (pp. 521–533).
short answer grading. In Proceedings of the 12th conference of the European chapter Wiemer-Hastings, P. \& Zipitria, I. (2001). Rules for syntax, vectors for semantics. In
of the ACL (EACL 2009) (pp. 567–575). Proceedings of the annual meeting of the cognitive science society (Vol. 23, No. 23).
Nivre, J. (2005). Dependency grammar and dependency parsing. MSI Report, 5133 Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R. \& Le, Q. V. (2019). Xlnet:
(1959), 1–32. Generalized autoregressive pretraining for language understanding. In Advances in
Otter, D. W., Medina, J. R., & Kalita, J. K. (2020). A survey of the usages of deep learning neural information processing systems (pp. 5754–5764).
for natural language processing. IEEE Transactions on Neural Networks and Learning Yih, W. T., Zweig, G. \& Platt, J. C. (2012, July). Polarity inducing latent semantic
Systems. analysis. In Proceedings of the 2012 joint conference on empirical methods in
Pereira, J., & Díaz, Ó. (2019). What matters for chatbots? Analyzing quality measures for natural language processing and computational natural language learning (pp.
facebook messenger’s 100 most popular Chatbots. In Towards integrated web, mobile, 1212–1222). Association for Computational Linguistics.
and IoT technology (pp. 67–82). Cham: Springer. Young, P., Lai, A., Hodosh, M., & Hockenmaier, J. (2014). From image descriptions to
Schone, P. \& Jurafsky, D. (2000). Knowledge-free induction of morphology using latent visual denotations: New similarity metrics for semantic inference over event
semantic analysis. In Proceedings of the 2nd workshop on learning language in logic descriptions. Transactions of the Association for Computational Linguistics, 2, 67–78.

You might also like