IJISRT23DEC1110 (1)

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/377895701
Performance Evaluation of Word Embedding Algorithms
Article in International Journal of Innovative Science and Research Technology · December 2023
DOI: 10.5281/zenodo.10443962
CITATIONS READS
0 369
2 authors, including:
Lokesh Lodha
Jaipur National University
14 PUBLICATIONS 20 CITATIONS
SEE PROFILE
All content following this page was uploaded by Lokesh Lodha on 02 February 2024.
The user has requested enhancement of the downloaded file.

Volume 8, Issue 12, December – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Performance Evaluation of Word Embedding

Algorithms
Shilpi Kulshretha Lokesh Lodha
Department of CSE Department of Electronics. & Comm.
Jagannath University, Jaipur, India Jaipur National University, Jaipur, India
Abstract:- This study intends to explore the field of word accuracy is sufficient, it can be applied to additional
embedding and thoroughly examine and contrast various classification tasks; if not, more complexity is required to
word embedding algorithms. Words retain their semantic achieve optimal outcomes. This comparative study may help
relationships and meaning when they are transformed researchers and practitioners select the best word embedding
into vectors using word embedding models. Numerous strategy for their particular applications.
methods have been put forth, each with unique benefits
and drawbacks. Making wise choices when using word The research study is structured in five sections where
embedding for NLP tasks requires an understanding of first section introduces the NLP tasks and word embedding
these methods and their relative efficacy. The study what about. Second section gives an overview of the basic
presents methodologies, potential uses of each technique ideas and precepts guiding word embedding. Section 3
and discussed advantages, disadvantages. The discusses the popular word embedding techniques in detail,
fundamental ideas and workings of well-known word outlining their strengths and weaknesses. Section 4 discusses
embedding methods, such as Word2Vec, GloVe, the evaluation framework and criteria used for comparing
FastText, contextual embedding ELMo, and BERT, are these techniques. Section 5 presents the experimental results
evaluated in this paper. The performance of these and comparative analysis. Finally these methods are
algorithms are evaluated for three datasets on the basis of compared according to a range of criteria, including
words similarity and word analogy and finally results are transferability, quality of embedding, computational
compared. efficiency, and adaptability to different languages or
domains.
Keywords:- Embedding, Word2Vec, Global Vectors for Word
Representation (GloVe), Embedding from Language II. OVERVIEW ON WORD EMBEDDING
Models (ELMo), BERT.
A. Vector Space Model (VSM)
I. INTRODUCTION The Vector Space Model forms the foundational
concept for word Embedding. It represents words as vectors
Word embedding is technique uses Vector Space Model in a multi-dimensional space, where each dimension
concept to transform words into vectors in natural language corresponds to a specific aspect or feature of the word. This
processing. This technique offers a dense representation of representation allows mathematical operations and
words in a continuous vector space and now a days it has computations on words, enabling algorithms to understand
become a key technique in machine translation and NLP relationships and similarities. Here we try to densely pack the
tasks. These techniques help machines comprehend and information of the text into a vector which formally takes
process human language more effectively by capturing some hundred or thousand dimensions [1]. It had the first use
contextual information and semantic relationships from case in the SMArt Information Retrieval System. The VSM
words and sentences. The effectiveness of these algorithms has many use cases some of which are:
depends on the selection of embedding technique for the  Relevancy Ranking
application. Several NLP tasks, including named entity  Information Retrieval
recognition, machine translation, sentiment analysis, and  Information Gathering
more, can be strongly impacted by the words.
In word embedding a fundamental principle dictates
Humans have always attempted to complete that words appearing in comparable contexts tend to manifest
complicated tasks at the speed of light, thanks to the proximity within the vector space representation. This
development of computers and computational capacity which signifies that their corresponding vectors exhibit similarity,
made this possible. Word embedding provides a continuous emphasizing the preservation of contextual meaning and
and distributed representation that captures semantic and semantic relationships during the embedding process. For
contextual information from sentences, in contrast to example, If the words "cat" and "dog" are frequently
traditional approaches that represented words as discrete observed in the dataset within the context of "owner," the
symbols or sparse representations. This is a supervised resulting word Embedding for "cat" and "dog" will
learning task that discards the classification accuracy by demonstrate closeness in the vector representation. This
using the categories and the data vector as input. If the proximity reflects their shared contextual relationship with
IJISRT23DEC1110 www.ijisrt.com 1555

ISSN No:-2456-2165
"owner" and highlights the ability of word embedding to
capture semantic associations based on the co-occurrence of
words.
B. Semantic Meaning and Context

A vector space which contains the semantic and
contextual information of word. These words that are
semantically similar or contextually related are positioned
closer to each other in this vector space. This characteristic
simplifies the NLP task of capturing nuances and semantics
essential for accurate interpretation. Semantic representation
is an essential component of NLP, and it enables both
humans and computers to better understand the meaning of
language. For example division by zero in mathematics is Fig. 1. POS Tagging for Words
erroneous. The mathematical rules of grammar do not allow
us to divide anything by zero. This error is in the context of F. Co-Reference Resolution:
mathematics, and it is an example of semantic errors [2]. Co-reference resolution is a difficult task because it
necessitates a thorough comprehension of the text's meaning,
C. Word Sense Disambiguation including the connections between various entities. In the
Word sense disambiguation is the task of identifying the sentence "Mary went to the store and bought some apples,"
correct meaning of a word in each context in the sentence. for instance. "She" refers to the subject "Mary" in the
Sometimes, we are unable to distinguish what the person sentence "She ate them on the way home." To identify this
wants to say and we do not get the meaning. For example we coreference, the system must understand that "Mary" is the
have a sentence "Let's eat Grandma". It is evident to us that subject of the first sentence and the antecedent of the
this sentence has two meanings. Grandmother is being eaten pronoun "she" in the second sentence.CR systems are
in the first meaning and we know that it is not something that typically trained on large corpora of text that have been
anyone would want. In second case Grandma join you at the manually annotated with coreference information. A CR
table to eat something. Because many words have multiple system can be used to resolve coreferences in fresh text once
meanings and because a word's meaning can change it has been trained. In the sentence "Mary went to the store
depending on the context in which it is used, word sense and bought some apples," for instance. "She" refers to the
discrimination (WSD) is a difficult task [3] because it subject "Mary" in the sentence "She ate them on the way
depends on previous sentences. WSD can be used in many home". A machine would resolve this coreference by
different ways and applications, such as rule-based, identifying all potential coreferences ("Mary" and "she") and
statistical, and machine learning approaches. Although WSD using a variety of factors, such as number agreement, gender
is a difficult and demanding task because it is being utilized agreement, semantic similarity, and recency, to determine
in many NLP applications. whether the two expressions actually refer to the same entity.
Coreference resolution is a challenging task, but it is essential
D. Named Entity Recognition for many NLP tasks [4].
Named entity recognition is a subtask of NLP that
extract and identify essential information from the text ant it III. LITERATURE REVIEW
a key function in NLP. It recognizes and categorize named
entities, such as individuals, groups, places, and goods, and is Saqqa and Samar et al [5] suggested approach Bengali
the initial stage in deciphering the meaning of text. We can language NLP researchers can quickly construct the
attain higher accuracy in other NLP tasks, like machine necessary word embedding vectors for word representation in
translation, text summarization, and question answering, if NLP. Jeffrey et al [6] performed a study on embedding and in
we can execute NER accurately and efficiently. To keep this study they find that word2vec is efficient embedding
things simple, we take an example: "The man with the technique which gives highest accuracy.
telescope is who I saw." Here, it's unclear that if I was the
one with the telescope or if someone else was holding the Several well-known word embeddings, including
telescope. word2vec, Glove, and FastText, were investigated by
Cagliero et al. [7] in a number of downstream tasks, such as
E. Part of Speech Tagging sentiment analysis and text inference. According to their
The part of speech tagging is the process of giving each findings, starting the embedding layer at random can be
word in a text corpus and it is known as "POS Tagging." In trained to produce results that are comparable to starting it
corpus linguistics, POS tagging is the process of marking a with pre-trained classic embeddings. Using cross-language
word in a text as corresponding to a particular part of speech datasets (English and Arabic), M. Fawzy et al. [8] examined
on the basis of definition and context. Noun, Pronoun, latent semantic analysis on word2vec and GloVe in the topic
Adjective, Verb, Adverb, Preposition, Conjunction, and segmentation (TS) task and they conducted a thorough
Interjection are the parts of speech. A text corpus contains analysis of the word2vec model and investigate its influence
assortment of textual data used for NLP model. on TS using various training strategies and they concluded
that when training algorithms are carefully selected based on

ISSN No:-2456-2165
the features of a language-specific dataset, word2vec with a window size of 4: [[1, 2, 4, 5], 3]. In this case center
performs well. P. Shah et al [9] performed a study and used word text is 3 and the final calculation is compared to the
datasets from multiple domains to investigate the effects of center word index, and the neural network's input is [1, 2, 3,
pre-trained word embedding. Their findings show that using 4]. For mapping, a loss function is used to aid in training and
pre-trained embedding as feature representations has a direct the neural network towards the acquisition of
substantial effect on RC's performance and make the system meaningful representations and a logarithmic function is a
easier. good choice.
IV. METHODOLOGY  GloVe: This word embedding technique uses a novel

method to capture word semantics and creates word
This study performs a comparative analysis on popular vectors by utilizing global word co-occurrence statistics,
word embedding techniques. This study includes the popular highlighting word analogy and similarity connections.
word embedding techniques like Word2Vec, GloVe, GloVe is well-known for its efficiency and scalability and
fastText, ELMo and BERT. The examination process and it has become more and more popular in NLP and
brief discussion of each algorithms is given below. provides rich, context-aware word representations. GloVe
support a wide range of applications, including sentiment
 Word2Vec: Word2vec is a group of related models that analysis and machine translation and many others. This is
are used to produce word embedding and these model unsupervised approach in the field of word embedding
used shallow neural networks that are trained to and NLP.
reconstruct linguistic contexts of words. This approach
uses continuously sliding Skip-gram or continuously  Word Commonality and Semantic Relations: GloVe word
sliding Bag-of-Words (CBOW) which are two well- embedding provides a nuanced understanding that is
known techniques for creating datasets. Integration of this beneficial for tasks like NER and word similarity
Word2Vec are used NLP and deep learning libraries is measurements. When it examines semantic relationships
proof of its widespread adoption. The Word2Vec converts and word commonality, it performs exceptionally very
words into dense vectors, facilitating the capture of well.
complex semantic relationships, and has been crucial in  Global Corpus Statistics: GloVe and Word2Vec
the advancement of numerous NLP applications. integrates global corpus statistics into its model
architecture. This integration enhances accuracy and
 C-BOW Model – The CBOW model works on the basis of model performance, particularly in limited size of word
surrounding words concepts. It takes word as input and corpus when training is performed. The ability of this
trying to predict the target word in the center of the method is to leverage these global statistics contributes to
window. Predicting the central word in each corpus is the a more robust word embedding model.
task of the CBOW. In the CBOW model, the distributed
representations of word are combined to predict the word  FastText: This competitive algorithm was created by
in the middle and to be more precise, given the words that Facebook Research in 2016 and it uses the concept of
come before and after the target word, the goal is to sliding window for creating a training dataset. One of the
predict and identify the word that is in the middle of this excelling libraries for deep learning was also created by
context and Skip gram model predicts the context. this group. This algorithm was developed mostly for text
classification, but it has found its use in word embedding
 Skip-gram Model- This method has many uses in NLP too. In the other methods, a word was being used as a
and has proven effective in capturing complex semantic fundamental quantity for text processing, but the FastText
relationships between words. This focuses on context method uses characters as a base for its purpose. Since a
word prediction given a target word. By increasing the bunch of characters can itself make a big dataset,
likelihood of adjacent words, it seeks to obtain a thorough therefore less training is needed for FastText to work.
grasp of the context around the given word. For example
given the word is "jumped" in the sentence, we ought to  Example Sentence: "The performance of the model was
be able to guess the other words, such as "the," "cat," exceptional"
"over," and "puddle" in the sentence. During the training  Sub word Segmentation: FastText segments each word
phase of the neural network, this is the essential for the into sub words, commonly using bi-grams and trigrams.
construction of vectors. Collection of texts is the first step For the given sentence, the sub word segmentation would
to create a list of distinct words and each of which is include sub words such as <Th, he, e , p, er, fo, or, rf, ,
given a unique index, in order to construct a dataset for rm, , nc, ce, of, ma, th, he, m, de, el, l, wa, mo, od, as, s,
training. In this scenario, each word is unique, ce, ep, pt, ti, ex, xc, io, na, al, on >.
corresponding to indices such as [1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12].  Sub word Vector Representation: Each sub word is
associated with a pre-trained vector representation,
The sliding window is used to make training simpler. obtained through an unsupervised learning process. These
The words inside the window serve as the input for a given vectors capture the semantic and morphological
window size, and the target word is located in the center of information of the sub words.
the sentence. For example, the dataset could look like this

ISSN No:-2456-2165
 Word Vector Computation: The word vector for a V. PERFORMANCE EVALUATION
particular word is computed as the sum of its sub word
vectors. For example, the vector representation of the There are numerous word embedding methods available
word "model" would be the sum of its subword vectors, and it is crucial to select a set of evaluation criteria that are
where each subword is associated with a pre-trained pertinent to the task or tasks for which word embedding will
vector. For example, <mo> + <od> + <de> + <el>. be used in order to compare various word embedding
techniques. Every algorithm or technique has its advantages
This indicates that the word vector for the word "model" and disadvantages. To select a benchmark dataset that
is the total of all of its bigrams. Every term in the lexicon is adequately represent the tasks for which the term
depicted as a single-hot encoded vector. The size of the "Embedding" will be used, is also crucial. Two parameters
vocabulary is represented by this binary vector, where all semantic similarity and syntactic similarity are frequently
other values are set to 0 and only the index corresponding to used evaluation criterion that gauges how well a embedding
the current word is set to 1. Text can also be generated with captures the meaning or relatedness of words. Both
FastText and it is an effective tool that can be applied to a requirements semantic and syntactic frequently call for
range of NLP tasks. This is easy to use and delivers cutting- figuring out similarities or distances within the embedding
edge outcomes for a variety of tasks and it generates a word space.
sequence as output in order to accomplish a word sequence.
The output sequence of the model is generated by forecasting Evaluation of Word Similarity: Calculate the degree of
the subsequent word in the sequence using the words. similarity between words embedding in word pairs and
contrast the results with scores of human annotations.
 BERT: This uses masked language model which is a pre- Pearson or Spearman correlation are two common measures
trained model to predict the masked words based on the of similarity. Extrinsic evaluation is just as important as
surrounding words' context, and random words in the intrinsic evaluation and it consists the term "Embedding" into
input sentences are masked. Now days it has seen a rise in tasks related to downstream natural language processing.
interest in pre-trained language models (PLMs) as a way Computational efficiency, scalability, and adaptability of
to improve natural language processing related tasks. word embedding across domains should be evaluated and
PLMs learn to represent the meaning of words and how different embedding techniques handle out-of-
phrases by considering their context from both sides. One vocabulary words, rare words, and multilingual contexts is
of the most popular PLMs is called BERT (Bidirectional important.
encoder representation from transformers), which has
demonstrated state-of-the-art performance on a range of TABLE 1. PRE-TRAINED MODELS USED FOR OUR
natural language processing tasks, such as text TESTS
summarization, sentiment analysis, and question Serial Name Training Approximate
answering. By taking into account both left and right No. Corpus size
context, this bidirectional approach differs from previous 1 Googlenews- Google News 1.5GB
models and enables this algorithms to capture a more Vectors- Corpus
thorough understanding of word relationships. Negative300.bin
2 glove.42B.300d Common 4.5GB
 GPT: Generative Pre-trained Transformer uses self- Crawl(Websites)
attention mechanisms while generating predictions, the 3 wiki-news-300d-1M- Wikipedia 2GB
attention mechanism enables the model to concentrate on subword.vec
distinct segments of the input sequence. GPT is a key
component of contemporary NLP and it is a transformer- The insights gained from these evaluations, can guide
based architecture that works on the pre-training and fine- researchers to select the most suitable embedding technique
tuning principles. The most recent model in the GPT for their specific NLP tasks. Choosing the appropriate
family uses three orders of magnitude more parameters. evaluation criteria and benchmark datasets plays an important
These types’ transformers have proven to be an extremely role in assessing and comparing various word embedding
effective tools for producing text or sentences like a techniques. Word Similarity test, Quality of classification,
human being. and word analogy test are chosen for analysis.
A fundamental component of GPT is a transformer TABLE 2. DATASETS USED TO CHECK EFFICIENCY
neural network which uses attention mechanisms. The ON WORD SIMILARITY
architecture of the model consists of feed-forward neural Name Entries Year
networks, positional encodings, and several layers of
attention. Positional encoding is added to the input WordSim-353 353 2002
embedding to provide information about the positions of WordSim-353-SIM 203 2009
tokens in the sequence, as transformers do not understand the WordSim-353-REL 253 2009
sequential order of input tokens by default.
Miller-Charles (MC-30) 30 1991
Dataset
Rubenstein & Goodenough 65 1965

ISSN No:-2456-2165
(RG-65) Categorization of words into different clusters in a
machine learning task because of semantic similarity. Three
SimVerb-999 999 2014
datasets were used for this purpose namely AP dataset and
SimLex-999 999 2014 BLESS dataset.
For performing word analogy test, the Google dataset TABLE 5. COMPARISON OF CONCEPT
and the MSR dataset are used. These dataset evaluates the CATEGORIZATION
ability of word embedding and to capture the semantic and Name Categories Word2vec GloVe fastText
syntactic relationships between words. The Google dataset
contains 19,544 number of questions which can be divided AP 21 65.7 61.4 59.0
into groups one is "morpho-syntactic" and other one is Dataset
"semantic". There are total 8,000 analogy issues in the other BLESS 56 74.0 82.0 73.0
MSR dataset. Dataset
BM 27 45.1 43.6 41.9
VI. RESULTS AND DISCUSSION Dataset
Table 3, 4, 5, and 6 show the results of the study and The final evaluation criteria we chose was the Outlier
these show that Word2Vec performed faster and more Detection criteria. We adopted two datasets for outlier
accurately than GloVe and FastText on every dataset. The detection: WordSim-500 and 8-8-8 datasets.
reason is that Word2Vec could accurately capture the
semantic relationships between words. It gives highest word Each of the 500 clusters in the WordSim-500 is
similarity 81.3 for RG-65 dataset and gives highest word represented by a set of eight words with five to seven
analogy 74.4. outliers. Eight clusters, each consisting of a set of eight
words with eight outliers, make up the 8-8-8 dataset. We
Word2Vec, GloVe, and FastText are the three word computed the Outlier Position Percentage (OPP) in addition
embedding methods we employed in our experiment. Words to accuracy. Between the two datasets, the results, displayed
are learned to be represented as vectors of real numbers using in Table V, were inconsistent. On the WordSim-500 dataset,
neural network-based models, which underpin all three word for instance, GloVe performed the best, but on the 8-8-8
embedding techniques. Following that, these vectors can be dataset, it had the lowest accuracy.
applied to a range of natural language processing tasks,
including machine translation, sentiment analysis, and text TABLE 6. COMPARISON OF OUTLIER DETECTION
classification. These are all highly well-liked methods in Name Word2vec GloVe fastText
NLP that are applied to some fascinating tasks such as:
machine translation, similarity detection, analogy detection, WS-500 14.02 15.09 10.68
named entity recognition (Accuracy)
WS-500 85.33 85.74 82.16
TABLE 3. COMPARISON OF WORD SIMILARITY
(OPP)
Name Word2vec GloVe fastText 8–8–8 56.25 50.0 57.81
(Accuracy)
WS-353 64.3 59.7 64.3 8–8–8 84.38 84.77 84.38
WS-353-REL 53.4 55.9 56.4 (OPP
WS-353-SIM 74 66.8 72.1
MC-30 74.7 74.2 76.3 This study does not cover advanced topics like machine
RG-65 81.3 75.1 77.3 translation because those require further training on our part.
SimVerb-999 24.5 17.2 21.9 In the machine translation, to understand the meaning of a
SimLex-999 37.2 32.4 35.2 text in one language and converting it into another language
is difficult task. It is necessary to take textual meaning of
TABLE 4. COMPARISON OF WORD ANALOGY both languages same and machine translation can perform
Name Word2vec GloVe fastText this up to some limitation. A robust machine translation
Google (Add) 70.7 68.4 40.5 systems can be created that are more accurate and effective
Google (Mul) 70.8 68.7 45.1 than ever before by utilizing the most recent developments.
Semantic (Add) 74.4 76.1 19.1
Semantic (Mul) 74.1 75.9 24.8 VII. CONCLUSION
Syntactic (Add) 67.6 61.9 58.3
Syntactic (Mul) 68.1 62.7 61.9 In this study, the performance of various word
MSR (Add) 56.2 50.3 48.6 embedding techniques was evaluated and compared. The
MSR (Add) 56.8 51.6 52.2 findings of this study gives insightful information about how
well various word embedding methods perform in tasks
involving natural language processing. According to the
result Word2Vec word embedding technique performs well
compared to the other techniques. This technique could be

ISSN No:-2456-2165
used to create machine translation or text classification [10]. Y. Singh, M. Saini and Savita, "Impact and
systems that are more precise and effective. This study also Performance Analysis of Various Activation Functions
discovered that using a larger dataset to train the models for Classification Problems," 2023 IEEE International
enhanced the performance of all three word embedding Conference on Contemporary Computing and
techniques. Pre-trained model glove42B has 4GB size. This Communications (InC4), Bangalore, India, 2023, pp. 1-
study used three word embedding strategies with some 7, doi: 10.1109/InC457730.2023.10263129.
limitations. This study is limited to the two NLP tasks and it [11]. Vispute, S., Saini, M.L. (2022). Automation in
would be intriguing to compare how well these methods Agriculture: A Systematic Survey of Research
perform with other word embedding strategies. Overall this Activities in Agriculture Decision Support Systems
study presents the good insights and can guide practitioners Using Machine Learning. In: Singh, P.K., Wierzchoń,
to select a good word embedding technique for their S.T., Chhabra, J.K., Tanwar, S. (eds) Futuristic Trends
applications. in Networks and Computing Technologies. Lecture
Notes in Electrical Engineering, vol 936. Springer,
REFERENCES Singapore. https://doi.org/10.1007/978-981-19-5037-
7_56
[1]. D. Suleiman and A. Awajan, "Comparative Study of [12]. M. L. Saini, V. K. Sharma and A. Kumar, "An Efficient
Word Embeddings Models and Their Usage in Arabic Single and Double Error Correcting Block Codes with
Language Applications," 2018 International Arab Low Redundancy for Digital Communications," 2023
Conference on Information Technology (ACIT), International Conference on Advancement in
Werdanye, Lebanon, 2018, pp. 1-7, doi: Computation & Computer Technologies (InCACCT),
10.1109/ACIT.2018.8672674. Gharuan, India, 2023, pp. 827-831, doi:
[2]. Wadud, Md. Anwar Hussen et al. “Word Embedding 10.1109/InCACCT57535.2023.10141727.
Methods for Word Representation in Deep Learning for [13]. Vispute, S.; Saini, M.L. Performance Analysis of Soil
Natural Language Processing.” Iraqi Journal of Health Classifiers Using Data Analytics Tools and
Science (2022): n. pag. Techniques for Best Model and Tool Selection. Int. J.
[3]. H. Sanjurjo-González, "Improving accuracy of an Online Biomed. Eng. 2022, 18, 169–189.
existing semantic word labelling tool using word [14]. Shailaja Pede and Madan Lal Saini, "A Brief
embeddings," 2021 16th Iberian Conference on Bibliometric Analysis and Visualisation of Scopus and
Information Systems and Technologies (CISTI), Chaves, WoS databases on Blockchain Technology in
Portugal, 2021, pp. 1-5, doi: Healthcare Domain", Library Philosophy and Practice
10.23919/CISTI52073.2021.9476323. (e-journal), vol. 5106, 2021.
[4]. M. N. Helaskar and S. S. Sonawane, "Text [15]. M. Gehlot and M. L. Saini, "Analysis of Different CNN
Classification Using Word Embeddings," 2019 5th Architectures for Tomato Leaf Disease Classification,"
International Conference On Computing, 2020 5th IEEE International Conference on Recent
Communication, Control And Automation (ICCUBEA), Advances and Innovations in Engineering (ICRAIE),
Pune, India, 2019, pp. 1-4, doi: Jaipur, India, 2020, pp. 1-6, doi:
10.1109/ICCUBEA47591.2019.9129565. 10.1109/ICRAIE51050.2020.9358279.
[5]. Al-Saqqa, Samar and Arafat A. Awajan. “The Use of [16]. Varun Sapra, Madan Lal Saini, “Deep learning network
Word2vec Model in Sentiment Analysis: A for identification of Ischemia using clinical data”,
Survey.” Proceedings of the 2019 International International Journal of Engineering and Advanced
Conference on Artificial Intelligence, Robotics and Technology, ISSN: 2249-8958, Volume-8 Issue-5, June
Control (2019): n. pag. 2019.
[6]. Pennington, Jeffrey et al. “GloVe: Global Vectors for [17]. Varun Sapra, Madan Lal Saini, “Computational
Word Representation.” Conference on Empirical Intelligence for Detection of Coronary Artery Disease
Methods in Natural Language Processing (2014). with Optimized Features”, International Journal of
[7]. L. Cagliero and M. L. Quatra, "Inferring Multilingual Innovative Technology and Exploring Engineering,
Domain-Specific Word Embeddings From Large ISSN: 2278-3075 Volume 8, Issue-6C, Pages 144-148,
Document Corpora," in IEEE Access, vol. 9, pp. April 2019.
137309-137321, 2021, doi: [18]. S. Kulshrestha and M. L. Saini, "Study for the
10.1109/ACCESS.2021.3118093. Prediction of E-Commerce Business Market Growth
[8]. M. Fawzy, M. W. Fakhr and M. A. Rizka, "Word using Machine Learning Algorithm," 2020 5th IEEE
Embeddings and Neural Network Architectures for International Conference on Recent Advances and
Arabic Sentiment Analysis," 2020 16th International Innovations in Engineering (ICRAIE), Jaipur, India,
Computer Engineering Conference (ICENCO), Cairo, 2020, pp. 1-6, doi:
Egypt, 2020, pp. 92-96, doi: 10.1109/ICRAIE51050.2020.9358275
10.1109/ICENCO49778.2020.9357377. [19]. Sapra Varun , Saini M.L and Verma Luxmi,
[9]. P. Shah, S. Shah and S. Joshi, "A Study of Various Identification of Coronary Artery Disease using
Word Embeddings in Deep Learning," 2022 3rd Artificial Neural Network and Case-Based Reasoning,
International Conference for Emerging Technology Recent Advances in Computer Science and
(INCET), Belgaum, India, 2022, pp. 1-5, doi: Communications 2021; 14(8) .
10.1109/INCET54531.2022.9824963.

ISSN No:-2456-2165
https://dx.doi.org/10.2174/26662558139992006132254
04
[20]. M. L. Saini, J. Panduro-Ramirez, J. Padilla-Caballero,
A. Saxena, M. Tiwari and K. Ravi, "A Study on the
Potential Role of Blockchain in Future Wireless Mobile
Networks," 2023 3rd International Conference on
Advance Computing and Innovative Technologies in
Engineering (ICACITE), Greater Noida, India, 2023,
pp. 547-550, doi:
10.1109/ICACITE57410.2023.10182375.
[21]. Kulshrestha, Shilpi and Lal Saini, Madan, The
Analytical Study of an E-Business Model for
Establishing the Concept of Customer Retention (A
Case Study of Myntra.com) (August 29, 2019).
Proceedings of International Conference on
Advancements in Computing & Management (ICACM)
2019, http://dx.doi.org/10.2139/ssrn.3444797
[22]. Kavita Lal, Madan Lal Saini; A study on deep fake
identification techniques using deep learning. AIP Conf.
Proc. 15 June 2023; 2782 (1): 020155.
https://doi.org/10.1063/5.0154828
View publication stats

IJISRT23DEC1110 (1)

Uploaded by

Copyright:

Available Formats

IJISRT23DEC1110 (1)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IJISRT23DEC1110 (1)

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Performance Evaluation of Word Embedding Algorithms

The user has requested enhancement of the downloaded file.

Performance Evaluation of Word Embedding

IJISRT23DEC1110 www.ijisrt.com 1555

B. Semantic Meaning and Context

IJISRT23DEC1110 www.ijisrt.com 1556

IV. METHODOLOGY  GloVe: This word embedding technique uses a novel

IJISRT23DEC1110 www.ijisrt.com 1557

IJISRT23DEC1110 www.ijisrt.com 1558

IJISRT23DEC1110 www.ijisrt.com 1559

IJISRT23DEC1110 www.ijisrt.com 1560

IJISRT23DEC1110 www.ijisrt.com 1561

View publication stats

You might also like