Omar Meriwani

University of Essex, School of Computer Science and Electronic Engineering, Graduate Student

Followers

Following

Public Views

MSc in big data and text analytics, highly interested in text mining and NLP
Phone: 00447399161128
Address: Colchester, United Kingdom

less

Interests

Uploads

Thesis Chapters by Omar Meriwani

Mediaeval 2021 Emerging News: Detection of Emerging News from Live News Stream Based on Categorization of News Annotations

This paper describes the contribution of RS_OMERIWANI in the Mediaeval 2021 Emerging News task. A... more This paper describes the contribution of RS_OMERIWANI in the Mediaeval 2021 Emerging News task. Among the various definitions of emerging news, this work is based on the definition of emerging news as the type of news that would gain more attention from news sources, i.e. higher frequency in publishing the same news. Relying on the categorization of the news annotations, the classification process has been completed through an unsupervised clustering to generate training data for a supervised neural network model that classifies the news based on the categories that are mentioned in it. The accuracy score for the final model was 74%, with a 65% F-Score for detecting emerging news. The final model fulfilled the requirements of newsworthiness and completeness of reported events as well as the relevance criteria in the task evaluation.

Download

Enhancing Deep Neural Network Perforamnce on Small Datasets by the using Deep Autoencoder

An assignment in data science CSEE University of Essex, 2019

Deep neural networks (DNN) have proven high efficiency in many solutions in the industry and the ... more Deep neural networks (DNN) have proven high efficiency in many solutions in the industry and the academic research. However, they face many limitations, and challenges such as the insufficiency in data or the noise effects that leads to the common overfitting problem. Among many solutions that were proposed to face these challenges, we propose the use of deep Autoencoder (DAE) to improve the accuracy of DNN with and without the effect of noise. The experiment has been done on three medical datasets for diagnosing diabetes, heart attack seizures and autism in toddlers. We have built DAE model, and a model of DNN using Keras, and evaluated the performance of DNN with and without the use of DAE as an auxiliary task. Then we added noise and evaluated the results. The DAE has proven to enhance the efficiency of one model with and without noise, while it has proven to enhance performance in the two other models if the noise was applied.

Download

Movie Reviews Sentiment Analysis Using Latent Semantic Analysis and Statistical Features

An assignment in text analytics module CSEE University of Essex, 2019

Movie reviews sentiment analysis is a special case of sentiment analysis, it has multiple issues ... more Movie reviews sentiment analysis is a special case of sentiment analysis, it has multiple issues that makes it more challenging than other sentiment analysis or text classification problems. Reviewers do not necessarily use formal language, and the length of reviews could vary largely, in addition to the syntax challenge such as the negation rule. In this project, we propose the use of different features and techniques to create a classifier of movie reviews. The techniques include using sentiments polarity; and creating hand-crafted sentence polarity techniques by relying on lists of negative and positive words; we also use latent semantic analysis using word2vec; and finally, we use term frequency. Different techniques tested separately and together, the best achieved result was of using statistical features, while non-significant results were achieved via the set of polarity values and the latent semantic analysis. The achieved accuracy score is 59.1% for training data, and 61.7% for testing data after upload to Kaggle online competition 1 .

Download

Distant Supervision of Named Entity Recognition

An assignment in text analytics CSEE University of Essex, 2019

Named entity recognition (NER) is a crucial task of information extraction and text analytics, it... more Named entity recognition (NER) is a crucial task of information extraction and text analytics, it has been discussed by many studies and there have been multiple platforms with high accuracy levels that attains human recognition level of named entities. However, the contemporary rapid growth of information that is accompanied by the augmentation and the variety of named entities produced a parallel development in NER detection methods according to the classification supervision mode or the domain of NER. In this document, we discuss the distant supervised classification of NER and propose a framework of NER detection using a pre-annotated dataset to select significant features of NER detection in English language. In addition, we use multiple models of machine learning including naïve bays, support vector machine and feed-forward neural networks and we compare the different performance levels of them.

Download

Fake financial news detection framework

MSc Dissertation, 2019

Fake news is considered one of the main threats to the global information revolution in the world... more Fake news is considered one of the main threats to the global information revolution in the world. They tend to cause serious damages on the reputation of persons or organizations, they also tend to cause many indirect effects on other aspects depending on the type of fake news. Fake financial news represents one of the serious types of fake news; their effects can cause direct and serious damages to the stock market. They have also another aspect that makes them more difficult to detect, such as the difficulty of detection due to similarities in the form with the real news, and the difficulty of checking numbers meanings when the news include specific numerical facts. However, they share common features with other fake news types. In this work, we present five models to detect fake financial news using sentiment analysis, news sources checking, objectivity check, checking against existing news and a fact-checking method. Datasets have been created especially for this project in addition to the online available data sources. Sentiment analysis model has been done using deep learning model in and it has achieved 87% accuracy, while the objectivity check has not achieved significant results. News sources analysis problem has been dealt with as a traditional term frequency problem, the solution achieved 94% accuracy value. Due to the lack of enough data sources, the fact-checking solution has ended up in creating a dataset that is ready for fact-checking against any relational dataset of periodic values such as the stock market. Finally, the similarity check against existing news has achieved 76% accuracy value.

Download