Emeka is a resource person and lead trainer in many state and federal IT-related conferences and workshops. His research interests are in the areas of Big Data Analytics, Web/Text Mining, Machine Learning, and Sentiment Analysis.
The extraction of public opinions from online communication platforms can serve several purposes ... more The extraction of public opinions from online communication platforms can serve several purposes in corporate institutions, state politics, and governance. The analysis of these opinions may be useful for both immediate business decision making and professional planning. This analysis is becoming relevant in managing social movements and digital activism by applying computational technology. There is a need to deploy this opinion mining technology to the recent largest digital activism in Nigeria known as the #EndSARS movement. In this work, we proposed the EndSARS live analytics framework which holds a promising solution to social unrest and may serve as a panacea to curbing the menace of vandalism resulting from unresolved protest issues. Using a dataset of 12,357 tweets, we demonstrated that computational technology can be relevant to addressing online protests. The result of the analysis shows the eight basic emotions expressed during the protest and approaches the government m...
The dataset describes 826,412 raw tweet posts matching COVID-19 and Lockdown between February 14,... more The dataset describes 826,412 raw tweet posts matching COVID-19 and Lockdown between February 14, 2020, to August 14, 2020, from selected five African countries, including Nigeria, South Africa, Algeria, Egypt Sudan. It was cleaned to comprise 619,203 unique tweets relevant to researchers in data science, natural language processing, social science, informatics, tourism, and infodemiology.
Journal of Applied Sciences and Environmental Management, 2020
Patients share key information about their health with medical practitioners during clinic consul... more Patients share key information about their health with medical practitioners during clinic consultations. These key information may include their past medications and allergies, current situations/issues, and expectations. The healthcare professionals store this information in an Electronic Medical Record (EMR). EMRs have empowered research in healthcare; information hidden in them if harnessed properly through Natural Language Processing (NLP) can be used for disease registries, drug safety, epidemic surveillance, disease prediction, and treatment. This work illustrates the application of NLP techniques to design and implement a Key Information Retrieval System (KIRS framework) using the Latent Dirichlet Allocation algorithm. The cross-industry standard process for data mining methodology was applied in an experiment with an EMR dataset from PubMed todemonstrate the framework. The new system extracted the common problems (ailments) and prescriptions across the five (5) countries pr...
Journal of Computer Science and Its Application, 2021
The advent of the internet with its attendant democratization of data and deluge of information h... more The advent of the internet with its attendant democratization of data and deluge of information had given rise to the avalanche of news media agencies. These agencies publish news articles with varying emotional reports especially stories conveying bad sentiments to the public. As major news agencies operate micro-blogging websites and establish their presence on social media channels, the distribution of bad news increases. It has been shown that constant exposure to bad news presented in a body of texts, graphics, and videos/audios contribute to increase in high blood pressure, anxiety attacks, bowel disorders, stroke and/or heart failure. In this work, we presented a sentiment analysis framework to extract news articles from FrontPage of online newspapers and generate contextual wordlists to support positive news broadcasting. Using a set of 12 Nigerian online news channels, we employed a hybrid method of dictionary and corpus-based lexicon approaches to achieve the wordlist deri...
Organizations may be related in terms of similar operational procedures, management, and supervis... more Organizations may be related in terms of similar operational procedures, management, and supervisory agencies coordinating their operations. Supervisory agencies may be governmental or non-governmental but, in all cases, they perform oversight functions over the activities of the organizations under their control. Multiple organizations that are related in terms of oversight functions by their supervisory agencies, may differ significantly in terms of their geographical locations, aims, and objectives. To harmonize these differences such that comparative analysis will be meaningful, data about the operations of multiple organizations under one control or management can be cultivated, using a uniform format. In this format, data is easily harvested and the ease with which it is used for cross-population analysis, referred to as data comparability is enhanced. The current practice, whereby organizations under one control maintain their data in independent databases, specific to an ent...
Transactions on Machine Learning and Artificial Intelligence, 2020
The social media space has evolved into a large labyrinth of information exchange platform and du... more The social media space has evolved into a large labyrinth of information exchange platform and due to the growth in the adoption of different social media platforms, there has been an increasing wave of interests in sentiment analysis as a paradigm for the mining and analysis of users’ opinions and sentiments based on their posts. In this paper, we present a review of contextual sentiment analysis on social media entries with a specific focus on Twitter. The sentimental analysis consists of two broad approaches which are machine learning which uses classification techniques to classify text and is further categorized into supervised learning and unsupervised learning; and the lexicon-based approach which uses a dictionary without using any test or training data set, unlike the machine learning approach.
The coronavirus disease of 2019 (COVID-19) is a pandemic that is ravaging Nigeria and the world a... more The coronavirus disease of 2019 (COVID-19) is a pandemic that is ravaging Nigeria and the world at large. This data article provides a dataset of daily updates of COVID-19 as reported online by the Nigeria Centre for Disease Control (NCDC) from February 27, 2020 to September 29, 2020. The data were obtained through web scraping from different sources and it includes some economic variables such as the Nigeria budget for each state in 2020, population estimate, healthcare facilities, and the COVID-19 laboratories in Nigeria. The dataset has been processed using the standard of the FAIR data principle which encourages its findability, accessibility, interoperability, and reusability and will be relevant to researchers in different fields such as Data Science, Epidemiology, Earth Modelling, and Health Informatics.
The Internet has continued to span great geographical space and generality interests. It has prov... more The Internet has continued to span great geographical space and generality interests. It has provided enough space for social interaction and information exchange. It is hard to imagine a world without the internet. Like other fields of human endeavours, the internet is no doubt revolutionizing the act of researching, especially in the sciences. Regardless of any viewpoint, research outlines formal, methodical and rigorous processes, specifically the application of scientific methods of problem recognition, definition, solution development, data collection, analysis and conclusions. Expectedly, the introduction of the Internet heralded the upswing of the new soft form of learning; with the aim of achieving speedy and cost effective diffusion of knowledge. Secondly, the internet has also helped in aggregating with ease such knowledge which can be shared amongst geographically-detached partners. So, whether it involves
This study explores the prevalent Machine and Deep Learning approaches for the control of COVID-1... more This study explores the prevalent Machine and Deep Learning approaches for the control of COVID-19. It reveals the impact of Artificial Intelligence in the case prediction, analysis, diagnosis, and treatment of the disease. Apart from discussing four (4) knowledge areas where Machine Learning and Deep Learning approaches were employed in the fight against the pandemic, we proposed a Generalized Artificial Intelligence Response Framework using those areas. We observed that most of the works seeking Artificial Intelligence scientific solutions to the pandemic were employing the use of chest X-ray images and chest computed tomography scans for prognosis and diagnosis while applying different Machine and Deep Learning approaches using available data dashboards. However, a production-ready landmark contribution towards the control of the disease through Artificial Intelligence is still at the moment a work in progress. Hence, the need for a response framework to give researchers and prac...
Text classification is a method of grouping a document text into different predefined categories.... more Text classification is a method of grouping a document text into different predefined categories. This method has been applied in different areas such as classification of scientific articles, spam filtering, and classification of document genre. Text classification is a popular task in data mining because of its level of accuracy and easy application. The Internet is a common message transmission medium among many people, billions of messages move around the internet on a daily basis through different platforms on the internet such as e-mail, Facebook, Twitter, etc. Some of these messages are being transmitted with wrong motives, thus it became imperative to design a model for filtering some of these messages using data mining algorithms to sieve away the unwanted messages from circulation. In the light of this, this paper applied three data mining techniques namely: Support Vector Machine (SVM), Naive Bayes and K-Nearest Neighbour (KNN) to develop models that can be applied to fil...
The Internet has continued to span great geographical space and generality interests. It has prov... more The Internet has continued to span great geographical space and generality interests. It has provided enough space for social interaction and information exchange. It is hard to imagine a world without the internet. Like other fields of human endeavours, the internet is no doubt revolutionizing the act of researching, especially in the sciences. Regardless of any viewpoint, research outlines formal, methodical and rigorous processes, specifically the application of scientific methods of problem recognition, definition, solution development, data collection, analysis and conclusions. Expectedly, the introduction of the Internet heralded the upswing of the new soft form of learning; with the aim of achieving speedy and cost effective diffusion of knowledge. Secondly, the internet has also helped in aggregating with ease such knowledge which can be shared amongst geographically-detached partners. So, whether it involves fundamental/pure or basic distributed research, action, applied re...
There are publicly available general purpose sentiment lexicons in some high resource languages b... more There are publicly available general purpose sentiment lexicons in some high resource languages but very few exist in the low resource languages. This makes it difficult to directly perform sentiment analysis tasks in such languages. The objective of this work is to create a general purpose sentiment lexicon for Igbo language that can determine the sentiment of documents written in Igbo language without having to translate it to English language. The material used was an automatically translated Liu’s lexicon and manual addition of Igbo native words. The result of this work is a general purpose lexicon – IgboSentilex. The performance was tested on the BBC Igbo news channel. It returned an average polarity agreement of 95% with other general purpose sentiment lexicons.
International journal of innovation and scientific research, 2015
The Internet has continued to span great geographical space and generality interests. It has prov... more The Internet has continued to span great geographical space and generality interests. It has provided enough space for social interaction and information exchange. It is hard to imagine a world without the internet. Like other fields of human endeavours, the internet is no doubt revolutionising the act of researching, especially in the sciences. Regardless of any viewpoint, research outlines formal, methodical and rigorous processes, specifically the application of scientific methods of problem recognition, definition, solution development, data collection, analysis and conclusions. Expectedly, the introduction of the Internet heralded the upswing of the new soft form of learning; with the aim of achieving speedy and cost effective diffusion of knowledge. Secondly, the internet has also helped in aggregating with ease such knowledge which can be shared amongst geographically-detached partners. So, whether it involves fundamental/pure or basic distributed research, action, applied re...
The social media space has evolved into a large labyrinth of information exchange platform and du... more The social media space has evolved into a large labyrinth of information exchange platform and due to the growth in the adoption of different social media platforms; there has been an increasing wave of interests in sentiment analysis as a paradigm for the mining and analysis of users' opinions and sentiments based on their posts. In this paper, we present a review of contextual sentiment analysis on social media entries with a specific focus on Twitter. The sentimental analysis consists of two broad approaches which are machine learning which uses classification techniques to classify text, and is further categorized into supervised learning and unsupervised learning; and the lexicon-based approach which uses a dictionary without using any test or training data set, unlike the machine learning approach. The paper explores generic application areas including product/services analysis and security/terrorism investigations.
Digital forensics of visual-based evidence from video surveillance systems and forensic photograp... more Digital forensics of visual-based evidence from video surveillance systems and forensic photographs holds object detection as a key aspect of the process. Recognizing an instance of object classes over a wide range of image data using computational techniques is one of the areas that has gained continuous attention over the years due to their numerous practical applications. Several algorithms and techniques have been specified for object detection and recognition with Machine Learning gaining more prominence and ensuring the remarkable performance of object detection and recognition systems. This study presents a comprehensive review of the frameworks and applications of Machine Learning in object detection and classification with particular applications to Digital Forensics. The analysis covers a wide range of publications between 2007 and 2019 available in different indexed and non-indexed databases and the candidate papers were selected using certain exclusion criteria proposed in the Kitchenham’s methodology. The study in a bid to streamline future researches categorized digital forensic researches into six knowledge areas and identified the convolutional neural network as a state-of-the-art algorithm for machine learning-based digital forensics.
The extraction of public opinions from online communication platforms can serve several purposes ... more The extraction of public opinions from online communication platforms can serve several purposes in corporate institutions, state politics, and governance. The analysis of these opinions may be useful for both immediate business decision making and professional planning. This analysis is becoming relevant in managing social movements and digital activism by applying computational technology. There is a need to deploy this opinion mining technology to the recent largest digital activism in Nigeria known as the #EndSARS movement. In this work, we proposed the EndSARS live analytics framework which holds a promising solution to social unrest and may serve as a panacea to curbing the menace of vandalism resulting from unresolved protest issues. Using a dataset of 12,357 tweets, we demonstrated that computational technology can be relevant to addressing online protests. The result of the analysis shows the eight basic emotions expressed during the protest and approaches the government m...
The dataset describes 826,412 raw tweet posts matching COVID-19 and Lockdown between February 14,... more The dataset describes 826,412 raw tweet posts matching COVID-19 and Lockdown between February 14, 2020, to August 14, 2020, from selected five African countries, including Nigeria, South Africa, Algeria, Egypt Sudan. It was cleaned to comprise 619,203 unique tweets relevant to researchers in data science, natural language processing, social science, informatics, tourism, and infodemiology.
Journal of Applied Sciences and Environmental Management, 2020
Patients share key information about their health with medical practitioners during clinic consul... more Patients share key information about their health with medical practitioners during clinic consultations. These key information may include their past medications and allergies, current situations/issues, and expectations. The healthcare professionals store this information in an Electronic Medical Record (EMR). EMRs have empowered research in healthcare; information hidden in them if harnessed properly through Natural Language Processing (NLP) can be used for disease registries, drug safety, epidemic surveillance, disease prediction, and treatment. This work illustrates the application of NLP techniques to design and implement a Key Information Retrieval System (KIRS framework) using the Latent Dirichlet Allocation algorithm. The cross-industry standard process for data mining methodology was applied in an experiment with an EMR dataset from PubMed todemonstrate the framework. The new system extracted the common problems (ailments) and prescriptions across the five (5) countries pr...
Journal of Computer Science and Its Application, 2021
The advent of the internet with its attendant democratization of data and deluge of information h... more The advent of the internet with its attendant democratization of data and deluge of information had given rise to the avalanche of news media agencies. These agencies publish news articles with varying emotional reports especially stories conveying bad sentiments to the public. As major news agencies operate micro-blogging websites and establish their presence on social media channels, the distribution of bad news increases. It has been shown that constant exposure to bad news presented in a body of texts, graphics, and videos/audios contribute to increase in high blood pressure, anxiety attacks, bowel disorders, stroke and/or heart failure. In this work, we presented a sentiment analysis framework to extract news articles from FrontPage of online newspapers and generate contextual wordlists to support positive news broadcasting. Using a set of 12 Nigerian online news channels, we employed a hybrid method of dictionary and corpus-based lexicon approaches to achieve the wordlist deri...
Organizations may be related in terms of similar operational procedures, management, and supervis... more Organizations may be related in terms of similar operational procedures, management, and supervisory agencies coordinating their operations. Supervisory agencies may be governmental or non-governmental but, in all cases, they perform oversight functions over the activities of the organizations under their control. Multiple organizations that are related in terms of oversight functions by their supervisory agencies, may differ significantly in terms of their geographical locations, aims, and objectives. To harmonize these differences such that comparative analysis will be meaningful, data about the operations of multiple organizations under one control or management can be cultivated, using a uniform format. In this format, data is easily harvested and the ease with which it is used for cross-population analysis, referred to as data comparability is enhanced. The current practice, whereby organizations under one control maintain their data in independent databases, specific to an ent...
Transactions on Machine Learning and Artificial Intelligence, 2020
The social media space has evolved into a large labyrinth of information exchange platform and du... more The social media space has evolved into a large labyrinth of information exchange platform and due to the growth in the adoption of different social media platforms, there has been an increasing wave of interests in sentiment analysis as a paradigm for the mining and analysis of users’ opinions and sentiments based on their posts. In this paper, we present a review of contextual sentiment analysis on social media entries with a specific focus on Twitter. The sentimental analysis consists of two broad approaches which are machine learning which uses classification techniques to classify text and is further categorized into supervised learning and unsupervised learning; and the lexicon-based approach which uses a dictionary without using any test or training data set, unlike the machine learning approach.
The coronavirus disease of 2019 (COVID-19) is a pandemic that is ravaging Nigeria and the world a... more The coronavirus disease of 2019 (COVID-19) is a pandemic that is ravaging Nigeria and the world at large. This data article provides a dataset of daily updates of COVID-19 as reported online by the Nigeria Centre for Disease Control (NCDC) from February 27, 2020 to September 29, 2020. The data were obtained through web scraping from different sources and it includes some economic variables such as the Nigeria budget for each state in 2020, population estimate, healthcare facilities, and the COVID-19 laboratories in Nigeria. The dataset has been processed using the standard of the FAIR data principle which encourages its findability, accessibility, interoperability, and reusability and will be relevant to researchers in different fields such as Data Science, Epidemiology, Earth Modelling, and Health Informatics.
The Internet has continued to span great geographical space and generality interests. It has prov... more The Internet has continued to span great geographical space and generality interests. It has provided enough space for social interaction and information exchange. It is hard to imagine a world without the internet. Like other fields of human endeavours, the internet is no doubt revolutionizing the act of researching, especially in the sciences. Regardless of any viewpoint, research outlines formal, methodical and rigorous processes, specifically the application of scientific methods of problem recognition, definition, solution development, data collection, analysis and conclusions. Expectedly, the introduction of the Internet heralded the upswing of the new soft form of learning; with the aim of achieving speedy and cost effective diffusion of knowledge. Secondly, the internet has also helped in aggregating with ease such knowledge which can be shared amongst geographically-detached partners. So, whether it involves
This study explores the prevalent Machine and Deep Learning approaches for the control of COVID-1... more This study explores the prevalent Machine and Deep Learning approaches for the control of COVID-19. It reveals the impact of Artificial Intelligence in the case prediction, analysis, diagnosis, and treatment of the disease. Apart from discussing four (4) knowledge areas where Machine Learning and Deep Learning approaches were employed in the fight against the pandemic, we proposed a Generalized Artificial Intelligence Response Framework using those areas. We observed that most of the works seeking Artificial Intelligence scientific solutions to the pandemic were employing the use of chest X-ray images and chest computed tomography scans for prognosis and diagnosis while applying different Machine and Deep Learning approaches using available data dashboards. However, a production-ready landmark contribution towards the control of the disease through Artificial Intelligence is still at the moment a work in progress. Hence, the need for a response framework to give researchers and prac...
Text classification is a method of grouping a document text into different predefined categories.... more Text classification is a method of grouping a document text into different predefined categories. This method has been applied in different areas such as classification of scientific articles, spam filtering, and classification of document genre. Text classification is a popular task in data mining because of its level of accuracy and easy application. The Internet is a common message transmission medium among many people, billions of messages move around the internet on a daily basis through different platforms on the internet such as e-mail, Facebook, Twitter, etc. Some of these messages are being transmitted with wrong motives, thus it became imperative to design a model for filtering some of these messages using data mining algorithms to sieve away the unwanted messages from circulation. In the light of this, this paper applied three data mining techniques namely: Support Vector Machine (SVM), Naive Bayes and K-Nearest Neighbour (KNN) to develop models that can be applied to fil...
The Internet has continued to span great geographical space and generality interests. It has prov... more The Internet has continued to span great geographical space and generality interests. It has provided enough space for social interaction and information exchange. It is hard to imagine a world without the internet. Like other fields of human endeavours, the internet is no doubt revolutionizing the act of researching, especially in the sciences. Regardless of any viewpoint, research outlines formal, methodical and rigorous processes, specifically the application of scientific methods of problem recognition, definition, solution development, data collection, analysis and conclusions. Expectedly, the introduction of the Internet heralded the upswing of the new soft form of learning; with the aim of achieving speedy and cost effective diffusion of knowledge. Secondly, the internet has also helped in aggregating with ease such knowledge which can be shared amongst geographically-detached partners. So, whether it involves fundamental/pure or basic distributed research, action, applied re...
There are publicly available general purpose sentiment lexicons in some high resource languages b... more There are publicly available general purpose sentiment lexicons in some high resource languages but very few exist in the low resource languages. This makes it difficult to directly perform sentiment analysis tasks in such languages. The objective of this work is to create a general purpose sentiment lexicon for Igbo language that can determine the sentiment of documents written in Igbo language without having to translate it to English language. The material used was an automatically translated Liu’s lexicon and manual addition of Igbo native words. The result of this work is a general purpose lexicon – IgboSentilex. The performance was tested on the BBC Igbo news channel. It returned an average polarity agreement of 95% with other general purpose sentiment lexicons.
International journal of innovation and scientific research, 2015
The Internet has continued to span great geographical space and generality interests. It has prov... more The Internet has continued to span great geographical space and generality interests. It has provided enough space for social interaction and information exchange. It is hard to imagine a world without the internet. Like other fields of human endeavours, the internet is no doubt revolutionising the act of researching, especially in the sciences. Regardless of any viewpoint, research outlines formal, methodical and rigorous processes, specifically the application of scientific methods of problem recognition, definition, solution development, data collection, analysis and conclusions. Expectedly, the introduction of the Internet heralded the upswing of the new soft form of learning; with the aim of achieving speedy and cost effective diffusion of knowledge. Secondly, the internet has also helped in aggregating with ease such knowledge which can be shared amongst geographically-detached partners. So, whether it involves fundamental/pure or basic distributed research, action, applied re...
The social media space has evolved into a large labyrinth of information exchange platform and du... more The social media space has evolved into a large labyrinth of information exchange platform and due to the growth in the adoption of different social media platforms; there has been an increasing wave of interests in sentiment analysis as a paradigm for the mining and analysis of users' opinions and sentiments based on their posts. In this paper, we present a review of contextual sentiment analysis on social media entries with a specific focus on Twitter. The sentimental analysis consists of two broad approaches which are machine learning which uses classification techniques to classify text, and is further categorized into supervised learning and unsupervised learning; and the lexicon-based approach which uses a dictionary without using any test or training data set, unlike the machine learning approach. The paper explores generic application areas including product/services analysis and security/terrorism investigations.
Digital forensics of visual-based evidence from video surveillance systems and forensic photograp... more Digital forensics of visual-based evidence from video surveillance systems and forensic photographs holds object detection as a key aspect of the process. Recognizing an instance of object classes over a wide range of image data using computational techniques is one of the areas that has gained continuous attention over the years due to their numerous practical applications. Several algorithms and techniques have been specified for object detection and recognition with Machine Learning gaining more prominence and ensuring the remarkable performance of object detection and recognition systems. This study presents a comprehensive review of the frameworks and applications of Machine Learning in object detection and classification with particular applications to Digital Forensics. The analysis covers a wide range of publications between 2007 and 2019 available in different indexed and non-indexed databases and the candidate papers were selected using certain exclusion criteria proposed in the Kitchenham’s methodology. The study in a bid to streamline future researches categorized digital forensic researches into six knowledge areas and identified the convolutional neural network as a state-of-the-art algorithm for machine learning-based digital forensics.
Uploads
Papers by Emeka Ogbuju