Panagiotis Karampelas holds a Ph.D. in Electronic Engineering from the University of Kent at Canterbury, UK and a Master of Science degree from the Department of Informatics, Kapodistrian University of Athens with specialization in “High Performance Algorithms”. He also holds a Bachelor degree in Mathematics from the same University majoring in Applied Mathematics. His areas of interest include Human Computer Interaction, Information Visualization, Data Mining, Social Network Analysis, Counterterrorism Informatics, Power Management System, Artificial Neural Networks, Power Transmission and Distribution Systems. He has published a number of books and research articles in his major areas of interests in international journals and conferences. Currently, he is with the Department of Informatics and Computers, at the Hellenic Air Force Academy teaching courses to pilots and engineers.
Internet-enabled devices or Internet of Things as it has been prevailed are increasing exponentia... more Internet-enabled devices or Internet of Things as it has been prevailed are increasing exponentially every day. The lack of security standards in the manufacturing of these devices along with the haste of the manufacturers to increase their market share in this area has created a very large network of vulnerable devices that can be easily recruited as bot members and used to initiate very large volumetric Distributed Denial of Service (DDoS) attacks. The significance of the problem can be easily acknowledged due to the large number of cases regarding attacks on institutions, enterprises and even countries which have been recently revealed. In the current paper a novel method is introduced, which is based on a data mining technique that can analyze incoming IP traffic details and early warn the network administrator about a potentially developing DDoS attack. The method can scale depending on the availability of the infrastructure from a conventional laptop computer to a complex cloud infrastructure. Based on the hardware configuration as it is proved with the experiments the method can easily monitor and detect abnormal network traffic of several Gbps in real time using the minimum hardware equipment.
For the past few years, climate changes and frequent disasters that are attributed to extreme wea... more For the past few years, climate changes and frequent disasters that are attributed to extreme weather phenomena have received considerable attention. Technical advancement both in hardware, such as sensors, satellites, cluster computing, etc., and analytical tools such as machine learning, deep learning, network analysis, etc., have allowed the collection and analysis of a large volume of complex weather related data. In this chapter, we study the European capital temperatures by implementing the novel “General Purpose Sequence Clustering” methodology (GPSC), which allows to analyze and cluster numerous long time series using commercial widely available hardware of low cost. Using the specific methodology, we have managed to cluster two-years temperature time series of 38 European capitals. This is not just based on typical seasonality but in a more in-depth level using complex patterns. The results showed the efficiency and effectiveness of the methodology by identifying several clusters showing similarities that could help weather specialists in discovering more advanced weather prediction models.
Sequential frequent itemsets detection is one of the core problems in data mining with many appli... more Sequential frequent itemsets detection is one of the core problems in data mining with many applications in business, marketing, data stream analysis, etc. In the current paper, we propose a new methodology based on our previous work regarding the detection of all repeated patterns in a sequence, i.e., frequent and non-frequent itemsets. By analyzing big datasets from FIMI website of up to one million transactions we were able to detect not only the most frequent sequential itemsets, but also any sequential itemset that occurred at least twice in the dataset and, therefore, detect outliers which may be important while no other methodology can perform such analysis. For this purpose, we have used the novel data structure LERP-RSA (Longest Expected Repeated Pattern-Reduced Suffix Array) and the innovative ARPaD algorithm which allows the detection of all repeated patterns in a string. The methodology uses a pre-statistical analysis of the transactions and this allows constructing in a very efficient way smaller LERP-RSA data structures for each transaction. The integration and classification of all LERP-RSAs let the ARPaD algorithm to be executed in parallel which can accelerate the process and find the itemsets in a very efficient way.
According to Thomson-Reuters the top cyber threat today is phishing in which people are tricked e... more According to Thomson-Reuters the top cyber threat today is phishing in which people are tricked either to click a malicious link or give out personal information. It’s a fact that 96% of these phishing attacks comes from emails, which amount to more than 3.4 billion daily, as reported by Cisco. Austrian aerospace company FACC, Belgian bank Crelan, Acorn financial services and many other companies were recently fell victims of phishing emails losing millions of dollars. Even if experts provide lists of signs that users should seek in an email in order to understand if it is legitimate or scam, the attackers have elevated the quality of the email messages making them believable and very hard to discern them. In order to respond to this elevated threat, unconventional user training is required, focusing on recognizing a phishing email. Knowing how an attacker thinks and prepares the attack vector against a target, we claim that it will make users more suspicious when they receive one. ...
How do people make their decisions? Searching for the answer in the relevant literature, we can f... more How do people make their decisions? Searching for the answer in the relevant literature, we can find that decisions are based either on rationality or intuition. Rational thinking is mainly observed in situations characterized by certainty (in terms of data or the consequences of decisions), while heuristic intuitive methods are mainly observed in situations of uncertainty. Training for the enhancement of decision making skills usually employs problem-based activities which mainly focus either only on rationality or only on intuition. However, problems in real life cannot always be solved with the contribution of only one way of thinking. In a decision making process often rationality works up to an extent and then intuition will lead to the final decision. For this reason, we designed and developed a game-based learning activity that enhances both rational and intuitive decision making skills. More specifically, we created a decision scenario in a virtual environment in which parti...
ABSTRACT With the emergence of high speed internet applications and advanced Web 2.0 based Rich I... more ABSTRACT With the emergence of high speed internet applications and advanced Web 2.0 based Rich Internet Applications (i.e., blogs, wikis, etc.), it has become much easier for the users to publish data over the Web. This brings a challenge for the Web search solutions to let individual users find the right information as per their preferences, because traditional Web search engines have been built on “one size fits for all” concept. Different users of the Web may have different preferences. Search results for the same query raised by different users may differ in priority for individual users. In this book chapter, we present the extended version and results of our proposal on community-aware personalized Web search. It is quite challenging to know the preferences of the users by the search engines. We have designed and developed our unique approach of finding the preferences of users from the relevant parts of the user’s social network and community. We believe that the information related to the queries posed by the users may have strong correlation with the relevant information in their social networks. In order to find out personal interest and social-context, we find (1) activities of users in their social-network, and (2) relevant information from user’s social networks, based on our proposed trust and relevance matrices. We have further developed a mechanism that extracts from user’s social network information to be used to re-rank search results from a search engine. We also have discussed the implementation and evaluation details of our proposed solution.
The SIREN project (Secure, Interoperable, UAV-assisted, Rapid Emergency Deployment Communication ... more The SIREN project (Secure, Interoperable, UAV-assisted, Rapid Emergency Deployment Communication and sensing Infrastructure) implements a secure, distributed, open, self-configured and emergency-aware network and service platform for automated, secure and dependable support of multiple mission critical applications in highly demanding and dynamic emergency environments.
2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2016
Big data streaming analysis nowadays has become one of the most important topic in the list of da... more Big data streaming analysis nowadays has become one of the most important topic in the list of data analysts since enormous amount of data are produced daily by the numerous smart devices. The analysis of such data is very important and the detection of frequent or even non-frequent patterns can be critical for many aspects of our lives. In the current paper, we propose a new methodology based on our previous work regarding the detection of all repeated patterns in a string in order to analyze a very big data stream with 1 Trillion digits, composed from 1 thousand subsequences of 1 billion digits each one. More specifically, using the novel data structure, LERP Reduced Suffix Array, and the innovative ARPaD algorithm which allows the detection of all repeated patterns in a string we managed to analyze each one of the 1 billion data points, using 10 computers with standard hardware configuration, in 33 minutes which outperforms to the best of our knowledge any other existing methodology, which is equivalent to data point generation every 2 microseconds.
Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2019
In recent years, there are very frequent reports of disasters attributed to the climate change an... more In recent years, there are very frequent reports of disasters attributed to the climate change and there are several reports that these extreme phenomena will further affect people not only as weather disasters but also indirectly with the shortage of natural resources such as water or food due to the climate change. Towards this direction, there is an on-going research that studies weather phenomena by collecting data not only in the surface of the globe but also at the different levels of the atmosphere. Having such a large volume of data, traditional numerical weather prediction models may not be able to assimilate those data and extract knowledge useful for the prediction of extreme phenomena. Thus, analysis of weather data has been transformed into a big data analytics problem which may enable weather scientists to better understand the interrelations of the weather variables and use the knowledge discovered to improve their prediction models. In this context, the current paper proposes a big data analytics methodology that is able to detect all common patterns between different weather variables in neighboring or distant points in a specific time window revealing useful associations between weather variables which is not possible to detect otherwise with the traditional numerical methods. The proposed methodology is based on a data structure that is able to store the magnitude of the weather data in different dimensions and a pattern detection algorithm which is able to detect all common patterns. The experimental results using weather data from the National Oceanic and Atmospheric Administration (NOAA) revealed interesting otherwise unknown patterns in two weather variables for two specific locations that were studied.
This book introduces readers to novel, efficient and user-friendly software tools for power syste... more This book introduces readers to novel, efficient and user-friendly software tools for power systems studies, to issues related to distributed and dispersed power generation, and to the correlation between renewable power generation and electricity demand. Discussing new methodologies for addressing grid stability and control problems, it also examines issues concerning the safety and protection of transmission and distribution networks, energy storage and power quality, and the application of embedded systems to these networks. Lastly, the book sheds light on the implications of these new methodologies and developments for the economics of the power industry. As such, it offers readers a comprehensive overview of state-of-the-art research on modern electricity transmission and distribution networks.
This chapter identifies the correlation between renewable electricity generation and electricity ... more This chapter identifies the correlation between renewable electricity generation and electricity demand using as a case study Portugal. It presents the Portuguese current electric system, the installed generation capacity, and the electricity demand pattern for the year 2012. It also presents the 2020 national strategy for this sector, namely the Renewables Plan of Action, its targets and challenges. It focuses on the three main natural resources that exist in the country and describes their technology as well as the technical challenges for the integration of renewable generation in the electric system. Through this study the effectiveness of hydro, solar and wind power to meet electricity demand in Portugal is investigated. Statistical data regarding the seasonal and daily availability of the three main resources, relating them with the electricity generated from those sources and their correspondent capacity factors are presented. In addition, comparative analyses are performed between the electricity demand curve, both seasonal and daily, and statistical data related to the electricity generation from the renewable sources using both the Pearson product-moment correlation coefficient and graphical illustrations. The results obtained confirm a correlation between the renewables availability, namely, hydro and wind power, and electricity demand during a typical year. They also suggest no correlation between demand and a solar/wind combination during a 24-h period, however, they reveal a complementarity between solar and wind power availability during a typical day, highlighting the need and advantages of energy storage systems and “smart grid” technologies, to adjust electricity generation curves to demand load curves. The authors strongly believe that this study can be useful to the development of national strategies for the modern electric power systems.
Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Many researchers have extensively studied the importance of social media in every aspect of human... more Many researchers have extensively studied the importance of social media in every aspect of human life in recent years. There are several instances in which information collected from social media can help not only companies and political parties to design their strategy, but also can save human lives if appropriately analyzed and used. There are numerous cases in which information posted to social media in emergencies and natural disasters was used by the emergency responders to get immediate access to the areas in need or by authorities to acquire a better understanding of the affected areas. In our work, we attempt to test whether information from social media can improve the effectiveness of first responders in the areas of refugee crisis and especially in the Mediterranean Sea. Hundreds of thousands of people attempt to cross the Mediterranean sea and enter Europe but several of them lose their lives because of the dangerous boats that smugglers use. We simulated a Search and Rescue (SAR) mission in a virtual environment which takes place in the 3D world of a real Greek island and we tested our hypothesis that social media can help rescuers to locate people in need by applying the visual search theory. The experimental results were very promising for the specific application of social media surveillance.
The last two decades witnessed tremendous and astonishing developments in technology. This pushed... more The last two decades witnessed tremendous and astonishing developments in technology. This pushed for visible revolution in communication and electronics design leading to the production of computing devices of various sizes and capabilities, ranging from tiny sensors with limited specifications to mobile devices with huge power and rich functionalities, among others. These stimulated researchers and practitioners work hard seeking the best possible benefit from such novel devices to serve humanity. Gathering huge amounts of data is way easier and more affordable than ever before. Indeed, there is a clear shift from paper-based manual data collection to totally automated data collection even under sever conditions which were never feasible to consider before. Data is captured as a stream which may encapsulate some trends that may reveal certain aspects essential to our daily life. Identifying such trends in data streams is the main theme of the study described in this chapter. We mainly concentrate on real-time stream data analysis to better serve time-critical applications where instant decision making is crucial. This study builds on our methodology described in (Xylogiannopoulos et al. Frequent and non-frequent pattern detection in big data streams: an experimental simulation in 1 trillion data points. In: Advances in social networks analysis and mining (ASONAM), pp. 931–938, 2016) which considers detecting all repeated patterns in a big data stream. In the new dynamic approach, a sliding window is employed with LERP Reduced Suffix Array and the ARPaD algorithm to analyze one trillion digits composed from one million subsequences of one million digits each. We achieved like generating one data point every 300 ns.
Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015
Sequential frequent itemsets detection is one of the core problems in data mining. In the current... more Sequential frequent itemsets detection is one of the core problems in data mining. In the current paper we propose a new methodology based on our previous work regarding the detection of all repeated patterns in a string. By analyzing big datasets from FIMI website of up to one million transactions we were able to detect not only the most frequent sequential itemsets but any sequential itemset occurred at least twice in the transactions' database. For this purpose we have used a novel data structure the LERP Reduced Suffix Array and the innovative ARPaD algorithm which allows the detection of all repeated patterns in a string. The methodology uses a pre-statistical analysis of the transactions that allows constructing in a very efficient way smaller LERP-RSA data structures for each transaction. The integration and classification of all LERP-RSAs let ARPaD algorithm to be executed in parallel and to detect every sequential itemset that occurs at least twice in a very efficient way.
Internet-enabled devices or Internet of Things as it has been prevailed are increasing exponentia... more Internet-enabled devices or Internet of Things as it has been prevailed are increasing exponentially every day. The lack of security standards in the manufacturing of these devices along with the haste of the manufacturers to increase their market share in this area has created a very large network of vulnerable devices that can be easily recruited as bot members and used to initiate very large volumetric Distributed Denial of Service (DDoS) attacks. The significance of the problem can be easily acknowledged due to the large number of cases regarding attacks on institutions, enterprises and even countries which have been recently revealed. In the current paper a novel method is introduced, which is based on a data mining technique that can analyze incoming IP traffic details and early warn the network administrator about a potentially developing DDoS attack. The method can scale depending on the availability of the infrastructure from a conventional laptop computer to a complex cloud infrastructure. Based on the hardware configuration as it is proved with the experiments the method can easily monitor and detect abnormal network traffic of several Gbps in real time using the minimum hardware equipment.
For the past few years, climate changes and frequent disasters that are attributed to extreme wea... more For the past few years, climate changes and frequent disasters that are attributed to extreme weather phenomena have received considerable attention. Technical advancement both in hardware, such as sensors, satellites, cluster computing, etc., and analytical tools such as machine learning, deep learning, network analysis, etc., have allowed the collection and analysis of a large volume of complex weather related data. In this chapter, we study the European capital temperatures by implementing the novel “General Purpose Sequence Clustering” methodology (GPSC), which allows to analyze and cluster numerous long time series using commercial widely available hardware of low cost. Using the specific methodology, we have managed to cluster two-years temperature time series of 38 European capitals. This is not just based on typical seasonality but in a more in-depth level using complex patterns. The results showed the efficiency and effectiveness of the methodology by identifying several clusters showing similarities that could help weather specialists in discovering more advanced weather prediction models.
Sequential frequent itemsets detection is one of the core problems in data mining with many appli... more Sequential frequent itemsets detection is one of the core problems in data mining with many applications in business, marketing, data stream analysis, etc. In the current paper, we propose a new methodology based on our previous work regarding the detection of all repeated patterns in a sequence, i.e., frequent and non-frequent itemsets. By analyzing big datasets from FIMI website of up to one million transactions we were able to detect not only the most frequent sequential itemsets, but also any sequential itemset that occurred at least twice in the dataset and, therefore, detect outliers which may be important while no other methodology can perform such analysis. For this purpose, we have used the novel data structure LERP-RSA (Longest Expected Repeated Pattern-Reduced Suffix Array) and the innovative ARPaD algorithm which allows the detection of all repeated patterns in a string. The methodology uses a pre-statistical analysis of the transactions and this allows constructing in a very efficient way smaller LERP-RSA data structures for each transaction. The integration and classification of all LERP-RSAs let the ARPaD algorithm to be executed in parallel which can accelerate the process and find the itemsets in a very efficient way.
According to Thomson-Reuters the top cyber threat today is phishing in which people are tricked e... more According to Thomson-Reuters the top cyber threat today is phishing in which people are tricked either to click a malicious link or give out personal information. It’s a fact that 96% of these phishing attacks comes from emails, which amount to more than 3.4 billion daily, as reported by Cisco. Austrian aerospace company FACC, Belgian bank Crelan, Acorn financial services and many other companies were recently fell victims of phishing emails losing millions of dollars. Even if experts provide lists of signs that users should seek in an email in order to understand if it is legitimate or scam, the attackers have elevated the quality of the email messages making them believable and very hard to discern them. In order to respond to this elevated threat, unconventional user training is required, focusing on recognizing a phishing email. Knowing how an attacker thinks and prepares the attack vector against a target, we claim that it will make users more suspicious when they receive one. ...
How do people make their decisions? Searching for the answer in the relevant literature, we can f... more How do people make their decisions? Searching for the answer in the relevant literature, we can find that decisions are based either on rationality or intuition. Rational thinking is mainly observed in situations characterized by certainty (in terms of data or the consequences of decisions), while heuristic intuitive methods are mainly observed in situations of uncertainty. Training for the enhancement of decision making skills usually employs problem-based activities which mainly focus either only on rationality or only on intuition. However, problems in real life cannot always be solved with the contribution of only one way of thinking. In a decision making process often rationality works up to an extent and then intuition will lead to the final decision. For this reason, we designed and developed a game-based learning activity that enhances both rational and intuitive decision making skills. More specifically, we created a decision scenario in a virtual environment in which parti...
ABSTRACT With the emergence of high speed internet applications and advanced Web 2.0 based Rich I... more ABSTRACT With the emergence of high speed internet applications and advanced Web 2.0 based Rich Internet Applications (i.e., blogs, wikis, etc.), it has become much easier for the users to publish data over the Web. This brings a challenge for the Web search solutions to let individual users find the right information as per their preferences, because traditional Web search engines have been built on “one size fits for all” concept. Different users of the Web may have different preferences. Search results for the same query raised by different users may differ in priority for individual users. In this book chapter, we present the extended version and results of our proposal on community-aware personalized Web search. It is quite challenging to know the preferences of the users by the search engines. We have designed and developed our unique approach of finding the preferences of users from the relevant parts of the user’s social network and community. We believe that the information related to the queries posed by the users may have strong correlation with the relevant information in their social networks. In order to find out personal interest and social-context, we find (1) activities of users in their social-network, and (2) relevant information from user’s social networks, based on our proposed trust and relevance matrices. We have further developed a mechanism that extracts from user’s social network information to be used to re-rank search results from a search engine. We also have discussed the implementation and evaluation details of our proposed solution.
The SIREN project (Secure, Interoperable, UAV-assisted, Rapid Emergency Deployment Communication ... more The SIREN project (Secure, Interoperable, UAV-assisted, Rapid Emergency Deployment Communication and sensing Infrastructure) implements a secure, distributed, open, self-configured and emergency-aware network and service platform for automated, secure and dependable support of multiple mission critical applications in highly demanding and dynamic emergency environments.
2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2016
Big data streaming analysis nowadays has become one of the most important topic in the list of da... more Big data streaming analysis nowadays has become one of the most important topic in the list of data analysts since enormous amount of data are produced daily by the numerous smart devices. The analysis of such data is very important and the detection of frequent or even non-frequent patterns can be critical for many aspects of our lives. In the current paper, we propose a new methodology based on our previous work regarding the detection of all repeated patterns in a string in order to analyze a very big data stream with 1 Trillion digits, composed from 1 thousand subsequences of 1 billion digits each one. More specifically, using the novel data structure, LERP Reduced Suffix Array, and the innovative ARPaD algorithm which allows the detection of all repeated patterns in a string we managed to analyze each one of the 1 billion data points, using 10 computers with standard hardware configuration, in 33 minutes which outperforms to the best of our knowledge any other existing methodology, which is equivalent to data point generation every 2 microseconds.
Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2019
In recent years, there are very frequent reports of disasters attributed to the climate change an... more In recent years, there are very frequent reports of disasters attributed to the climate change and there are several reports that these extreme phenomena will further affect people not only as weather disasters but also indirectly with the shortage of natural resources such as water or food due to the climate change. Towards this direction, there is an on-going research that studies weather phenomena by collecting data not only in the surface of the globe but also at the different levels of the atmosphere. Having such a large volume of data, traditional numerical weather prediction models may not be able to assimilate those data and extract knowledge useful for the prediction of extreme phenomena. Thus, analysis of weather data has been transformed into a big data analytics problem which may enable weather scientists to better understand the interrelations of the weather variables and use the knowledge discovered to improve their prediction models. In this context, the current paper proposes a big data analytics methodology that is able to detect all common patterns between different weather variables in neighboring or distant points in a specific time window revealing useful associations between weather variables which is not possible to detect otherwise with the traditional numerical methods. The proposed methodology is based on a data structure that is able to store the magnitude of the weather data in different dimensions and a pattern detection algorithm which is able to detect all common patterns. The experimental results using weather data from the National Oceanic and Atmospheric Administration (NOAA) revealed interesting otherwise unknown patterns in two weather variables for two specific locations that were studied.
This book introduces readers to novel, efficient and user-friendly software tools for power syste... more This book introduces readers to novel, efficient and user-friendly software tools for power systems studies, to issues related to distributed and dispersed power generation, and to the correlation between renewable power generation and electricity demand. Discussing new methodologies for addressing grid stability and control problems, it also examines issues concerning the safety and protection of transmission and distribution networks, energy storage and power quality, and the application of embedded systems to these networks. Lastly, the book sheds light on the implications of these new methodologies and developments for the economics of the power industry. As such, it offers readers a comprehensive overview of state-of-the-art research on modern electricity transmission and distribution networks.
This chapter identifies the correlation between renewable electricity generation and electricity ... more This chapter identifies the correlation between renewable electricity generation and electricity demand using as a case study Portugal. It presents the Portuguese current electric system, the installed generation capacity, and the electricity demand pattern for the year 2012. It also presents the 2020 national strategy for this sector, namely the Renewables Plan of Action, its targets and challenges. It focuses on the three main natural resources that exist in the country and describes their technology as well as the technical challenges for the integration of renewable generation in the electric system. Through this study the effectiveness of hydro, solar and wind power to meet electricity demand in Portugal is investigated. Statistical data regarding the seasonal and daily availability of the three main resources, relating them with the electricity generated from those sources and their correspondent capacity factors are presented. In addition, comparative analyses are performed between the electricity demand curve, both seasonal and daily, and statistical data related to the electricity generation from the renewable sources using both the Pearson product-moment correlation coefficient and graphical illustrations. The results obtained confirm a correlation between the renewables availability, namely, hydro and wind power, and electricity demand during a typical year. They also suggest no correlation between demand and a solar/wind combination during a 24-h period, however, they reveal a complementarity between solar and wind power availability during a typical day, highlighting the need and advantages of energy storage systems and “smart grid” technologies, to adjust electricity generation curves to demand load curves. The authors strongly believe that this study can be useful to the development of national strategies for the modern electric power systems.
Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Many researchers have extensively studied the importance of social media in every aspect of human... more Many researchers have extensively studied the importance of social media in every aspect of human life in recent years. There are several instances in which information collected from social media can help not only companies and political parties to design their strategy, but also can save human lives if appropriately analyzed and used. There are numerous cases in which information posted to social media in emergencies and natural disasters was used by the emergency responders to get immediate access to the areas in need or by authorities to acquire a better understanding of the affected areas. In our work, we attempt to test whether information from social media can improve the effectiveness of first responders in the areas of refugee crisis and especially in the Mediterranean Sea. Hundreds of thousands of people attempt to cross the Mediterranean sea and enter Europe but several of them lose their lives because of the dangerous boats that smugglers use. We simulated a Search and Rescue (SAR) mission in a virtual environment which takes place in the 3D world of a real Greek island and we tested our hypothesis that social media can help rescuers to locate people in need by applying the visual search theory. The experimental results were very promising for the specific application of social media surveillance.
The last two decades witnessed tremendous and astonishing developments in technology. This pushed... more The last two decades witnessed tremendous and astonishing developments in technology. This pushed for visible revolution in communication and electronics design leading to the production of computing devices of various sizes and capabilities, ranging from tiny sensors with limited specifications to mobile devices with huge power and rich functionalities, among others. These stimulated researchers and practitioners work hard seeking the best possible benefit from such novel devices to serve humanity. Gathering huge amounts of data is way easier and more affordable than ever before. Indeed, there is a clear shift from paper-based manual data collection to totally automated data collection even under sever conditions which were never feasible to consider before. Data is captured as a stream which may encapsulate some trends that may reveal certain aspects essential to our daily life. Identifying such trends in data streams is the main theme of the study described in this chapter. We mainly concentrate on real-time stream data analysis to better serve time-critical applications where instant decision making is crucial. This study builds on our methodology described in (Xylogiannopoulos et al. Frequent and non-frequent pattern detection in big data streams: an experimental simulation in 1 trillion data points. In: Advances in social networks analysis and mining (ASONAM), pp. 931–938, 2016) which considers detecting all repeated patterns in a big data stream. In the new dynamic approach, a sliding window is employed with LERP Reduced Suffix Array and the ARPaD algorithm to analyze one trillion digits composed from one million subsequences of one million digits each. We achieved like generating one data point every 300 ns.
Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015
Sequential frequent itemsets detection is one of the core problems in data mining. In the current... more Sequential frequent itemsets detection is one of the core problems in data mining. In the current paper we propose a new methodology based on our previous work regarding the detection of all repeated patterns in a string. By analyzing big datasets from FIMI website of up to one million transactions we were able to detect not only the most frequent sequential itemsets but any sequential itemset occurred at least twice in the transactions' database. For this purpose we have used a novel data structure the LERP Reduced Suffix Array and the innovative ARPaD algorithm which allows the detection of all repeated patterns in a string. The methodology uses a pre-statistical analysis of the transactions that allows constructing in a very efficient way smaller LERP-RSA data structures for each transaction. The integration and classification of all LERP-RSAs let ARPaD algorithm to be executed in parallel and to detect every sequential itemset that occurs at least twice in a very efficient way.
Uploads