2019 7th International conference on ICT & Accessibility (ICTA)
Humans prefer to interact with their smart devices through speech rather than using pointing tool... more Humans prefer to interact with their smart devices through speech rather than using pointing tools and keyboards. Recent mobile devices provide the capability to automatically convert voice to text but they still suffer from many issues when it comes to efficiency goal. In this paper, we present a new approach for vocal commands processing based on Information Retrieval concepts (i.e. text pre-processing, cosine similarity) as well as machine learning algorithms and which enables mobile smart devices to better and faster access to applications' functionalities.
2020 International Symposium on Networks, Computers and Communications (ISNCC)
Crowd-sourced air traffic communication networks have gained importance over the past decade. The... more Crowd-sourced air traffic communication networks have gained importance over the past decade. They use distributed networks that are randomly deployed. Contrary to traditional and carefully planned receiver networks, crowd-sourced use of cheap sensors poses a number of new challenges to existing localization algorithms. The purpose of this paper is twofold: First, to survey the literature on Aircrafts’ real-time tracking in Crowdsourced Air Traffic Networks, and Second, sketch a big data architecture allowing real-time tracking of aircrafts and predicting missing localization data.
2017 International Symposium on Networks, Computers and Communications (ISNCC)
The purpose of this paper is twofold: First, to survey the literature on Multidimensional Databas... more The purpose of this paper is twofold: First, to survey the literature on Multidimensional Database Modeling fundamentals, and Second, to sketch a research agenda which challenges Big Data four V's, namely Volume, Velocity, Veracity, and Variety.
2018 International Conference on Smart Communications and Networking (SmartNets), 2018
A large volume of sensor networks and trajectories of mobile objects are collected. Such data off... more A large volume of sensor networks and trajectories of mobile objects are collected. Such data offer us high value knowledge to understand moving objects and locations, fostering a broad range of applications in smart cities, enabling intelligent transportation systems and intelligent urban computing. The next generation of roads needs to be intelligent to accommodate a future transition to fully autonomous vehicles. Consequently, we need to engineer scalable and smart Trajectory Data Analytics Systems in order to analyze both historical data and real-time data flows, derive insights and convert insights into decisions and actions.The purpose of this paper is first to identify key functional and non-functional requirements that a Trajectory Data Analytical system must provide and second to survey open-source technologies designed for the analysis of general geo-referenced data.
Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems, 2021
According to the American National Institute of Environmental Health Sciences (NIEHS), air pollut... more According to the American National Institute of Environmental Health Sciences (NIEHS), air pollutants are harmful to the health of humans and other living beings, and cause damage to the climate and to the ecosystem by polluting lakes, streams, and soils. Recent developments in sensor technology, and Internet of Things (IoT) technologies provide an opportunity to use sensor networks to measure air quality, in real time, at a large number of locations. The adoption and deployment of IoT technologies for sensing air quality raises a challenging research agenda related to big data processing, such as, data analysis, scalable architectures, and algorithms for best managing and processing IoT data at different edges in the IoT ecosystem. In response to the DEBS'2021 contest, we design and implement a scalable solution for comparing previous year and current year air quality indexes for German Cities, as well as the calculus of cities' longest streaks of good air quality. Our solution is designed to be scalable. It's based on primo Apache Spark - an open-source unified analytics engine for large-scale data processing, and secundo Apache Sedona for creating spatial indexes, and performing spatial operations over large-scale spatial data.
Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, 2018
In this paper, we propose scalable algorithms allowing primo to infer a map of vessels' traje... more In this paper, we propose scalable algorithms allowing primo to infer a map of vessels' trajectories and secundo to predict future locations of a vessel on sea. Our system is based on Apache Spark -a fast and scalable engine for large-scale data processing. The training dataset is event-based. Each event depicts the GPS position of the vessel at a timestamp. We propose and implement a workflow computing trips' patterns, with GPS locations of each trip summarized using GeoHashing. The latter is an efficient encoding of a geographic location into a short string of letters and digits. In order to perform prediction queries efficiently, we propose (i) a geohash positional index which maps each geohash to a list of pairs (trip-pattern-identifier, offset of the geohash in the geohash sequence of the trip-pattern), (ii) a departure-port index which maps each departure port to a list of trip-patterns' identifiers, as well as (iii) a pairwise geohash sequence alignment allowing t...
Zhongguo Zhong yao za zhi = Zhongguo zhongyao zazhi = China journal of Chinese materia medica, 2021
An UPLC-MS/MS method for rapid and simultaneous determination of psoralen, isopsoralen, apigenin,... more An UPLC-MS/MS method for rapid and simultaneous determination of psoralen, isopsoralen, apigenin, genistein, bavaisoflavone, neobavaisoflavone, bavachin, bavachinin, psoralenoside, and isopsoralenoside of Psoraleae Fructus in beagle dog plasma was established, and then the method was applied in the pharmacokinetic study after oral administration of Psoraleae Fructus extract to beagle dogs. The pharmacokinetic parameters were calculated by the software of WinNonlin. A Waters HSS-T3 column(2.1 mm×100 mm,1.8 μm)was used for liquid chromatography separation with acetonitrile-water(containing 0.004% formic acid) as the mobile phase for gradient elution.The mass spectrometry was detected using electrospray ion source(ESI) under multi-reaction monitoring mode(MRM), as well as positive ion mode. Analysis time only takes 8.5 min. The methodological study in terms of specificity, accuracy, precision, linear range, recovery, matrix effect, and stability, was validated. The LC-MS analysis metho...
Smart farming and IoT technologies open up a new research agenda, which relates to different inte... more Smart farming and IoT technologies open up a new research agenda, which relates to different inter-related scopes within a Farm Management Information System, such as robots’ programming, tasks’ scheduling, sensor data capture, management and processing at different layers of the IoT ecosystem. Many research works address these topics, but to the best of our knowledge none has contributed with a fully-featured architecture design of monitoring and scheduling of autonomous agricultural robots. In this paper, we propose the skeleton of architecture for such kind of IoT systems, called LambdAgrIoT. It is designed to support big data and different types of workload (real-time, near real-time, analytic, and transactional). We present the main features of each layer, and the implementation details and its deployment of the Data Source and Speed layers in a real environment. The paper also discusses the open issues related to the other layers and the deployment of the overall architecture at large scale.
The Image Adjustment Algorithm that LHRS client executes to update its image when the IAM comes b... more The Image Adjustment Algorithm that LHRS client executes to update its image when the IAM comes back is as follows. Here a is the address of the last bucket to forward the request to the correct one, and j is the level of bucket a. These values are in IAM. Notice that they come from a different bucket than that considered in Litwin et al. [1996]. The latter was the first bucket to receive the request. The change produces the image whose extent is closer to the actual one in many cases. The search for key c = 60 in the file in Figure 1(b) illustrates one such case.
NoSQL systems rose alongside internet companies, which have different challenges in dealing with ... more NoSQL systems rose alongside internet companies, which have different challenges in dealing with data that the traditional RDBMS solutions could not cope with. Indeed, in order to handle efficiently the continuous growth of data, NoSQL technologies feature dynamic horizontal scaling rather than vertical scaling. To date few studies address On-Line Analytical Processing challenges and solutions using NoSQL systems. In this paper, we first overview NoSQL and adjacent technologies, then discuss analytics challenges in the cloud, and propose a taxonomy for a decision-support system workload, as well as specific analytics scenarios. The proposed scenarios aim at allowing best performances, best availability and tradeoff between space, bandwidth and computing overheads. Finally, we evaluate Hadoop/Pig using TPC-H benchmark, under different analytics scenarios, and report thorough performance tests on Hadoop for various data volumes, workloads, and cluster' sizes.
Proceedings of the 13th International Conference on Management of Digital EcoSystems, 2021
The adoption and deployment of the Internet of Things (IoT) technologies in agroecology raise a c... more The adoption and deployment of the Internet of Things (IoT) technologies in agroecology raise a challenging research agenda. Agroecology IoT projects feature complex requirements involving: heterogeneous hardware and heterogeneous software systems; data collection architectures, stream and queuing systems as well as data management systems for real-time and batch processing with different data models. On top of that, agroecology IoT applications are characterized by complex spatio-temporal data and low quality communication networks. Developing conceptual models of such complex systems is mandatory for successful projects, but it is much more challenging than for traditional systems. To the best of our knowledge, a comprehensive (end-to-end) data modeling method applicable to such systems has not been provided yet. It motivated us to propose and assess a new UML profile for data modeling across an IoT ecosystem for agroecology applications. The modeling approach allows to represent the following components of a system: data producers, data integration and storage as well as data analytics. The profile has been validated in a real project on monitoring autonomous agricultural robots.
Background The aim of this study was to characterize the transmission chains and clusters of COVI... more Background The aim of this study was to characterize the transmission chains and clusters of COVID-19 infection in Tunisia. Methods All cases were confirmed by Reverse Transcriptase Polymerase Chain Reaction of a nasopharyngeal specimen. Contact tracing is undertaken for all confirmed cases in order to identify close contacts that will be systematically screened and quarantined. Transmission chains were identified based on field investigation, contact tracing, results of screening tests and by assessing all probable mode of transmission and interactions. Results As of May 18, 2020, 656 cases out of a total of 1043 confirmed cases of Coronavirus disease 2019 belong to 127 transmission chains identified during the epidemic (mean age 42.36 years, Standard deviation 19.56 and sex ratio 0.86). The virus transmission is the most concentrated in the governorate of Tunis (31.5%), Ariana (10.2%) and Ben Arous (10.2%). Virus transmission occurred 50 times (9.72% of secondary transmission even...
25th International Database Engineering & Applications Symposium, 2021
Big data systems are becoming mainstream for big data management either for batch processing or r... more Big data systems are becoming mainstream for big data management either for batch processing or real-time processing. In order to extract insights from data, quality issues are very important to address, particularly. A veracity assessment model is consequently needed. In this paper, we propose a model which ties quality of datasets and quality of query resultsets. We particularly examine quality issues raised by a given dataset, order attributes along their fitness for use and correlate veracity metrics to business queries. We validate our work using the open dataset NYC taxi’ trips.
Efficient implementations of DPLL with the addition of clause learning are the fastest complete B... more Efficient implementations of DPLL with the addition of clause learning are the fastest complete Boolean satisfiability solvers and can handle many significant real-world problems, such as verification, planning and design. Despite its importance, little is known of the ultimate strengths and limitations of the technique. This paper presents the first precise characterization of clause learning as a proof system (CL), and begins the task of understanding its power by relating it to the well-studied resolution proof system. In particular, we show that with a new learning scheme, CL can provide exponentially shorter proofs than many proper refinements of general resolution (RES) satisfying a natural property. These include regular and Davis-Putnam resolution, which are already known to be much stronger than ordinary DPLL. We also show that a slight variant of CL with unlimited restarts is as powerful as RES itself. Translating these analytical results to practice, however, presents a c...
2019 7th International conference on ICT & Accessibility (ICTA)
Humans prefer to interact with their smart devices through speech rather than using pointing tool... more Humans prefer to interact with their smart devices through speech rather than using pointing tools and keyboards. Recent mobile devices provide the capability to automatically convert voice to text but they still suffer from many issues when it comes to efficiency goal. In this paper, we present a new approach for vocal commands processing based on Information Retrieval concepts (i.e. text pre-processing, cosine similarity) as well as machine learning algorithms and which enables mobile smart devices to better and faster access to applications' functionalities.
2020 International Symposium on Networks, Computers and Communications (ISNCC)
Crowd-sourced air traffic communication networks have gained importance over the past decade. The... more Crowd-sourced air traffic communication networks have gained importance over the past decade. They use distributed networks that are randomly deployed. Contrary to traditional and carefully planned receiver networks, crowd-sourced use of cheap sensors poses a number of new challenges to existing localization algorithms. The purpose of this paper is twofold: First, to survey the literature on Aircrafts’ real-time tracking in Crowdsourced Air Traffic Networks, and Second, sketch a big data architecture allowing real-time tracking of aircrafts and predicting missing localization data.
2017 International Symposium on Networks, Computers and Communications (ISNCC)
The purpose of this paper is twofold: First, to survey the literature on Multidimensional Databas... more The purpose of this paper is twofold: First, to survey the literature on Multidimensional Database Modeling fundamentals, and Second, to sketch a research agenda which challenges Big Data four V's, namely Volume, Velocity, Veracity, and Variety.
2018 International Conference on Smart Communications and Networking (SmartNets), 2018
A large volume of sensor networks and trajectories of mobile objects are collected. Such data off... more A large volume of sensor networks and trajectories of mobile objects are collected. Such data offer us high value knowledge to understand moving objects and locations, fostering a broad range of applications in smart cities, enabling intelligent transportation systems and intelligent urban computing. The next generation of roads needs to be intelligent to accommodate a future transition to fully autonomous vehicles. Consequently, we need to engineer scalable and smart Trajectory Data Analytics Systems in order to analyze both historical data and real-time data flows, derive insights and convert insights into decisions and actions.The purpose of this paper is first to identify key functional and non-functional requirements that a Trajectory Data Analytical system must provide and second to survey open-source technologies designed for the analysis of general geo-referenced data.
Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems, 2021
According to the American National Institute of Environmental Health Sciences (NIEHS), air pollut... more According to the American National Institute of Environmental Health Sciences (NIEHS), air pollutants are harmful to the health of humans and other living beings, and cause damage to the climate and to the ecosystem by polluting lakes, streams, and soils. Recent developments in sensor technology, and Internet of Things (IoT) technologies provide an opportunity to use sensor networks to measure air quality, in real time, at a large number of locations. The adoption and deployment of IoT technologies for sensing air quality raises a challenging research agenda related to big data processing, such as, data analysis, scalable architectures, and algorithms for best managing and processing IoT data at different edges in the IoT ecosystem. In response to the DEBS'2021 contest, we design and implement a scalable solution for comparing previous year and current year air quality indexes for German Cities, as well as the calculus of cities' longest streaks of good air quality. Our solution is designed to be scalable. It's based on primo Apache Spark - an open-source unified analytics engine for large-scale data processing, and secundo Apache Sedona for creating spatial indexes, and performing spatial operations over large-scale spatial data.
Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, 2018
In this paper, we propose scalable algorithms allowing primo to infer a map of vessels' traje... more In this paper, we propose scalable algorithms allowing primo to infer a map of vessels' trajectories and secundo to predict future locations of a vessel on sea. Our system is based on Apache Spark -a fast and scalable engine for large-scale data processing. The training dataset is event-based. Each event depicts the GPS position of the vessel at a timestamp. We propose and implement a workflow computing trips' patterns, with GPS locations of each trip summarized using GeoHashing. The latter is an efficient encoding of a geographic location into a short string of letters and digits. In order to perform prediction queries efficiently, we propose (i) a geohash positional index which maps each geohash to a list of pairs (trip-pattern-identifier, offset of the geohash in the geohash sequence of the trip-pattern), (ii) a departure-port index which maps each departure port to a list of trip-patterns' identifiers, as well as (iii) a pairwise geohash sequence alignment allowing t...
Zhongguo Zhong yao za zhi = Zhongguo zhongyao zazhi = China journal of Chinese materia medica, 2021
An UPLC-MS/MS method for rapid and simultaneous determination of psoralen, isopsoralen, apigenin,... more An UPLC-MS/MS method for rapid and simultaneous determination of psoralen, isopsoralen, apigenin, genistein, bavaisoflavone, neobavaisoflavone, bavachin, bavachinin, psoralenoside, and isopsoralenoside of Psoraleae Fructus in beagle dog plasma was established, and then the method was applied in the pharmacokinetic study after oral administration of Psoraleae Fructus extract to beagle dogs. The pharmacokinetic parameters were calculated by the software of WinNonlin. A Waters HSS-T3 column(2.1 mm×100 mm,1.8 μm)was used for liquid chromatography separation with acetonitrile-water(containing 0.004% formic acid) as the mobile phase for gradient elution.The mass spectrometry was detected using electrospray ion source(ESI) under multi-reaction monitoring mode(MRM), as well as positive ion mode. Analysis time only takes 8.5 min. The methodological study in terms of specificity, accuracy, precision, linear range, recovery, matrix effect, and stability, was validated. The LC-MS analysis metho...
Smart farming and IoT technologies open up a new research agenda, which relates to different inte... more Smart farming and IoT technologies open up a new research agenda, which relates to different inter-related scopes within a Farm Management Information System, such as robots’ programming, tasks’ scheduling, sensor data capture, management and processing at different layers of the IoT ecosystem. Many research works address these topics, but to the best of our knowledge none has contributed with a fully-featured architecture design of monitoring and scheduling of autonomous agricultural robots. In this paper, we propose the skeleton of architecture for such kind of IoT systems, called LambdAgrIoT. It is designed to support big data and different types of workload (real-time, near real-time, analytic, and transactional). We present the main features of each layer, and the implementation details and its deployment of the Data Source and Speed layers in a real environment. The paper also discusses the open issues related to the other layers and the deployment of the overall architecture at large scale.
The Image Adjustment Algorithm that LHRS client executes to update its image when the IAM comes b... more The Image Adjustment Algorithm that LHRS client executes to update its image when the IAM comes back is as follows. Here a is the address of the last bucket to forward the request to the correct one, and j is the level of bucket a. These values are in IAM. Notice that they come from a different bucket than that considered in Litwin et al. [1996]. The latter was the first bucket to receive the request. The change produces the image whose extent is closer to the actual one in many cases. The search for key c = 60 in the file in Figure 1(b) illustrates one such case.
NoSQL systems rose alongside internet companies, which have different challenges in dealing with ... more NoSQL systems rose alongside internet companies, which have different challenges in dealing with data that the traditional RDBMS solutions could not cope with. Indeed, in order to handle efficiently the continuous growth of data, NoSQL technologies feature dynamic horizontal scaling rather than vertical scaling. To date few studies address On-Line Analytical Processing challenges and solutions using NoSQL systems. In this paper, we first overview NoSQL and adjacent technologies, then discuss analytics challenges in the cloud, and propose a taxonomy for a decision-support system workload, as well as specific analytics scenarios. The proposed scenarios aim at allowing best performances, best availability and tradeoff between space, bandwidth and computing overheads. Finally, we evaluate Hadoop/Pig using TPC-H benchmark, under different analytics scenarios, and report thorough performance tests on Hadoop for various data volumes, workloads, and cluster' sizes.
Proceedings of the 13th International Conference on Management of Digital EcoSystems, 2021
The adoption and deployment of the Internet of Things (IoT) technologies in agroecology raise a c... more The adoption and deployment of the Internet of Things (IoT) technologies in agroecology raise a challenging research agenda. Agroecology IoT projects feature complex requirements involving: heterogeneous hardware and heterogeneous software systems; data collection architectures, stream and queuing systems as well as data management systems for real-time and batch processing with different data models. On top of that, agroecology IoT applications are characterized by complex spatio-temporal data and low quality communication networks. Developing conceptual models of such complex systems is mandatory for successful projects, but it is much more challenging than for traditional systems. To the best of our knowledge, a comprehensive (end-to-end) data modeling method applicable to such systems has not been provided yet. It motivated us to propose and assess a new UML profile for data modeling across an IoT ecosystem for agroecology applications. The modeling approach allows to represent the following components of a system: data producers, data integration and storage as well as data analytics. The profile has been validated in a real project on monitoring autonomous agricultural robots.
Background The aim of this study was to characterize the transmission chains and clusters of COVI... more Background The aim of this study was to characterize the transmission chains and clusters of COVID-19 infection in Tunisia. Methods All cases were confirmed by Reverse Transcriptase Polymerase Chain Reaction of a nasopharyngeal specimen. Contact tracing is undertaken for all confirmed cases in order to identify close contacts that will be systematically screened and quarantined. Transmission chains were identified based on field investigation, contact tracing, results of screening tests and by assessing all probable mode of transmission and interactions. Results As of May 18, 2020, 656 cases out of a total of 1043 confirmed cases of Coronavirus disease 2019 belong to 127 transmission chains identified during the epidemic (mean age 42.36 years, Standard deviation 19.56 and sex ratio 0.86). The virus transmission is the most concentrated in the governorate of Tunis (31.5%), Ariana (10.2%) and Ben Arous (10.2%). Virus transmission occurred 50 times (9.72% of secondary transmission even...
25th International Database Engineering & Applications Symposium, 2021
Big data systems are becoming mainstream for big data management either for batch processing or r... more Big data systems are becoming mainstream for big data management either for batch processing or real-time processing. In order to extract insights from data, quality issues are very important to address, particularly. A veracity assessment model is consequently needed. In this paper, we propose a model which ties quality of datasets and quality of query resultsets. We particularly examine quality issues raised by a given dataset, order attributes along their fitness for use and correlate veracity metrics to business queries. We validate our work using the open dataset NYC taxi’ trips.
Efficient implementations of DPLL with the addition of clause learning are the fastest complete B... more Efficient implementations of DPLL with the addition of clause learning are the fastest complete Boolean satisfiability solvers and can handle many significant real-world problems, such as verification, planning and design. Despite its importance, little is known of the ultimate strengths and limitations of the technique. This paper presents the first precise characterization of clause learning as a proof system (CL), and begins the task of understanding its power by relating it to the well-studied resolution proof system. In particular, we show that with a new learning scheme, CL can provide exponentially shorter proofs than many proper refinements of general resolution (RES) satisfying a natural property. These include regular and Davis-Putnam resolution, which are already known to be much stronger than ordinary DPLL. We also show that a slight variant of CL with unlimited restarts is as powerful as RES itself. Translating these analytical results to practice, however, presents a c...
Uploads
Papers by Rim Moussa